The weak lensing radial acceleration relation Constraining modified gravity and cold dark matter theories with KiDS-1000

We present measurements of the radial gravitational acceleration around isolated galaxies, comparing the expected gravitational acceleration given the baryonic matter ( g bar ) with the observed gravitational acceleration ( g obs ), using weak lensing measurements from the fourth data release of the Kilo-Degree Survey (KiDS-1000). These measurements extend the radial acceleration relation (RAR), traditionally measured using galaxy rotation curves, by 2 decades in g obs into the low-acceleration regime beyond the outskirts of the observable galaxy. We compare our RAR measurements to the predictions of two modiﬁed gravity (MG) theories: modiﬁed Newtonian dynamics and Verlinde’s emergent gravity (EG). We ﬁnd that the measured relation between g obs and g bar agrees well with the MG predictions. In addition, we ﬁnd a di ﬀ erence of at least 6 σ between the RARs of early-and late-type galaxies (split by Sérsic index and u − r colour) with the same stellar mass. Current MG theories involve a gravity modiﬁcation that is independent of other galaxy properties, which would be unable to explain this behaviour, although the EG theory is still limited to spherically symmetric static mass models. The di ﬀ erence might be explained if only the early-type galaxies have signiﬁcant ( M gas ≈ M


Introduction
It has been known for almost a century that the outer regions of galaxies rotate faster than would be expected from Newtonian dynamics based on their luminous, or 'baryonic', mass (Kapteyn 1922;Oort 1932Oort , 1940;;Babcock 1939).This was also demonstrated by Gottesman et al. (1966) and Bosma (1981) through measurements of hydrogen profiles at radii beyond the optical discs of galaxies, and by Rubin (1983) through measurements of galactic rotation curves within the optical discs.The excess gravity implied by these measurements has generally been attributed to an unknown and invisible substance named dark matter (DM), a term coined more than 40 years prior by Zwicky (1933) when he discovered the so-called missing mass problem through the dynamics of galaxies in clusters.More recently, new methods such as weak gravitational lensing (Hoekstra et al. 2004;Mandelbaum et al. 2006;Clowe et al. 2006;Heymans et al. 2013;von der Linden et al. 2014), baryon acoustic oscillations (Eisenstein et al. 2005;Blake et al. 2011), and the cosmic microwave background (CMB; de Bernardis et al. 2000;Spergel et al. 2003;Planck XVI 2014) have contributed unique evidence to the missing mass problem.
Among many others, these observations have contributed to the fact that cold dark matter1 (CDM) has become a key ingredient of the current standard model of cosmology: the ΛCDM model.In this paradigm, CDM accounts for a fraction Ω CDM = 0.266 of the critical density ρ crit = 3H2 0 /8πG in the Universe, while baryonic matter only accounts for Ω bar = 0.049 (Planck VI 2020).The cosmological constant Λ, which is necessary to explain the accelerated expansion of the Universe (Riess et al. 1998;Perlmutter et al. 1999) and is a special case of dark energy (DE), accounts for the remaining Ω Λ = 0.685 in our flat space-time (de Bernardis et al. 2000).
Although the ΛCDM model successfully describes the observations on a wide range of scales, no conclusive direct evidence for the existence of DM particles has been found so far (despite years of enormous effort; for an overview, see Bertone et al. 2005;Bertone & Tait 2018).Combined with other current open questions in physics, such as the elusive unification of general relativity (GR) with quantum mechanics and the mysterious nature of DE, this leaves room for alternative theories of gravity.Two modified gravity (MG) theories that do not require the existence of particle DM are modified Newtonian dynamics (MOND ;Milgrom 1983) and the more recent theory of emergent gravity (EG; Verlinde 2017).In these theories all gravity is due to the baryonic matter (or, in the case of EG, the interaction between baryons and the entropy associated with DE).Hence, one of the main properties of these theories is that the mass discrepancy in galaxies correlates strongly with their baryonic mass distribution.
Such a correlation has indeed been observed, such as via the Tully-Fisher relation (Tully & Fisher 1977) between the luminosity of a spiral galaxy and its asymptotic rotation velocity (Pierce & Tully 1988;Bernstein et al. 1994).This relation was later generalised as the baryonic Tully-Fisher relation (McGaugh et al. 2000;McGaugh 2012) to include non-stellar forms of baryonic matter.Even earlier, astronomers had found a strong correlation between the observed rotation velocity as a function of galaxy radius v obs (r) and the enclosed luminous mass M bar (< r) (Sanders 1986(Sanders , 1996;;McGaugh 2004;Sanders & Noordermeer 2007;Wu & Kroupa 2015).Since M bar (< r) corresponds to the expected gravitational acceleration g bar (r) from baryonic matter, and the observed gravitational acceleration can be calculated through g obs (r) = v 2 obs (r)/r, this relation has also been named the radial acceleration relation (RAR) 2 .McGaugh et al. (2016, hereafter M16) in particular measured the RAR with unprecedented accuracy, using the Spitzer Photometry and Accurate Rotation Curves (SPARC; Lelli et al. 2016) data of 153 late-type galaxies.Their results again showed a tight correlation between g obs and g bar , which they could describe using a simple double power law (eq.4 in M16) that depends only on g bar and one free parameter: the acceleration scale g † where Newtonian gravity appears to break down.This rekindled the interest of scientists working on alternative theories of gravity (Lelli et al. 2017a,b;Burrage et al. 2017;Li et al. 2018;O'Brien et al. 2019), but also of those seeking an explanation of the RAR within the ΛCDM framework, employing correlations between the masses, sizes, and DM content of galaxies (Di Cintio & Lelli 2016;Keller & Wadsley 2017;Desmond 2017;Ludlow et al. 2017;Navarro et al. 2017;Tenneti et al. 2018).Navarro et al. (2017, hereafter N17) used a range of simplifying assumptions based on galaxy observations and DM simulations in order to create an analytical galaxy model including the baryonic and halo components.With this model they reconstruct the RAR inside galaxy discs, in particular the value of a 0 , the acceleration scale where the relation transitions from the baryondominated to the DM-dominated regime (which is equivalent to g † ), and a min , the minimum acceleration probed by galaxy discs.Based on their results, they claim that the RAR can be explained within the ΛCDM framework at the accelerations probed by galaxy rotation curves (within the galaxy disc, i.e. g obs > a min ).However, since their model relies on the fact that luminous kinematic tracers in galaxies only probe a limited radial range, N17 predicted that extending observations to radii beyond the disc (which correspond to lower gravitational accelerations) would lead to systematic deviations from the simple double power law proposed by M16.Although some progress has been made using globular clusters (Bílek et al. 2019b,a;Müller et al. 2021), using kinematic tracers to measure the RAR beyond the outskirts of visible galaxies remains difficult.
The goal of this work is to extend observations of the RAR to extremely low accelerations that cannot currently be detected through galaxy rotation curves or any other kinematic measurement.To this end, we use gravitational lensing: the perturbation of light inside a gravitational potential as described by relativistic theories such as GR.Both weak and strong gravitational lensing were used by Tian et al. (2020) to measure the RAR from observations of 20 galaxy clusters targeted by the CLASH survey.However, due to the high cluster masses, the accelerations probed by these measurements were of the same order as those measurable with galaxy rotation curves.In this work, we use the method of galaxy-galaxy lensing (GGL): the statistical measurement of the coherent image distortion (shear) of a field of background galaxies (sources) by the gravitational potential of a sample of individual foreground galaxies (lenses; for examples, see e.g.Brainerd et al. 1996;Fischer et al. 2000;Hoekstra et al. 2004;Mandelbaum et al. 2006;van Uitert et al. 2016).Using GGL we can measure the average (apparent) density distribution of isolated galaxies up to a radius of 3 Mpc, roughly 100 times larger than the radius of the luminous disc (∼ 30 kpc).At our stellar mass scale of interest -log(M / h −2 70 M ) ≈ 10.5 -this radius corresponds to g bar ≈ 10 −15 m s −2 , which is three orders of magnitude lower than the baryonic accelerations of the M16 rotation curves3 .
Our main goal is to use the lensing RAR of isolated galaxies at lower accelerations (beyond the observable galaxy disc) to distinguish which of the aforementioned MG and ΛCDM models best describe this result.To achieve this, we first measure the total and baryonic density profiles of our galaxies through their GGL profiles and luminosities.These measurements will be performed using 1006 deg 2 of weak lensing data from the Kilo-Degree Survey (KiDS-1000;de Jong et al. 2013;Kuijken et al. 2019), and nine-band photometric data from KiDS and the VISTA Kilo-Degree Infrared Galaxy Survey (Edge et al. 2013, VIKING).We then translate these measurements into the observed and baryonic radial accelerations, g obs and g bar .Finally, we compare the resulting RAR to predictions from different MG theories (MOND and EG) and ΛCDM.To test the MG theories, we need to make the assumption that the deflection of light by gravitational potentials (as described in GR) holds in these modified theories, which we motivate in the relevant sections.This work can be seen as an extension of Brouwer et al. (2017), where we tested the predictions of EG using KiDS GGL on foreground galaxies from 180 deg 2 of the Galaxy and Mass Assembly (GAMA) survey.Instead of GAMA, we now use a selection of ∼ 1 million foreground galaxies from KiDS-1000 to achieve a fivefold increase in survey area.
The ΛCDM predictions will not only be provided by the N17 analytical model, but also by mock galaxy catalogues based on two different DM simulations.One is the Marenostrum Institut de Ciències de l'Espai (MICE) Galaxy and Halo Light-cone catalogue (Carretero et al. 2015;Hoffmann et al. 2015), which is based on the MICE Grand Challenge lightcone simulation (Fosalba et al. 2015a,b;Crocce et al. 2015).The other mock galaxy catalogue is based on a suite of large-volume cosmological hydrodynamical simulations, called the BAryons and HAloes of MAssive Systems (BAHAMAS) project (McCarthy et al. 2017).
Having ∼ 1 million foreground galaxies at our disposal allows us to select specific galaxy samples, designed to optimally test the predictions from the aforementioned MG and ΛCDM models.Particularly, we note that the analytical models (MOND, EG and N17) mostly focus on the description of individual, isolated galaxies.In order to test them, we select a sample of galaxies whose GGL profiles are minimally affected by neighbouring galaxies (e.g.satellites) within the radius of our measurement.In contrast, the predictions from simulations can be tested with both isolated and non-isolated galaxy samples.
In addition, our sample of ∼ 350 000 isolated lens galaxies allows us to analyse the RAR as a function of colour, Sérsic index and stellar mass.Because MG and ΛCDM give different predictions regarding the dependence of the RAR on these observables, this allows us to better distinguish between the different models.Specifically: according to the MOND and EG theories the relation between g bar and g obs should remain fixed in the regime beyond the baryon-dominated galaxy disc, and hence be independent of galaxy observables.Within the ΛCDM paradigm, the relation between g bar and g obs is related to the stellar-to-halo-mass relation (SHMR) that is not necessarily constant as a function of galaxy stellar mass or other observables.
Our paper is structured as follows: In Section 2 we describe the methodology behind the GGL measurements and their conversion into the RAR, in addition to the theoretical predictions to which we compare our observations: MOND, EG and the N17 analytical DM model.In Section 3 we introduce the KiDS-1000 and GAMA galaxy surveys used to perform both the GGL and stellar mass measurements.Section 4 describes the MICE and BAHAMAS simulations and mock galaxy catalogues to which we compare our results.In Section 5 we present our lensing RAR measurements and compare them to the different models, first using all isolated galaxies and then separating the galaxies by different observables.Section 6 contains the discussion and conclusion.In Appendix A we validate our isolated galaxy selection, and Appendix B contains a description of the piecewise-powerlaw method of translating the lensing measurement into g obs .Finally, Appendix C shows the comparison of the N17 analytical DM model with our lensing RAR.
Throughout this work we adopt the WMAP 9-year (Hinshaw et al. 2013) cosmological parameters: Ω m = 0.2793, Ω b = 0.0463, Ω Λ = 0.7207, σ 8 = 0.821 and H 0 = 70 km s −1 Mpc −1 , which were used as the basis of the BAHAMAS simulation.When analysing the MICE simulations we use the cosmological parameters used in creating MICE, which are: Ω m = 0.25, σ 8 = 0.8, Ω Λ = 0.75, and H 0 = 70 km s −1 Mpc −1 .Throughout the paper we use the reduced Hubble constant h 70 = H 0 /(70 km s −1 Mpc −1 ).Due to the relatively low redshift of our lens galaxies (z ∼ 0.2) the effect of differences in the cosmological parameters on our results is small.

Mass measurements with weak gravitational lensing
To estimate the gravitational acceleration around galaxies we used GGL: the measurement of the coherent image distortion of a field of background galaxies (sources) by the gravitational potential of a sample of foreground galaxies (lenses).Because the individual image distortions are very small (only ∼ 1% compared to the galaxy's unknown original shape), this method can only be performed statistically for a large sample of sources.We averaged their projected ellipticity component tangential to the direction of the lens galaxy, t , which is the sum of the intrinsic tangential ellipticity component int t and the tangential shear γ t caused by weak lensing.Assuming no preferential alignment in the intrinsic galaxy shapes ( int t = 0), the average t is an estimator for γ t .By measuring this averaged quantity in circular annuli around the lens centre, we obtained the tangential shear profile γ t (R) as a function of projected radius R. Because our final goal is to compute the observed gravitational acceleration g obs as a function of that expected from baryonic matter g bar , we chose our R-bins such that they corresponded to 15 logarithmic bins between 1×10 −15 < g bar < 5×10 −12 m s −2 .For each individual lens the calculation of these g bar -bins was based on the baryonic mass of the galaxy M gal (see Section 3.3).In real space this binning approximately corresponds to the distance range used in Brouwer et al. (2017): 0.03 < R < 3 h −1 70 Mpc.The lensing shear profile can be related to the physical excess surface density (ESD, denoted ∆Σ) profile through the critical surface density Σ crit : which is the surface density Σ(R) at projected radius R, subtracted from the average surface density Σ (< R) within R. See Section 3.1 for more information on how this is computed.
The error values on the ESD profile were estimated by the square-root of the diagonal of the analytical covariance matrix, which is described in section 3.4 of Viola et al. (2015).The full covariance matrix was calculated based on the contribution of each individual source to the ESD profile, and incorporates the correlation between sources that contribute to the ESD in multiple bins, both in projected distance R and in galaxy observable.

The radial acceleration relation (RAR)
After measuring the lensing profile around a galaxy sample, the next step is to convert it into the corresponding RAR.We started from the ESD as a function of projected radius ∆Σ(R) and the measured stellar masses of the lens galaxies M , aiming to arrive at their observed radial acceleration g obs as a function of their expected baryonic radial acceleration g bar .The latter can be calculated using Newton's law of universal gravitation: which defines the radial acceleration g in terms of the gravitational constant G and the enclosed mass M(< r) within spherical radius r.Assuming spherical symmetry here is reasonable, given that for lensing measurements thousands of galaxies are stacked under many different angles to create one average halo profile.The calculation of g bar requires the enclosed baryonic mass M bar (< r) of all galaxies.We discuss our construction of M bar (< r) in Section 3.3.The calculation of g obs requires the enclosed observed mass M obs (< r) of the galaxy sample, which we obtained through the conversion of our observed ESD profile ∆Σ(R).
When calculating g obs we started from our ESD profile measurement, which consists of the value ∆Σ(R) measured in a set of radial bins.At our measurement radii (R > 30 h −1 70 kpc) the ESD is dominated by the excess gravity, which means the contribution from baryonic matter can be neglected.We adopted the simple assumption that our observed density profile ρ obs (r) is roughly described by a Singular Isothermal Sphere (SIS) model: The SIS is generally considered to be the simplest parametrisation of the spatial distribution of matter in an astronomical system (such as galaxies, clusters, etc.).If interpreted in a ΛCDM context, the SIS implies the assumption that the DM particles have a Gaussian velocity distribution analogous to an ideal gas that is confined by their combined spherically symmetric gravitational potential, where σ is the total velocity dispersion of the particles.In a MG context, however, the SIS profile can be considered to represent a simple r −2 density profile as predicted by MOND and EG in the low-acceleration regime outside a baryonic mass distribution, with σ as a normalisation constant.The ESD derived from the SIS profile is: From Brouwer et al. (2017) we know that, despite its simple form, it provides a good approximation of the GGL measurements around isolated galaxies.The SIS profile is therefore wellsuited to analytically model the total enclosed mass distribution of our lenses, which can then be derived as follows: Now, for each individual observed ESD value ∆Σ obs,m at certain projected radius R m , we assumed that the density distribution within R m is described by an SIS profile with σ normalised such that ∆Σ SIS (R m ) = ∆Σ obs,m .Under this approximation, we combined equations 4 and 5 to give a relation between the lensing measurement ∆Σ and the deprojected, spherically enclosed mass M obs : Through Eq. 2, this results in a very simple expression for the observed gravitational acceleration: Throughout this work, we have used the SIS approximation to convert the ESD into g obs .In Section 4.4 we validate this approach by comparing it to a more elaborate method and testing both on the BAHAMAS simulation.

The RAR with modified Newtonian dynamics
With his theory, MOND, Milgrom (1983) postulated that the missing mass problem in galaxies is not caused by an undiscovered fundamental particle, but that instead our current gravitational theory should be revised.Since MOND is a nonrelativistic theory, performing GGL measurements to test it requires the assumption that light is curved by a MONDian gravitational potential in the same way as in GR.This assumption is justified since Milgrom (2013, while testing the MOND paradigm using GGL data from the Canada-France-Hawaii Telescope Lensing survey), states that non-relativistic MOND is a limit of relativistic versions that predict that gravitational potentials determine lensing in the same way as Newtonian potentials in GR.For this reason GGL surveys can be used as valuable tools to test MOND and similar MG theories, as was done for instance by Tian et al. (2009) using Sloan Digital Sky Survey (SDSS) and Red-sequence Cluster Survey data.MOND's basic premise is that one can adjust Newton's second law of motion (F = ma) by inserting a general function µ(a/a 0 ), which only comes into play when the acceleration a of a test mass m is much smaller than a critical acceleration scale a 0 .This function predicts the observed flat rotation curves in the outskirts of galaxies, while still reproducing the Newtonian behaviour of the inner disc.In short, the force F becomes: This implies that a a 0 represents the Newtonian regime where F N = m a N as expected, while a a 0 represents the 'deep-MOND' regime where F MOND = m a 2 MOND /a 0 .In a circular orbit, this is reflected in the deep-MOND gravitational acceleration g MOND ≡ a MOND as follows: This can be written in terms of the expected baryonic acceleration g bar = GM/r 2 as follows: This demonstrates that MOND predicts a very simple relation for the RAR: g obs = g bar in the Newtonian regime (g obs a 0 ) and Eq. 9 in the deep-MOND regime (g obs a 0 ).However, since µ(a/a 0 ), also known as the interpolating function, is not specified by Milgrom (1983), there is no specific constraint on the behaviour of this relation in between the two regimes.In the work of Milgrom & Sanders (2008), several families of interpolation functions are discussed.Selecting the third family (given by their eq.13) with constant parameter α = 1/2, provides the function that M16 later used to fit to their measurement of the RAR using rotation curves of 153 galaxies.This relation can be written as: where a 0 ≡ g † corresponds to the fitting parameter constrained by M16 to be g † = 1.20 ± 0.26 × 10 −10 m s −2 .Since Eq. 11 (equal to eq. 4 in M16) is also considered a viable version of the MOND interpolation function by Milgrom & Sanders (2008), we will consider it the baseline prediction of MOND in this work.As the baseline value of a 0 , we will likewise use the value of g † measured by M16 since it exactly corresponds to the value of a 0 = 1.2 × 10 −10 m s −2 considered canonical in MOND since its first measurement by Begeman et al. (1991), using the rotation curves of 10 galaxies.One of the main characteristics of the MOND paradigm, is that it gives a direct and fixed prediction for the total acceleration based only on the system's baryonic mass, given by Eq. 11.The main exception to this rule is the possible influence by neighbouring mass distributions through the external field effect (EFE), predicted by Milgrom (1983) and studied analytically, observationally and in simulations by Banik & Zhao (2015); Banik et al. (2020); Chae et al. (2020).Since we explicitly selected isolated galaxies in this work (see Appendix A), this effect is minimised as much as possible.However, since total isolation cannot be guaranteed, a small EFE might remain.In order to describe this effect, we used eq.6 from Chae et al. (2020): with: Here z ≡ g bar /g † , A e ≡ e(1 + e/2)/(1 + e), and B e ≡ (1 + e).The strength of the EFE is parametrised through: e = g ext /g † , determined by the external gravitational acceleration g ext .Although the interpolation functions differ, the result of Eq. 13 corresponds almost exactly to the M16 fitting function given in Eq. 11 in the limit e = 0 (no EFE).Positive values of e result in reduced values of the predicted g obs at very low accelerations (see Fig. 4 in Section 5.2, and fig. 1 of Chae et al. 2020).It should be noted that this fitting function represents an idealised model and could be subject to deviations in real, complex, 3D galaxies.

The RAR with emergent gravity
The work of Verlinde (2017, V17 hereafter), which is embedded in the framework of string theory and holography, shares the view that the missing mass problem is to be solved through a revision of our current gravitational theory.Building on the ideas from Jacobson (1995Jacobson ( , 2016)); Padmanabhan (2010); Verlinde (2011); Faulkner et al. (2014), V17 abandons the notion of gravity as a fundamental force.Instead, it emerges from an underlying microscopic description of space-time, in which the notion of gravity has no a priori meaning.V17 shows that constructing an EG theory in a universe with a negative cosmological constant ('anti-de Sitter') allows for the re-derivation of Einstein's laws of GR.A distinguishing feature of V17 is that it attempts to describe a universe with a positive cosmological constant ('de Sitter'), that is, one that is filled with a DE component.This results in a new volume law for gravitational entropy caused by DE, in addition to the area law normally used to retrieve Einsteinian gravity.According to V17, energy that is concentrated in the form of a baryonic mass distribution causes an elastic response in the entropy of the surrounding DE.This results in an additional gravitational component at scales set by the Hubble acceleration scale a 0 = cH 0 /6.Here c is the speed of light, and H 0 is the current Hubble constant that measures the Universe's expansion velocity.
Because this extra gravitational component aims to explain the effects usually attributed to DM, it is conveniently expressed as an apparent dark matter (ADM) distribution: Thus the ADM distribution is completely defined by the baryonic mass distribution M bar (r) as a function of the spherical radius r, and a set of known physical constants.Since we measured the ESD profiles of galaxies at projected radial distances R > 30 h −1 70 kpc, we can follow Brouwer et al. (2017) in assuming that their baryonic component is equal to the stars+cold gas mass enclosed within the minimal measurement radius (for further justification of this assumption, see Section 4.3).This is equivalent to describing the galaxy as a point mass M bar , which allows us to simplify Eq. 14 to: Now the total enclosed mass M EG (r) = M bar + M ADM (r) can be used to calculate the gravitational acceleration g EG (r) predicted by EG, as follows: In terms of the expected baryonic acceleration g bar (r) = GM bar /r 2 , this simplifies even further to: We emphasise that Eq. 14 is only a macroscopic approximation of the underlying microscopic phenomena described in V17, and is thus only valid for static, spherically symmetric and isolated baryonic mass distributions.For this reason, we selected only the most isolated galaxies from our sample (see Appendix A), such that our GGL measurements are not unduly influenced by neighbouring galaxies.Furthermore, the current EG theory is only valid in the acceleration range g bar < a 0 , often called the deep-MOND regime.Therefore, the prediction of Eq. 17 should be taken with a grain of salt for accelerations g bar > 1.2 × 10 −10 m s −2 .This will not affect our analysis since weak lensing takes place in the weak gravity regime.In addition, cosmological evolution of the H 0 parameter is not yet implemented in the theory, restricting its validity to galaxies with relatively low redshifts.However, we calculated that at our mean lens redshift, z ∼ 0.2, using an evolving H(z) would result in only a ∼ 5% difference in our ESD measurements, based on the background cosmology used in this work.
In order to test EG using the standard GGL methodology, we needed to assume that the deflection of photons by a gravitational potential in this alternative theory corresponds to that in GR.This assumption is justified because, in EG's original (anti-de Sitter) form, Einstein's laws emerge from its underlying description of space-time.The additional gravitational force described by ADM does not affect this underlying theory, which is an effective description of GR.Therefore, we assumed that the gravitational potential of an ADM distribution produces the same lensing shear as an equivalent distribution of actual matter.

The RAR in ΛCDM
To help guide an intuitive interpretation of the lensing RAR within the framework of the ΛCDM theory, we made use of the simple model of N17, which combines a basic model of galactic structure and scaling relations to predict the RAR.We refer to N17 for a full description, but give a summary here.A galaxy of a given stellar (or baryonic -there is no distinction in this model) mass occupies a DM halo of a mass fixed by the abundance matching relation of Behroozi et al. (2013).The dark halo concentration is fixed to the cosmological mean for haloes of that mass (Ludlow et al. 2014).The baryonic disc follows an exponential surface density profile with a half-mass size fixed to 0.2× the scale radius of the dark halo.This model is sufficient to specify the cumulative mass profile of both the baryonic and dark components of the model galaxy; calculating g obs and g bar is then straightforward.However, since the N17 model is merely a simple analytical description, our main ΛCDM test utilised more elaborate numerical simulations (see Section 4).

The Kilo-Degree Survey (KiDS)
We measured the gravitational potential around a sample of foreground galaxies (lenses), by measuring the image distortion (shear) of a field of background galaxies (sources).These sources were observed using OmegaCAM (Kuijken 2011): a 268-million pixel CCD mosaic camera mounted on the Very Large Telescope (VLT) Survey Telescope (Capaccioli & Schipani 2011).Over the past ten years these instruments have performed KiDS, a photometric survey in the ugri bands, which was especially designed to perform weak lensing measurements (de Jong et al. 2013).
GGL studies with KiDS have hitherto been performed in combination with the spectroscopic GAMA survey (see Section 3.2), with the KiDS survey covering 180 deg 2 of the GAMA area.Although the final KiDS survey will span 1350 deg 2 on the sky, the current state-of-the-art is the 4 th Data Release (KiDS-1000; Kuijken et al. 2019) containing observations from 1006 square-degree survey tiles.We therefore used a photometrically selected 'KiDS-bright' sample of lens galaxies from the full KiDS-1000 release, as described in Section 3.3.The measurement and calibration of the source shapes and photometric redshifts are described in Kuijken et al. (2019); Giblin et al. (2021) and Hildebrandt et al. (2021).
The measurements of the galaxy shapes are based on the rband data since this filter was used during the darkest time (moon distance > 90 deg) and with the best atmospheric seeing con-ditions (< 0.8 arcsec).The r-band observations were co-added using the Theli pipeline (Erben et al. 2013).From these images the galaxy positions were detected through the SExtractor algorithm (Bertin & Arnouts 1996).After detection, the shapes of the galaxies were measured using the lensfit pipeline (Miller et al. 2007(Miller et al. , 2013)), which includes a self-calibration algorithm based on Fenech Conti et al. (2017) that was validated in Kannawadi et al. (2019).Each shape is accompanied by a lensfit weight w s , which was used as an estimate of the precision of the ellipticity measurement.
For the purpose of creating the photometric redshift and stellar mass estimates, 9 bands were observed in total.The ugri bands were observed by KiDS, while the VIKING survey (Edge et al. 2013) performed on the VISTA telescope adds the ZY JHK s bands.All KiDS bands were reduced and co-added using the Astro-WISE pipeline (AW; McFarland et al. 2013).The galaxy colours, which form the basis of the photometric redshift measurements, were measured from these images using the Gaussian Aperture and PSF pipeline (GAaP; Kuijken 2008;Kuijken et al. 2015).
The addition of the lower frequency VISTA data allowed us to extend the redshift estimates out to 0.1 < z B < 1.2, where z B is the best-fit photometric redshift of the sources (Benítez 2000;Hildebrandt et al. 2012).However, when performing our lensing measurements (see Section 2.1) we used the total redshift probability distribution function n(z s ) of the full source population.This n(z s ) was calculated using a direct calibration method (see Hildebrandt et al. 2017 for details), and circumvents the inherent bias related to photometric redshift estimates of individual sources.
We note that this is a different redshift calibration method than that used by the KiDS-1000 cosmology analyses (Asgari et al. 2021;Heymans et al. 2021;Tröster et al. 2020), who used a self-organising map to remove (primarily high-redshift) sources whose redshifts could not be accurately calibrated due to incompleteness in the spectroscopic sample (Wright et al. 2020;Hildebrandt et al. 2021).Following Robertson et al. (in prep.)we prioritised precision by analysing the full KiDS-1000 source sample (calibrated using the direct calibration method) since percent-level biases in the mean source redshifts do not significantly impact our analysis.
For the lens redshifts z l , we used the ANNz2 (Artificial Neural Network) machine-learning redshifts of the KiDS foreground galaxy sample (KiDS-bright; see Section 3.3).We implemented the contribution of z l by integrating over the individual redshift probability distributions p(z l ) of each lens.This p(z l ) is defined by a normal distribution centred at the lens' z ANN redshift, with a standard deviation: σ z /(1 + z) = 0.02 (which is equal to the standard deviation of the KiDS-bright redshifts compared to their matched spectroscopic GAMA redshifts).For the source redshifts z s we followed the method used in Dvornik et al. (2018), integrating over the part of the redshift probability distribution n(z s ) where z s > z l .In addition, sources only contribute their shear to the lensing signal when z B + ∆z > z l -when the sum of their best-fit photometric redshift z B and the redshift buffer ∆z = 0.2 is greater than the lens redshift.Hence, when performing the lensing measurement in Section 2.1 the critical surface density 4 (the conversion factor between γ t and ∆Σ, whose inverse is also called the lensing efficiency) was calculated as fol-lows: Here D(z l ) and D(z s ) are the angular diameter distances to the lens and the source respectively, and D(z l , z s ) the distance between them.The constant multiplication factor is defined by Newton's gravitational constant G and the speed of light c.
The ESD profile was averaged (or 'stacked') for large samples of lenses to increase the signal-to-noise (S /N) ratio of the lensing signal.We defined a lensing weight W ls that depends on both the lensfit weight w s and the lensing efficiency Σ −1 crit : and used it to optimally sum the measurements from all lenssource pairs into the average ESD: Here the factor (1+µ) calibrates the shear estimates (Fenech Conti et al. 2017;Kannawadi et al. 2019).Extending the method of Dvornik et al. (2017) to the higher KiDS-1000 redshifts, µ denotes the mean multiplicative calibration correction calculated in 11 linear redshift bins between 0.1 < z B < 1.2 from the individual source calibration values m: The value of this correction is µ ≈ 0.014, independent of the projected distance from the lens.We also corrected our lensing signal for sample variance on large scales by subtracting the ESD profile measured around ∼ 5 million uniform random coordinates, 50 times the size of our total KiDS-bright sample.These random coordinates mimic the exact footprint of KiDS, excluding the areas masked by the 'nine-band no AW-r-band' mask that we applied to the KiDSbright lenses (see Section 3.3).In order to create random redshift values that mimic the true distribution, we created a histogram of the KiDS-bright redshifts divided into 80 linear bins between 0.1 < z ANN < 0.5.In each bin, we created random redshift values equal to the number of real lenses in that bin.Because of the large contiguous area of KiDS-1000, we found that the random ESD profile is very small at all projected radii R, with a mean absolute value of only 1.85 ± 0.75% of the lensing signal of the full sample of isolated KiDS-bright galaxies.

The Galaxy and Mass Assembly (GAMA) survey
Although the most contraining RAR measurements below were performed using exclusively KiDS-1000 data, the smaller set of foreground galaxies observed by the spectroscopic GAMA survey (Driver et al. 2011) functions both as a model and validation sample for the KiDS foreground galaxies.The survey was performed by the Anglo-Australian Telescope with the AAOmega spectrograph, and targeted more than 238 000 galaxies selected from the Sloan Digital Sky Survey (Abazajian et al. 2009, SDSS;).For this study we used GAMA II observations (Liske et al. 2015) from three equatorial regions (G09, G12, and G15) range, the measured ESD profiles are expected to be approximately stationary in proper coordinates.
containing more than 180 000 galaxies.These regions span a total area of ∼ 180 deg 2 on the sky, completely overlapping with KiDS.
GAMA has a redshift range of 0 < z < 0.5, with a mean redshift of z = 0.22.The survey has a redshift completeness of 98.5% down to Petrosian r-band magnitude m r,Petro = 19.8mag.We limited our GAMA foreground sample to galaxies with the recommended redshift quality: n Q ≥ 3. Despite being a smaller survey, GAMA's accurate spectroscopic redshifts were highly advantageous when measuring the lensing profiles of galaxies (see Section 2.1).The GAMA redshifts were used to train the photometric machine-learning (ML) redshifts of our larger sample of KiDS foreground galaxies (see Section 3.3).Also, in combination with its high redshift completeness, GAMA allows for a more accurate selection of isolated galaxies.We therefore checked that the results from the KiDS-only measurements are consistent with those from KiDS-GAMA.
To measure the RAR with KiDS-GAMA, we need individual stellar masses M for each GAMA galaxy.We used the Taylor et al. (2011) stellar masses, which are calculated from ugrizZY spectral energy distributions5 measured by SDSS and VIKING by fitting them with Bruzual & Charlot (2003) Stellar Population Synthesis (SPS) models, using the Initial Mass Function (IMF) of Chabrier (2003).Following the procedure described by Taylor et al. (2011), we accounted for flux falling outside the automatically selected aperture using the 'flux-scale' correction.

Selecting isolated lens galaxies with accurate redshifts and stellar masses
Because of its accurate spectroscopic redshifts, the GAMA lenses would be an ideal sample for the selection of isolated galaxies and the measurement of accurate stellar masses (as was done in Brouwer et al. 2017).However, since the current KiDS survey area is > 5 times larger than that of GAMA, we selected a KiDS-bright sample of foreground galaxies from KiDS-1000 that resembles the GAMA survey.We then used the GAMA redshifts as a training sample to compute neural-net redshifts for the KiDS-bright lenses (see e.g.Bilicki et al. 2018), from which accurate stellar masses could subsequently be derived.The details of the specific sample used in this work are provided in Bilicki et al. (2021).Here we give an overview relevant for this paper.
To mimic the magnitude limit of GAMA (m r,Petro < 19.8 mag), we applied a similar cut to the (much deeper) KiDS survey.Because the KiDS catalogue does not contain Petrosian magnitudes we used the Kron-like elliptical aperture r-band magnitudes from SExtractor, calibrated for r-band extinction and zero-point offset6 , which have a very similar magnitude distribution.Through matching the KiDS and GAMA galaxies and seeking the best trade-off between completeness and purity, we decided to limit our KiDS-bright sample to m r,auto < 20.0.In addition we removed KiDS galaxies with a photometric redshift z > 0.5, where GAMA becomes very incomplete.
To remove stars from our galaxy sample, we applied a cut based on galaxy morphology, nine-band photometry and the SExtractor star-galaxy classifier 7 .Through applying the IMAFLAGS_ISO=0 flag, we also removed galaxies that are af-fected by readout and diffraction spikes, saturation cores, bad pixels, or by primary, secondary or tertiary haloes of bright stars8 .We applied the recommended mask that was also used to create the KiDS-1000 shear catalogues9 .In addition, objects that are not detected in all 9 bands were removed from the sample.Our final sample of KiDS-bright lenses consists of ∼ 1 million galaxies, more than fivefold the number of GAMA galaxies.This increased lens sample allowed us to verify the results from Brouwer et al. (2017) with increased statistics, and to study possible dependencies of the RAR on galaxy observables.
To use the KiDS-bright sample as lenses to measure g obs , we needed accurate individual redshifts for all galaxies in our sample.These photometric redshifts z ANN were derived from the full nine-band KiDS+VIKING photometry by training on the spectroscopic GAMA redshifts (see Section 3.2) using the ANNz2 (Artificial Neural Network) machine learning method (Sadeh et al. 2016).When comparing this z ANN to the spectroscopic GAMA redshifts z G measured for the same galaxies, we found that their mean offset However, this offset is mainly caused by the low-redshift galaxies: z ANN < 0.1.Removing these reduces the mean offset to δz/(1 + z G ) = −6 × 10 −5 , with a standard deviation σ z = σ(δz) = 0.026.This corresponds to a redshift-dependent deviation of σ z /(1 + z ANN ) = 0.02 based on the mean redshift z ANN = 0.25 of KiDS-bright between 0.1 < z < 0.5, which is the lens redshift range used throughout this work for all lens samples.
In order to measure the expected baryonic acceleration g bar , we computed the KiDS-bright stellar masses M based on these ANNz2 redshifts and the nine-band GAaP photometry.Because the GAaP photometry only measures the galaxy magnitude within a specific aperture size, the stellar mass was corrected using the 'fluxscale' parameter 10 The stellar masses were computed using the LePhare algorithm (Arnouts et al. 1999;Ilbert et al. 2006), which performs SPS model fits on the stellar component of the galaxy spectral energy distribution.We used the Bruzual & Charlot (2003) SPS model, with the IMF from Chabrier (2003, equal to those used for the GAMA stellar masses).LePhare provides both the best-fit logarithmic stellar mass value 'MASS_BEST' of the galaxy template's probability distribution function, and the 68% confidence level upper and lower limits.We used the latter to estimate the statistical uncertainty on M .For both the upper and lower limit, the mean difference with the best-fit mass is approximately: Another way of estimating the statistical uncertainty in the stellar mass is to combine the estimated uncertainties from the input: the redshifts and magnitudes.The redshift uncertainty σ z / z G = 0.11 corresponds to an uncertainty in the luminosity distance of: σ(δD L )/ D L = 0.12.We took0 the flux F to remain constant between measurements, such that: Assuming that approximately L ∝ M leads to an estimate: which finally gives our adopted stellar mass uncertainty resulting from the KiDS-bright redshifts: log 10 (1 + δM /M ) = 0.11 dex.The uncertainty resulting from the KiDS-bright magnitudes is best estimated by comparing two different KiDS apparent magnitude measurements: the elliptical aperture magnitudes 'MAG_AUTO_CALIB' from SExtractor and the Sérsic magnitudes 'MAG_2dphot' from 2DPHOT (La Barbera et al. 2008).The standard deviation of their difference, δm = m 2dphot − m calib , is σ(δm) = 0.69, which corresponds to a flux ratio of F 2dphot /F calib = 1.88 (or 0.27 dex).Using the same assumption, now taking D L to remain constant, results in: 4πD 2 L F ∝ F ∝ L ∝ M .This means our flux ratio uncertainty directly corresponds to our estimate of the M uncertainty.Quadratically combining the 0.11 dex uncertainty from the redshifts and the 0.27 dex uncertainty from the magnitudes gives an estimate of the total statistical uncertainty on the stellar mass of ∼ 0.29 dex.This is much larger than that from the LePhare code.Taking a middle ground between these two, we have assumed twice the LePhare estimate: σ M = 0.12 dex.However, we have confirmed that using the maximal estimate σ M = 0.29 dex throughout our analysis does not change the conclusions of this work, in particular those of Section 5.4.
When comparing M ,ANN with the GAMA stellar masses M ,G of matched galaxies, we found that its distribution is very similar, with a standard deviation of 0.21 dex around the mean.Nevertheless there exists a systematic offset of log(M ,ANN ) − log(M ,G ) = −0.056dex, which is caused by the differences in the adopted stellar mass estimation methods.In general, it has been found impossible to constrain stellar masses to within better than a systematic uncertainty of ∆M ≈ 0.2 dex when applying different methods, even when the same SPS, IMF and data are used (Taylor et al. 2011;Wright et al. 2017).We therefore normalised the M ,ANN values of our KiDS-bright sample to the mean M ,G of GAMA, while indicating throughout our results the range of possible bias due to a ∆M = 0.2 dex systematic shift in M .We estimated the effect of this bias by computing the RAR with log 10 (M ) ± ∆M as upper and lower limits.
In order to compare our observations to the MG theories, the measured lensing profiles of our galaxies should not be significantly affected by neighbouring galaxies, which we call 'satellites'.We defined our isolated lenses (Appendix A) such that they do not have any satellites with more than a fraction f M ≡ M ,sat /M ,lens of their stellar mass within a spherical radius r sat (where r sat was calculated from the projected and redshift distances between the galaxies).We chose f M = 0.1, which corresponds to 10% of the lens stellar mass, and r sat = 3 h −1 70 Mpc, which is equal to the maximum projected radius of our measurement.In short: r sat ( f M > 0.1) > 3 h −1 70 Mpc.We also restricted our lens stellar masses to M < 10 11 h −2 70 M since galaxies with higher masses have significantly more satellites (see Section 2.2.3 of Brouwer et al. 2017).This provided us with an isolated lens sample of 259 383 galaxies.We provide full details of our choice of isolation criterion and an extensive validation of the isolated galaxy sample in Appendix A. Based on tests with KiDS, GAMA and MICE data we found that this is the optimal isolation criterion for our data.The ESD profile of our isolated sample is not significantly affected by satellite galaxies and that our sample is accurate to ∼ 80%, in spite of it being flux-limited.Using the MICE simulation we also estimated that the effect of the photometric redshift error is limited.

Simulations
In order to compare our observations to ΛCDM-based predictions, we used two different sets of simulations: MICE and BA-HAMAS.Here MICE is an N-body simulation, which means that galaxies are added to the DM haloes afterwards, while BA-HAMAS is a hydrodynamical simulation that incorporates both stars and gas through sub-grid physics.MICE, however, has a simulation volume at least two orders of magnitude larger than BAHAMAS.Below we explain the details of each simulation, and how we utilised their unique qualities for our analysis.

MICE mock catalogues
The MICE N-body simulation contains ∼ 7 × 10 10 DM particles in a (3072 h −1 70 Mpc) 3 comoving volume (Fosalba et al. 2015a).From this simulation the MICE collaboration constructed a ∼ 5000 deg 2 lightcone with a maximum redshift of z = 1.4.The DM haloes in this lightcone were identified using a Friend-of-Friend algorithm on the particles.These DM haloes were populated with galaxies using a hybrid halo occupation distribution (HOD) and halo abundance matching (HAM) prescription (Carretero et al. 2015;Crocce et al. 2015).The galaxy luminosity function and colour distribution of these galaxies were constructed to reproduce local observational constraints from SDSS (Blanton et al. 2003b(Blanton et al. ,a, 2005)).
In the MICECATv2.0catalogue11 , every galaxy had sky coordinates, redshifts, comoving distances, apparent magnitudes and absolute magnitudes assigned to them.Of the total MICE lightcone we used 1024 deg 2 , an area similar to the KiDS-1000 survey.We used the SDSS apparent r-band magnitudes m r as these most closely match those from KiDS (see Brouwer et al. 2018).We could therefore limit the MICE galaxies to the same apparent magnitude as the KiDS-bright sample: m r < 20 mag, in order to create a MICE foreground galaxy (lens) sample.We used the same redshift limit: 0.1 < z < 0.5, resulting in a mean MICE lens redshift z = 0.23, almost equal to that of GAMA and KiDS-bright within this range.The absolute magnitudes of the mock galaxies go down to M r − 5 log 10 (h 100 ) < −14 mag, which corresponds to the faintest GAMA and KiDSbright galaxies.Each galaxy was also assigned a stellar mass M , which is needed to compute the RAR (see Section 2.2).These stellar masses were determined from the galaxy luminosities L using Bell & de Jong (2001) M /L ratios.
In addition, each galaxy had a pair of lensing shear values associated with it (γ 1 and γ 2 , with respect to the Cartesian coordinate system).These shear values were calculated from healpix weak lensing maps that were constructed using the 'onion shell method' (Fosalba et al. 2008(Fosalba et al. , 2015b)).The lensing map of MICE-CATv2.0 has a pixel size of 0.43 arcmin.We did not use MICE results within a radius R res corresponding to 3 times this resolution.We calculated R res and the corresponding g bar using the mean angular diameter distance and baryonic mass of the MICE lens sample.For the full sample of isolated MICE galaxies these values are: R res = 0.25 h −1 70 Mpc and g bar = 6.60 × 10 −14 m s −2 .At scales larger than this resolution limit, the MICE shears allowed us to emulate the GGL analysis and conversion to the RAR that we performed on our KiDS-1000 data (as described in Section 2) using the MICE simulation.To create a sample of MICE background galaxies (sources) for the lensing analysis, we applied limits on the MICE mock galaxies' redshifts and apparent magnitudes, which are analogous to those applied to the KiDS source sample: 0.1 < z < 1.2, m r > 20 (see Hildebrandt et al. 2017 and Section 3.1; uncertainties in the KiDS z B are not accounted for in this selection).We also applied an absolute magnitude cut of M r > −18.5 mag, in order to reproduce the KiDS source redshift distribution more closely.
The MICE mock catalogue also features very accurate clustering.At lower redshifts (z < 0.25) the clustering of the mock galaxies as a function of luminosity was constructed to reproduce the Zehavi et al. (2011) clustering observations, while at higher redshifts (0.45 < z < 1.1) the MICE clustering was validated against the Cosmic Evolution Survey (COSMOS; Ilbert et al. 2009).The accurate MICE galaxy clustering allowed us to analyse the RAR at larger scales (> 0.3 h −1 70 Mpc) where clustered neighbouring galaxies start to affect the lensing signal.MICE also allowed us to test our criteria defining galaxy isolation (see Appendix.A).

BAHAMAS mock catalogue
The second set of simulations that we utilised is BAHAMAS (McCarthy et al. 2017).The BAHAMAS suite are smoothedparticle hydrodynamical realisations of (400 h −1 100 Mpc) 3 volumes and include prescriptions for radiative cooling and heating, ionising background radiation, star formation, stellar evolution and chemical enrichment, (kinetic wind) supernova feedback, supermassive black hole accretion, and merging and thermal feedback from active galactic nuclei (AGN).The simulations were calibrated to reproduce the stellar and hot gas content of massive haloes, which makes them particularly well suited for our study of the matter content around haloes out to distances of 1-3 h −1 70 Mpc.The masses of DM and baryonic resolution elements are 3.85×10 9 h −1 100 M and 7.66×10 8 h −1 100 M respectively, and the gravitational softening is fixed at = 4 h −1 100 kpc = 5.71 h −1 70 kpc.Haloes and galaxies were identified in the simulations using the friends-of-friends (Davis et al. 1985) and Subfind (Springel et al. 2001;Dolag et al. 2009) algorithms.We labeled the most massive sub-halo in each Friend-of-Friend group as the 'central' and other sub-haloes as 'satellites'.We constructed an 'isolated' galaxy sample by restricting the selection to central sub-haloes that have no other sub-haloes (satellites or centrals) more massive than 10% of their mass within 3 h −1 70 Mpc.We randomly selected 100 galaxies per 0.25 dex bin in M 200 between 10 12 and 10 13.5 h −2 70 M .In the last two bins there were fewer than 100 candidates, so we selected them all.All galaxies have a redshift z = 0.25.For each selected galaxy we constructed an integrated surface density map, integrated along the line-of-sight for ±15 comoving h −1 100 Mpc around the target halo.We also extracted the cumulative spherically averaged mass profile of each target sub-halo, decomposed into DM, stars, and gas.For both the maps and profiles, we included mass contributions from all surrounding (sub)structures: we did not isolate the haloes from their surrounding environment.
We used the integrated surface density map of each galaxy to calculate its mock ESD profile as a function of the projected distance R from the lens centre, in order to mimic the effect of GGL and the conversion to the RAR on the BAHAMAS results.Each pixel on these maps corresponds to 15 comoving h −1 100 kpc, which in our physical units is: 15/(1 + z) 0.7 −1 h −1 70 kpc = 17.14 h −1 70 kpc.The density maps each have a dimensionality of 400 × 400 pixels.Hence the total area of each map is (6.86 h −1 70 Mpc) 2 .In calculating the lensing profiles and RAR with BAHAMAS we followed, as closely as possible, the GGL procedure and conver-sion to the RAR as described in Section 2. We truncated our lensing profiles at 10 times the gravitational softening length: 10 = 0.057 h −1 70 Mpc, to avoid the numerically poorly converged central region (Power et al. 2003).For a typical galaxy in our sample of isolated BAHAMAS galaxies, this corresponds to g bar ∼ 2.38 × 10 −12 m s −2 .

The BAHAMAS RAR: Quantifying the missing baryon effect
The calculation of the expected baryonic radial acceleration g bar requires the enclosed baryonic mass M bar (< r) within a spherical radius r around the galaxy centre.Since we are dealing with measurements around isolated galaxies at R > 30 h −1 70 kpc, we can approximate M bar (< r) as a point mass M gal mainly composed of the mass of the lens galaxy itself.M gal can be subdivided into stars and gas, and the latter further decomposed into cold and hot gas.
How we obtained the stellar masses of our GAMA, KiDSbright, MICE and BAHAMAS galaxies is described in Sections 3 and 4. From these M values, the fraction of cold gas f cold = M cold /M can be estimated using scaling relations based on H i and CO observations.Following Brouwer et al. (2017) we used the best-fit scaling relation found by Boselli et al. (2014), based on the Herschel Reference Survey (Boselli et al. 2010): We applied this equation to all observed and simulated values of M in order to arrive at the total galaxy mass: The spatial distribution of the stellar and cold gas mass are similar (Pohlen et al. 2010;Crocker et al. 2011;Mentuch Cooper et al. 2012;Davis et al. 2013) and can therefore be considered a single mass distribution, especially for the purposes of GGL, which only measures the ESD profile at scales larger than the galaxy disc (R > 30 h −1 70 kpc).We illustrate this in Fig. 1, which shows the enclosed mass profiles (upper panel) and RAR (lower panel) for different baryonic components in the BAHAMAS simulation.For these mock galaxies, the stellar mass within 30 h −1 70 kpc (red star) gives a good approximation of the M distribution across all radii that we consider.We therefore modeled the baryonic mass of our galaxies as a point mass M gal , containing both the stellar and cold gas mass.
We recognise that the total baryonic mass distribution M bar of galaxies may include a significant amount of additional mass at larger distances, notably in the hot gas phase.This is illustrated in Fig. 1.In the upper panel, we show the average baryonic mass profile for BAHAMAS galaxies with 1 < M 200 /(10 12 h −2 70 M ) < 3.In addition, we show an estimate of the typical baryonic mass profile for galaxies in the same mass range, based on an extrapolation to larger radii of the compilation of observations in Tumlinson et al. (2017); including stars, cold gas (< 10 4 K, traced by absorption lines such as H i, Na i and Ca ii), cool gas (10 4 -10 5 K, traced by many UV absorption lines, e.g.Mg ii, C ii, C iii, Si ii, Si iii, N ii, N iii), warm gas (10 5 -10 6 K, traced by C iv, N v, O vi and Ne vii absorption lines), hot gas (> 10 6 K, traced by its X-ray emission) and dust (estimated from the reddening of background QSOs, and Ca ii absorption).The light blue shaded region therefore illustrates a component of missing baryons predicted by these simulations but not (yet) observed, possibly related to the cosmological missing baryons (e.g.Fukugita et al. 1998;Fukugita & Peebles 2004;Shull et al. 2012).There are several possibilities: (i) there may be additional gas present in a difficult-to-observe phase (e.g.hot, low-density gas, see for instance Nicastro et al. 2018); (ii) the simulations do not accurately  (2017).In the inner galaxy the discrepancy (light blue shaded region) between the observed and simulated M bar is relatively small, but in the outer galaxy the majority of the baryons predicted to be present in BA-HAMAS consist of currently unobserved, missing baryons.The orange dashed line shows the expected baryonic mass profile if the baryon density is everywhere equal to a fixed fraction f b = Ω b /Ω m of the local DM density.At large enough radii ( 2 h −1 70 Mpc), the baryon-to-DM ratio converges to the cosmic average.Lower panel: As in upper panel, but in acceleration space.The cosmic baryon fraction provides a strong theoretical upper limit on g bar at low accelerations in the context of the ΛCDM cosmology.reflect reality, for example: galaxies may eject substantially more gas from their surroundings than is predicted by these simulations; (iii) there may be less baryonic matter in the Universe than expected in the standard cosmology based on big bang nucleosynthesis (BBN; Kirkman et al. 2003) calculations and CMB measurements (Spergel et al. 2003;Planck XVI 2014).
The lower panel of Fig. 1 illustrates the magnitude of the resulting systematic uncertainties in g bar .In the ΛCDM cosmology, the expectation at sufficiently large radii is given by g obs = f −1 b g bar where f b is the cosmic baryon fraction f b = Ω b /Ω m = 0.17 (Hinshaw et al. 2013).BAHAMAS, and generically any ΛCDM galaxy formation simulation, converges to this density at low enough accelerations (large enough radii).The most optimistic extrapolation of currently observed baryons falls a factor of ∼ 3 short of this expectation, while the stellar mass alone is a further factor of ∼ 3 lower.The unresolved uncertainty around these missing baryons is the single most severe limitation of our analysis.Given that we are interested in both ΛCDM and alternative cosmologies, we will use the stellar+cold gas mass M gal as our fiducial estimate of the total baryonic mass M bar , which is translated into the baryonic acceleration g bar , throughout this work.This serves as a secure lower limit on g bar .We note that the eventual detection, or robust non-detection, of the missing baryons has direct implications for the interpretation of the results presented in Section 5.In Section 5.2 we address the possible effect of extended hot gas haloes on g bar .We discuss this issue further in Section 6.
Concerning g obs , omitting the contribution of hot gas will not have a large effect on the prediction within the ΛCDM framework (e.g. from simulations) since the total mass distribution at the considered scales is heavily dominated by DM.Within MG frameworks such as EG and MOND, where the excess gravity is sourced by the baryonic matter, it is slightly more complicated.(Brouwer et al. 2017, see section 2.2) carefully modelled the distribution of all baryonic components, based on observations from both GAMA and the literature, including their effect on the excess gravity in the EG framework.They found that, for galaxies with M < 10 11 h −2 70 M , the contribution to the ESD profile (and hence to g obs ) from hot gas and satellites was small compared to that of the stars and cold gas.Although this analysis was done for the EG theory, the effect of these extended mass distributions within MOND are similar or even less.This allows us to use a point mass M gal as a reasonable approximation for the baryonic mass distribution M bar (< r) within our measurement range when computing g obs as predicted by MOND and EG (see Section 2.3 and 2.4).

The BAHAMAS RAR: Testing the ESD to RAR conversion
We used BAHAMAS to test the accuracy of our SIS method (outlined in Section 2.2) in estimating g obs from our GGL measurement of ∆Σ obs , by comparing it against the more sophisticated piece-wise power law (PPL) method outlined in Appendix B. As a test system, we used the 28 galaxies from our BAHAMAS sample with 10 13 < M 200 /( h −2 70 M ) < 10 13.1 .We combined these into a stacked object by averaging the individual ESD profiles as derived from their mock lensing maps.The stacked ESD as measured from the lensing mocks is shown in the left panel of Fig. 2. Since the mock ESD profiles are derived from convergence maps (rather than the shapes of background galaxies), they have no associated measurement uncertainty -for simplicity, we assumed a constant 0.1 dex uncertainty, which is similar to that for the KiDS measurements.We also combined the spherically averaged enclosed mass profiles of the galaxies out to 3 h −1 70 Mpc by averaging them.From this average mass profile we analytically calculated the ESD profile shown in the left panel of Fig. 2. We found that the ∆Σ calculated from the spherically averaged mass profile is ∼ 0.05 dex higher than the direct measurement of the stacked lensing mocks.This primarily results from the fact that the spherically averaged mass profile does not take into account the additional matter outside the 3 h −1 70 Mpc spherical aperture, whereas the mock surface density maps are integrated along the line-of-sight for ±15 comoving h −1 100 Mpc around the lens.The PPL method described in Appendix B attempts to reproduce the ESD profile by converging to an appropriate volume density profile.The resulting recovered ESD profile and its 68% confidence interval is shown with blue points and error bars in the left panel of Fig. 2 -the fit to the mock data is excellent.In the centre panel we show the enclosed mass profile as recovered by both the PPL and SIS methods, in addition to the true enclosed mass profile.Both estimators recover the profile within their stated errors.The PPL method systematically underestimates it by ∼ 0.1 dex across most of the radial range.This is directly caused by the difference between the spherically averaged and mock lensing ESD profiles (left panel).The somewhat wider confidence intervals at small radii are caused by the lack of information in the mock data as to the behaviour of the profile at r < 30 h −1 70 kpc; the PPL model marginalises over all possibilities.Once the enclosed mass is dominated by the contribution at radii covered by the measurement, the uncertainties shrink.To account for the added uncertainty resulting from the conversion to the RAR, we added 0.1 dex to the error bars of our RAR measurements throughout this work.
The SIS method instead slightly underestimates the enclosed mass at small radii, and overestimates it at large radii.The apparent improved performance relative to the PPL method is actually due to a fortuitous partial cancellation of two errors.First, the SIS calculation suffers from the same underestimation of the spherically averaged enclosed mass profile as the PPL method, due to the difference between the mock lensing and spherically averaged ESD profiles.However, in addition to this, the SIS method assumes a density profile ρ(r) ∝ r −2 at all radii.At small radii, the power-law slope is in reality about −2.1.This results in a slight overestimate of the enclosed mass, which partially compensates the underestimate described above, resulting in a net underestimate.At larger radii, the slope of the density profile becomes progressively steeper, such that the assumption of an r −2 profile increasingly overestimates the enclosed mass, eventually resulting in a net overestimate.
The right panel of Fig. 2 illustrates the resulting uncertainty in the measurement of the RAR.To focus on the influence of the method used to recover g obs , we simply used the exact spherically averaged stellar mass profile to calculate g , plotted on the x-axis12 .We found that, for mock lenses within the BAHAMAS simulation, both the SIS and the PPL method yield acceptable and consistent estimates of g obs .We note that the BAHAMAS g obs (g ) is significantly offset from the RAR as measured by M16; we will return to this point when we compare BAHAMAS to our observations in Section 5.3.Fig. 2. Illustration of the recovery of the acceleration profile from simulated weak lensing observations.Left: Average ESD profile of a subset of our sample of BAHAMAS galaxies with 10 13 < M 200 /( h −2 70 M ) < 10 13.1 , derived from the spherically averaged mass profile (red line) and the mock lensing maps (yellow line, with an assumed 0.1 dex Gaussian uncertainty).The PPL method recovery of the ESD profile is shown with the blue points; error bars represent 68% confidence intervals.Centre: The SIS (light blue squares) and PPL (dark blue points) method recover the spherically averaged enclosed mass profile.The uncertainties on the SIS points are derived by sampling the uncertainties on the mock lensing ESD profile.Right: The resulting dynamical acceleration profile g obs and uncertainties, plotted as a function of the acceleration due to stars g = GM (< r)/r 2 .

Lensing rotation curves
As a final consistency check between the SIS assumption and the PPL method, we applied both methods to the true KiDS-1000 data.Since these methods are only used to convert ∆Σ(R) into g obs (r), we can leave g bar out of the comparison and plot our results as a function of R.An observable closely related to the RAR that is usually plotted as a function of radius, is the traditional circular velocity curve: an observable that indeed served as input to the M16 RAR measurement.We applied the SIS method described in Section 2.2 to convert our ESD profiles ∆Σ(R) into v circ (R) since substituting Eq. 6 into Eq.24 gives: We also applied Eq. 24 to compute v circ (R) from the M(< R) calculated through the PPL method described in Appendix B. We note that both the SIS and PPL method assume spherical symmetry, while in simulations DM haloes are found to deviate from sphericity, which could lead to deviations in the lensing rotation curves (Cuddeford 1993).However, the mean ellipticity of haloes is observed to be small ( | | = 0.174 ± 0.046, Schrabback et al. 2021).The stacking of thousands of lenses with approximately random orientations further reduces the impact on the lensing signal, which means the halo ellipticity will not significantly change our results.Fig. 3 shows the lensing rotation curves for isolated KiDSbright galaxies, divided into four stellar mass bins using the following limits: log 10 (M / h −2 70 M ) = [8.5, 10.3, 10.6, 10.8, 11.0].
For each bin the mean galaxy mass (stars+cold gas) of the lenses, log 10 M gal / h −2 70 M = [10.14, 10.57, 10.78, 10.96], is shown at the top of the panel.Showing the data in this way allows us to observe for the first time in this intuitive manner how the circular velocity curves of isolated galaxies continue beyond the observable disc (r > 30 h −1 70 kpc).In addition, it provides a consistency check against the SPARC rotation curves (Lelli et al. 2016) that form the basis for the M16 RAR measurement.It is remarkable how well the mean of the SPARC rotation curves and our lensing results correspond at their intersection (r ∼ 30 h −1 70 kpc).But most importantly, we find that the 'lensing rotation curves' from the SIS assumption are consistent with the ones from the PPL method.Although the SIS assumption results in slightly more scatter, there is very little systematic bias between the results from the two methods, which have a fractional difference of log(v circ,SIS /v circ,PPL ) = 0.017 dex.Since this measurement is merely a different way of presenting the observed acceleration, which equals g obs (r) = v 2 circ /r, we can easily compute that the expected difference in g obs would be log(g obs,SIS /g obs,PPL ) = 0.038 dex.
The consistency between the two conversion methods allows us to use the SIS assumption throughout this work.The great advantage of this method is that it allows us to convert GGL profiles binned by baryonic acceleration ∆Σ(g bar ), into the RAR: g obs (g bar ).This is not the case for the PPL method, which only works on ∆Σ(R) binned by radius.The former can therefore be applied to any lens sample; the latter only to lenses within a narrow mass range (in order to convert R into g bar using the mean M gal ).As explained in Section 4.4 we added 0.1 dex to the error bars of all RAR measurements in this work, to account for the added uncertainty from the conversion of the ESD to the RAR.After showing that both methods yield acceptable and consistent estimates of g obs , we will show only the SIS measurement when presenting our results in this section to reduce clutter in the figures.Appendix A).The black points (with 1σ error bars) show the result calculated using the SIS assumption, while the blue points (with error bars representing the 16th and 84th percentile of the fits) show the result from the more sophisticated PPL method.Our measurements are consistent between the two methods, and also with the rotation curves from SPARC (all data as the blue 2D histogram, the mean as red squares).

The RAR of KiDS compared to MG theories
In Fig. 4 we show the RAR, with the observed radial acceleration computed from our lensing measurements through Eq. 7) on the y-axis.The x-axis shows the expected baryonic (star+cold gas) radial acceleration, where the label serves as a reminder throughout this work that g bar is only computed from the measured stellar masses of the galaxies and an estimate of their cold gas component.
The lensing g obs was measured using the GAMA and KiDSbright isolated galaxy samples, respectively.Due to its smaller survey area (180 vs. 1006 deg 2 ), the error bars using GAMA lenses are larger than those using KiDS-bright lenses.However, as explained in Appendix A, the spectroscopic redshifts of the GAMA survey allow for a more reliable selection of the isolated lenses compared to KiDS (which measures photometric redshifts with a σ z = 0.02 uncertainty).The effect of this uncertainty on the measured lensing profiles is modelled in Fig. A.3, which shows that the ESD profile of the 'offset' MICE sample diverges from the truly isolated MICE galaxies at radius R > 0.3 h −1 70 Mpc.At these large scales, the effect of satellite galaxies on the lensing signal result in a ∼ 30% increase in ∆Σ due to the contribution of satellite galaxies.We translated this radius into a gravitational acceleration value using Eq. 2, based on the average M gal of the lens sample.In this way we estimate that, for the full sample of isolated KiDS-bright galaxies, the isolation criterion is no longer reliable when g bar 10 −13 m s −2 , as indicated by the light blue shaded region in Fig. 4. We note that the GAMA results, which are based on accurate spectroscopic redshift measurements, are still reliable within this region.
The grey band shows the range of possible bias due to a ∆M = ±0.2dex systematic shift in stellar mass.We estimated this range by performing our analysis assuming stellar masses that are 0.2 dex higher than, and then 0.2 dex lower than, their best-fitting M values (see Section 3.3).We only show this band once, for the KiDS-bright result, but note that this uncertainty equally affects the GAMA stellar masses (and, indeed, any stellar mass measurement; see Wright et al. 2017).
We compare our results to the M16 RAR measurements (both the full dataset: blue 2D histogram, and the mean: red squares), from SPARC galaxy rotation curves, which cover higher accelerations than our lensing measurements (corresponding to smaller scales: R < 30 h −1 70 kpc).At the highestacceleration end (smallest scales), where g obs is dominated by g bar , they follow a one-to-one relation.At lower accelerations (larger scales) their results quickly diverge from unity, signifying the start of the DM dominated regime.We find that these two fully independent RAR observations, respectively from rotation curves and lensing, are in strong agreement 13 . 13Because the blinding intended to avoid observer bias in the KiDS-1000 cosmological constraints (Asgari et al. 2021;Heymans et al. 2021;  4. Measured RAR, which compares the total gravitational acceleration g obs with the expected baryonic acceleration g bar of galaxies.At high accelerations we show the M16 RAR measurements from galaxy rotation curves (all data as the blue 2D histogram, the mean as red squares).Using weak gravitational lensing we were able to extend this measurement to lower accelerations, using both the spectroscopic GAMA and the photometric KiDS-bright isolated lens samples (blue and black points with 1σ error bars).Comparing our lensing observations to two MG models: MOND (the M16 fitting function; grey solid line) and EG (assuming a point mass; red dashed line) we find that GAMA results are in agreement with the two models, while those from KiDS-bright are systematically higher.At very low accelerations (corresponding to R > 0.3 h −1 70 Mpc, light blue shaded region) the uncertainty in the photometric KiDS redshifts affects the isolated lens selection, resulting in systematically higher values of g obs due to the possible contribution of satellites.The results from the spectroscopic GAMA survey, however, are still reliable within this region.The impact of stellar mass uncertainty (∆M = 0.2 dex) on the measurement is shown as the grey band.We show the MOND prediction including the EFE (with e = 0.003, see Eq. 13) as the grey dashed line.In addition, we show the effect on the RAR of KiDS-bright galaxies if g bar contained an additional isothermal hot gas contribution within a 100 h −1 70 kpc radius, with a nominal gas mass equal to the stellar mass (orange crosses with 1σ error bars).We emphasise that this is only a rough order of magnitude estimate of the possible effect of gaseous haloes, which are extremely difficult to observe.Fig. 4 also compares the two MG models, EG and MOND, to our lensing results (for a comparison of these two models with the RAR from SPARC, see Lelli et al. 2017a).As explained in Sections 2.3 and 2.4, we took the MOND prediction to be equal to the extrapolated M16 fitting function (Eq.11), and that of EG as the prediction from Verlinde (2017) for a point mass (Eq.17).At high accelerations, the prediction from EG appears to lie above that of MOND and the SPARC data.However, as explained in Section 2.4, the prediction of Eq. 17 should be taken with a grain of salt for accelerations g bar > 1.2 × 10 −10 m s −2 .Within our measurement range, the two predictions are almost indistinguishable.Both models are compatible with the GAMA data.The KiDS-bright data points, however, lie systematically above the MG predictions.Tröster et al. 2020) only has a small effect on GGL observations, this agreement has been present since the start of our analysis (before the data were un-blinded).
To quantify the level of agreement between the acceleration predicted by the different models g mod and the observed g obs , we calculated the χ 2 value: where C −1 is the inverse of the analytical covariance matrix (see Section 2.1).We divided this quantity by the number of degrees of freedom N DOF of the model, which gives the reduced χ 2 statistic: Here N data is the number of data points in the measurement and N param is the number of free parameters in the model.Since none of the models have free parameters, N DOF is simply the total number of g bar -bins (in this case N data = 15).
Comparing the GAMA data to the two MG models results in χ 2 red -values of 0.8 for both MOND and EG, corresponding to a standard deviation of 0.4σ.This confirms that both models agree well with the GAMA data.When using the KiDS-bright results, neither model provides a good description of the data with: χ 2 red = 4.6 and 5.0 for MOND and EG respectively, corresponding to ∼ 6 standard deviations (∼ 6σ).Taking into account the effect of the photometric redshift uncertainty of KiDS-bright by only using the seven data points within the isolation criterion limit (R < 3 h −1 70 Mpc) we find: χ 2 red = 4.0 for MOND and χ 2 red = 4.4 for EG, ∼ 3.8σ away from a good fit.Considering the ∆M = ±0.2dex uncertainty shown by the grey band (with the data points beyond the isolation criterion limit still removed) leads to χ 2 red = 1.5 for ∆M = +0.2dex and χ 2 red = 14 for ∆M = −0.2dex with respect to MOND, with similar results for EG.Thus, the MOND and EG predictions are able to describe our measurements within the statistical and systematic uncertainties.Whether these models are confirmed or excluded relies heavily on the systematic bias in the stellar mass measurements.This highlights the general point that GGL measurements are now so accurate in determining the total observed mass distribution that improving the RAR measurement primarily depends on obtaining better constraints on the baryonic mass distribution.
This point is highlighted further by the fact that we cannot incorporate measurements of the total baryonic mass distribution into our comparison, in particular those components that have not been detected, such as hot gaseous haloes and missing baryons.This remains a fundamental limitation of all work testing DM or MG theories at large scales (see Section 4.3).Although there have been very recent fruitful attempts at a first detection of this barely visible baryonic component (Macquart et al. 2020;Tanimura et al. 2020), there exist no accurate measurements of its distribution around isolated galaxies.However, we can safely continue as long as all estimates of g bar (in the measurements, models and simulations) are based on the same components (in our case: stars+cold gas).This way our RAR results remain purely observational, based on actual measurements along both axes.
However, a qualitative idea of the possible effect of an additional extended ionised gas component on g bar is depicted in Fig. 4. In addition to our standard stars-and-cold-gas point mass used to calculate g bar , we modeled the hot gas as a simple isothermal density profile (ρ(r) ∝ r −2 ), truncated at the accretion radius R acc .Based on Valentijn (1988), we derived that R acc ≈ 100 h −1 70 kpc for hot gas haloes around galaxies with M ≈ 10 11 h −2 70 M .Finding an accurate estimate of the additional gas mass M gas within this radius is no easy matter.Brouwer et al. (2017) assumed a total hot gas mass M gas = 3M , based on results from the OWLS hydrodynamical simulations by Fedeli et al. (2014).They found that, in simulations with AGN feedback, OWLS galaxies with a total mass M 200 = 10 12 h −1 70 M (corresponding to M ≈ 10 10 h −2 70 M , a lower limit on the typical stellar masses in our sample) have a gas-to-stellar-mass fraction of M gas /M ≈ 3.One of the few observational scaling relations for hot gas is derived by Babyk et al. (2018), using Chandra X-ray observations of 94 early-type galaxies.In their Fig. 7, which shows the X-ray gas mass versus the total galaxy mass, galaxies with M tot = 10 12 M have gas fractions ranging from 0.1 − 1.However, Babyk et al. (2018) measured both M tot and M gas within 5 effective radii of their galaxies, which means that the hot gas fraction on larger scales could be as high as 3 in extreme cases.These relatively high hot gas masses motivated by the Babyk et al. (2018) observations are possibly biased towards a high X-ray surface brightness and are an order of magnitude higher than the hot gas masses presented in Fig. 7 of Tumlinson et al. (2017).As this gas mass outweighs the possible contribution of various cooler gas and dust components, this case provides a good guide for our evaluation.Based on all these considerations, we assumed a nominal gas-to-stellar-mass fraction of M /M gas = 1, emphasising that this is only an order of magnitude estimate due to the challenging nature of observing circumgalactic gas.
In Fig. 4 we include the RAR of KiDS-bright galaxies with our nominal estimate of the hot gas distribution added to g bar on the x-axis.At the highest accelerations measurable by lensing, we find that these results are almost indistinguishable from the original KiDS-bright measurements.As the acceleration decreases, the g bar values including hot gas shift further to the right (higher values) due to the increased enclosed hot gas mass.This causes a steepening downward slope of the RAR, such that it finally diverges from the g obs ∝ √ g bar relation at very low accelerations (g bar < 10 −14 m s −2 ).The same effect is as also seen in the BAHAMAS results in Fig. 1.As expected, we find that this steepening of the RAR increases for higher assumed gaseous halo masses M gas , and decreases for lower values.This implies that, if gaseous haloes more massive than in our example (M gas M ) were detected directly and incorporated into the measurement, the observed RAR would diverge from the current MOND and EG predictions at low accelerations.
In the case of MOND a steep downward slope at low accelerations is not expected unless, despite our best efforts, our isolated galaxy sample is not truly isolated.In that case undetected satellites might cause an external field effect (EFE).To evaluate this effect we use the results of Chae et al. (2020) for the isolated SPARC galaxies.Based on their results, we have assumed e = g ext /g † = 0.003 as a reasonable estimate of the external gravitational acceleration g ext compared to the critical acceleration scale g † (see Section 2.3) for our isolated lenses.We use the fitting function in Eq. 13, which represents the EFE for an idealised model of galaxies within their environment, to depict the EFE on the predicted MOND RAR in Fig. 4. The extrapolated M16 fitting function represents the MOND prediction without any EFE (e = 0).As expected the MOND prediction including the EFE diverges from the one without, tending towards a steeper downward slope at low accelerations (g bar < 10 −12 m s −2 ).Hence the EFE moves the MOND prediction away from our main observational result: the lensing RAR from the KiDS-bright sample without an estimate for the additional hot gas, which we explore throughout the rest of this work.We will therefore maintain the use of the M16 fitting function as our main MOND prediction since this represents the optimal case considering our observations.Regarding the KiDS-bright result including an estimate for the hot gas, it turns out that the steeper downward slope resulting from the MOND EFE is not steep enough to be consistent with our measured RAR including an estimate of the additional hot gas.This is illustrated by the fact that, for our chosen value e = 0.003, the MOND prediction including EFE and our RAR observation including hot gas reach the same value of g obs at g bar ≈ 10 −15 m s −2 .However, the observation reaches this depth within a much smaller span in g bar (−15 < log 10 (g bar / m s −2 ) < −14).Choosing a different value for the EFE strength e does not solve this problem, and the effect becomes stronger for higher assumed values of M gas .It is therefore unlikely that the MOND EFE can explain the effect of massive (M gas M ) hot gaseous haloes, if such haloes are detected.In the case of EG it is not yet known whether and, if so, how external gravitational fields affect its prediction (Verlinde, priv. comm.).

The RAR of KiDS compared to ΛCDM simulations
In this section we compare the KiDS-1000 RAR with numerical ΛCDM simulations14 .In order to obtain the predictions from these simulations, we applied the same isolation criterion, GGL procedures and RAR conversion to mock galaxy samples from the MICE and BAHAMAS simulations (see Section 4).In Fig. 5, BAHAMAS (orange band) is shown as the median result of all lens galaxies, with the upper and lower limit of the band representing the 16 th and 84 th percentiles.For MICE (red band) we show the result for isolated lenses selected using the true redshifts (lower limit) and using redshifts with a normally distributed random offset of σ z /(1+z) = 0.02 (upper limit), in order to emulate the effect of the redshift uncertainty in KiDS on the isolated galaxy selection (see Appendix A).This means that the upper limit of the MICE prediction is considered reliable even at high accelerations (blue shaded region), where uncertainties in the galaxy isolation could affect the RAR measurement.The RAR observations are the same KiDS-bright lensing and M16 rotation curve results as shown in Fig. 4, this time compared to the predictions from the two simulations.
We find a good agreement between the MICE simulation and our measurements.The MICE measurements are limited to the low g bar regime, owing to the resolution of the MICE simulations.The MICE scale limit of R > 0.25 h −1 70 Mpc is within the angular scale where satellites missed by the isolation criterion might impact the lensing signal (R > 0.3 h −1 70 Mpc, light blue shaded region).The effect of the KiDS-bright redshift uncertainty σ z on the isolation criterion is however mimicked in the MICE simulation (upper limit of the red band), which means we can safely compare MICE with our low-acceleration measurements.The limited width of red band shows that this effect is relatively small (∼ 30%).The MICE prediction (with the σ z offset) results in a reduced χ 2 value of χ 2 red = 2.3, corresponding to 2.3σ.
Figure 5 shows poor agreement between the lensing RAR for isolated BAHAMAS galaxies and the KiDS measurement.The reason for this is straightforward to understand: the BAHAMAS measurement in Fig. 5 runs approximately parallel to both the KiDS and MICE curves, as a result of a constant offset in the stellar-to-halo-mass relation (SHMR) between BAHAMAS and MICE.Both simulations reproduce the observed SHMR in an overall sense, as shown in fig.6 of McCarthy et al. (2017) and Jakobs et al. (2018) for BAHAMAS, and guaranteed by construction as described in Carretero et al. (2015) for MICE.However, while in MICE our isolated galaxy sample follows essentially the same SHMR as the parent sample, in BAHAMAS isolated galaxies have, on average, triple the stellar mass at fixed halo mass compared to the global BAHAMAS galaxy population.This difference fully accounts for the 0.5 dex horizontal offset between the MICE and BAHAMAS curves in Fig. 5.The failure of BAHAMAS to reproduce the observed lensing RAR could therefore be regarded as a possible shortcoming of the galaxy formation model used in those simulations, rather than a general failure of their cosmological paradigm.However, we note that the offset in the SHMR as a function of local galaxy density is theoretically expected, and (indirectly) observed (e.g.Dutton et al. 2010;Correa & Schaye 2020).It is therefore curious that MICE, which does not reproduce this observed bias, turns out to be in reasonable agreement with our measurements.The discrepancy between KiDS-bright and BAHAMAS must therefore arise due to some more subtle underlying reason that we have yet to identify; we hope to follow this up in future work.We initially selected BAHAMAS for our analysis due to its large volume -required to produce enough of the rare isolated, relatively massive galaxies of interest -and readily available mock lensing data.It will be interesting to revisit the lensing RAR as cosmological hydrodynamical galaxy formation simulations continue to improve in terms of realism, simulated volume, and resolution.

The RAR for early-and late-type KiDS galaxies
The large size of the KiDS-bright lens sample gives us the opportunity to divide our lenses into different samples based on observed galaxy parameters.We determined the RAR for isolated galaxies split into two types based on either parameter: bulgedominated and disc-dominated based on their Sérsic index, and red and blue based on their u − r colour.Although these selections are far from perfect representations of true morphological types, the red and bulge-dominated samples can roughly be identified with canonically early-type (pressure supported) galaxies and the blue and disc-dominated samples with late-type (rotationally supported) galaxies (Driver et al. 2006)  15 .
The r-band Sérsic indices n of all KiDS galaxies with S /N > 50 (following Roy et al. 2018) were measured using the 2DPHOT multi-purpose environment for 2D wide-field image analysis (La Barbera et al. 2008).For the colour split, we used the u and r magnitudes measured using the GAaP pipeline (see Section 3.1).In Fig. 6 the u − r colour versus stellar mass distribution of isolated galaxies shows the split based on Sérsic index, which defines early-type galaxies as those with n > 2 and late-type disc-dominated galaxies as those with n < 2. Based on the u − r magnitude distribution of these two populations, we defined our split by galaxy colour as follows: galaxies with m u − m r > 2.5 mag are defined as red, and those with m u − m r < 2.5 mag as blue.
In both cases, we aimed to select two samples with the same stellar mass distribution, in order to isolate any possible effect of galaxy type on the RAR from that of M .In Fig. 7 we show the M histogram of the two types (in this case based on galaxy colour).From both samples, we removed galaxies until only the overlapping section of both mass distributions remained.Ideally this should give us two samples (red and blue galaxies) with equal stellar mass distributions, shown by the light shaded blue region.
Fig. 8 shows the lensing RAR of equal-mass KiDS-bright galaxies split by Sérsic index (left panel) and u − r colour (right panel).For this result, we focus on establishing whether there exists a significant difference between the RAR of the two types.Contrary to previous plots, the effect of a 0.2 dex global systematic bias in M (normally shown by a grey band) is omitted because this affects both measurements in the same way such that their relative difference does not change (the possibility of a colour-or Sérsic index-dependent M bias is discussed below).The accelerations where uncertainty in the photometric KiDS redshifts affects the KiDS-bright isolated lens selection is indicated by the light blue shaded region.The MICE results (red band) emulate the effect of the redshift uncertainty in KiDS, while the BAHAMAS results (orange band) reflect the median and 16 th and 84 th percentiles of the simulated lens galaxies.The MICE simulation, though limited to low accelerations by its resolution, succeeds in reproducing the lensing data.The result from the BAHAMAS simulation runs approximately parallel to the MICE curve, but underestimates our measurement by 0.5 dex due to the biased SHMR of the BAHAMAS isolated galaxies (see Section 5.3).

Baryonic (stars+cold gas) radial acceleration log(g bar
We indeed observe a significant difference between the RAR measurements of early and late galaxy types.To quantify this difference, we measured the reduced χ 2 between the RAR measurements by replacing g obs and g mod in Eq. 26 with g obs,E and g obs,L from the early-type (red or bulge-dominated) and late-type (blue or disc-dominated) galaxy samples.The χ 2 red equals 67.8/15 = 4.5 for the lenses split by Sérsic index, and 134.2/15 = 8.9 for those split by u − r colour.Taking the full covariance matrix into account we find that even the Sérsic index split, which displays the smallest offset, results in RAR difference with a 5.7σ significance.The mean ratio between the RAR measurements of the two types, log 10 (δg E/L obs ) = log 10 g obs,E /g obs,L , is 0.17 dex and 0.27 dex for the Sérsic and colour splits respectively.
We address the question whether the observed difference of the RAR between early and late types could be caused by any bias in the stellar mass.To this end, we estimated the systematic stellar mass bias between the two types, defined as log 10 (δM E/L ) = log 10 ( M E / M L ), that would be required to resolve the difference between their two RAR measurements.When trying to estimate the effect of this bias on the RAR, we had to take into account that δM E/L affects both the estimated acceleration from baryonic mass g bar (directly) and the observed acceleration g obs (indirectly, through the equal-mass selection).The bias in baryonic acceleration scales linearly with the bias in M , such that: log 10 (δg E/L bar ) = log 10 (δM E/L ).Throughout this work, the observed relation between g bar and g obs at the scales measured by lensing has approximately followed g obs ∝ √ g bar .
This means that we can roughly estimate the effect on g obs as: log 10 (δg E/L obs ) ≈ log 10 (δM E/L ) / 2. Since our measured difference δg obs 0.2 dex, this means log 10 (δM E/L ) should be 2 log 10 (δg E/L obs ) = 0.4 dex.That is, the observed difference could be resolved by a systematic stellar mass bias between the two types 0.4 dex.We will now discuss different sources of a possible systematic bias, and estimate whether they could be the cause of the observed difference.
First, the statistical uncertainty in the M measurements could cause a systematic shift in the two M distributions resulting from Eddington bias (Eddington 1913).We estimated the size of this bias by adding a random offset to the true log 10 (M ) measurements of KiDS-bright before selecting the two 'equal' stellar mass distributions for red and blue galaxies.Based on our estimate of the statistical uncertainty in the KiDS-bright M (see Section 3.3), we drew the random offsets from a lognormal distribution with σ = 0.12 dex.When looking at the underlying Disc-dominated (n< 2) Fig. 6. 2D histogram of the u − r colour and stellar mass of isolated KiDS-bright galaxies.We divide our galaxies into canonically earlyand late-type galaxies, based on either Sérsic index n or u−r magnitude.When dividing by Sérsic index, we define bulge-dominated (early-type) galaxies as those with n > 2 and disc-dominated (late-type) galaxies as those with n < 2 (red and blue points).When dividing colour we define red (early-type) galaxies as those with m u − m r > 2.5 and blue (latetype) galaxies as those with m u − m r < 2.5 (above and below the dashed horizontal line).To isolate the effect of galaxy type on the RAR from that of M , we select two samples with the same stellar mass distribution by randomly removing galaxies from both samples until only the overlapping region (light blue shaded region) remains.
true stellar mass distributions we found that they are indeed not equal, but that the mean stellar masses M ,E and M ,L of the red and blue samples differ by only 0.025 dex.Of course, this method overlooks the fact that the measured M distribution already contains scatter, and is therefore not the true M distribution.Indeed when we apply the random offset multiple times, we see the Eddington bias decrease by ∼ 5% after every iteration.Therefore, the true Eddington bias is likely to be slightly larger, around 0.027 dex.This is still very small compared to the 0.4 dex bias needed, thus it is very unlikely that the difference we observe is caused exclusively by Eddington bias.
Second, there could be systematic errors in the KiDS-bright M measurements that differ between red and blue galaxies (due to e.g.systematic variation of the IMF, SPS model inaccuracies, or systematic errors in the measured redshifts or magnitudes).In order to estimate the size of any systematic biases in the stellar mass, we compared KiDS-bright's M ,ANN with GAMA's M ,G of exactly the same galaxies.Here M ,ANN is based on the nine-band KiDS+VIKING photometry and photometric redshifts z ANN derived by training the ANNz2 (Artificial Neural Network) machine learning method on the spectroscopic GAMA redshifts (see Section 3.3), while M ,G is based on the ugrizZY SDSS+VIKING photometry combined with the spectroscopic GAMA redshifts (see Section 3.2).After selecting our samples of blue and red galaxies with the same M ,ANN distribution as described above, we indeed found that the M ,G distributions are not exactly equal: M E / M L = 1.4,corresponding to 0.14 dex.This indicates that using different sets of observations and models to measure M can cause a systematic bias between red and blue galaxies, but that this effect is too small to reach the 0.4 dex difference in M needed to explain the 0.2 dex difference in the measured RAR.
In conclusion, even when combined the Eddington plus overall systematic measurement bias is at most 0.17 dex, not even half of what is needed.We note that this bias estimation has been carried out using the types split by u − r colour; when split by Sérsic index, the Eddington and other systematic biases between bulge-and disc-dominated galaxies are even smaller (0.021 and 0.12 dex respectively).
Domínguez Sánchez et al. ( 2019) reported evidence of a varying IMF in massive early-type galaxies.As seen in fig.19 of their work, this could cause the global mass-to-light-ratio of these galaxies to increase by as much as 0.09 dex compared to a fixed Chabrier IMF.They find this effect only for their high-mass galaxy sample with a stellar mass of at least M > 2 × 10 11 M , and not for their lower-mass sample.Since we limit all our galaxies to M < 10 11 h −2 70 M (see Section 3.3), the varying IMF is not likely to apply to our early-type galaxy sample.However, even if this had been the case, this 0.09 dex difference in M is small compared to the 0.4 dex needed to explain the difference in the RAR of early-and late-type galaxies.
The higher values of g obs for red and bulge-dominated galaxies that we find in Fig. 8 are in qualitative agreement with earlier GGL studies.A recent KiDS-1000 lensing study by Taylor et al. (2020) found that, within a narrow stellar mass range near the knee of the SHMR (M ∼ 2 − 5 × 10 10 h −2 70 M ), galaxy halo mass varied with galaxy colour, specific star formation rate (SSFR), effective radius R e and Sérsic index n.Although not explicitly mentioned, their figures 1 and 6 reveal that their earlytype (red, low-SSFR) galaxies have larger halo masses than their late-type (blue, low-n, high-SSFR) galaxies of the same stellar mass.Sérsic parameter coupling between n and R e , for a fixed galaxy luminosity, may also contribute towards the trends seen among the early-type galaxies in their M halo -n and M halo -R e diagrams 16 .Much earlier Hoekstra et al. (2005) measured the GGL signal of a sample of 'isolated' Red-sequence Cluster Survey galaxies as a function of their rest-frame B-, V-, and R-band luminosity, and found that early-type galaxies have lower stellar Fig. 8. Measured RAR of the KiDS-bright isolated lenses (points with 1σ error bars) divided into canonically early-and late-type galaxies.In the left panel, the lenses are split by Sérsic index (n ≷ 2) into bulge-dominated (red points) and disc-dominated (blue points) galaxies.In the right panel they are split by u − r colour (m u − m r ≷ 2.5) into red and blue galaxies (with correspondingly coloured points).In both panels we find a significant difference between the RAR measurements of early and late galaxy types.The extrapolated MOND and EG predictions (grey solid and red dashed lines) and the SPARC data (red squares with 2D histogram) are shown as a reference.
mass fractions.In contrast, Mandelbaum et al. (2006) found no dependence of the halo mass on morphology for a given stellar mass below M < 10 11 M , although they did find a factor of two difference in halo mass between ellipticals and spirals at fixed luminosity.
Finding a significantly different RAR at equal M would have interesting implications for galaxy formation models in the ΛCDM framework.In the ΛCDM framework it is expected that the galaxy-to-halo-mass relation, and therefore the RAR, can be different for different galaxy types through their galaxy formation history (Dutton et al. 2010;Matthee et al. 2017;Posti et al. 2019;Marasco et al. 2020).Two parameters that correlate heavily with galaxy formation history are Sérsic index and colour.
Current MG theories do not predict any effect of galaxy morphological type on the RAR, at least on large scales.The MOND paradigm gives a fixed prediction for the relation between g bar and g obs given by Eq. 11.Since the RAR is the observation of exactly this relation, in principle MOND gives a fixed prediction, independent of any galaxy characteristic.As discussed in Section 2.3, the main exception is the EFE that could be caused by neighbouring mass distributions.However, Fig. 4 shows that an increase in the EFE only predicts an increase in steepness of the downward RAR slope at low accelerations (g bar < 10 −12 m s −2 ), while the observed RAR of both early-and late-type galaxies follow approximately the same slope across all measured accelerations.It is therefore unlikely that their amplitude difference can be explained through the EFE.
We will next discuss whether the observed difference in RAR between early and late types is at odds with EG, but first emphasise three caveats of this discussion.
First, the derivation of the EG formalism assumes a spherical mass distribution.Solutions for non-spherical systems do not exist yet.It is not excluded that solutions for large-scale triaxial ellipticals will differ from rotationally supported spiral galaxies.This requires further theoretical study.
Second, the current EG theory predicts ADM fields based exclusively on the static baryonic mass distribution, although very large-scale dynamics can potentially influence the excess gravitational force predicted by EG.It is unknown whether large-scale pressure supported (virialised) systems create an ADM distribution similar to that of rotationally supported galaxies.
Third, we assume here that, to first order, the uncertainty in the KiDS photometric redshifts affects the isolated galaxy selection of both galaxy types in the same way, allowing us to include the full acceleration range into our comparison.However, the well established morphology-density relation predicts a higher density of satellite and dwarf galaxies around early-type galaxies compared to the late types (Dressler 1980;Goto et al. 2003), although we have minimised this effect by selecting isolated galaxies (see Appendix A).It is not yet known whether and, if so, how these external gravitational fields affect the EG prediction.
To address this last caveat, the light blue shaded region in Fig. 8 shows the acceleration scales beyond the KiDS isolation criterion limit (g bar < 10 −13 m s −2 ), where the presence of satellites might play a role (see Appendix A).But even when we remove all data points inside this region, we obtain a difference log 10 (δg E/L obs ) of 0.14 dex and 0.19 dex for the Sérsic and colour split respectively, where the latter has a significance of 3.2σ.Therefore, even at the scales where isolation is certain (corresponding to R < 0.3 h −1 70 Mpc), the difference remains significant.To evaluate the possible effect of circumgalactic hot gas, we computed the RAR of early and late-type isolated galaxies (of the same stellar mass) while including a rough estimate of the hot gas contribution to g bar .We used the same model of the nominal hot gas distribution around our galaxies as discussed in Sect.5.2: an isothermal halo within 100 h −1 70 kpc, with a mass M gas = M .When applying the same hot gas model to both early-and late-type galaxies, we find that there remains a > 6σ difference between their RARs, both for the split by Sérsic index and u − r colour.However, for this particular gas model, we find that g bar increases in such a way that the RAR of earlytype galaxies moves to the right, close to the MG predictions where the RAR of late-type galaxies without circumgalactic gas resides.This means that, in the specific case where early-type galaxies have gaseous haloes with M gas = M while late-type galaxies (of the same stellar mass) have negligible hot circumstellar gas, this would reduce the difference in their RARs to ∼ 4σ.Fine-tuning the M gas /M ratio of early-type galaxies to a slightly higher value, while keeping M gas /M ≈ 0 for late types, might remove the difference between their RARs.However, as discussed in Sect.5.2, unbiased X-ray surveys of circumgalactic gas around isolated galaxies are still lacking, which makes it difficult to obtain representative observational data.
In conclusion, unless early-type galaxies have significant circumgalactic gaseous haloes while late types (of the same stellar mass) do not, the difference we find in the RARs of different galaxy types might prove difficult to explain within MG frameworks.In MOND, g bar and g obs should be directly linked through Eq. 11 without any dependence on galaxy type.In EG the effect might be a consequence of yet unexplored aspects of the theory, such as a non-symmetric mass distribution or the effect of large-scale dynamics.To explore whether this is the case, however, more theoretical work is needed.Through the derivative in Eq. 14, EG does include a dependence on the slope of the baryonic density distribution.A shallower slope of M bar (r) increases M ADM and thus g obs , which might solve the current tension if early-type galaxies have significantly shallower baryonic mass distributions that extend far beyond 30 h −1 70 kpc, such as gaseous haloes (although Brouwer et al. 2017 did not find evidence for a significant effect of the baryonic mass distribution on the EG prediction; see their section 4.3).In addition, EG is currently only formulated for spherically symmetric systems.It would be interesting to investigate whether discs and spheroidal galaxies yield different predictions, and whether these differences would extend beyond 30 h −1 70 kpc.In a ΛCDM context, our findings would point to a difference in the SHMR for different galaxy types.Recently Correa & Schaye (2020) used SDSS data with morphological classifications from Galaxy Zoo to find that, at fixed halo mass (in the range 10 11.7 − 10 12.9 M ), the median stellar mass of SDSS disc galaxies was a factor of 1.4 higher than that of ellipticals.They found this to be in agreement with the EAGLE simulations, where haloes hosting disc galaxies are assembled earlier than those hosting ellipticals, therefore having more time for gas accretion and star formation.

The RAR as a function of stellar mass
In addition to splitting by galaxy type, it is interesting to create the RAR for galaxy samples with different stellar mass M (including very low-mass galaxies, 'dwarfs', in Section 5.6).In the ΛCDM paradigm, where baryonic and dark matter are described as separate substances, there can in theory be a difference in the SHMR depending on galaxy observables such as stellar mass, which could cause a shift in the measured RAR.This is in contrast with most MG models, which predict a fixed RAR (as is the case for and for EG at scales beyond the galaxy disc).In this section, we separated our isolated KiDSbright lenses into four samples based on M .We selected our M -bins to obtain a similar S /N ratio of the lensing signal in each bin, resulting in the following limits: log 10 (M / h −2 70 M ) = [8.5, 10.3, 10.6, 10.8, 11.0].Fig. 9 shows the lensing measurements and predictions for isolated galaxies split in four stellar mass bins.For each bin the mean galaxy mass (stars+cold gas) of the lenses, log 10 M gal / h −2 70 M = [10.14, 10.57, 10.78, 10.96], is shown at the top of the panel.Quantifying the difference between MOND (the extended M16 fitting function) and our measurement at all scales results in: χ 2 red = 117.0/60= 1.9, which (noting that the prediction for EG is very similar) excludes both models at the ∼ 4.5σ level.This result should be taken with caution, however, as at accelerations g bar that correspond to scales larger than R > 0.3 h −1 70 Mpc (light blue shaded region) an increasing signal is to be expected since at these distances satellite galaxies missed by our isolation criterion might affect the measurement.Galaxies with higher stellar masses reside in denser neighbourhoods, and therefore tend to have more satellites (see e.g.Baldry et al. 2006;Bolzonella et al. 2010;Brouwer et al. 2016).
The reduced χ 2 values using only the data within R < 0.3 h −1 70 Mpc are χ 2 red = 49.9/31= 1.6 for MOND and 51.7/31 = 1.7 for EG respectively (corresponding to a standard deviation of 2.4 and 2.5σ).Considering the stellar mass uncertainty (∆M = ±0.2dex), which, if it acts to reduce the observed RAR, results in χ 2 red = 0.97 for the extended M16 fitting function (with similar results for EG): a good fit.If the stellar mass uncertainty increases the observed RAR, we find χ 2 red = 4.6: a poor fit.This again highlights the grave importance of accurate baryonic mass measurements in determining the RAR, in addition to deep lensing surveys that can detect satellites down to very faint magnitudes.This could be achieved by future cosmology telescopes such as Euclid (Laureijs et al. 2011) and The Vera C. Rubin Observatory, previously called Large Synoptic Survey Telescope (LSST; Dark Energy Science Collaboration 2012).As for the MICE simulation, it matches our measurements reasonably well in every M bin.For the result that includes the photometric redshift uncertainty σ z in the isolated galaxy selection, we find χ 2 red = 49.7/30= 1.7 (2.5σ).

The RAR of low-mass (dwarf) late-type galaxies
As a final exploration of different galaxy masses, we attempt to measure the RAR for the lightest lenses in KiDS-bright.Low-mass galaxies are of particular interest to DM and MG researchers as extreme examples that might show eccentric behaviour (e.g.Oman et al. 2016;van Dokkum et al. 2018;Guo et al. 2019), as well as those who attempt to extend the RAR to lower accelerations using galaxy rotation curves (Lelli et al. 2017b;Di Paolo et al. 2019).We therefore select a sample of dwarfs: isolated galaxies with a stellar mass M < 10 10 h −2 70 M (whereas the full sample of isolated galaxies has M < 10 11 h −2 70 M , see Section 3.3).As can be seen in Fig. 6, this sample is dominated by blue, disc-dominated galaxies based on their colours and Sérsic indices (m u − m r > 2.5 mag and n < 2), which means they are likely to be late-type.Since these galaxies are few, and have an even smaller effect on the path of light rays than more massive ones, we needed to reduce the number of bins in g bar from 15 to 5 to obtain sufficient S /N radio in each bin.Fig. 10 shows the resulting RAR measurement of dwarfs compared to the full isolated sample.We do not show the effect of the ∆M = ±0.2dex systematic uncertainty because this would affect both results in the same way.We find that, within its large error bars, the RAR of the dwarfs is consistent with that of the full isolated sample; they both approximately follow the predictions, which are shown as a reference.Hence, we do not find a significant difference in the RAR of dwarf galaxies.

Discussion and conclusions
Galaxy-galaxy lensing observations from the fourth data release of the Kilo Degree Survey (KiDS-1000) have extended the RAR of isolated galaxies by nearly 2 orders of magnitude in gravitational acceleration g obs , compared to previous measurements based on rotation curves (most notably McGaugh et al. 2016, M16).To compute the lensing RAR, we converted our ESD profiles ∆Σ(R) into the observed gravitational acceleration g obs , and our galaxy masses (measured using nine-band KiDS+VIKING photometry) into g bar .These measurements allowed us to perform unprecedented tests of two MG models: MOND and EG, as well as tests of DM using the MICE (Nbody + semi-analytic) and BAHAMAS (hydrodynamical) sim-ulations.Our conclusions from these observational tests are as follows: -Fig.3: We find that lensing rotation curves of isolated galaxies, as inferred from GGL measurements, remain approximately flat at scales far beyond the visible disc (0.03 < R < 3 h −1 70 Mpc).At the accelerations corresponding to the outskirts of observable galaxies (R ≈ 30 h −1 70 kpc), our lensing results are in excellent agreement with the SPARC rotation curves (Lelli et al. 2016).These two measurements are obtained by two very different methods, providing independent corroboration of each result.
-Fig.4: At the low accelerations corresponding to GGL scales, the lensing RAR of isolated galaxies approximately follows a g obs ∝ √ g bar relation.This is in agreement with the expectations from EG (Eq.17) and MOND (which we take to be the M16 fitting function, Eq. 11, extrapolated to larger scales).At low accelerations both these models predict a direct relation between observed and baryonic accel- Due to the low S /N ratio of the dwarf lensing signal, the number of g bar -bins is reduced from 15 to 5. We find that the RAR of dwarfs is consistent with that of our regular sample, and with the extrapolated MOND and EG predictions (grey solid and red dashed lines), which are shown as a reference.eration of this form, with a very similar proportionality constant17 of ∼ 1.2 × 10 −10 m s −2 .This reinforces the results of Brouwer et al. (2017), who found that EG provides a good description of ESD profiles measured using 180 deg 2 of KiDS-GAMA data, but with a five times larger survey area.However, this result only remains valid if no massive (M gas M ) extended baryon distributions, such as as-yet undetected gaseous haloes, are common around our isolated lens galaxies.
-Fig.5: We find that the BAHAMAS simulation underestimates our KiDS-bright lensing RAR.The discrepancy relative to MICE is caused by a bias in the stellar-to-halomass-relation (SHMR) of isolated galaxies in BAHAMAS, which is absent in MICE: BAHAMAS galaxies have stellar masses typically three times higher at fixed halo mass than their non-isolated counterparts.Determining which of the two models more accurately captures the true SHMR is clearly crucial to the interpretation of our measurements in the ΛCDM context.Interestingly, the BAHAMAS RAR still has approximately the correct low-acceleration slope, rather than a steeper slope as would naively be predicted based on the ρ ∝ r −3 outer slopes of the simulated DM haloes.The prediction from MICE (only feasible at low accelerations due to the limited resolution of the simulated lensing measurements) matches our RAR measurements very well.-The additional lensing power at large radii with respect to the prediction from Navarro et al. (2017, see Appendix C) might be caused by large-scale structure along the line-ofsight to the source, in spite of our efforts to select isolated galaxies.This highlights the crucial importance of simulating the entire measurement process (where possible) when making theoretical predictions, both in ΛCDM and MG, before they can be ruled out.In addition, the need for accurate isolated galaxy selection highlights the importance of large spectroscopic surveys, such as the upcoming 4MOST (de Jong et al. 2019) and Dark Energy Spectroscopic Instrument (DESI; Ruiz-Macias et al. 2020) surveys.-Fig.8: When we split galaxies into two types based on Sérsic index or u − r colour, we find at least a factor of 1.5 ( 0.2 dex) difference between the respective lensing RAR measurements with a significance of at least 5.7σ.This observed difference could be resolved by a 0.4 dex systematic bias between the stellar masses of the two types.However, we calculated that the expected M bias (due to Eddington bias or systematic biases in the M measurement) is at most 0.17 dex.This variation in the RAR based on galaxy type, which is in agreement with Taylor et al. (2020) and Correa & Schaye (2020), could be difficult to explain for MG models that predict a fixed relation between baryonic mass and the total gravitational potential.-Fig.9: The lensing RAR for galaxy samples split by stellar mass M demonstrated a slight upward trend, away from the fixed predictions of MOND and EG, with increasing M .This could be caused by satellite or companion galaxies missed by the isolated galaxy selection due to the KiDSbright redshift uncertainty, however.With the inclusion of the KiDS isolation criterion limit and accounting for uncertainty in the stellar mass, we find a reasonable agreement between the MG models and observations.This highlights the crucial importance of accurate baryonic mass measurements in determining the RAR, in addition to deep lensing surveys that can detect satellites to down to very faint magnitudes (such as the future Euclid space telescope and Vera C. Rubin Observatory).The MICE prediction, which is corrected for the KiDS-bright redshift uncertainty, again matches well to our data.-Fig.10: We find no significantly different RAR, relative to the entire isolated lens sample, for a subsample of the lightest KiDS-bright lenses: isolated dwarf (M < 10 10 h −2 70 M ) galaxies.
-Throughout this work, we find that the field of GGL has reached a level of accuracy in the measurement of g obs greater than that of the baryonic acceleration g bar .The fact that we have no accurate measurements of the additional hot gas at large radii, and the ambiguity around the cosmological missing baryons, forces us to limit g bar to the contributions of stars and cold gas.In addition, the current 0.2 dex systematic uncertainty in M prevents us from definitively excluding any of the models we test.This shows that, if we want to have any hope of testing DM and MG models using the next generation of cosmological lensing surveys (such as Euclid and LSST), we also need to focus on the models and observations needed to accurately measure the baryonic mass distribution in and around galaxies.
We find that galaxy lensing rotation curves continue approximately flat out to R = 3 h −1 70 Mpc (where observations are bound to encounter lensing due to surrounding galaxies), which is difficult to explain in a ΛCDM framework that predicts simple NFWlike haloes because of their r −3 outer slope (see the N17 model in Appendix C).However, our analysis of the MICE and BA-HAMAS simulations shows that the combination of the lenses and the additional structure along the line-of-sight can yield an ESD profile consistent with an ∼ r −2 density profile for isolated galaxies, even though the lenses have an intrinsic ∼ r −3 outer profile.
Throughout our analysis we find that the extrapolated M16 fitting function (Eq. 11), which approximately corresponds to the prediction of both MG models (EG and MOND), holds to scales of 3 h −1 70 Mpc for isolated galaxies.A fundamental limitation of this measurement is that the additional diffuse gas surrounding galaxies remains difficult to measure, and has therefore not been included in most of this study.By implementing a rough order of magnitude estimate of the hot gas contribution to g bar , an isothermal distribution with M gas = M within 100 h −1 70 kpc, we found that this causes an overall downward shift of the RAR and a steeper downward slope at very low accelerations (see Fig. 4, and also Fig. 1 for a broader discussion of missing baryons).Although the MOND external field effect (EFE) causes a similar steepening of the RAR, we find that the idealised EFE fitting function of Chae et al. (2020) is not steep enough the explain the effect of gaseous haloes.Therefore, a convincing detection of additional gaseous components with a nominal mass of M gas M would move the observed RAR away from the MG predictions (g bar ∝ √ g obs ) at very low accelerations (g bar < 10 −13 m s −2 ) and towards the DM predictions (where g bar and g obs are independent).A robust non-detection of such massive gaseous haloes in general would likely strengthen the position of MG models.
Finding them for early-type galaxies only would reduce the difference between the RAR of early-and late-type galaxies, which otherwise remains unexplained in MG frameworks.
In conclusion, we find that the lensing RAR is a promising method to be used by future cosmological surveys to distinguish between MG and DM models.This can be done by measuring the RAR including large-scale baryonic mass observations; by simply performing the same comparison with even more accurate lensing and stellar mass measurements; or by further exploring the offset that we have found between the RARs of different galaxy types.All these options require that systematic biases in the stellar and other baryonic mass measurements be reduced.

Fig. 1 .
Fig.1.Mass profiles and RAR of BAHAMAS galaxies.Upper panel: Cumulative mass profiles of stars (red dotted line) and total baryons (blue solid line) for BAHAMAS galaxies with 1 < M 200 /(10 12 h −2 70 M ) < 3. The star marker indicates the stellar mass within a 30 h −1 70 kpc aperture, indicative of what is typically regarded as the stellar mass of a galaxy.The blue dash-dotted line shows the typical baryonic mass profile of observed galaxies of similar mass, estimated based on an extrapolation of the compilation in fig.7ofTumlinson et al. (2017).In the inner galaxy the discrepancy (light blue shaded region) between the observed and simulated M bar is relatively small, but in the outer galaxy the majority of the baryons predicted to be present in BA-HAMAS consist of currently unobserved, missing baryons.The orange dashed line shows the expected baryonic mass profile if the baryon density is everywhere equal to a fixed fraction f b = Ω b /Ω m of the local DM density.At large enough radii ( 2 h −1 70 Mpc), the baryon-to-DM ratio converges to the cosmic average.Lower panel: As in upper panel, but in acceleration space.The cosmic baryon fraction provides a strong theoretical upper limit on g bar at low accelerations in the context of the ΛCDM cosmology.

Fig. 3 .
Fig. 3. Measured rotation curves -the circular velocity as a function of radius v circ (R) -of the KiDS-bright isolated lens sample, divided into four stellar mass bins.The mean galaxy mass (stars+cold gas) of the lenses is shown at the top of each panel.The light blue shaded region indicates the radii corresponding to R > 0.3 h −1 70 Mpc, where the uncertainty in the photometric KiDS redshifts can affect the isolated lens selection (seeAppendix A).The black points (with 1σ error bars) show the result calculated using the SIS assumption, while the blue points (with error bars representing the 16th and 84th percentile of the fits) show the result from the more sophisticated PPL method.Our measurements are consistent between the two methods, and also with the rotation curves from SPARC (all data as the blue 2D histogram, the mean as red squares).
Fig. 4. Measured RAR, which compares the total gravitational acceleration g obs with the expected baryonic acceleration g bar of galaxies.At high accelerations we show the M16 RAR measurements from galaxy rotation curves (all data as the blue 2D histogram, the mean as red squares).Using weak gravitational lensing we were able to extend this measurement to lower accelerations, using both the spectroscopic GAMA and the photometric KiDS-bright isolated lens samples (blue and black points with 1σ error bars).Comparing our lensing observations to two MG models: MOND (the M16 fitting function; grey solid line) and EG (assuming a point mass; red dashed line) we find that GAMA results are in agreement with the two models, while those from KiDS-bright are systematically higher.At very low accelerations (corresponding to R > 0.3 h −1 70 Mpc, light blue shaded region) the uncertainty in the photometric KiDS redshifts affects the isolated lens selection, resulting in systematically higher values of g obs due to the possible contribution of satellites.The results from the spectroscopic GAMA survey, however, are still reliable within this region.The impact of stellar mass uncertainty (∆M = 0.2 dex) on the measurement is shown as the grey band.We show the MOND prediction including the EFE (with e = 0.003, see Eq. 13) as the grey dashed line.In addition, we show the effect on the RAR of KiDS-bright galaxies if g bar contained an additional isothermal hot gas contribution within a 100 h −1 70 kpc radius, with a nominal gas mass equal to the stellar mass (orange crosses with 1σ error bars).We emphasise that this is only a rough order of magnitude estimate of the possible effect of gaseous haloes, which are extremely difficult to observe.
Fig. 5. Measured RAR of the KiDS-bright isolated lens sample (black points with 1σ error bars) compared to two ΛCDM simulations: MICE and BAHAMAS.The accelerations where uncertainty in the photometric KiDS redshifts affects the KiDS-bright isolated lens selection is indicated by the light blue shaded region.The MICE results (red band) emulate the effect of the redshift uncertainty in KiDS, while the BAHAMAS results (orange band) reflect the median and 16 th and 84 th percentiles of the simulated lens galaxies.The MICE simulation, though limited to low accelerations by its resolution, succeeds in reproducing the lensing data.The result from the BAHAMAS simulation runs approximately parallel to the MICE curve, but underestimates our measurement by 0.5 dex due to the biased SHMR of the BAHAMAS isolated galaxies (see Section 5.3).

Fig. 7 .
Fig.7.Stellar mass histogram of the red (early-type) and blue (latetype) isolated KiDS-bright galaxies (red and blue lines), divided by u−r colour (m u − m r ≶ 2.5 mag).To isolate the effect of galaxy type on the RAR from that of M , we select two samples with the same stellar mass distribution by randomly removing galaxies from both samples until only the overlapping region (light blue shaded region) remains.

u
Baryonic (stars+cold gas) radial acceleration log(g bar [h 70 m/s 2 ]) Observed radial acceleration log(g − r colour (split at 2.5 mag) GL-KiDS isolated blue/disc-dominated galaxies GL-KiDS isolated red/bulge-dominated galaxies SPARC rotation curves (mean + 2D histogram)MOND (McGaugh+16, extrapolated)    Emergent Gravity (Verlinde+16, point mass) Unity (No dark matter: g obs = g bar ) Fig. 9. Measured RAR of isolated KiDS-bright lenses (black points with 1σ error bars) divided into four stellar mass bins.The mean galaxy mass (stars+cold gas) of the lenses is shown at the top of each panel.At increasing stellar mass, the measurements seem to rise above the predictions from MOND (grey solid line) and EG (red dashed line).However, at scales larger than R > 0.3 h −1 70 Mpc (light blue shaded region) this could be caused by false positives in the isolated galaxy sample due to the KiDS-bright redshift uncertainty.gobs ∝ √ g bar relation expected by the extended MOND and EG Fig. 10.Measured RAR of KiDS-bright lenses (points with 1σ error bars), respectively for isolated dwarfs (log(M / h −2 70 M ) < 10, blue) and the full isolated galaxy sample (log(M / h −2 70 M ) < 11, black).Due to the low S /N ratio of the dwarf lensing signal, the number of g bar -bins is reduced from 15 to 5. We find that the RAR of dwarfs is consistent with that of our regular sample, and with the extrapolated MOND and EG predictions (grey solid and red dashed lines), which are shown as a reference.