Does NGC 6397 contain an intermediate-mass black hole or a more diffuse inner sub-cluster?

The existence of Intermediate Mass Black Holes (IMBHs) in the centers of globular clusters is heavily debated. We analyze proper motions from the Hubble Space Telescope (HST) and the second Gaia data release along with line of sight velocities from the MUSE spectrograph to detect imprints of an IMBH in the center of the nearby, core collapsed, globular cluster NGC 6397. For this, we use the new MAMPOSSt-PM Bayesian mass-modeling code, along with updated estimates of the surface density profile of NGC 6397. We consider different priors on velocity anisotropy and on the size of the central mass, and also separate stars into components of different mean mass to allow for mass segregation. The velocity ellipsoid is very isotropic throughout the cluster, and we argue that this must have been acquired early on. The data prefer the existence of a dark component in this cluster center of 0.8 to 2 percent of the total mass of the cluster. However, we find robust evidence to rule out a central IMBH in NGC 6397, in favor of a diffuse dark inner sub-cluster of unresolved objects with total mass ranging from 1000 to 2000 solar masses, half of it concentrated in a projected radius of 2.5 to 5 arcsec (1 to 2 percent of the stellar effective radius). These results require the combination of HST and Gaia data: HST for the inner diagnostics, and Gaia for the outer surface density and velocity anisotropy profiles. The small effective radius of the diffuse dark component suggests that it is composed of compact stars (white dwarfs and neutron stars) and stellar-mass black holes, whose inner locations are caused by mass segregation from dynamical friction given the high masses of their progenitors. We argue that stellar-mass black holes should dominate the mass of this diffuse dark component.


Introduction
When the Laser Interferometer Gravitational-Wave Observatory (LIGO) first detected gravitational waves coming from a stellarmass black hole merger (Abbott et al. 2016) and then the Event Horizon Telescope (EHT) released the first image of the supermassive black hole in M87 (Event Horizon Telescope Collaboration et al. 2019), astronomers obtained the most compelling evidences about the existence of those intriguing and particular objects. Nevertheless, black holes (BHs) have been treated as more than a theoretical object for a considerable amount of time, starting from the discovery of quasars (Schmidt 1963) and their identification as supermassive black holes (SMBHs, Hoyle & Fowler 1963). This class of black holes, which reside in the centers of massive galaxies, are responsible for extremely luminous sources in the universe, such as quasars and Active Galactic Nuclei (AGN), sometimes unleashing powerful jets of relativistic matter, along with outflows that have a profound impact on star formation and galaxy evolution (e.g. Croton et al. 2006;Hopkins et al. 2006). Moreover, the existence of black holes had been predicted well before AGN studies (Oppenheimer & Snyder 1939), and further confirmed with observations (e.g. Webster & Murdin 1972, Bolton 1972, as the final step of the life of massive stars ( 10 M ) after their final gravitational collapse into stellar-mass black holes, with masses in between ∼ 1 M and ∼ 10 2 M (e.g. Remillard & McClintock 2006).
Since there is no theoretical limit to the mass of a black hole, it would be reasonable to believe that intermediate-mass black Send offprint requests to: Eduardo Vitral, e-mail: vitral@iap.fr holes (IMBHs) could exist, filling the considerable gap between stellar-mass black holes and SMBHs, i.e. 10 2 − 10 5 M . However, there currently is little evidence for IMBHs (see reviews by Volonteri 2010 andGreene, Strader, &Ho 2019), with some important candidates highlighted (e.g. Kaaret et al. 2001, and more recently Chilingarian et al. 2018 andLin et al. 2020) and one gravitational wave confirmation (The LIGO Scientific Collaboration et al. 2020). Furthermore, IMBHs could help to explain many enigmas in astrophysics, such as filling up part of the dark matter mass budget (e.g. Haehnelt &Rees 1993 andLoeb &Rasio 1994) or providing massive seeds for high redshift quasars, whose high masses at such early times represent a challenge to current theories (Haiman 2013). Therefore, great efforts have been undertaken to detect IMBHs to better understand their origin and evolution.
Globular clusters (GCs) appear to be a unique laboratory to test the existence of IMBHs. These quasi-spherical star clusters are known to have old stellar populations, indicating that they formed at early epochs. Their high stellar number densities provide an excellent environment to increase stellar interactions that could give birth to compact objects. More precisely, contrary to galaxies, the rates of stellar encounters in the inner parts of GCs containing half their stellar mass are sufficiently high to statistically affect the orbits of their stars by two-body relaxation (Chandrasekhar 1942). Moreover, after several relaxation times, the interplay between the negative and positive heat capacities of the inner core and outer envelope leads to transfers of energy leading to the gravothermal catastrophe, where the core collapses leading to a steep inner density profile, while the en-Article number, page 1 of 26 arXiv:2010.05532v1 [astro-ph.GA] 12 Oct 2020 A&A proofs: manuscript no. main velope expands. Roughly one-fifth of GCs are believed to have suffered such core-collapse (Djorgovski & King 1986).
Several scenarios have been proposed for the existence of IMBHs in GCs. One is the direct collapse of population III stars (Madau & Rees 2001), but the link of population III stars with GCs is not clear. The other mechanism is the growth of mass by physical collisions. Portegies Zwart & McMillan (2002) proposed a runaway path to IMBH formation in GCs, where an initially massive star suffers multiple physical collisions with other stars during an early phase of the GC core collapse. During these collisions, the most massive star will lose linear momentum, ending up at the bottom of the gravitational potential well. At the same time, its mass will grow during the successive stellar mergers, to the point that it will end up as a BH, possibly reaching 0.1 percent of the GC stellar mass. The authors emphasize that for this scenario to take place, the timescale for the most massive stars to suffer mass segregation must be less than the lifetime of these stars, otherwise they could lose enough energy and heat during their mass loss to reverse core collapse. On the other hand, Miller & Hamilton (2002) proposed a slower process for IMBH formation, where dynamical friction (Chandrasekhar 1943) causes the most massive stellar remnant BHs to sink to the center of the gravitational well. Thus a 50 M stellar remnant BH, sufficiently massive to avoid being ejected from the GC by dynamical interactions, would grow in mass through mergers with these other massive BHs as well as other typically massive stars, reaching a mass of 1000 M over the Hubble time. They suggest that this process generates IMBHs in some ten percent of GCs.
These models present, however, drawbacks: The short relaxation time needed in the Portegies Zwart & McMillan scenario usually requires primordial mass segregation in order not to eliminate too many GCs candidates, while the assumption by Miller & Hamilton (2002) of BH seeds above ≈ 50 M is not expected as the massive progenitors are fully exploded in pairinstability supernovae (e.g. Woosley 2017).
Moreover, many attempts to detect IMBHs have been somewhat inconclusive: Dynamical modeling is still dependent on the assumptions concerning the confusion between the IMBH and a central sub-cluster of stellar remnants (e.g. den Brok et al. 2014, Mann et al. 2019and Zocchi et al. 2019. Furthermore, these analyses usually rely on too few stars inside the sphere of influence of the IMBH, which can lead to false detections (Aros et al. 2020). Besides, searches for signs of accretion indicate no strong evidence for > 1000 M black holes in galactic GCs (e.g. Tremou et al. 2018).
In this paper, we analyze the core-collapsed Milky Way globular cluster NGC 6397, and search for kinematic imprints of a central IMBH or sub-cluster of unresolved objects (hereafter, CUO). We use very precise and deep Hubble Space Telescope (HST) proper motions (PMs) from Bellini et al. (2014) and line-of-sight (LOS) velocities from the Multi Unit Spectroscopic Explorer (MUSE) instrument on the Very Large Telescope, to which we added PMs from the Gaia Data Release 2 (hereafter, Gaia or Gaia DR2).
NGC 6397 has been broadly investigated in previous studies. The PMs lead to plane-of-sky (POS) velocities with equal radial and tangential components, leading to the appearance of isotropic orbits (Heyl et al. 2012 andWatkins et al. 2015a with HST for the inner orbits, and Jindal et al. 2019 with Gaia for the outer orbits). But this projected velocity isotropy can hide threedimensional (3D) variations of the anisotropy of the 3D velocity ellipsoid. Kamann et al. (2016) used the JAM mass-orbit modeling code (Cappellari 2008) to fit the LOS velocities obtained with the MUSE spectrograph, and found that a 600 ± 200 M black hole best fits their data, agreeing with the radio emission upper limit of 610 M for the same cluster provided in Tremou et al. (2018). On the other hand, numerical N−body modeling by Baumgardt (2017) strongly excluded the possibility of an IMBH in NGC 6397, not to say that much of the non-luminous mass measured in this cluster could actually be in the form of unresolved white dwarfs and low-mass stars (Heggie & Hut 1996).
Many of the mass modeling studies performed with this cluster assumed or constrained its surface density parameters to values estimated long ago (e.g. Trager, King, & Djorgovski 1995), that may suffer from problems such as radial incompleteness, which is better addressed by missions such as HST and Gaia. This lack of accuracy on the surface density can strongly impact the dynamical analysis of NGC 6397.
In the present study, we performed state-of-the-art massorbit modeling to analyze the presence of an IMBH or a CUO in the center of NGC 6397 and better measure the radial variation of its velocity anisotropy. Among the different popular mass/orbit modeling (e.g. Mamon, Biviano, & Boué 2013, Watkins et al. 2013, Read & Steger 2017, Vasiliev 2019a; see chap. 5 of Courteau et al. 2014 for a review), we used an extension of the MAMPOSSt: Modeling Anisotropy and Mass Profiles of Observed Spherical Systems (Mamon et al. 2013) Bayesian code, which now takes into account both LOS velocities and PMs (MAMPOSSt-PM, Mamon & Vitral in prep.). MAMPOSSt-PM was found to perform extremely well in a code challenge using mock data for dwarf spheroidal galaxies, with many similarities to GCs (Read et al. 2020).
We profit from the versatility of MAMPOSSt-PM to fit not only cases with a single population of stars, but also with two mass populations, thus allowing for mass segregation. We also provide new fits to the surface density parameters of this cluster by jointly modeling HST and Gaia data.

MAMPOSSt-PM
We summarize here the main aspects of MAMPOSSt-PM; the reader will find more details in Mamon & Vitral, in prep. MAMPOSSt-PM. MAMPOSSt-PM is a Bayesian code that fits parametrized forms of the radial profiles of mass and velocity anisotropy to the distribution of observed tracers (here, GC stars) in four-dimensional projected phase space (PPS). The mass profile is the sum of observed tracers, as well as a possible central BH, and additional dark components. By working on discrete stars, MAMPOSSt-PM can probe the inner regions better than methods where velocity or PM data are binned in radial intervals (e.g., van der Marel & Anderson 2010), and an analysis of dark matter on mock dwarf spheroidals has shown that the conclusions can depend on how the data are binned (Richardson & Fairbairn 2014).
The probability density that a tracer at projected radius R has a velocity vector v is where Σ(R) is the surface density (SD), while g(R, v) is the density in PPS, such that with N p (R) = R 0 2π R Σ(R ) dR being the projected number, i.e. the number of points (e.g. stars in a star cluster) in a cylinder of radius R. The likelihood is then calculated as where the i indices represent individual points (stars). The observed stars tend to be located in in the GC. But, inevitably, some observed stars will be interlopers (here field stars, FS) mostly foreground in our case, but possibly some background too. MAMPOSSt-PM splits the PPS distribution into separate GC and FS components. Eq. (1) becomes where PM is the 2D vector of PMs (µ α, * , µ δ ), η = 4.7405 D is the conversion of PM in mas yr −1 to plane-of-sky velocity in km s −1 given the distance D to the system, in kpc. The 2nd equality of Eq. (4) assumes that the FS surface density is independent of position and displays the probability distribution function for FS LOS velocities and PMs. 1 The GC contribution to the PPS density is the mean local GC velocity distribution function h, averaged along the line of sight (Mamon et al. 2013;Read et al. 2020): where r is the 3D distance to the system center, while ν(r) is the 3D tracer density (here stellar number density) profile. MAMPOSSt-PM assumes that the local velocity ellipsoid, i.e. the local 3D velocity distribution function (VDF) is separable along the three spherical coordinates (r, θ, φ): where θ and φ are the tangential components of the coordinate system and r is the radial component, while σ 2 i stands for the velocity dispersion of the component i of the coordinate system.
When considering spherical symmetry in velocity space, the variances of the composite Gaussian VDF h(v|(R, r) can be written as where Eq. (9a) is from Binney & Mamon (1982), while Eqs. (9b) and (9c) are from Strigari et al. (2007). In Eqs. (9), the radial velocity variance σ 2 r (r) is obtained by solving the spherical stationary Jeans equation with no streaming motions (in particular, rotation): where M(r) is the total mass profile. The distribution of interlopers in PPS is straightforward: the spatial density is assumed uniform (hence Σ FS does not depend on R in eq. [4]). The distribution of LOS velocities, f LOS (v LOS ) in Eq. (4), is assumed Gaussian. The two-dimensional distribution of PMs, f PM (PM) in Eq. (4), is found to have wider wings than a Gaussian (see Sect. 5.2.3 and Appendix B.1). The POS velocity errors are not added in quadrature to the velocity dispersions, because the PM distribution of interlopers has wider tails (Appendix B.2). Instead, MAMPOSSt-PM uses an approximation (Mamon & Vitral, in prep.) for the convolution of PM errors with the PM dispersions.

Dark component
MAMPOSSt-PM allows for a diffuse dark component with a large choice of analytical density profiles. This dark component could represent dark matter or alternatively unseen stars.
Dark matter (DM) may dominate the outskirts of GCs, and perhaps also deeper inside. Contrary to distant GCs (Ibata et al. 2013), the presence of substantial DM in the outskirts of NGC 6397 is unlikely because this GC appears to have an orbit around the Milky Way that takes it from an apocenter of roughly 6.6 kpc to a pericenter of 2.9 kpc in 88 Myr (Gaia Collaboration et al. 2018a), so it should have suffered from dozens of previous episodes of tidal stripping, leaving little DM (Mashchenko & Sills 2005), if there was any in the first place. In particular, the orbital parameters of Gaia Collaboration et al. (2018a), Z = −0.48 ± 0.01 kpc and W = −127.9 ± 3 km s −1 , indicate that NGC 6397 just passed through the Milky Way disk only ≈ 4 Myr ago. 2 Could there be DM in the inner regions of GCs like NGC 6397? Using N-body simulations of isolated ultra-compact dwarf galaxies, which appear to be larger analogs of GCs, but with relaxation times longer than the age of the Universe, A&A proofs: manuscript no. main Baumgardt & Mieske (2008) found that the inner DM component is heated by dynamical friction from the stars, causing a shallower inner DM distribution than that of the stars. Shin et al. (2013) traced the dynamical evolution of NGC 6397 with Fokker-Planck methods, discarding those particles that reach the GC tidal radius. They found that the GC could have contained up to one-quarter of its initial mass in the form of DM and match the present-day surface density and LOS velocity dispersion profiles.
On the other hand, there may be unseen matter in the core of the GC that is not dark matter per se, but composed of unseen stars, i.e. an inner nuclear cluster of faint stars, possibly white dwarfs, neutron stars, or stellar-mass black holes (Zocchi et al. 2019;Mann et al. 2019) and binary stars (Mann et al.). We thus performed MAMPOSSt-PM runs including such a CUO instead of (or in addition to) an IMBH.

Velocity anisotropy profile
With LOS data, MAMPOSSt is able (at least partially) to lift the degeneracy first pointed out by Binney & Mamon (1982) between the radial profiles of mass and velocity anisotropy (Mamon et al. 2013). With the three components of the velocity vector, MAMPOSSt-PM is even much more able to disentangle mass and anisotropy profiles (Read et al. 2020;Mamon & Vitral in prep.). With the data used in this paper, we have thus a unique chance of constraining the mass profile of our sources, which is essential to restrain the estimates on the IMBH.
While GCs are often modeled with isotropic velocities, we performed many runs of MAMPOSSt-PM with freedom in the anisotropy profile. Indeed, a central IMBH may modify the orbital shapes. Furthermore, the outer orbits of isolated GCs are often thought to be quasi radial (Takahashi 1995). Moreover, the frequent passages of NGC 6397 across the Galactic disk (Sect. 1) could alter the orbital shapes of GC stars.
The anisotropic runs of MAMPOSSt-PM used the generalization (hereafter gOM) of the Osipkov-Merritt model (Osipkov 1979;Merritt 1985) for the velocity anisotropy profile: where r β is the anisotropy radius, which can be fixed as the scale radius of the luminous tracer by MAMPOSSt-PM. 3

Global structure and distance of NGC 6397
Our mass-orbit modeling of NGC 6397 (Sect. 2 below) assumes a distance, the knowledge of the GC center, spherical symmetry, and the absence of rotation. We investigate these assumptions for NGC 6397.

Spherical symmetry
NGC 6397 appears to be close to spherical symmetry. Its elongation on the sky is 0.07 (Harris 2010).

Center
Our analysis assumes that the IMBH is located at the GC center, so our choice of center is critical. We have considered two cen-3 Mamon et al. (2019) found no significant change in models of galaxy clusters when using this model for β(r) compared to one with a softer transition: β(r) = β 0 +(β ∞ −β 0 ) r/(r+r β ), first used by Tiret et al. (2007).  (Goldsbury et al. 2010, using HST data) and (RA,Dec) = (265 • .1697, -53 • .6773) (Gaia Collaboration et al. 2018a, using Gaia data), both in epoch J2000. We selected the center with the highest local stellar counts. It is dangerous to use our HST star counts for this, because the positions are relative to the center, hence need to be re-scaled assuming a given center (see Sect. 4.1). We used instead the Gaia star counts, which benefit from absolute positional calibration. Figure 1 shows a Gaussian-smoothed map of Gaia star counts in the inner 100 × 100 (half the HST field of view) of NGC 6397, where we overplotted the centers obtained by Goldsbury et al. (2010) andby Gaia Collaboration et al. (2018a). Clearly, the center of Gaia Collaboration et al. (2018a) is less well aligned with the density map than the center of Goldsbury et al.. We therefore select the Goldsbury et al. center at (265 • .1754, -53 • .6743) in J2000 equatorial coordinates.

Distance
The adopted distance is important, because the mass profile at a given angular radius (e.g. r in arcmin) deduced from the Jeans equation of local dynamical equilibrium (Eq. [10]) varies roughly as M ∝ rσ 2 v , i.e. as distance D for given LOS velocities, and as D 3 for given PMs. Table 1 shows a list of distance estimates for NGC 6397. Recent estimates are bimodal around 2.39 kpc and 2.64 kpc. The CMD-based distance estimates favor the larger distance, the kinematics favor the smaller distance, while the parallax distances point to small (HST) or large (Gaia) distances.
We estimated a kinematical distance by equating the LOS velocity dispersion of stars measured with VLT/MUSE (see Sect. 4.3,below) with the HST PM dispersions of the same stars (see Sect. 4.1). To avoid Milky Way field stars, we used the cleaned MUSE and HST samples, for which we had 692 matches, among which 445 with separations smaller than 0 . 1. 4 For these 445 stars, we measured a LOS velocity dispersion Vitral & Mamon: Does NGC 6397 contain an IMBH or a more diffuse inner cluster?  1)). This yields a kinematic distance of 4.91/[c (0.439 + 0.442)/2] = 2.35 ± 0.10 kpc, where c = 4.7405 is the POS velocity of a star of PM = 1 mas yr −1 located at D = 1 kpc. We adopted the lower distance of 2.39 kpc (i.e. a distance modulus of 11.89), for three reasons: 1) We trust more the HST parallax than the Gaia-DR2 parallax, given that the former is based on a much longer baseline; 2) Our study of NGC 6397 is based on kinematics, and thus the larger distance would lead to abnormally high POS velocity dispersions compared to the LOS velocity dispersions. 3) It is consistent with our kinematic estimate of the distance. We do not adopt our estimated kinematic distance of 2.35 ± 0.10 kpc, because it is based on the perfect equality of LOS and POS velocity dispersions, which supposes velocity isotropy, which in turn is not certain. For our adopted distance of 2.39 kpc, 1 arcmin subtends 0.7 pc.

HST data
The HST data were kindly provided by A. Bellini, who measured PMs for over 1.3 million stars in 22 GCs, including NGC 6397 (Bellini et al. 2014). The data for NGC 6397 has a 202 arcsec square field of view (Wide Field Camera) and reached down to less than 1 arcsec from the Goldsbury et al. (2010) center (see the right panel of Figure 8). This minimum projected radius to the center is smaller than the BH radius of influence r BH ∼ G M BH /σ 2 0 0.11 (M BH /600 M ) pc (using the LOS velocity dispersion of Sect. 3.3), which corresponds to 9 (M BH /600 M ) arcsec, given our adopted distance of 2.39 kpc.
This dataset was provided in a particular master frame shape (for details, see Table 29  zero. The first step in our analysis was to convert the positions and PMs to the absolute frame.

HST absolute positions
We applied the Rodrigues (1840) rotation formula to shift the relative positions back to their original center to translate the GC stars to their true positions on the sky. We used the center of Goldsbury et al. (2010), which was the one considered by Bellini et al. (2014). We then rotated the subset with respect to its true center, so that the stars originally parallel to the dataset's increasing x axis remained parallel to the right ascension increasing direction. We then verified our method by matching the stars in sky position with Gaia (see Sect. 4.4 below).

HST absolute proper motions
The HST PMs were measured relative to the bulk PM of the GC. We corrected the relative PMs of Bellini et al. (2014) with their provided PM corrections. We just added columns 4 and 5 to columns 31 and 32 from Table 29 of Bellini et al. (2014), respectively. We then converted the relative PMs to absolute PMs by computing the bulk PM of NGC 6397 using the stellar PMs provided by Gaia DR2 as explained in Sect. 4.2 below.
The small field of view of HST and the few pointed observations do not allow the observation of sufficiently numerous background quasars to obtain an absolute calibration of HST PMs. On the other hand, the Gaia reference frame obtained with more than half a million quasars provides a median positional uncertainty of 0.12 mas for G < 18 stars (Gaia Collaboration et al. 2018b) and therefore allows us, by combining its accuracy with HST's precision, to know NGC 6397 PMs with unprecedented accuracy. We will compare the PMs of stars measured both by HST and Gaia in Sect. 4.4.
Our HST data had 13 593 stars. The PM precision varies across the field of view because of the different time baselines and the smaller size of the older HST cameras. Figure 2 shows that the PMs are much more accurate in a large square portion of the field of view, including the western part and extending to the central region. At magnitude F606W = 17, the one-dimensional PM precision is 0.02 mas yr −1 in the more precise portion of field of view and 0.08 mas yr −1 outside.

Gaia data
Gaia DR2 presented an overall astrometric coverage of stellar velocities, positions and magnitudes of more than 10 9 stars in the Milky Way and beyond. Gaia measured PMs in over 40 000 stars of magnitude G < 17 within a 1 • cone around NGC 6397. These PMs have a typical precision of 0.17 mas/yr at G = 17. Since the Gaia G and HST F606W magnitudes differ by less than 0.1, one sees that Gaia PMs are less precise than those of HST by factors of 2 to 8 depending on the region of the HST field of view. The Gaia data was also used to infer the number density profile out to large distances from the GC center.

VLT/MUSE data
We complemented the PMs using LOS velocities that Husser et al. (2016) acquired with the MUSE spectrograph on the VLT. A mosaic of 5×5 MUSE pointings led to an effective square field of view of 5 arcmin on the side. The bulk LOS velocity is v LOS = 17.84 ± 0.07 km s −1 . In a companion article, Kamann et al. (2016) assigned membership probabilities to the stars according to their positions in the space of LOS velocity and metallicity, compared to the predictions for the field stars from the Besançon model of the Milky Way (Robin et al. 2003). The data, kindly provided by S. Kamann, contained 7130 LOS velocities, as well as the membership probabilities.

Positional accuracy
We used Tool for OPerations on Catalogues And Tables (TOPCAT, Taylor 2005) to check the match positions between catalogs, which we did in our own Python routines that filtered the original datasets (Sect. 5 below). Performing symmetric matches among each pair of datasets, with a maximum allowed separation of 1 arcsec, we obtained median separations of 0 . 38 for the 4455 stars in common between Gaia and HST, as well as 0 . 01 for the 4440 stars in common between MUSE and HST.

Proper motions
Many of the 4455 stars in common between HST and Gaia have very high Gaia PM uncertainties (we saw in Sect. 4.2 that HST PM uncertainties were much smaller, but a few have large values). Limiting to 1179 stars in common with both Gaia and HST PM uncertainties below 0.4 mas/yr, we find that the HST PMs in the (RA,Dec) frame are (0.006±0.812,0.029±1.239) mas/yr above those from Gaia, where the "errors" represent the standard deviation. The uncertainties on the means are √ 1179 = 34 times lower, i.e. (0.024,0.036) mas/yr. The mean shifts in PMs are (0.3,0.8) times the uncertainties on the mean, and thus not significant. This possible, but not statistically significant offset in PMs, is of no concern as long as we only use HST data, since we analyze them relative to the bulk PM of the GC. But in MAMPOSSt-PM runs with the combined HST+Gaia dataset, it may have a small effect.

Data cleaning
We now describe how we selected stars from each dataset for the mass modeling runs with MAMPOSSt-PM. We also used more liberal criteria to select stars in our estimates of the surface density profile (see Sect. 7.1).
For each dataset, we discarded all stars with PM errors above half of 0.394 mas yr −1 , which is the one-dimensional PM dispersion of NGC 6397 measured by Baumgardt et al. (2019) for the innermost 2000 Gaia stars, whose mean projected radius was 99 . 25.
The PM error is computed as the semi-major axis of the error ellipse (eq. B.2 of Lindegren et al. 2018): where denotes the error or uncertainty and where ρ is the correlation coefficient between µ α * and µ δ . 5 Our HST data does not provide ρ.
We thus select stars with µ < 0.197, mas yr −1 , or equivalently for which the LOS velocity error satisfies v LOS < 0.197 c D = 2.23 km s −1 , for D = 2.39 kpc and where c = 4.7405 (see Sect. 3.3). Figure 3 shows the PM errors versus magnitude for HST (blue) and Gaia (red), as well as the equivalent PM error for the LOS velocity errors of Gaia (purple) and MUSE (green). The figure indicates that our maximum allowed PM errors are reached at magnitudes m F606W = 19.7 for the worst HST data (those with the shortest baseline, see Fig. 2), and for virtually all magnitudes for the other HST stars. Gaia stars reach the maximum allowed PM error at typically G = 17.5. The equivalent LOS velocity error limit is reached at magnitude m F606W = 16.8  Table 2. In blue, we display all the stars in the HST sample, while in red we display the same subset after applying the low PM error cut mentioned in section 5.1.1. The green dashed circle represents the very liberal hard cut on PMs to remove Milky Way field stars and the green cross highlights the mean PM presented in Table 2. for MUSE, but only at G = 13.4 for Gaia. The Gaia radial velocities are thus of little use for our modeling, as too few of them have sufficiently precise values.

HST data cleaning
After selecting the stars with low PM errors, we cleaned our HST data in 3 ways: We discarded stars with 1) PMs far from the bulk PM of the GC; 2) lying off the color-magnitude diagram; 3) associated with X-ray binaries.

HST proper motion filtering
The higher surface number density of stars in the inner regions of NGC 6397 allows to distinguish GC stars with field stars in PM space, as shown in Figure 4. Moreover, some high velocity stars could in principle be caused by very tight (separations smaller than ∼ 0.1 AU) GC binaries. We made a very liberal cut to select GC stars in PM space, using a circle of radius 6 mas yr −1 (green circle in Fig. 4), corresponding to 15 times the PM dispersion measured by Baumgardt et al. (2019). The PM center of the GC was set at the mean of the values in Table 2, below. Our cut in PM space also corresponds to a velocity dispersion of 75 km s −1 , which is over 13 times the highest LOS velocity dispersion measured by Kamann et al. (2016). We chose such a liberal cut to ensure that we would not miss any high velocity GC members, because otherwise our modeling would underestimate the mass. This left us with 9149 stars among the 9624 with low PM errors.
The filtering of field stars in PM space is not fully reliable, because the cloud of field stars in PM space (upper part of Fig-ure 4) may extend into the (smaller) GC cloud (and past it). We thus proceed to another filtering in the color-magnitude diagram (CMD).

HST color-magnitude filtering
The left panel of Figure 5 shows the locations in the CMD of the stars that survived the PM error cut and the filtering in PM space. While most stars follow a tight relation, a non-negligible fraction are outliers. We took a conservative cut of the stars on the CMD using kernel density estimation as displayed in the right panel of Figure 5. 6 This graph displays the 1, 2 and 3 σ contours of the kernel density estimation, drawn as black lines. We selected stars inside the 2 σ region, because the 3 σ region appears too wide to rule out binaries, while the 1 σ region was too conservative, extracting too few stars for our analysis.
CMD filtering not only removes field stars whose PMs coincide by chance with those of GC stars, but also removes GC members that are unresolved binary stars and lie in the edges of the main sequence, as well as particular Blue Stragglers in the 15 < F606W < 16.5 range, which are believed to be the result of past mergers of GC members (Leonard 1989). Removing binaries and stars who have gone through mergers is important, because their kinematics are dominated by two-body interactions, while our modeling assumes that stellar motions are dominated by the global gravitational potential of the GC. As discussed by Bianchini et al. (2016), three types of binaries need to be considered. 1. Resolved (i.e. wide) binaries will produce their own PMs that can be confused with the parallax. But, following Bianchini et al., the PMs of such resolved binaries, of order of a/T where a is the semi-major axis of the binary and T is the time baseline, are negligible in comparison to the GC velocity dispersion. Indeed, to be resolved at the distance of NGC 6397, the binary star should be separated by at least 0 . 1, which corresponds to 24 AU at the distance of NGC 6397; the most massive binaries, with mass below 2 M , will have a period of 90 years, i.e. a/T = 1.26 km s −1 , one quarter of the while less massive binaries will have longer periods, hence lower a/T . 2. Nearly-resolved binaries, will not be deblended, and their distorted image (in comparison to the PSF) will lead to less precise astrometry, which will be flagged. 3. The orbits of unresolved binaries, considered as single entities, will be those of test particles in the gravitational potential. But their higher mass should lead them to have lower velocity dispersions than ordinary stars, in particular in the dense inner regions of GCs where the two-body relaxation time is sufficiently short for mass segregation.
Our CMD filtering left us with 7259 stars among the 9149 surviving the previous filters.

Removal of X-ray binaries
Bahramian et al. (2020) detected nearly 200 X-ray sources within 200 arcsec from the center of NGC 6397, some of which could potentially be background active galactic nuclei. The remaining sources are thought to be X-ray binaries, and as such will have motions perturbed by their invisible compact companion. We therefore deleted the 50 HST stars whose positions coincided within 1 arcsec with any X-ray source.  Bianchini et al. (2016) found that the effect of unresolved binaries on GC dispersion profiles is only important for high initial binary fractions (e.g. ∼ 50%), which can induct a difference of 0.1 − 0.3 km s −1 in the velocity dispersion profile, mainly in the GC inner regions, where the binary fraction should be highest. For lower initial binary fractions, Bianchini et al. found that unresolved binaries do not significantly affect the PM dispersion profile, but only the kinematics error budget. Therefore, the binary fractions calculated by Davis et al. (2008) and Milone et al. (2012b) for NGC 6397 are clearly insufficient (i.e. 5% and 7%, respectively) to require a special treatment.

HST final numbers
After these cuts, we are left with 7209 stars from the HST observations (among the original 13 593). The combination of the CMD and PM filtering along with the removal of X-ray binaries effectively cleans the data of most of the binaries that would affect our modeling, only leaving binaries of different luminosities that are at intermediate separations (≈ 0.15 to 5 AU, causing peculiar motions between 1 and 15 times the GC velocity dispersion). We checked that using a less liberal cut of 7.5 σ instead of 15 σ affects very little our results.

Gaia data cleaning
We followed similar steps in cleaning the Gaia data as we did for the HST data.

Quality flags
We filtered the Gaia stars with µ < 0.197 mas yr −1 using two data quality flags proposed by Lindegren et al. (2018). First, we only kept stars whose astrometric solution presented a sufficiently low goodness of fit: where N is the number of points (epochs) in the astrometric fit of a given star and 5 is the number of free parameters of the astrometric fit (2 for the position, 1 for the parallax and two for the PM). Eq. (16) gives a sharper HR diagram, removing artifacts such as double stars, calibration problems, and astrometric effects from binaries. It is more optimized than the (astrometric_excess_noise< 1) criterion, used in Baumgardt et al. (2019) and Vasiliev (2019b), especially for brighter stars (G 15), according to Lindegren et al. (2018). Second, we only kept stars with good photometry.
where E = (I BP + I RP )/I G is the flux excess factor. Eq. (17) performs an additional filter in the HR diagram, removing stars with considerable photometric errors in the BP and RP photometry, affecting mainly faint sources in crowded areas. This poor photometry broadens the CMD, leading to more confusion with field stars. Those variables correspond to the following quantities in the Gaia DR2 archive:

Maximum projected radius
NGC 6397 likely suffers from tidal heating every time it passes through the Milky Way's disk, in particular during its last passage less than 4 Myr ago (Sect. 2.2). The tidal forces felt by GC stars during passages through the disk will produce velocity impulses that are effective in perturbing the least bound orbits, i.e. those of the stars in the outer envelope of the GC. The Gaia data can trace the effects of such tidal disturbances on the GC kinematics.
We searched for anomalous kinematics in NGC 6397 using Gaia DR2 out to a maximum projected radius of 1 degree with respect to NGC 6397's center. The top panel of Figure 6 indicates that the mean POS velocities in the radial and tangential directions differ beyond 8 . 7 We also checked the concordance of the radial and tangential components of the radial profiles of POS velocity dispersion (or equivalently PM dispersion). The bottom panel of Figure 6 shows excellent agreement between the two components of the velocity dispersion from 2 to 20 , suggesting that the velocity ellipsoid is nearly isotropic in this range of projected radii. We conservatively adopted a maximum projected radius of 8 as set by the divergence of the mean velocity profiles beyond that radius.

Gaia proper motion filtering and the bulk proper motion of NGC 6397
As for HST, we filtered Gaia stars in PM space to later filter them in CMD space. Since Gaia data extends to much greater projected radii from the GC center, i.e. to lower GC surface densities, the GC stands out less prominently from the field stars in PM space. We therefore first estimated the bulk PM of the GC and we assigned a first-order probability of membership using a GC+FS mixture model. We were tempted to assign two-dimensional (2D) Gaussian distributions for both GC stars and interlopers. However, the field star PM distribution has wider tails than a Gaussian. This means that stars on the other side of the GC, relative to the center of the field star component (i.e. its bulk motion) in PM space are more likely to be field stars than assumed by the Gaussian model. We found that the PM-modulus "surface density" profile (the velocity analog of the surface density profile) is well fit by a Pearson type VII distribution (Pearson 1916), as explained in detail in Appendix B.1. This distribution relies on two free parameters, a scale radius a and an outer slope γ, and can be written as: 7 Drukier et al. (1998) had noticed a similar effect in the M15 GC.
where µ = (µ α, * , µ δ ) and where the suffix i stands for the component analyzed, which in the case of Eq. (18) is the interlopers (i.e. field stars, hereafter FS). The reader can verify that, indeed, f µ (µ) dµ = 1, with dµ = 2 π µ dµ. With the respective Gaussian and Pearson VII distributions of PMs of GC stars and field stars, we performed a joint fit to the 2D distribution of PMs, which provided us with a precise bulk PM of the GC. For this, we considered the convolved expressions of the PM distributions of both GC and interlopers with the errors provided by the Gaia archive ( µ α, * , µ δ and ρ µ α, * µ δ ). When passing onto polar coordinates, the uncertainty propagation of where µ α, * µ δ = µ α, * µ δ ρ µ α, * µ δ . The convolution with Gaussian errors was straightforward in the case of GC stars, since their PM distribution was also modeled as a Gaussian, and thus we just added the errors to the dispersions in quadrature: However, the convolution of the field star distribution with Gaussian errors cannot be reduced to an analytic function; numerical evaluation of the convolution integrals for each star would dramatically increase the calculation time. We therefore used the analytical approximation for the ratio of convolved to raw probability distribution functions of PM moduli, (which is also incorporated in MAMPOSSt-PM), as quickly described in Sect. B.2 (details are given in Mamon & Vitral in prep.). This allowed us to perform our mixture model fit to the PM data, using Markov Chain Monte Carlo (MCMC) 8 to estimate bulk motions of both the GC and the field stars and assign probabilities of GC membership for each star. Table 2 displays our estimates of µ α, * and µ δ , which show a good agreement with the literature values for the GCs mean PMs. In the same table, we display the overall average of the literature plus our estimates, with its respective uncertainty , calculated as 2 = i 2 + σ 2 , where i stands for the uncertainties on the estimated values, and σ stands for their standard deviation. This overall average and uncertainties were also later used as priors for the MAMPOSSt-PM mass modeling analysis.
We only kept stars with GC membership probability higher then 0.9 according to our mixture model. This corresponds to a 1.43 mas yr −1 cut in the distribution of PMs of the GC, i.e. a 3.6 σ cut.

Gaia color-magnitude filtering
We CMD-filtered the Gaia data roughly following the same KDE method as we used for the HST data. The only differences were A&A proofs: manuscript no. main that 1) we used the equivalent filters G, B and R, and 2) we selected the 3 σ region of the KDE, instead of 2 σ, since the latter appeared to be too conservative a cut for Gaia. Figure 7 displays the final stars after this filter, along with the previously filtered subsets.

Removal of X-ray binaries
We removed the 5 X-ray binaries that were matched to Gaia stars.

Gaia final numbers and comparison to HST
The cleaned Gaia sample contains 1905 stars. Figure 8 shows that the spatial coverage of Gaia DR2 is indeed poorer than HST when taking into account only stars from the clean samples. While the cleaned HST sample contains 221 stars within the 9 IMBH radius of influence (for an IMBH with a mass of 600 M as measured by Kamann et al. 2016, see Sect. 4.1), the cleaned Gaia presents none, which is clearly insufficient to provide reliable results on the presence of an IMBH.

MUSE data cleaning
We filtered the MUSE sample by removing stars whose probability of being a member of the GC, according to their position in velocity -metallicity space , and kindly provided by S. Kamann), was less than 0.9. This step left us with 6595 stars among the original 7130. We then removed stars with LOS velocity errors greater than 2.232 km s −1 (half of the GC velocity dispersion), as well as stars that did not match the HST stars in a symmetric ≤ 1 arcsec match. We were left with 532 stars, of which 4 were previously identified as X-ray binaries (see section 5.1.3), yielding a final MUSE sample containing 528 stars, thus adding LOS velocities to ∼ 7% of the HST sample.
Our Gaussian prior on v LOS used the mean provided by Husser et al. (2016), i.e. v LOS = 17.84 km s −1 . For the uncertainty on v LOS , we allowed a much wider dispersion of 2.5 km s −1 instead of the 0.07 km s −1 of Husser et al., given the relatively wide range of bulk LOS velocities reported in other studies (Milone et al. 2006;Lind et al. 2008;Carretta et al. 2009a;Lovisi et al. 2012)).

Merging of the different datasets
Although we ran MAMPOSSt-PM on HST and Gaia individually for data homogeneity, we preferred to combine the data from HST, MUSE and Gaia to probe the mass and velocity anisotropy parameters with more accuracy. We started with the HST filtered subset, which accounted for most of the data we would use, and then we merged it to the other datasets following the steps below: 1. We restricted the Gaia stars to those with G magnitudes within the limits of F606W magnitudes from the HST subset, (i.e. 16.11 and 22.14), given the quasi equivalence between these two filters. This step ensured that mass segregation effects would be the same for all data subsets. 2. We removed Gaia stars that were symmetrically matched to HST stars to better than 1 arcsec (see Sect. 4.4.1), since HST PMs presented smaller errors (Fig. 3). 3. We incorporated LOS velocities from MUSE, according to the approach described in 5.3.
After these steps, we were left with 8255 stars, with 7209 of them with PMs from HST and 583 of them presenting LOS velocities.

Rotation
The presence of rotation in quasi-spherical systems makes the kinematical modeling more difficult. Any rotation will be inter-Article number, page 10 of 26 Vitral & Mamon: Does NGC 6397 contain an IMBH or a more diffuse inner cluster? preted by codes neglecting it, such as MAMPOSSt-PM, as disordered motions, and should lead to different mass profiles and deduce velocity anisotropies that are more tangential than in reality.
Fortunately, NGC 6397 does not appear to have significant amount of rotation. Indeed, while Gebhardt et al. (1995) found that the integrated light of the core of NGC 6397 showed rotation with a projected amplitude of 2 km s −1 , Kamann et al. (2016) concluded with individual stars from VLT/MUSE observations (see Sect. 4.3 below) that rotation contributes negligibly to the second velocity moment in this GC. Vasiliev (2019c) reported a typical uncertainty in Gaia DR2 PMs of up to ∼ 0.02 mas yr −1 at any radius, due to systematic errors and considered rotation to be confirmed only when its peak amplitude exceeded ∼ 0.05 mas yr −1 , which was not the case for his NGC 6397 measurements. On the other hand, Bianchini et al. (2018) found POS rotation at a 2 σ level for this GC, but with v/σ = 0.03 only. The cleaned HST data, which dominates the PMs at R < 1 , has v POSt = 0.004 ± 0.061 km s −1 (for a distance of 2.39 kpc), which is negligible compared to the mean POSt velocity dispersion of 5.2 km s −1 . Finally, Sollima, Baumgardt, & Hilker (2019) detected a rotation of 0.48 km s −1 from a 3D analysis, which is less than 10 per cent of its velocity dispersion. We therefore neglect rotation in NGC 6397.

Radial motions
Analogously, at smaller projected radii, the cleaned HST sample yields v POSr = −0.002 ± 0.061 km s −1 (for a distance of 2.39 kpc), calculated in the same fashion as v POSt .

Practical considerations
7.1. Surface density 7.1.1. Basic approach Eq. (4) requires the knowledge of the GC surface density (SD) profile, Σ GC (R). More importantly, Eq. (5) requires the knowledge of the 3D number density ν(r) when integrating the local VDF along the LOS. While MAMPOSSt-PM has a mode where it jointly fits the parameters of ν(r), M(r) and β(r) to the distribution of stars in projected phase space (see eq. [11] of Mamon et al. 2013, which is different from our eq. [4]), this requires the data to have constant completeness as a function of projected radius. If the astrometric measures of PMs are not independent of projected radius, one first needs to estimate (and deproject) the SD profile based on a wider dataset than that with PM values to obtain priors on the parameters of ν(r). Such an analysis had been performed for mass-orbit modeling of galaxy clusters with spectroscopic information whose completeness depended on projected radius (e.g. Mamon et al. 2019).
The PM data are likely to be incomplete at small projected radii, because of the increased crowding of stars as one moves inwards, especially for Gaia, whose mirrors are smaller than that of HST. We noticed that Gaia data with PMs traced less well the inner cusp of the SD than the full Gaia data (requesting only positions and G magnitude). Figure 9 shows that the Gaia PM subsample with G < 17 is indeed incomplete in the inner regions relative to the full G < 17 Gaia sample, since the distribution of projected radii of Gaia stars from the GC center is shallower for stars with precise PMs than for all stars (including those without PM measurements). A Kolmogorov-Smirnov test indicates that there is only 10 −5 probability that the distributions of projected radii of the subsample with precise PMs and that of the full sample (including stars without PM measurements and those with imprecise ones) arise from the same parent population. Another difficulty is to measure the outer envelope given confusion with field stars. This is important because MAMPOSSt-PM computes outward integrals. We therefore needed to provide MAMPOSSt-PM with a precomputed SD profile. In MAMPOSSt-PM, this profile had to be a simple analytical 1-or 2-parameter model (where free parameters are scale and possibly shape). More precisely, MAMPOSSt-PM assumes Gaussian priors on the pre-computed SD profile parameters.

Choice of model and main parameters
The choice of a good model for SD is crucial, because the mass of a possible IMBH is linked to the inner slope of the SD profile (van der Marel & Anderson 2010). NGC 6397 has long been known to have a cuspy (steep) inner SD profile (Auriere 1982;Lauzeral et al. 1992;Lugger et al. 1995;Noyola & Gebhardt 2006;Kamann et al. 2016), which prompted Djorgovski & King (1986) to classify it as core-collapsed. Unfortunately, no simple analytic model was ever fit to the SD profile of NGC 6397 extending to large projected radii. 9 Trager et al. (1995) estimated an effective (projected half-light) radius R e = 2 .90 for NGC 6397 using ground-based data and analyzing the SD profile in different apertures. Using Gaia DR2 data, de Boer et al. (2019) recently inferred R e = 3 .35 (which we infer from their 3D half-luminosity radius r h , coupled with their assumed distance and with the ratio R e /r h 0.732 kindly estimated for us by M. Gieles). But their analysis was based on a fit to two dynamical models that both assumed a cored inner density profile, which clearly does not represent the inner SD profile of NGC 6397.
We thus had to estimate the SD profile ourselves. The cuspy profile led us to fit a Sérsic model (Sérsic 1963, Sersic 1968) to the distribution of projected radii, since the Sérsic model has a shape parameter (Sérsic index) that makes it flexible to handle both cuspy and cored density profiles (e.g. Fig. A.1 of Vitral & Mamon 2020). Furthermore, there are simple analytical forms for the precise (but not exact) deprojection of the Sérsic model. In appendix A, we extend the analysis of Vitral & Mamon (2020) of the precision of the different analytical approximations as a function of R e and n, to even smaller values of R/R e representative of our data. MAMPOSSt-PM will then use the best-fit values of log R e and n and their uncertainties as the respective mean and standard deviation of the Gaussian priors.
We required HST data to fit correctly the inner cusp, and we also needed Gaia data to fit correctly the outer envelope. We therefore fitted the GC surface density profile Σ GC (R) together with the field star surface density Σ FS (assumed constant) from the Gaia DR2 photometric data, supplemented by the HST data to better probe the very inner regions. The two datasets are not cleaned in any way, except for cuts in magnitude and maximum (minimum) projected radius. The HST data may not be radially complete (we did not have a parent sample with all stars including those without PM measurements), but this is the best we can do. Our approach does not remove the Milky Way field stars, but instead incorporates them in our SD fit. It allows us to keep Gaia stars with no PM information, which would be filtered out in a PM cut, even though their contribution to the surface density shape is relevant.

Stitching HST and Gaia
We stitched the HST and Gaia datasets as follows: 1. We cut the Gaia stars (GC stars plus interlopers) at G = 17, where we expect negligible mass segregation. We explored different magnitude limits, but moderate changes affected little our fits. However, using very faint magnitude limits (G = 19 or 20.5), we noticed a lack of continuity in the transition from the HST to the Gaia radial range, which appears to be a consequence of overcrowding in the cluster's center (Arenou et al. 2018), where fainter magnitude stars are more difficultly detected. 2. We counted the number N G of Gaia stars inside the annulus 2 . 7 < R < 100 , corresponding to the region where our HST data seems complete (see Figure 8). 3. We removed the Gaia stars in that annulus and added the N G HST brightest stars (a tiny fraction of which may be interlopers) in that annulus, so we could correctly probe the inner surface density profile.

Surface density results
We performed maximum likelihood estimation (MLE) to fit log R e and n to the distribution of projected radii, using a model SD profile with a Sérsic GC plus a uniform (constant SD) field star contribution, considering the data up to a maximum projected radius, R max . 10 This maximum radius must be chosen large enough to capture the behavior of the SD profile in the outer envelope of the GC as well as the contribution of the field stars, but not too large to be overwhelmed by the field stars in the MLE fit. We performed such MLE fits of the SD profile for 500 values of log-spaced R max between 0 • .2 and 1 • . Figure 10 (left) shows a plateau for R max in the range ∼ 1300 − 1800 , where R e , n and the field star SD are each indeed constant. At lower (resp. higher) R max the field star SD fluctuates at lower (higher) values with corresponding higher (lower) and fluctuating R e and n.
For better precision, we then performed 30 MCMC runs for log-spaced R max between 1300 and 1800 arcsec and assigned the mean of best-fitted values and their uncertainties (green crosses and error bars of Figure 10, respectively) as the global Sérsic radius and index means and uncertainties. The results are displayed in Figure 10 (bottom). We found R e = 4 .51 ± 0 .36 and n = 3.26 ± 0.23. Our effective radius is significantly higher than the estimates of Trager et al. and de Boer et al. (2019), which both fail to properly capture the cusp. Indeed, the Sérsic SD profile with the same index (n = 3.26) but with the Trager et al. effective radius (green curve in the right panel of Fig. 10) fails to reproduce the steep inner SD profile.

Multiple populations
Mass-orbit modeling codes such as MAMPOSSt-PM assume a given population with a given SD profile, as well as given kinematics (i.e. p(v|R) for MAMPOSSt-PM). Systems with multiple populations, each with their SD profiles and kinematics, should be analyzed jointly. MAMPOSSt-PM can handle such multiple populations. In practice, we pre-determine priors for the SD profile of each population, and then run MAMPOSSt-PM with several tracer components.
There are two mechanisms that bring multiple populations in GCs. First, the formation of a GC may occur in several episodes, with different populations with their own chemistry and kinematics. Second, the short two-body relaxation time of GCs in general and especially of core-collapsed GCs such as NGC 6397, allows for energy exchanges between stars that are sufficient to drive them towards energy equipartition. This leads to mass segregation, where massive stars are confined to smaller radii and have lower typical velocities at a given radius. We examine these two mechanisms in turn.

Multiple chemical populations
Many GCs show signs of different chemical populations (e.g., Carretta et al. 2009a), in particular NGC 6397 (Carretta et al. 2009b;Milone et al. 2012a). Unfortunately, the separation of stars into different chemical populations in NGC 6397 was only performed with HST data limited to small projected radii. Such a chemical separation of stars is impossible with the Gaia data, for lack of sufficient wavebands to provide a large enough dimensionality of colors to distinguish chemical populations. Moreover, this requires cleaning the data for differential reddening. Therefore, the splitting of the stars of NGC 6397 into chemical populations, while possible, is beyond the scope of the present article.
Besides, Cordoni et al. (2020), who analyzed the spatial distributions and kinematics of seven GCs (other than NGC 6397) split by their two detected chemical populations, found that only the two GCs with the highest two-body relaxation times showed signs of different kinematics between their two chemical populations. If the two populations have roughly the same mass functions, then two-body relaxation should wash out any differences in their positions in projected phase space (Vesperini et al. 2013), except in their outer regions where two-body relaxation is incomplete. The very low relaxation time of NGC 6397 (600 Myr at its half-mass radius according to Baumgardt  kinematics, at least within its effective radius of 4 .5 as we found above.

Mass segregation
We explored the range of masses of the stars in our datasets by comparing synthetic CMDs to our observed ones.  Table 1). The left panel of Figure 11 shows that the PARSEC model fits well the CMD of the cleaned subsample of HST stars. The mass ratio among the heaviest and the lightest stars in our cleaned HST sample is 2.57, sufficiently high to worry about the possibility of mass segregation in our cleaned HST sample.
In fact, mass segregation is visible in NGC 6397 (Heyl et al. 2012;Martinazzi et al. 2014). Analyzing HST stars in the range 3 < R < 7 .5, Heyl et al. found mass segregation of Main Sequence stars: brighter stars show 4% lower median projected radius and 12% lower median PM moduli than fainter stars. In a subsequent study by the same team, Goldsbury et al. (2013) showed that two characteristic radii scale as a M γ , with γ = −1.0 ± 0.1 for NGC 6397. Martinazzi et al. found that the mean mass of Main Sequence stars drops with physical radius (after deprojection) from > 0.7 M for r < 10 to 0.56 M for r > 100 (after correcting for the completeness with magnitude, estimating it by adding artificial PSF-convolved stars to the images), i.e. a 20% effect.
Since MAMPOSSt-PM is able to treat multiple stellar populations together, we also performed SD fits with two populations of stars, as we explain below.
The magnitude threshold we chose to separate the two population of stars was based on the analysis of Heyl et al. (2012): Their figures 7 and 12 indicate that NGC 6397 Main Sequence stars present a (small) variation of radial distribution and velocity dispersion profiles, respectively, at a magnitude of F814W ∼ 18.75, which is equivalent to F606W = 19.76. We thus used this limit to divide our cleaned subset of 8255 stars into two populations, one with brighter magnitudes (6527 heavier stars) and the other with fainter ones (1728 less massive stars).
Unfortunately, we could not perform a surface density fit as robust as for the single population case, because 1) our data was considerably incomplete at higher magnitudes (the fainter subset) and 2) as mentioned in Sect. 7.1.3, the HST plus Gaia stacked subset presented discontinuous trends when allowing stars with fainter magnitudes (G > 17). We therefore let MAMPOSSt-PM fit the SD profile of each population from the kinematics only, i.e. from the conditional probabilities p(v|R). We adopted Gaussian priors for log R e , with mean equal to that found by our previous SD fit of the single population (Sect. 7.1) and a wide (0.2 dex) uncertainty. We also considered Gaussian priors for the Sérsic indices, but with lower uncertainties to avoid a degeneracy in its marginal distribution.
We helped MAMPOSSt-PM by providing narrow Gaussian mass priors for each population. We derived mass fractions for each population using the power-law Main Sequence stellar mass function of slope α = −0.52, 12 together with the mass limits of our subsets. Indeed, a power-law relation dN/ dm ∝ m α implies a total stellar mass in the range of stellar masses (m 1 , m 2 ) of and thus derive

MAMPOSSt-PM assumptions and priors
To summarize, MAMPOSSt-PM assumes that NGC 6397, lying at a distance of 2.39 kpc, with its center at the position 12 The slope α = −0.52 of the Main Sequence stellar mass function of NGC 6397 is given in H. Baumgardt's very useful web site on GCs, https://people.smp.uq.edu.au/HolgerBaumgardt/globular. found by Goldsbury et al. (2010), is a spherical self-gravitating system with no internal streaming motions (rotation, expansion/collapse, etc.). It also assumes that the local velocity distribution function is an ellipsoidal 3D Gaussian aligned with the spherical coordinates. By relying on the Jeans equation, MAMPOSSt-PM assumes that stars are test particles orbiting the the gravitational potential, and that they do not interact with one another. While binary stars have their own motions and violate the test particle hypothesis, these are in great part filtered out of our data. However, the low two-body relaxation time of NGC 6397 means that star-star interactions are not negligible, which violates the test particle assumption.
We now summarize our adopted priors. The surface density profile of GC stars is assumed to follow the Sérsic law. We adopted Gaussian priors for log R e and for n (see Sect. 7.1) for both single-and 2-population analyses.
Following Sect. 5.2.3, we adopted Gaussian priors on the bulk LOS velocity and PM of the GC and Pearson VII priors on the modulus of the bulk PM of the field stars. The field star LOS distribution is Gaussian with a wide (1.7 dex) uncertainty. The log ratio of field stars to GC stars is flat between -5 and -2.
We also adopted flat priors on the GC stellar log mass (3 to 6) for the single-population runs and Gaussian priors with equal means and wide dispersions for the 2-population runs (Sect. 7.2.2). The (sub-)cluster of unseen objects is modeled with different density models with a wide Gaussian (1 dex) prior on the scale radius uncertainty, assuming isotropic velocities.

Marginal distributions and covariances
We explored parameter space to determine marginal distributions and parameter covariances using the same Markov Chain Monte Carlo (MCMC) approach as Mamon et al. (2013Mamon et al. ( , 2019, see section 4.7 of Mamon et al. (2019). This makes use of the public Fortran CosmoMC code (Lewis & Bridle 2002). 13 In particular, we used 6 MCMC chains run in parallel and stopped the exploration of parameter space after one of the chains reached a number of steps N steps = 10 000 N free , where N free is the number of free parameters of the model. The only difference is that we now discard the 3000 N free first steps of each MCMC chain (burn-in phase) instead of the first 2000 N free as Mamon et al. previously did.

Bayesian information
There are several ways to compare the different results of MAMPOSSt-PM using different priors. The simplest would be to compare the log-likelihoods. But there is then the risk of overfitting (under-fitting) the data when using too many (few) free parameters. We considered two model selection criteria to distinguish our different models and priors. We first considered the corrected Akaike Information Criterion (Sugiura 1978) where AIC is the original Akaike Information Criterion (Akaike 1973) and where L MLE is highest likelihood found when exploring the parameter space, N free is the number of free parameters, and N data the number of data points. 14 We also considered the Bayes Information Criterion (BIC, Schwarz 1978): Given that the relative likelihood (given the data) of one model relative to a reference one is Since BIC penalizes extra parameters more (factor ln N data 9.0) than AICc (factor 2), BIC effectively prefers simpler models than does AICc. BIC is well suited for situations where the true model is among the tested ones, while AIC(c) is more robust in the opposite case (Burnham & Anderson 2002). In practice, we don't expect any of our models to be true: for example, there is no expectation that the surface density of stars in a GC should precisely follow a Sérsic model. To paraphrase the statistician George Box: "All models are wrong, but some are useful" (Box 1979). 15 We therefore present both AICc and BIC in our results, but we give preference to AICc whenever having to decide between different models. Table 3 displays our standard runs, along with some robustness tests on velocity isotropy, mass segregation and presence of central components. Our runs have come with 4 variations for the inner mass distribution: 1) no IMBH nor (sub)-cluster of unresolved objects (CUO, see Sect. 2.2), 2) IMBH with no CUO, 3) CUO without an IMBH, and 4) both IMBH and CUO.

MAMPOSSt-PM runs
The left panel of Figure 12 shows the marginal distributions and covariances from the MCMC output of model 1(IMBH without CUO and with isotropic stars), for a selection of missing parameters (missing are the bulk GC motion and the log ratio of field stars to GC members). The MCMC ran well in the 14 The main results of this paper use N data = 8255 and N free ∼ 10. 15 The angular fluctuation spectrum of the cosmic microwave background appears to be a counter-example. sense that the MLE values (red arrows) are close to the peaks of the marginal distributions (blue shaded histograms). One notices that higher IMBH masses are obtained with higher GC scale radii, independently of the Sérsic index. The reason is that, at given total stellar GC mass, more concentrated stars will lead to more mass in the inner regions, meaning less IMBH mass. This is the opposite of what van der Marel & Anderson (2010) proposed, but consistent with what Baumgardt (2017) found for NGC 6397.
The right panel of Figure 12 shows the same for model 10 (CUO without IMBH and with isotropic stars). Again, the MLE values are coincident with the peaks of the marginal distributions. But surprisingly, the MCMC favors a GC scale radius at the upper edge of the 3 σ prior obtained from the SD profile fit of Sect. 7.1.4. We will discuss this below. The mass of the CUO increases with the GC and CUO scale radii, and decreases with the GC Sérsic index (because low index leads to flatter SD profiles, which are somewhat analog to wider SD profiles). Interestingly, the scale radius of the CUO population is small (7 ), as we will discuss in Sect. 9.4.1.

Velocity anisotropy
The combination of HST data properly probing the inner regions of NGC 6397 with the Gaia data probing the outer regions allows us to estimate the velocity anisotropy across the cluster. We ran MAMPOSSt-PM using different priors on the anisotropy. Our standard prior has isotropic velocities throughout the GC. Our other priors assume the gOM anisotropy model (we found that the softer varying Tiret et al. 2007 model performs almost as well, but not better).
The highest likelihoods (lowest −∆ ln L max ) between the first five models are achieved by model 2 (everything free), but it did not converge (R −1 = 0.07 > 0.02). In particular, model 2 leads to quasi-isotropic inner and outer velocities, with a narrow constraint for β 0 and a quite wide one for β ∞ . The transition radius, r β , is very poorly constrained. Compared to model 2, the isotropic model 1 is very strongly preferred by BIC (according to Eq. [28] for ∆BIC = 26) and marginally so (92% confidence according to Eq. [27] for ∆AIC = 4.87) by AICc. Model 3 with the anisotropy radius tied to the effective radius of the SD profile (TAND) is strongly preferred by BIC Bayesian evidence (with respect to model 2), but not with AICc. Interestingly, model 3 leads to negligible anisotropy in the center with narrow uncertainty. But it points to mildly tangential anisotropy at r = r out = 8 , which corresponds to our maximum projected radius R max . Forcing central isotropy with TAND (model 4) leads to less tangential outer velocities. Conversely, forcing outer isotropy with TAND (model 5) leads to strongly well constrained isotropic inner velocities.
We also ran a CUO case, where the stars had free inner and outer anisotropy, but with TAND. Compared to its isotropic analog (model 10), model 13 with a single population has a slightly worse likelihood (but did not fully converge), and the inner and outer anisotropies are consistent with isotropic orbits (β 0 = −0.02 ± 0.05, β out = −0.03 ± 0.05). An analogous run for the 2-population CUO model also yields close to isotropic orbits for the bright population (β 0 = 0.03 ± 0.05, β out = 0.00 ± 0.12). The faint population has inner velocities consistent with isotropy (β 0 = −0.1 ± 0.1). On the other hand, its outer anisotropy is poorly constrained (β out = 0.1 +0.2 −0.7 ) because of the near complete lack of faint Gaia stars with decent PM errors, hence we are limited to the inner regions. Still, the outer velocities are again consistent with isotropy.
Article number, page 15 of 26 A&A proofs: manuscript no. main  (2): anisotropy radius (where TAND is tied anisotropy number density, r β = r scale ); (3): velocity anisotropy model ("iso" for isotropic); (4): surface density model ('S' for Sérsic, 'P' for Plummer, 'H' for Hernquist); (5): number of free parameters; (6): MCMC convergence criterion (R −1 ≤ 0.02 is considered as properly converged, worse convergence runs are shown in bold red); (7): anisotropy value at r = 0; (8): anisotropy value at r out = 8 ; (9): scale radius (effective -half projected number -radius R e for Sérsic, otherwise radius of 3D density slope -2, with r −2 = 1.63 R e for Plummer and r −2 = 0.28 R e for Hernquist); (10): Sérsic index; (11): black hole mass; (12): mass of inner sub-cluster of unresolved objects (CUO); (13): mass of the stellar population considered; (14): difference in minus natural logarithm of the maximum likelihood relative to model 14; (15): difference in AIC (eq. [25]) relative to best value; (16): difference in BIC (eq. [26]) relative to best value. Blue bold zeros for −∆ ln L max , ∆AICc and ∆BIC represent the reference values among all runs, for each column, respectively. Values in bold gray were fixed parameters. The values of columns (9) to (13) are at maximum likelihood (black) or medians (orange and italics, when the MLE is outside the 16-84 percentiles). The uncertainties are from the 16th and 84th percentiles of the marginal distributions. The number of stars in each subset is 8255, the maximum allowed PM error (eq. [12]) for those runs is 0.197 mas yr −1 , the distance was considered as 2.39 kpc, and the GC center was that of Goldsbury et al. (2010). For models considering two Main Sequence populations, the parameters of the brighter (fainter) population is displayed first (next).
In summary, there is strong evidence for isotropy with BIC, but weaker evidence with AICc. However, even the anisotropic runs produce anisotropy profiles that are very close to isotropic throughout. We thus conclude that the visible stars in NGC 6397 have quasi-isotropic orbits, at least for the stars brighter than F606W = 19.76 and up to R = 8 . We therefore adopted isotropic orbits as our standard when investigating other quantities.

Intermediate mass black hole
We tested the scenario with an IMBH and no CUO (models 1 to 6). Among models 1 to 6, the most likely one and very strongly favored by AICc (model 6, which is the analog of model 1, but with 2 populations for the GC stars) yields an IMBH mass of 511 +158 −207 M , while model 1, which is weakly favored by BIC over model 6, yields an IMBH mass of 658 +70 −338 M . Both models 1 and 6 indicate an IMBH mass above 200 M at 95% con-Article number, page 16 of 26 Vitral & Mamon: Does NGC 6397 contain an IMBH or a more diffuse inner cluster? fidence. Furthermore, AICc (resp. BIC) indicates very strong (resp. quite strong) evidence for the presence of an IMBH in the absence of a central diffuse component (i.e. comparing IMBH models 1 and 6 to models 7 and 8, respectively).
However, both AICc and BIC indicate strong evidence against the IMBH hypothesis in comparison with the presence of a CUO, with differences of 15.9 in AICc and 8.9 in BIC between isotropic, single-population models 1 and 10. With 2-population GC stars, we have differences of 12.2 in AICc between isotropic models 6 and 14. But with BIC, the difference is only 5.2, leading to only moderately strong evidence (92% confidence) of favoring the CUO over the IMBH. But given our preference for AICc for the complex physics of GC kinematics with imperfect models for the surface density profile for example (see Sect. 8), we conclude that the evidence is strong in favor of a dark component that is diffuse for the two-population model. Hence, the unseen inner matter of NGC 6397 is very likely to be diffuse. We therefore now investigate the CUO model in more detail.

CUO density profile
If the dark component is diffuse as a CUO instead of a singular IMBH, we first need to measure its extent. MAMPOSSt-PM can provide constraints on the shape and scale radius of the density profile of the unseen population, through the sole use of the conditional probabilities of velocity at given projected radius, p(v|R), without directly fitting the distribution of projected radii. Marginal distributions of the CUO effective radius and mass, and their covariance, for a preliminary MAMPOSSt-PM run for an isotropic, single-population plus Plummer CUO SD profile, with a prior on the log scale radius centered at r −2,CUO = R e,GC = 4 .51. The notation is the same as in Figure 12.
We first assumed a Plummer (1911) model for the CUO with the same effective radius as that of the GC stars (4 .5), but with a wide standard deviation (1 dex) for the Gaussian prior on log scale radius. Interestingly, as seen in Figure 13, MAMPOSSt-PM converged to r −2 = 7 , i.e. an effective (half-projected num- The Sérsic model (12) is the most likely one, closely followed by the Plummer model (10). AICc has a weak preference for the Plummer model, while BIC prefers the Plummer model, with strong evidence against the more complex Sérsic model, but weak evidence against the Hernquist model (11). Model 12 with the Sérsic CUO density profile produced a low Sérsic index of n = 0.92, which leads to a shallow inner slope that is not too different from the zero slope of the inner Plummer density profile (see Figure A.1 from Vitral & Mamon 2020). The steeper inner profile of the Hernquist density profile makes it less similar to the n = 0.92 Sérsic model than is the Plummer profile, as confirmed by the Hernquist model 11 producing the lowest likelihood (highest − ln L) among the three models.
In summary, it is hard to distinguish which is the best density model for the CUO scenario. There is weak evidence for a shallow slope. We adopted the Plummer model given that it is the preferred of the three density models for both AICc and BIC, albeit with weak evidence for both.

Presence of an IMBH in addition to the CUO
Could the center of NGC 6397 host both an IMBH and a diffuse dark component? Model 9 contains both, but it is somewhat less likely than model 10, and in comparison it is strongly disfavored by BIC, although only weakly disfavored by AICc. Note that the Plummer model used for the CUO is the one that best distinguishes the CUO from a possible additional IMBH. Moreover, the recovered mass of the additional IMBH is so small 42 +92 −26 M that it can no longer be called an IMBH. Table 3 also displays the MAMPOSSt-PM results of 4 models with two populations, split by apparent magnitude, hence by mass (Sect. 7.2.2). Model 14, with CUO but no IMBH, has the highest likelihood of the three models considering isotropy. It is very strongly favored with AICc over models 6 (with IMBH but no CUO) and 8 (no IMBH nor CUO). It is also preferred by BIC, which strongly favors it over model 8, but only marginally favors it over model 6 (∆BIC = 5.16, leading to 92% confidence in preferring model 14 over model 6,according to Eq. [28]). This preference of the CUO model over the IMBH model resembles that found for the single population (Sect. 9.3).

Two-mass populations
But there are differences in the MAMPOSSt-PM results between single-population and 2-population models, both in their Bayesian evidence and in their best-fit parameters. Indeed, the 2-population model 14 shows the highest likelihood and AICc evidence of the first fourteen models listed in Table 3. In particular, comparing 2-population models to their single-population equivalents, there is strong AICc evidence (∆AICc = 13.7) fa-voring 2-population model 14 than 1-population model 10. But there is strong BIC evidence (∆BIC = 7.3) the other way, with model 10 displaying the best BIC evidence of all fourteen models listed in Table 3.
The reader may note that the GC stellar masses summed over the one or two populations are 9% greater in the 2-population runs for CUO without IMBH scenarios (model 14 vs. model 10) and 13% higher for IMBH without CUO scenarios (model 6 vs. model 1). Moreover, the IMBH mass is 22% lower in the 2population model 6 relative to the single-population model 1. Given that the standard deviations in the log GC mass (single or sum of Bright and Faint) are less than 0.03 dex, the difference in the means of log M GC of 0.086 dex is highly significant for samples of order of 10 5 points, as we checked with a Student t test. Similarly, the standard deviations in log M IMBH are of order 0.2 dex; thus the difference of 0.037 dex in their means is again highly significant, as we also checked with the Student t test.
MAMPOSSt-PM yields interesting results on the differences between the bright and faint populations. First, the 2-population runs can be tested for the respective masses in each. Despite our prior of equal masses for each, MAMPOSSt-PM returns bestfit bright population mass fractions of 0.49, 0.44, and 0.57 for models 6 (IMBH without CUO), 8 (no IMBH nor CUO), and 14 (CUO without IMBH). Only model 14 has a bright fraction close to the expected value of 0.56 (Sect. 7.2.2).
Secondly, in all three 2-population models, the brighter population has a much lower scale radius than its fainter counterpart, by factors of 2.5, 2.6 and 2.1 for models 6, 8, and 14, respectively. These lower scale radii for the bright population are highly statistically significant. Indeed, the fractions of MCMC chain elements leading to higher scale radius of the brighter population are less than 0.04% for models 6, 8, and 14. Therefore, MAMPOSSt-PM is able to find very strong kinematic signatures of luminosity (hence mass) segregation, by fitting p(v|R) with the same priors on the scale radii of the two populations, without directly fitting the distribution of projected radii.
Recall that we used the same Gaussian priors on scale radius for both populations. As can be seen in Figure 14, these priors may have been too narrow, and MAMPOSSt-PM may have thus under-estimated the differences in the scale radii of the bright vs. faint populations.

Main results
We ran the MAMPOSSt-PM code on a combined set of PMs from HST and Gaia, as well as with LOS velocities from MUSE. Our results indicate that the GC star velocities are close to isotropic out to 2 effective radii. Our models with inner central components (IMBH or diffuse sub-cluster of unresolved objects, CUO) are strongly preferred over models without any. Models where the central mass is in the form of an IMBH find favored masses of 500 − 650 M , with a 5th lower percentile of 200 M . But we found better likelihoods and Bayesian evidence for a diffuse central unseen mass component (CUO), with effective (projected) radius is of order of 2.5 to 5 and mass of order 1000 to 2000 M .

Robustness
We tested the robustness of our results for several variations in our assumptions and our cuts to the data. Notes: the models are all isotropic single-component. The last two refer to our models 1 and 10, respectively.
First, the preference for the CUO component is relatively robust to our analysis with a single bright population or two populations. It is also quite robust to our assumptions on velocity anisotropy from free inner and outer values to isotropic. The differences in AICc and BIC of IMBH relative to CUO are +16 and +9 for single-population isotropic, +15 and +8 for single-population, free inner and outer anisotropy but fixing the anisotropy transition radius to the density scale radius (TAND), and +12 and +5 for double-population isotropic. In other words, CUO is very strongly preferred to IMBH by both AICc and BIC.
The preference for CUO vs. IMBH is also robust to the choice of the density profile. Indeed, adopting a Sérsic profile, but fixing the effective radius to 2 .9 as found by Trager et al. (1995), produces a worse match to the kinematic data. However, it strongly prefers the CUO model to the IMBH one, but provides much weaker constraints on the central masses.
Finally, the preference for a diffuse central mass (CUO) versus an IMBH is also robust to the choice of maximum allowed PM error. We obtain the same ∆AICc = 12 for both µ < 0.276, mas yr −1 and our standard choice of µ < 0.197, mas yr −1 , as well as the same ∆BIC = 5. On the other hand, to µ < 0.141 mas yr −1 , yields ∆AICc = 6.5 instead, presumably because of the small dataset satisfying this low maximum allowed PM error.

Effects of the dataset
It is useful to check how efficient are our different datasets in reaching our conclusions. Such a comparison will be useful for future mass-orbit modeling of other GCs, for example in the absence of HST and LOS data. Table 4 displays a summary of the masses and CUO size from single-component isotropic runs using different datasets. We used the same SD profile for all samples.
First, with MUSE data alone, we cannot constrain the mass of the IMBH. In contrast, Kamann et al. (2016) found M = 600± 200 M for the IMBH. The difference can be explained by the much more liberal cut in velocity errors used by Kamann et al., 5 km s −1 , compared to our cut at 2.2 km s −1 , which led them to have 4608 stars compared to our 528 stars. Our small MUSE sample prevents us from constraining the CUO mass and size. This is also true using Gaia data alone. It is not surprising, given that we have only 152 Gaia stars within the inner 100 and none within the inner 10 . This low number is caused by our cut on PM errors. Later Gaia releases will have lower PM errors, and will thus have larger numbers of inner stars to use for mass-orbit modeling. Moreover, a subset with just HST and MUSE, extending up to 2 .5 is not able to constrain the outer isotropy indicated by Gaia DR2, since a run similar to model 13 with this dataset yielded β(r = 8 ) = −1.0 +0.5 −0.3 , i.e. a significantly tangential anisotropy, in contrast with β(8 ) = −0.03 +0.11 −0.14 when Gaia data is also considered.
In summary, Gaia offers much improvement to HST(+MUSE) data, by 1) allowing a better estimation of the surface density profile, hence of the 3D density profile, and 2) allow measuring velocity anisotropy well beyond the effective radius. With the prior knowledge of the surface density profile and velocity anisotropy profile, the addition of Gaia provides somewhat better constrained masses for the inner component (IMBH or CUO). On the other hand, Gaia data by itself, has much too few stars with sufficiently accurate PMs at low projected radii to probe, the IMBH or CUO mass, and even less the CUO size. This highlights the requirement of combining HST and Gaia for proper mass-orbit modeling of the PM of stars in GCs.

Velocity anisotropy
Our combination of HST data probing the inner regions of NGC 6397 and Gaia data probing the outer regions, allowed us to obtain a wide-range view of the variation of the orbital anisotropy with radius. Bayesian evidence favors isotropy throughout the cluster. This confirms the projected isotropy previously determined by Heyl et al. (2012) from 3 .5 to 7 using HST (with 10% individual errors), as well as Watkins et al. (2015a): σ POSt = 0.98 σ POSr , up to R e /10 using HST, and by Jindal et al. (2019) at larger radii up to 12 3 R e using Gaia, with signs of radial motions at R > 8 .
But a constant projected isotropy can hide radial variations of the 3D anisotropy β(r). Figure 16 displays the first known con-straints on the velocity anisotropy profile of NGC 6397, which we obtained from our MAMPOSSt-PM analyses from 1 to 8 . Indeed, it is highly consistent with isotropic orbits throughout, as seen in Figure 16. Our parametric representation of β(r) prevents us to see a possible sharp up-or down-turn in this radial range. Nevertheless, MAMPOSSt-PM, like other mass-orbit modeling algorithms, probes physical radii beyond the maximum projected radius. This suggests that the orbits of stars at 2 effective radii are not much disturbed by the Milky Way, despite the differences in the two components of the mean POS velocity beyond R = 8 .
The velocity isotropy of NGC 6397 at fairly large radii appears to contradict models where core-collapsed GCs have isotropic inner motions and very radial outer motions (e.g. Takahashi 1995). But such models are for isolated GCs, whereas NGC 6397 has crossed the Galactic disk many times and as recently as 4 Myr ago (Sect. 2.2). Indeed, we noticed a difference in the mean POS velocities between the radial and tangential components beyond 8 (top panel of Fig. 6), suggesting that the outer parts of NGC 6397 are out of equilibrium.
Only two previous studies have measured velocity anisotropy profiles in GCs. Cluster ω Cen was modeled by van der Marel & Anderson (2010), who fit the PM dispersions measured in bins of projected radius. They found a gOM anisotropy profile with β 0 = 0.13 ± 0.02 and β ∞ = −0.52 ± 0.22, i.e. slightly radial in the center and slightly tangential in the outer regions. Messier 15 was analyzed by den Brok et al. (2014), who used an updated version of JAM (Cappellari 2008). They found a gOM anisotropy profile with β 0 = −0.21 ± 0.30 and β ∞ = 0.015 ± 0.12, i.e. consistent with isotropic at all radii.
In comparison, the present study provides the first Bayesian mass-orbit modeling of discrete GC data. Our model 13, as well as the bright population of model 15, both yield inner anisotropies consistent with zero with ≈ ±0.05 uncertainty (Table 3). Both yield anisotropies at 8 of |β| ≤ 0.03 with uncertainties ≈ 0.14 ( Table 3).
Which physical mechanisms are responsible for isotropic orbits at a few R e in NGC 6397? First, tidal forces produce radial perturbations (e.g., Dekel et al. 2003), so the Milky Way tidal field (from disk or bulge) cannot be blamed for the isotropic orbits. Note that the shape of the Milky Way gravitational potential is not very different from spherical, despite the presence of the disk ( fig. 2.19 of Binney & Tremaine 2008). Second, the passage through the differentially rotating Galactic disk could lead to exchanges of angular momentum leading to tangential orbits, in particular for the outer less bound stars. But on the scale of a GC, the differences in disk rotation velocity must be minute. Third, the stars in the outer parts of the GC must suffer violent relaxation during passages through the Galactic disk, but the passages are so short (a few Myr) that there may not be enough time for violent relaxation to effectively perturb the GC stars. Fourth, solenoidal modes of turbulence on Galactic gas clouds generate angular momentum within these clouds and within the stars formed therein, which in turn can transfer this angular momentum to the passing GC. But again, the short duration of these passages limits the amount of angular momentum that the GC stars can acquire. This suggests the isotropy of this GC and perhaps all was acquired early on. (M IMBH < 610 M ). Our 95% confidence lower limits (i.e. 5th percentiles) are 201 M for both models 1 (single population) and 6 (two-population). But if IMBHs grow by BH mergers, one expects that the gravitational radiation emitted during these mergers are anisotropic and lead to substantial recoil of the BH remnant of the merger (Peres 1962). Such recoils are strong enough for 70% of IMBHs of mass < 1000 M to escape a GC of escape velocity 50 km s −1 over time (Holley-Bockelmann et al. 2008). We can estimate the escape velocity from the GC center, v esc = √ −2 Φ(0), for our best IMBH model 6, given that the central potential is Φ 0 = −(2/π) b n Γ(n)/Γ(2n) G M/R e for the Sérsic model (Ciotti 1991), where b(n) follows the relation given in Ciotti & Bertin (1999). Model 6 then has v esc = 20.9 km s −1 and 13.1 km s −1 for the Bright and Faint components, respectively. Summing these 2 terms quadratically leads to a global escape velocity of v esc = 24.7 km s −1 for model 6. This escape velocity is lower than the 50 km s −1 assumed by Holley-Bockelmann et al., which should lead to an even much higher fraction of escaping IMBHs of mass 1000 M . And our 95% confidence upper limits on the IMBH mass (without a CUO) are lower, again implying that a yet higher fraction of IMBHs escape their GC. A simpler model by Holley-Bockelmann et al. indicates that 25 IMBH+BH mergers should lead to 85% of 500 M IMBHs escaping their GC of escape velocity 17 km s −1 . This weakens our confidence in the central IMBH scenario for NGC 6397.

Cluster of unresolved objects
Our preference for the CUO model agrees with the suggestions by Zocchi et al. (2019) and Mann et al. (2019) that such a CUO can mimic an IMBH. But we go one step further than these authors, by fitting the mass and scale radius of the CUO.
The scale radius of the CUO in NGC 6397 is consistent with the SD profile of NGC 6397. Indeed, as seen in the right panel of Figure 10, the SD profile suggests a separation between two populations at R = 10 , consistent with the CUO effective radius of a 2 . 5 to 5 . We now discuss the nature of the sub-cluster of unresolved objects that we found in the center of NGC 6397. The much smaller scale radius for the CUO indicates that the objects must be more massive than the stars that we studied, i.e. more massive than m bright = 0.77 M (Sect. 7.2.2). Indeed, such unresolved massive objects would sink to the center by dynamical friction Chandrasekhar (1943). The core-collapsed state of NGC 6397 is a strong argument in favor of a population of heavier stars in the cluster's inner regions, as it suggests a short two-body relaxation time, hence mass segregation. These heavier objects constituting the CUO could be white dwarfs, neutron stars, stellar black holes, or unresolved binaries.
We compared the radial distribution from our cleaned merged sample and from X-ray binaries from Bahramian et al. (2020), filtering only stars with probability of being an AGN smaller then 0.5. Figure 17 shows the cumulative distribution function (CDF) of these two datasets in the range of 2.7 − 100 , where our HST data was complete. It is clear that these two populations do not follow the same radial distribution (we find a KS p-value of 8.7 × 10 −5 ). Furthermore, the bulk of the X-ray binaries seems to be located within 6 − 50 arcsec ( Figure 17). This is consistent with the CUO effective radius of 2.5 to 5 ( Table 3).
Which of among white dwarfs, neutron stars, BHs and massive binaries dominates the mass of the CUO? We can first discard unresolved binaries of Main Sequence stars. Indeed, if they are unobserved, their total magnitude must be fainter than F606W = 22, which is the rough magnitude limit of our HST sample (Gaia is not relevant given the very low effective radius of the CUO). There is no way that their total mass can exceed that of the bulk of our sample. We could then have unresolved binaries of a Main Sequence star with a compact star (white dwarf or neutron star) or possibly a BH. But the Main Sequence star will have a mass of m faint = 0.25 M (Sect. 7.2.2), i.e. at least 3 times lower than our cut at m bright = 0.77 M and at least 20 times lower than that of a BH. Thus the Main Sequence mass can be neglected and we are left with the compact star or BH.
We can compare the total mass in white dwarfs, neutron stars and BHs, by integrating over the zero-age (stellar mass function of the) Main Sequence (ZAMS): where n(m) is the ZAMS. We adopted the initial -final (remnant) mass relations from equations (4) . We conservatively assumed a mass gap between 2.15 and 5 M from the lack of LIGO detections in this mass range. We also considered a maximum stellar BH mass above which pair-instability supernovae fully explode the progenitor star leaving no remnant, adopting M BH,max = 45 M (Farmer et al. 2019) or 52 M (Woosley 2017). We ignored the formation of very massive BHs above the pair-instability gap, i.e. M BH > 133 M (Woosley 2017). Summing the masses of each component by integrating Eq. (29) over the ZAMS mass function (i.e. the IMF, which in this mass range always has the Salpeter 1955 slope of -2.3) leads to BHs accounting for ≈ 80% of the CUO, with only 13 to 16% from white dwarfs and 3 to 4% from neutron stars. These fractions vary little with the maximum allowed BH mass: with 81% of the CUO mass in BHs with M BH < 45 M (Farmer et al. 2019) or 84% with M BH < 51 M (Woosley 2017). These fractions are insensitive to the metallicity of NGC 6397 from Z = 0.00013 to 0.0004 (i.e. from 1% to 3% solar, consistent with the metallicities derived for NGC 6397 by Marín-Franch et al. 2009 andJain et al. 2020, respectively).
However, we must take this CUO mass dominance by BHs with caution, because BHs may merge, losing of order of 5% of their mass to gravitational waves (e.g., Abbott et al. 2016), leading to kicks, some of which are strong enough to drive them out of the GC (Sect. 10.5). Still, one part of the BH population may merge and end up escaping the GC, while another part has not merged, but would nevertheless be located at low radii thanks to orbital decay by dynamical friction. BHs could contribute to half the CUO mass, if the maximum surviving BH mass is 30 M according to our integrations of Eq. (29). The more massive BHs would sink faster to the center by dynamical friction and preferentially merge, leaving these lower mass ones. This discussion is a simplification because orbital decay by dynamical friction is stochastic, and one needs to test this with N-body simulations.
A more robust conclusion of our integration of Eq. (29) is that white dwarfs always dominate the neutron stars, by a factor of ≈ 4, regardless of the possible dominance of BHs in the mass of the CUO component. Moreover, neutron stars can also merge together (or with black holes), and such an event has been detected by LIGO (Abbott et al. 2017). Presumably, the gravitational waves emitted will be much weaker and the lower momentum vector of the waves should lead to smaller kicks on the remnant (despite the lower masses of neutron stars compared to BHs).
In summary, the CUO mass should be dominated by BHs, but must also contain white dwarfs, contributing 4 times more mass to the CUO than neutron stars.

Final thoughts
The future of GC and IMBH science is very exciting, thanks to Gaia now supplementing HST proper motions at large projected radii. Our discovery of a diffuse central mass, composed in large part of stellar mass black holes, enrichens the physics of the inner regions of GCs, and renders the search of IMBHs in Milky Way GCs even more delicate. The third data release of the Gaia mission will enable more accurate mass-orbit modeling, not only of nearby GCs such as NGC 6397, but also of more distant ones, with sufficient accurate PMs for ambitious massorbit modeling as done here. Continued pointings of GCs with HST and soon James Webb Space Telescope will lead to longer baselines and more accurate PMs. Future gravitational wave missions such as the Laser Interferometer Space Antenna (LISA) will probe IMBHs as well as stellar BHs, including in our Milky Way (Sesana et al. 2020). The physics of the inner parts of globular clusters with possible intermediate-mass black holes as well as sub-clusters of stellar-mass black holes is truly enticing.
We downloaded Gaia DR2 PM data in four regions around NGC 6397 (and also for two other GCs, M4 and NGC 6752, in order to check for generality). These four regions were chosen by doing a cone search, with a 30 aperture, for positions 5 degrees distant from the GC centers (α GC , δ GC ), North, South, East, and West. We applied the quality flags that we had applied to the Gaia data (Sect. 5.2.1), i.e. the inequalities of Eqs. (12), (16) and (17).
We then fitted the distribution of PM moduli using both a Gaussian and the form of Eq. (18). We estimated the mean µ α, * and mean µ δ for both distribution functions, a dispersion σ for the Gaussian assumption and for Eq. (18), we estimated the scale radius and outer slope. Finally, we took into account the convolution of both distributions with Gaussian errors. Figure B.1 shows the distribution of PM moduli, for the four regions NGC 6397. One can easily verify that the new expression fits extremely better than a Gaussian in Figure B.1. Moreover, the calculated kurtosis excess of both µ α, * and µ δ always gave huge values (from 16 to over 400, compared to 0 expected for a Gaussian), which clearly implies a non-Gaussian behavior.
To check for robustness of our fits, we also verified that whenever limiting the fitting range to exclude the wider wings of the FS distribution (which cannot be done when considering a GC relatively separated from the FS in PM space), a Gaussian distribution is well fitted. Thus, we decided to employ Eq. (18) throughout our study in order to account for the entire PM range.

Appendix B.2: Convolution of field stars distribution
For the GC component, the local Gaussian-like velocity distribution function (VDF) must be convolved by the Gaussian LOS and POS velocity errors, as in Eq. (7). Similarly, since the field star LOS VDF is a Gaussian, the LOS velocity errors are added in quadrature to the LOS velocity dispersion.
On the other hand, since the interloper PM distribution function (eq. [18]) is not Gaussian-like shaped, we need to perform the integral of its convolution. Calling R and R o the respective true and observed PMs and the PM error, it is straightforward to follow the recipe of Binney & Mamon (1982, appendix C) for the convolution of two-dimensional data with circular symmetry: [19]) for the four 5 degrees distant regions around NGC 6397 (for simplicity, we call them SOUTH, EAST, WEST and NORTH), represented by red crosses. Solid green curves display the best MLE fit for a Gaussian distribution, while dashed green curves display the best Gaussian MLE fit when only considering the regions with proper motions inside the limit of the dashed blue vertical line. The best MLE fit using Eq. (18) is displayed as the black curves.