Long-term X-ray spectral evolution of Ultraluminous X-ray sources: implications on the accretion flow geometry and the nature of the accretor

The discovery of pulsations in several Ultraluminous X-ray sources (ULXs) demonstrated that a fraction of ULXs are powered by super-Eddington accretion onto neutron stars (NSs). This opened the debate as to what is the NS to black hole (BH) ratio within the ULX population and what physical mechanism allows ULXs to reach luminosities well in excess of their Eddington luminosity: strong magnetic fields or rather strong outflows that collimate the emission towards the observer. To distinguish between supercritically accreting BHs, weakly or strongly magnetised NSs, we study the long-term X-ray spectral evolution of a sample of 17 ULXs, 6 of which are known to host NSs. We combine archival data from \chandra, \xmm\ and \nustar\ observatories to sample a wide range of spectral states for each source and track each source's evolution in a hardness-luminosity diagram (HLD). We find NS-ULXs to be among the hardest sources in our sample with highly variable high-energy emission. On this basis, we identify M81 X-6 as a strong NS-ULX candidate, whose variability is shown to be akin to that seen in NGC 1313 X-2. Most softer sources with unknown accretor show the presence of three markedly different spectral states that we interpret invoking changes in the mass-accretion rate and obscuration by the supercritical wind/funnel structure. Finally, we report on a lack of variability at high-energies ($\gtrsim$ 10 keV) in NGC 1313 X-1 and Holmberg IX X-1, which we argue may offer means to differentiate BH from NS-ULXs. We argue that the hardest sources in our sample might harbour strongly magnetised NSs, while softer sources may be explained by weakly magnetised NSs or BHs, in which the presence of outflows naturally explains their softer spectra through Compton down-scattering, their spectral transitions and the dilution of the pulsed-emission, should some of these sources contain NSs.


Introduction
Ultraluminous X-ray sources are defined as extragalactic offnuclear point-like sources with an X-ray luminosity in excess of ∼ 10 39 erg/s (see Kaaret et al. 2017, for a review), albeit there is now evidence for a Galactic ULX (Wilson-Hodge et al. 2018). Their nature remains for the most part unknown, and given this empirical definition they likely constitute a heterogeneous population of objects. It was initially proposed that ULXs could be powered by accreting intermediate-mass black holes (IMBH:∼ 100 -10 5 M ; see Mezcua 2017, for a review) in the sub-Eddington regime (Colbert & Mushotzky 1999;Matsumoto et al. 2001). The best ULX IMBH candidates may be those at the high-end of the High-Mass X-ray binary (HMXB) luminosity distribution (e.g. Mineo et al. 2012), reaching L X ≥ 10 41 erg/s in some cases. Some of these objects show transitions or temporal properties that seem to be consistent with the expectation of a scaled-up version of Galactic Black Hole Binaries (GBHBs) (e.g. Farrell et al. 2009;Godet et al. 2009;Pasham et al. 2014) and/or show evidence for cooler accretion disks (e.g. Feng & Kaaret 2010;Servillat et al. 2011;Godet et al. 2012;Lin et al. 2020) which suggest masses in the IMBH regime.
However, it was soon realised that the bulk of the ULX population (10 39 < L X < 10 41 erg/s) did not comply with the canonical states seen in GBHBs (e.g. Stobbart et al. 2006;Gladstone et al. 2009;Grisé et al. 2010). It was thus suggested that stellar-mass black holes fed at super-Eddington mass-transfer rates could power these sources (Shakura & Sunyaev 1973;King & Pounds 2003;Poutanen et al. 2007), with geometrical beaming possibly further enhancing the observed luminosity (King et al. 2001). In this scenario, the intense radiation pressure blows-off some of the excess gas in the form of a wind or outflow, which can become optically thick to the hard radiation emitted in the inner parts of the disk (Poutanen et al. 2007). The wind is expected to form a conic structure around the rotational axis of the compact object due to angular momentum conservation, producing highly anisotropic emission. It was thus proposed that the shortterm variability and the differences in spectral shape in a broad ULX sample could be understood in terms of different viewing angles and mass-accretion rates (Sutton et al. 2013 et al. 2015a) with those ULXs with L x ∼ 10 39 erg/s showing a behaviour more consistent with stellar-mass black holes accreting close to or at their Eddington limit (e.g. Middleton et al. 2011). These studies were later extended to include ultraluminous supersoft sources (ULSs) (Urquhart & Soria 2016), which are interpreted as sources in which either the inclination of the system and/or the accretion rate are so high that the inner disk emission is completely reprocessed by the optically thick wind, with some sources possibly showing transitions between a ULXlike spectrum to a ULS state . Indeed, numerical simulations of super-Eddington accretion onto black holes have shown extensively that the viewing angle has a strong impact on the observed spectrum (e.g. Ohsuga & Mineshige 2007;Sądowski et al. 2015;Ogawa et al. 2017). For this reason, it has been suggested that overall ULXs belong to a homogeneous class of objects, including stellar-mass black holes or lowly magnetised neutron stars, being fed at super-Eddington mass-transfer rates (King & Lasota 2016;King et al. 2017;Pinto et al. 2020a). Observational evidence supporting winds associated with super-Eddington accretion comes from high-resolution X-ray spectroscopy studies, that revealed absorption and emission features suggesting the presence of strong ionised winds colliding with the circumstellar medium (Pinto et al. 2016(Pinto et al. , 2017(Pinto et al. , 2020b. However, while it was generally accepted that accreting stellar-mass black holes were the engines behind ULXs below 10 41 erg/s (e.g. Poutanen et al. 2007;Gladstone et al. 2009;Pintore et al. 2014), the discovery of X-ray pulsations in one ULX (Bachetti et al. 2014) showed that NS can also attain super-Eddington luminosities. Motivated by the discovery of (now) 5 more pulsating ULXs (PULXs) (Fürst et al. 2016;Israel et al. 2017b,a;Carpano et al. 2018;Sathyaprakash et al. 2019;Rodríguez-Castillo et al. 2020) and the possible confirmation of another NS-ULX through the identification of a cyclotron line (Brightman et al. 2018), several authors have highlighted the remarkable X-ray spectral similarity of the ULX sample with those with detected pulsations (Pintore et al. 2017;Koliopanos et al. 2017;Walton et al. 2018c), suggesting that NS-ULX may dominate the ULX population. Yet, there is still some debate as to what is the driving mechanism responsible for their extreme luminosities. Pintore et al. (2017) suggested that the accretion column could be responsible for the hard emission in ULXs, as their Xray emission could be described with a model commonly used to fit galactic X-ray pulsars. Instead, based on the theoretical calculations carried out by Mushtukov et al. (2017), Koliopanos et al. (2017) argued that the ULX spectra are consistent with accreting highly-magnetised NS (B>10 12 G), as the 0.3 -10 keV band can be described by two thermal components. In this scenario, the soft thermal component arises from a truncated accretion disk at the magnetospheric radius (Ghosh & Lamb 1978) while the hard emission is produced in an optically thick envelope as the accreting material is forced to follow the magnetic field lines, creating a closed envelope around the NS. The rotation of the envelope, due to the coupling with the NS through the magnetic field lines, and its latitudinal temperature gradient could explain the sinusoidal profiles seen in PULXs (e.g Fürst et al. 2018), provided that the NS rotational axis is misaligned with the magnetic field axis. Alternatively, Walton et al. (2018c) following the arguments from King & Lasota (2016); King et al. (2017) showed that the interplay between the mass-accretion rate and magnetic field strength could explain the lack/presence of pulsations in ULXs, suggesting that ULXs might instead be powered by weakly magnetised neutron stars. This in turn may explain the different spectral shape of those ULXs for which broadband spectroscopy data is available, compared to the PULXs.
However, the long-term evolution of PULXs is marked by high levels of variability of 2 orders of magnitude or more in flux (Israel et al. 2017a,b) while other ULXs show variations of a factor 5 or less throughout year-time scales (Grisé et al. 2013;Luangtip et al. 2016;Pinto et al. 2020b) with PULXs being also somewhat harder than the rest of the population (Pintore et al. 2017). Whether these represent differences in the nature of the accretor or are due to a disfavourable viewing angle (e.g. King & Lasota 2020) is still not fully understood (e.g. Walton et al. 2018c). If beaming is a natural consequence of super-Eddington mass transfer (King 2009), theoretical studies show that the fraction of observed BH-ULX could be higher than NS-ULX (Middleton & King 2017). Similarly, binary population synthesis studies suggest that BH-ULX could even dominate the observed ULX population (Wiktorowicz et al. 2019), as NSs need stronger beaming factors to reach L X > 10 39 erg/s. However, the evidence for supercritically accreting stellar-mass black holes remains elusive. The best observational evidence for such BH-ULX systems might be M101 ULX-1, which was estimated to host a ∼ 20 M BH, although its dynamical mass determination was challenged by Laycock et al. (2015). Constraining the BH to NS ratio in ULXs can lead to important clues about the formation path leading to such systems and their connection with HMXBs (e.g. Mineo et al. 2012) and the formation of BH-BH and BH-NS systems.
Thus, while the spectral resemblance of the ULX population is undeniable, tracing their long-term evolution can be the key in understanding the geometry of the accretion flow and discriminating between super-Eddington accretion models and the nature of the accretor. Therefore, in this paper, we perform a comprehensive study of the long-term spectral evolution of a representative sample of 17 ULXs (including 6 NS-ULXs) in the 10 39 < L x < 10 41 erg/s range, in an effort to gain insights into the accretion flow geometry as well as the nature of the accretor. More specifically, we focus on studying each source spectral evolution in an attempt to assess which scenario best describes the variability observed: super-Eddington accretion onto a highly magnetised NS, a weakly magnetised NS or a black hole. This in turn gives clues about the nature of the accretor and by comparing its evolution with the known PULXs, we can identify potential strong NS-ULX candidates. To do so, we build a phenomenological model taking into account the new insights gained on the ULX broadband emission with NuSTAR (Bachetti et al. 2014;Walton et al. 2015;Mukherjee et al. 2015;Walton et al. 2020) and track their spectral changes in a hardness-luminosity diagram.
We describe the sample of sources selected for this work and the data reduction in Section 2. In Section 3 we describe the data analysis and results. We discuss our results in Section 4 and present our conclusions in Section 5. stringent with the data constraints on the PULXs, since they are the only sources for which the nature of the accretor is known and thus they will be crucial for our study. From the current PULXs sample (NGC 7793 P13, NGC 5907 ULX1, NGC 300 ULX1, M51 ULX-7, NGC 1313 X-2 and M82 X-2), we discarded M82 X-2 as this source is only resolved by Chandra and its spectra are frequently affected by pile-up. Its emission is also often contaminated with nearby sources and diffuse emission of unknown origin (Brightman et al. 2016), precluding a clean detailed spectral analysis of the source. Another source of interest is M51 ULX-8, which was identified as harbouring an accreting NS through the detection of a cyclotron resonance feature (Brightman et al. 2018), although no pulsations have been reported to date. We therefore included this source in our sample to further investigate its spectral properties with respect to the PULXs.
For those ULXs for which the accretor is unknown, we included at least two sources from each of the different spectral regimes proposed by Sutton et al. (2013) in order to have a representative characterisation of the ULX population. We also made sure to include those showing evidence for super-Eddington outflows with 3σ detections (Pinto et al. 2016(Pinto et al. , 2017. Thus, we also included NGC 55 ULX1, even if it does not meet our criterion of having more than three epochs. The final sample selected for this study is presented in Table 1. Finally, note that the luminosities reported for NGC 5907 ULX1 in this work are subject to an additional source of uncertainty, as there is a large discrepancy in the distance measurements to the host galaxy. The distance measurements range from ∼17  to ∼ 12.9 Mpc (Crook et al. 2007) which can boost the inferred luminosity by a factor of ∼ 1.7. Here we adopted the most recent estimate of 17.06 by ).

Data reduction
XMM-Newton data reduction was carried out using SAS version 17.0.0. We produced calibrated event files from EPIC-pn (Strüder et al. 2001) and MOS (Turner et al. 2001) cameras with the latest calibration files as of March 2018 using the tasks epproc and emproc (version 2.24.1), respectively. We selected events from patterns 0 to 4 for pn and patterns ≤ 12 for the MOS cameras. The standard filters were used to remove pixels flagged as bad and those close to the CCD gaps. We created high-energy (10 − 12 keV) lightcurves from single pattern events from the full field of view to assess the presence of high-background particle flaring periods that could contaminate our spectra. We filtered these periods by removing times where the count-rate was above a certain threshold by visually inspecting the lightcurves. These thresholds varied for each observation, and ranged from ∼ 0.3 cts s −1 to 1.2 cts s −1 and from ∼ 0.2 cts s −1 to 0.5 cts s −1 for the pn and MOS respectively.
Generally, we extracted source events from circular regions with a radius of 40" and 30" for pn and MOS respectively owing to their different angular resolution, rejecting observations in which the source fell on a chip gap. We reduced the source regions to avoid contamination from nearby sources, chip-gaps, or in cases where the source was faint in order to increase the S/N, but always ensuring that at lest 50% of the PSF was enclosed. We used elliptical regions in cases where the source was placed at large off-axis angles, resulting in a distorted PSF. This was the case, for example, for the pn observations 0657801601 and 0657802001 of Holmberg IX X-1, observations 0112521001, 0112521101, 0657801801, 0657802001, 0657802201, 0693850801, 0693850901, 0693851001, 0693851101 of M81 X-6 and observation 0656580601 for Circinus ULX 5. The background region was selected from a larger circular source-free region and on the same chip as the source when possible. For the pn we also tried to select the background region from a distance to the readout node as close as possible as for the source region. Finally, for M51 ULX-7 we used regions of ∼20" and ∼ 25" for pn and MOS detectors respectively, to reduce contamination from the diffuse emission the source is immersed in (Rodríguez-Castillo et al. 2020). We discuss possible contamination by the diffuse emission and its treatment in Section 3.1.1.
We noted also that Holmberg IX X-1 and Holmberg II X-1 were bright enough in some occasions to cause pile-up in the XMM-Newton detectors. We assessed the importance of pile-up using the tool epatplot (version 1.22) when the source count rate was above the recommended values 1 . In order to mitigate its effects, we excised the inner core of the PSF by using an annular extraction region. The inner excised radius was never below 10.25" and 2.75" for the pn and the MOS cameras respectively, to avoid introducing inaccuracies in the flux estimation, as recommended 2 .
We finally used the tasks rmfgen (version 2.8.1) and arfgen (version 1.98.3) to generate redistribution matrices and auxiliary response files, respectively. We regrouped our spectra to have a minimum of 20 counts per bin to allow the use of χ 2 minimisation and also avoiding oversampling the instrumental resolution by setting a minimum channel width of 1/3 of the FWHM energy resolution.
We reprocessed the Chandra data using the script chan-dra_repro with calibration files from CALDB 4.8.2. We used extraction regions given by the tool wavdetect to extract source events. Background regions were selected from roughly 3 times larger, circular, nearby, source-free regions. The level of pileup was assessed by inspecting the images created using the pileup_map 3 tool. We rejected observations with a pile-up fraction 5%. We only considered observations that registered ≥ 1000 counts, as we found that below this threshold data were of too poor quality to robustly discriminate between different models. All data were also rebinned to a minimum of 20 counts per bin.
NuSTAR data was processed using the NuSTAR Data Analysis Software version 1.8.0 with CALDB version 1.0.2. We extracted source and background spectra using nuproducts with the standard filters. Source events were selected from a circular region of ∼ 60". The only exception was NGC 1313 X-1, for which we follow Walton et al. (2016b) and chose a region of 40 -50" to reduce contamination from a nearby source (see Bachetti et al. 2013). Background regions were selected from larger circular source-free regions and on the same chip as the source but as far away as possible to avoid contamination from the source itself. We regrouped the NuSTAR spectra to 40 counts per bin, owing to the lower energy resolution compared to the EPIC cameras. For this work, we only considered NuSTAR observations for which simultaneous soft X-ray coverage with XMM-Newton was available. A summary of all observations considered can be found in Table 1.

Data analysis and results
We used the X-ray spectral fitting package XSPEC (Arnaud 1996) version 12.10.1f for spectral fitting and quote uncertainties on spectral parameters at the 90% confidence level for a single parameter of interest, unless stated otherwise. All fluxes were estimated using the pseudo-model cflux in XSPEC.
In general, we fitted EPIC data and the ACIS data in the 0.3 -10 keV range. For the sources with the highest absorption columns (NGC 5907 ULX1, IC 342 X-1 and Circinus ULX5), we inspected the epatplot to choose the most suitable energy range to perform spectral fitting on the EPIC data. For IC 342 X-1 and Circinus ULX5, we noticed strong deviations from the observed pattern distributions with respect to the epatplot model below ∼ 0.4 keV. This is to be expected since the low-energy part of these absorbed spectra is likely dominated by the charge redistribution tail (and possibly noise). We therefore restricted the lower energy range to 0.4 keV for these two sources. For NuS-TAR, we typically considered the 3 -35 keV range, although to avoid including bins with negative number of counts in the χ 2 minimisation we restricted the high-end of the energy range to those bins where the net number of counts was positive.
When simultaneously fitting spectra from different instruments from the same epoch, we attempted to compensate for calibration uncertainties by introducing a multiplicative crossnormalisation factor that was allowed to vary between the different instruments. This factor was frozen to unity for the pn (or the two MOS detectors if no pn data was available). We used the same factor for the MOS detectors as we found them to be generally in good agreement, while FPMA and FPMB had each their own separate constant, as recommended 4 . The agreement between the pn and the MOS cameras was usually within errors, with a few cases in M81 X-6 where the disagreement reached up to 10%, due to the highly elliptical distorted PSF because of the off-axis position of the source on the detectors. The value of the cross-normalisation factor between EPIC data and NuSTAR detectors was typically in agreement within the errors, reaching in some cases a 5-20% disagreement, with the largest values found when the source was highly off-axis on the NuSTAR detectors.
Finally, we fitted together i.e. we assumed the same spectral model for two or more datasets if they were close in time (∼ few days) and, if after inspecting each observation separately, we found no significant variation in flux and spectral parameters. This is also noted in Table 1 where we quote the different datasets that have been fitted together based on the above. Throughout this work, we use the word epoch to refer to all datasets that have been fitted together assuming the same model.

Choice of model
Our first aim was to characterise the long-term spectral evolution of our sample in a simple and coherent manner, by studying variations of the spectral components (i.e. luminosity, radius, temperature, etc). As the latest studies have revealed that the 0.3 -10 keV band can be modelled by two thermal components (e.g. Mukherjee et al. 2015;Koliopanos et al. 2017Koliopanos et al. , 2019Walton et al. 2020) that can reproduce the curvature seen at highenergies, we first considered a phenomenological model based on two multi-colour blackbody disks (diskbb in XSPEC; Mitsuda et al. 1984) to fit the data in this band, taking into account interstellar absorption by neutral hydrogen with two absorbing components tbabs in XSPEC. One was frozen at the Galactic value along the source line of sight (see Table 1), and the other one was left free to vary to take into account possible absorption from the host galaxy and the system itself. We adopted abundances given by Wilms et al. (2000) and cross-sections given by Verner et al. (1996) as recommended.
While other models have been commonly adopted to reproduce the hard emission (∼ 2 keV -10 keV) in ULXs, like Comptonisation of soft disk photons in a warm optically thick corona or a more phenomenological power-law, we preferred the diskbb over these for various reasons. First of all, a warm optically thick corona up-scattering photons from the disk is merely a proxy to reproduce the curvature seen at high-energies, as its physical interpretation is subject to several caveats (see Koliopanos et al. 2017, and references therein). Additionally, broadband spectroscopic studies have shown how this model fails to reproduce the spectral shape of ULXs Mukherjee et al. 2015). Therefore, for the purpose of reproducing the spectral curvature in (at least) the XMM-Newton band, the diskbb has been shown to be equally valid with only two parameters. The power-law (or a power-law with a cutoff for the same matter) unphysically diverges towards low-energies, thus taking up flux from the soft component and causing the absorption column to be overestimated.
Another advantage of the dual thermal component is that thanks to its simplicity, it can be used as a proxy to represent more complex models. For instance, in the context of super-Eddington accretion, the soft black body has been frequently associated with an outflow, while the hard one has been associated with emission from the inner parts of the accretion flow (e.g. Walton et al. 2014). Alternatively, Koliopanos et al. (2017) associated the soft component with an accretion disk truncated at the magnetospheric radius, and the hard component to the emission from the magnetospheric envelope (Mushtukov et al. 2017). The diskbb also allows to test easily theoretical predictions by studying the evolution of its temperature with its luminosity as we show in Sections 3.4 and 3.5.
However, visual inspection of the fit residuals revealed strong residuals at high energies in the 0.3 -10 keV band in some epochs, indicating that our phenomenological model is not able to reproduce the emission at high energies. We show this for two high quality observations of Holmberg II X-1 and NGC 5408 X-1 in Fig 1. This is perhaps not surprising, as broadband studies using NuSTAR data have revealed the presence of a faint hard power-law like excess dominating above ∼ 10 keV (e.g. Mukherjee et al. 2015;Walton et al. 2015Walton et al. , 2017. We found that for those epochs for which we had broadband coverage with NuSTAR, this excess can be well modelled as Comptonisation of the hard/hot thermal component as noted by previous studies. Since the nature of this Comptonisation component is still poorly understood ) and given the lack of broadband coverage for most of the observations considered here, we decided to use the simpl model (Steiner et al. 2009) to reproduce it, as it does not assume any geometry but simply scatters a fraction of photons ( f scat ) of the seed component towards high energies emulating a power-law component with a certain Γ, with the advantage that it does not diverge towards low energies.
In order to identify those epochs for which the simpl model component is required, we would ideally rely on Monte-Carlo simulations. However, given the large number of datasets considered here this is not feasible. Instead, to have an estimate as to when the data quality allows to constrain this component we performed an F-test between the dual-thermal Unfolded pn (blue), MOS1(red) and MO2 (green) spectra fitted with an absorbed dual thermal component (tbabs⊗tbabs⊗(diskbb+ diskbb) on left observation 0200470101 of Holmberg II X-1 (χ 2 r ∼ 1.34) and right observations 0723130301 and 0723130401 of NGC 5408 X-1 (χ 2 r ∼ 1.5), that we fit together given the lack of variability (see Section 3). Data has been visually rebinned to have at least 3 σ significance and a minimum of 35 counts per bin. Some clear residuals are seen at high energies indicating that the model is inadequate. model described above and the model tbabs⊗tbabs⊗(diskbb + simpl⊗diskbb) and decided to include the simpl model only if the probability of rejecting (P rej ) the simpler model was ∼ 3 σ. We also included the simpl model for observations with P rej slightly below this value if we were able to constrain its parameters using near-in-time observations with P rej > 3 σ (see below). Similarly, we excluded the simpl component in some cases with P rej > 3 σ if its parameters are largely unconstrained. We are aware that this is not an appropriate use of the F-test, however, the presence of this component was already shown to be significant by Walton et al. (2018c) and indeed we found that this component was required to fit the highest quality datasets. We are not claiming whether this component is present or not, but instead we used these values to have a reference as to when the data quality allows to constrain this component. We present the result of this F-test in Table 7 where we also indicate for which observations we finally included this component.
For NGC 7793 P13 we found that this component is largely unconstrained even in the broadband datasets, resulting in no fit improvement when adding this component. The broadband datasets yield fits with χ 2 r ∼ 1.1 with the dual-thermal model alone, which are statistically acceptable, so we decided not to add this component.
Furthermore, as the simpl model is mostly responsible for the emission above ∼ 8 keV in our modelling, its parameters are not well constrained when Chandra or XMM-Newton data are considered alone, and only in long XMM-Newton exposures with > 40000 counts in pn we can constrain both f scat and Γ simultaneously. We also found good consistency between the Γ-values in several epochs and that most sources are found recurrently at similar fluxes and hardness ratios (see Section 4). We thus tied Γ across epochs where the source is found at similar flux and hardness ratio, while leaving f scat and the rest of the parameters free to vary. By doing so we were able to use the broadband information provided by NuSTAR to have some constraints on those epochs where this information is not available. For NGC 1313 X-1, after checking the consistency of Γ across epochs and bearing in mind that its variability above 10 keV has been shown to be small , we tied Γ for all observations where the simpl model is used. For NGC 5408 X-1 and NGC 6946 X-1, we also found very little variability at high-energies and thus we again tied Γ across all observations where we added the simpl model. Finally, for NGC 1313 X-2, we also found that the same Γ-value can be tied for the four epochs for which we included the simpl model, albeit in this case the source shows certain variability at high energies (see Section 3.3).
For M81 X-6, M51 ULX-7, M51 ULX-8 and M83 ULX1, given the lack of broadband coverage and that the dual-thermal model gave an acceptable fit to the data, we found no need to try to improve the fit by adding the simpl model and thus we did not explore this option. As stated in Section 2, the XMM-Newton observations of M51 ULX-7 might be contaminated by some diffuse emission. To quantify the possible contamination from this diffuse emission, we extracted two MOS1 and MOS2 spectra (from obs id 0824450901) from two circular (25" in radius) regions located 34" from the source in two different directions. We fitted both emission spectra with a powerlaw and a mekal model subject to Galactic absorption only. The spectral parameters of both spectra are consistent within errors, with Γ = 2.9 +0.4 −0.6 and 2.8±0.3 and plasma temperature = 0.28±0.03 keV and 0.27 +0.04 −0.03 keV, for region 1 and 2 respectively. We thus ruled out strong spatial variations and computed the luminosity of region 1. We found a total unabsorbed luminosity of (4.9 +0.7 −0.3 ) × 10 38 erg/s in the 0.3 -10 keV band, with the soft band (0.3 -1.5 keV) being ∼ 5 times brighter than the hard band. We thus estimated that the diffuse emission can make the source appear ∼ 25% softer in the XMM-Newton observations, given that M51 ULX-7 has a typical luminosity of 4 × 10 39 erg/s. We therefore added this diffuse emission model as a fixed additive component to the M51 ULX-7 continuum in all XMM-Newton observations and decided to discard obsids 0677980701, 0677980801 and 0830191401 when the source was at its lowest.
Finally, some sources showed strong residuals at soft energies that have been associated with unresolved emission and absorption lines produced in an outflow colliding with the circumstellar gas (Pinto et al. 2016(Pinto et al. , 2017(Pinto et al. , 2020b. In most cases, we ignored them, as we are interested in the continuum emission and this will not affect our results. There are two exceptions to this: for NGC 5408 X-1, including a Gaussian emission line at ∼ 1 keV gave a ∆χ 2 improvement that ranges from ∼ 50 to 122 (for 3 extra degrees of freedom) depending on the observation. We thus decided to also determine the parameters of this Gaussian in the joint fit of the high-quality datasets, by tying together all its parameter. We obtained E = 0.947±0.005 keV, σ = 78±6 eV and normalisation = 2.7 +0.3 −0.2 × 10 −5 photons/cm 2 /s. For Article number, page 5 of 44 A&A proofs: manuscript no. aanda NGC 5204 X-1, we also included a Gaussian emission line for the Chandra obsid 3933 following Roberts et al. (2006) as we again noted similar strong residuals. In this case we obtained E = 0.97±0.02 keV, σ = 56 +24 −21 eV and normalisation = 2±1 × 10 −5 photons/cm 2 /s, consistent with the values reported by Roberts et al. (2006).

Treatment of the absorption column
There is still no consensus as to whether the local absorption column in ULXs is variable (e.g. Kajava & Poutanen 2009) or not (e.g. Miller et al. 2013). The main argument for variability of the absorption column in ULXs is the contribution from outflows (e.g. Kajava & Poutanen 2009;Middleton et al. 2015b), which could imprint stochastic variability in n H due to wind clumps crossing our line of sight (Takeuchi et al. 2013;Middleton et al. 2015a) depending on their column density and ionisation state. Alternatively, Middleton et al. (2015b) argued that some expelled gas could cool down far from the source and contribute to the neutral absorption column. If this was the case, it is then not clear over which timescale n H would react to instantaneous changes in the mass-accretion rate.
To make matters more complicated, Miller et al. (2009) showed by studying absorption edges at high spectral resolution in X-rays that the absorption column remains stable throughout spectral changes in a set of X-ray binaries, and that changes in the soft component must come from changes in the source spectrum and not from the absorption column itself.
In view of these complications, we attempted to study variations of n H by fitting individually all epochs for each source using the dual-thermal component over the 0.3 -8 keV band, so as to use the same model for all epochs and avoid introducing artificial changes in n H due to the different energy ranges considered when using NuSTAR data. We found that in general, the values of the absorption column for a given source were consistent within 3σ errors throughout epochs, which we show in Figure 2 for four sources in our sample. Small discrepancies can be attributed to low data quality (e.g. short exposure Chandra observations and/or calibration uncertainties) rather than real physical-n H changes. For NGC 5408 X-1, when fitting the 0.3 -8 keV band with the dual-thermal model we still saw strong residuals at high-energies, which may indicate that the dualthermal model cannot adequately fit the 0.3 -8 keV band. We thus repeated the study of the n H variations replacing the hard diskbb with a power-law with a high-energy cutoff (cutoffpl in XSPEC) to ensure that the lack of mismodelling of the highenergy emission does not affect the n H variations (or lack of) found before. With this model we again found all n H values consistent at the 3 σ level.
Conversely, we did find evidence for variability in the absorption column of NGC 1313 X-1 and NGC 55 ULX1 when following the same approach, where variations above the 3σ confidence level were seen (see the arrows in Figure 3). However, given the phenomenological nature of our model, we cannot rule out that these discrepancies are due to a change of the underlying continuum, changing therefore the parameter degeneracies.
For the other sources not discussed, we did not find as strong evidence for variability in the absorption column as for the sources discussed above, although in some cases this could be due to poorly constrained model parameters. Thus, in view of the above considerations, we considered that the absorption column can be assumed to be constant for the most part, with the exception of epoch 2004-08-23 of NGC 1313 X-1 and epoch Fig. 2. Evolution of the local absorption column over time for NGC 5408 X-1 (green), NGC 7793 P13 (blue), Holmberg IX X-1 (red) and NGC 1313 X-2 (black) shifted by 0.2 10 22 cm −2 for clarity. All spectra have been fitted with a model consisting of an absorbed dual diskbb model in the 0.3 -8 keV band. Errors are shown at 3σ confidence level. Circles and squares correspond to XMM-Newton and Chandra data respectively. 2010-05-24 of NGC 55 ULX1. In these cases we allowed n H to vary together with the continuum parameters as this is preferred by the fit. We found a ∆χ 2 improvement of ∼ 25 and 15 for NGC 1313 X-1 and NGC 55 ULX1 respectively when leaving n H free, compared to the case where n H is frozen to the average value (see next Section). We discuss this in more detail in Section 4.4.

Spectral fitting approach
Ideally, we would jointly fit all datasets for each source, tying the n H and Γ across certain datasets as explained above. However, such a procedure would be computationally prohibitive. Since the joint fit will mostly be driven by those datasets with higher data quality, in order to reduce the computational burden of this approach we decided to do our spectral fitting in two steps: we first jointly fitted those datasets with better statistics, tying together Γ across those epochs where the source is found at similar flux/hardness ratio and n H across all epochs, while the rest of the parameters are free to vary. We typically consider 3 to 8 datasets for each source that include those observations with the longest XMM-Newton exposures when the three EPIC cameras are operational, those for which simultaneous NuSTAR coverage is available and in some rare cases, the longest Chandra observations. We next fitted individually the remaining observations of lower data quality with n H and Γ frozen at the values found in the joint fit. The results of the joint and the subsequent individual fits are reported in Table 2.
For Circinus ULX5, the joint fit approach resulted in largely unconstrained parameters for the low-energy components at soft energies. This is likely due to a combination of the calibration uncertainties at low energies (see Section 3) and the high absorption column along the line of sight (n HGal = 50.6× 10 20 cm −2 ; HI4PI Collaboration et al. 2016). We thus considered only epoch 2013-02-03 to constrain n H as it offers the best constrains on the broadband emission.

Hardness-luminosity diagram
As a first source classification and in order to highlight the differences and similarities between the sources in our sample, we started by building a hardness ratio luminosity diagram (HLD), similar to that often used for X-ray binaries (Done & Gierliński 2003) and also for ULXs (e.g. Sutton et al. 2013). We did this by computing unabsorbed fluxes rather than counts since given the different instruments employed for this work, relying on count rates is not feasible. Furthermore, fluxes have the advantage that can be corrected for absorption column. To do this, we retrieved the total unabsorbed luminosity in the 0.3 -10 keV band from the tbabs⊗tbabs⊗(diskbb + simpl⊗diskbb or diskbb), depending on the epoch. The hardness ratio is computed as the ratio of unabsorbed fluxes in a soft band (0.3 -1.5 keV) and a hard band (1.5 -10 keV). This is motivated by the fact that the pulsating component in PULX has been shown to dominate at high energies (e.g. Israel et al. 2017a;Walton et al. 2018b) and thus we may expect to highlight the differences between pulsating and non-pulsating sources, while the soft component in ULXs usually stops dominating above ∼ 1 keV.
Indeed, the results presented in Figure 4 and Table 3 show clearly that PULXs are harder than the rest of the sample, with some interesting exceptions. In general, PULXs harden in the 0.3-10 keV band with luminosity with high levels of flux variability. We found that most of the population reaches a maximum luminosity of ∼ 2 × 10 40 erg/s, with just Holmberg IX X-1 and NGC 5907 ULX1 above this value. This is also highlighted in Figure 5, where a drop in sources reaching a maximum luminosity ∼ 2 × 10 40 erg/s is clearly seen.
Lastly, we note that as we have rejected observations below ∼ 1000 counts, transitions to the off states below ∼ 10 39 erg/s like those seen for NGC 5907 ULX-1 (Israel et al.

Spectral transitions
We also present the temporal evolution of each individual source in the HLD in Figure 6, along with the unfolded spectra of some selected epochs. For this, we selected 2 to 4 clearly distinct spectral states based on the HLD, in order to highlight the possible range of spectral variability of each source. When possible, we selected observations for which NuSTAR data is available so the broadband variability can be observed. We caution that given the sparse monitoring offered by XMM-Newton and Chandra in some cases, care must be taken when looking at the source variations in the HLD. Arrows indicate the chronological order but in many cases we cannot guarantee that the source did not evolve differently between the epochs considered here.

Spectral evolution of the soft thermal component
Several authors have attempted to study the nature of the soft component in ULX spectra by investigating its evolution on the luminosity-temperature (L-T) plane (e.g Kajava & Poutanen 2009;Miller et al. 2013). However, these studies have frequently yielded contradictory results and thus there is still no consensus on its true nature. We thus investigated the correlation of the bolometric luminosity of the cool diskbb component with its temperature. We did this by retrieving unabsorbed luminosities of the diskbb component in the 0.01 -100 keV 5 . All fluxes are reported in Table 3. We did not attempt to derive any correlation for NGC 55 ULX1 and NGC 300 ULX1, due to the limited number of observations available for these sources.
We next assessed whether L-T are correlated by running a Spearman correlation test 6 . The results are reported in Table  4. Five of our sources show a strong positive L-T correlation (Spearman correlation ≥ 0.5 and a p-value of 0.05). For these, we fitted a power-law using the python routine odr 7 , that takes into account both errors on x and y variables (see Figure 7 and Table 4 for the results). In order to investigate whether these correlations were driven by the degeneracy between T soft and its normalisation, we derived 99% χ 2 confidence contours around the best-fit T soft and its normalisation for those sources showing a positive L-T correlation. Given the extensive computational time required by the steppar command in XSPEC, we did this only for some selected epochs, ensuring that at least one was from the joint fit. The results are shown in Figure 8 for NGC 1313 X-1, Holmberg IX X-1, Holmberg II X-1 and NGC 5204 X-1, where we also overlaid the best-fit T soft and normalisation from all epochs. In all cases where a positive L-T correlation is observed, including IC 342 X-1 that we did not show for brevity, a strong degeneracy between the T soft and its normalisation is observed, highly correlated with the datapoints from the individual observations, indicating that the L-T correlations are simply due to a degeneracy between these two parameters.
For M81 X-6, while the correlation test could indicate a positive correlation, visual inspection of the data clearly reveals that there is no apparent trend (see Figure 9). We recall that the Spearman's test coefficient does not take into account errors on the parameters and hence these values need to be treated with caution. In fact, the fit with the powerlaw yields a large error on the index α = 2.5 ± 2.0 (90% confidence level), while the normalisation is consistent with 0.

Spectral evolution of the hard thermal component
Similarly, we studied the hard thermal component evolution in the L-T plane. In cases where we have employed the simpl   Table 3 and the results from the L-T correlation in Table 5.
Most of the sources show no clear correlation in the hard thermal component. The only exception is M83 ULX1, which Table 4. Results of the L-T correlation for the soft thermal component. We quote the Spearman's rank correlation coefficient, the false alarm probability (p-value) and the index (α) of the L∝ T α relationship with its 90% confidence level uncertainty.

Source
Spearman p-value α NGC 7793 P13 -0.13 0.73 -a NGC 5907 ULX1 0.57 0.14 -a NGC 5408 X-1 -0.14 0.65 -a Circinus ULX5 0.32 0.5 -a HoIX X-1 0.87 0.0005 1.31 ± 0.14 M81 X-6 0.65 shows a strong positive correlation (Spearman correlation of 0.94 with a p-value = 0.005, see Figure 10). We again computed χ 2 contours around the best-fit temperature and normalisation for Right: Unfolded spectra for selected epochs in which the source has experienced strong changes, following the same colour-code as for the left panels. Selected epochs are indicated in the legend. For the EPIC data, only pn is shown (circles) or MOS1 if pn is not available (as triangles up). In cases where we have used NuSTAR data, FPMA is shown, represented as squares and with the same colour as pn. Chandra data is shown with triangles down. The soft and hard diskbb model components are shown with a dashed line and dotted line respectively, while the total model is shown with a solid line. For epochs where the simpl model has been used see Table 2. Data has been rebinned for clarity.
Article number, page 9 of 44 A&A proofs: manuscript no. aanda  Figure 10. In this case, it seems that the degeneracy between these two parameters is not driving the existing correlation. While NGC 6946 X-1 might seem positively correlated based on the Spearman test alone, examination of the data clearly revealed that there is an overall lack of strong variation of the hard component in most observations (see Figure 11). Furthermore, we noted a certain bias in those observations for which we have not included the simpl model, as the temperature of the hard diskbb appears to be systematically higher, as the diskbb is pushed towards high-energies due to the lack of an additional high-energy component.

Discussion
We have examined the long-term spectral evolution of a sample of sources containing known pulsating and ULXs for which the accretor is unknown, in order to gain insights on their nature and the accretion processes driving their extreme luminosities. While our spectral fitting approach is phenomenological, we have attempted to understand the nature of the soft and the hard components by means of L-T correlations. The positive L-T correlations for the soft component found in Section 3.4 are broadly consistent with previous results reported by Miller et al. (2013), who studied a similar sample of sources and also assumed a constant absorption column, albeit using a different model based on an accretion disk whose photons are Compton-up scattered by an optically thick corona of hot (T e > 2 keV) electrons. Similar correlations were also reported for NGC 5204 X-1 and Holmberg II X-1 by Feng & Kaaret (2009) using an absorbed diskbb and a powerlaw. These correlations have frequently been used to argue in favour of an accretion disk as the nature of the soft component in ULXs. However, our analysis reveals that these are driven by the existing degeneracy (see Figure 8) between the temperature of the soft component and its normalisation and thus these relationships cannot be used reliably to support this scenario. Indeed, it has been shown that depending on the assumption of the underlying model, the correlation can disappear entirely (see e.g. Luangtip et al. 2016;Walton et al. 2020) (see also Gonçalves & Soria (2006) for a discussion on the weaknesses of associating the soft component with an accretion disk). Nevertheless, these correlations might be useful to identify sources evolving in a similar fashion.
Conversely, the positive L-T correlation found for the hard thermal component of M83 ULX1 might have a physical origin as we argue in Section 4.6, given that this correlation is not driven by the existing degeneracy between the parameters (see Figure 10).  Overall, our study suggests that given the limitation from using phenomenological models to describe the ULX continuum, the observed changes in the best-fit spectral parameters may not be the most appropriate way to build a physical picture for our sources. For this reason, most of our discussion will be based on the variability observed in terms of luminosity and hardness ratios (see Figures 4 and 6).
Figures 4 and 5 show that most sources seem to have a maximum luminosity of roughly 2 × 10 40 erg/s. M51 ULX-7 and M82 X-2 (the other PULX not analysed in this work) also reach luminosities of a few × 10 40 erg/s (Brightman et al. 2016. While the sample presented here is rather small in a statistical sense, it has the advantage that we have taken into account the long-term variability of each source, it is also free of contaminants and the absorption column has been carefully estimated. On this basis, our work seems to support larger sample studies (Swartz et al. 2011), where a possible cutoff at around 2 × 10 40 erg/s was observed in the luminosity function of a sample of 107 ULX candidates. Mineo et al. (2012) showed that ULXs seem to extend the X-ray luminosity function of HMXBs up to a possible energy cutoff also at around 2 × 10 40 erg/s, supporting ULXs as being an evolutionary stage of HMXBs. Thus, ULXs might be related to systems where the companion star expands and starts to fill its Roche Lobe as suggested by King & Lasota (2020). As NS accretors are more numerous among HMXBs (e.g. Casares et al. 2017), it is possible that a substantial fraction of ULXs with L x ∼ 10 39−41 erg/s could host NS. However, we stress that given the limited sample studied here, we may be running into small numbers when distinguishing a cutoff distribution from a powerlaw one .
A luminosity cutoff around 2 × 10 40 erg/s would be consistent with the maximum accretion luminosity NSs can attain according to Mushtukov et al. (2015). The authors considered the accretion column model proposed by Basko & Sunyaev (1976) and the reduction of the electron scattering cross-section in the presence of high-magnetic fields, that allows to overcome the Eddington limit. Mushtukov et al. (2015) suggest that NSs could reach ∼ 10 40 erg/s for a reasonable parameter space of magnetic fields and spin periods. Indeed, the hard component (the diskbb or the more complex simpl⊗diskbb when applicable), that we would associate with the accretion column (e.g. Walton et al. 2018a), reaches a maximum luminosity of ∼ 10 40 erg/s in NGC 7793 P13, M51 ULX-7 and NGC 1313 X-2. However, the fact that we find a PULX (NGC 5907 ULX1) also above the break is problematic. The hard component is a factor ∼ 8 above this theoretical value. It is possible that by assuming isotropic emission in our luminosity calculations, we may be overestimating the luminosity of the accretion column. We discuss this in more detail in Section 4.1.
On the other hand, general relativistic radiation magnetohydrodynamic (GRRMHD) numerical simulations of super-Eddington accretion onto black holes also predict saturation of  the maximum luminosity with the mass-accretion rate (Narayan et al. 2017), as their models start to become radiatively inefficient above ∼ 10Ṁ Edd . Their simulations show that the observed luminosity saturates at 2 × 10 40 erg/s for a 10 M black hole viewed close to face-on, which might be supported by our observations. The mass-transfer rate (ṁ 0 ) in ULXs is generally accepted to be super-Eddington, even if the sustainability of such process at such extremes over long timescales (sometimes over several decades) has still to be understood. Models put forward to explain the emission of super-Eddington accretion onto black holes predict that as the mass-transfer rate increases beyond the classi-Article number, page 13 of 44 A&A proofs: manuscript no. aanda cal Eddington limit, a radiatively driven outflow will be launched from within the spherisation radius (R sph ), the radius at which the disk reaches the local Eddington limit (Shakura & Sunyaev 1973;Poutanen et al. 2007). The outflow leaves an optically thin funnel around the rotational axis of the compact object, so that at low inclinations we see a hard spectrum dominated by the inner parts of the disk (Hard ULXs: Sutton et al. 2013). At higher inclinations the outflow becomes optically thick to Thomson scattering (Poutanen et al. 2007) and thus an observer at such inclinations should see a softer and fainter spectrum (soft ULXs: Sutton et al. 2013) as most of the emission will be Compton down-scattered in the wind before reaching the observer (Mid-   Middleton et al. 2015a), but we may see differences in the long-term variability between these two scenarios. The increase in the mass-accretion is also expected to increase the Thomson optical thickness of the gas within the funnel, preventing high-energy photons from escaping to the observer without being scattered (Narayan et al. 2017;Kawashima et al. 2012). It is argued that, in extreme cases, either due to a high-mass accretion rate or/and higher inclination angle, a source may appear as a super-soft ultraluminous source (ULS) (Urquhart & Soria 2016), where most of the emission comes from the wind photosphere.
One key observational property of the presence of the funnellike structure created by strong outflows is highly anisotropic emission. While a relationship between anisotropy and super-Eddington mass-transfer rates may be a natural consequence of A&A proofs: manuscript no. aanda super-Eddington accretion onto black holes (King 2009), NS may have means to circumnavigate this relationship. Crucially, in the presence of a strong magnetic field, the disk might be truncated before radiation pressure starts to be significant to inflate the disk and drive strong outflows (e.g. Chashkina et al. 2019). This occurs when the magnetospheric radius, given by: where R NS is the radius of the neutron star, B is its magnetic field, M NS is the mass of the neutron star,Ṁ is the mass accretion rate at R m and ξ is a dimensionless parameter that takes into account the geometry of the accretion flow and is usually assumed to be 0.5 for an accretion disk (Ghosh et al. 1977), is larger than the spherisation radius (R sph ). R sph in turn has a linear dependence onṁ 0 . This offers means for a NS to be fed at high-mass transfer rates, while largely reducing anisotropy. On the other hand, we may expect that for weakly magnetised NSs, the disk will become super-critical given the dependency of R m on B. In this case the emission will be collimated by the outflow in a similar fashion as for super-critically accreting black holes (e.g. King et al. 2017;Takahashi & Ohsuga 2017). Additionally, we may expect a NS with strong outflows to appear softer, as outflows will Compton down-scatter the emission from the accretion column.
We therefore can make use of the long-term variability observed in the HLD to discuss which scenario best describes the variability observed in each source: super-Eddington accretion onto BHs, weakly magnetised NSs or highly magnetised NSs. At the same time, the wealth of data analysed in this work allows us to identify groups of sources showing common evolution and similar spectral states, while identifying new NS-ULX candidates based on their similarity with the PULXs. Given that our analysis focuses on discussing the spectral transitions observed in the HLD (see Figure 6), our study is less constraining for NGC 6946 X-1 and NGC 5408 X-1 8 , given the lack of marked spectral states. We therefore do not attempt to offer any interpretation for these two sources.
Finally, a common characteristic of the PULXs seems to be a hard spectrum accompanied with high-levels of variability at high-energies ( 6 keV) (see last four sources in Figure 6 and also Figure 1 from Pintore et al. (2017)). On this basis, we argue that M81 X-6 constitutes the best NS-ULX candidate from our sample, given its harder spectrum and its high spectral variability at high energies as we discuss in Section 4.3. This high-energy component is likely associated with emission from the accretion A. Gúrpide et al.: Long term evolution of ULXs   Figure 8 for the countours around the best-fit T hard and its normalisation for three selected epochs of M83 ULX1. The degeneracy between these two parameters can be ruled out as the cause of the positive L-T correlation. column in the case of PULXs (Walton et al. 2018b,c) and may offer means to distinguish NS-and BH-ULXs (which we discuss in Section 4.5). While IC 342 X-1 and Circinus ULX5 also show a hard spectra with high-levels of variability at high-energies, they both show a soft and dim state that we argue is akin to that seen in NGC 1313 X-1, a much softer source with relatively stable high-energy emission (see Figure 6) and thus we discuss them in Section 4.4.1. Nevertheless, we stress that the spectral similarity across the sample is undeniable as also noticed in previous studies (e.g. Pintore et al. 2017;Walton et al. 2018c).

PULXs
Remarkably, our analysis shows (Figure 4) that PULXs are among the hardest sources in our sample, something also noted by Pintore et al. (2017) employing a colour-colour diagram. Interestingly, PULXs with higher pulse-fractions, e.g. NGC 7793 P13 (PF∼20%; Israel et al. 2017b), M51 ULX-7 (PF∼12%; Rodríguez-Castillo et al. 2020) and NGC 300 ULX1 (PF∼50%; Carpano et al. 2018) tend to appear harder (see Figure 4) while the softest PULX in our sample, NGC 1313 X-2, has the low-est pulse-fraction (PF∼5% Sathyaprakash et al. 2019). Figure 6 shows that harder PULXs show little variability in the HLD (see the case of NGC 7793 P13 and M51 ULX-7 9 ), while in contrast NGC 1313 X-2 shows strong changes in hardness by a factor of ∼3, without undergoing to the propeller regime. Below we discuss whether these differences in the long-term evolution could be explained due to the different interplay between R m and R sph . We note that in the super-Eddington regime, the interaction between the magnetic field and the disk is likely to be more complex than as predicted by Equation 1 as radiation pressure can dominate over the gas ram pressure (Takahashi & Ohsuga 2017;Chashkina et al. 2019) but for the qualitative picture discussed here we neglect these facts. For NGC 300 ULX1, given the limited number of observations we cannot offer a detailed discussion as for the other sources and thus we will not consider this source further.
A&A proofs: manuscript no. aanda Fig. 11. Bolometric luminosity of the hard diskbb component as a function of its temperature for NGC 6946 X-1. Due to the uncertainties associated with the parameters, we do not consider these two quantities to be correlated (see text for details). Symbols as per Figure 6.
NGC 1313 X-2: R m < R sph -This source shows a strong bimodal behaviour in the HLD (see Figure 6 and also Feng & Kaaret 2006;Pintore & Zampieri 2012) that is likely associated with the 158 day quasi-periodicity reported by Weng & Feng (2018) using Swift-XRT data, although an association is not straightforward since several periodicities are found in the periodogram presented by the authors. Our analysis reveals that this bi-modal behaviour is driven by a highly variable hard component while the soft emission is rather stable (the soft diskbb varies by a factor 2 in luminosity while the hard component luminosity varies by factor 5, see Figure 9). This bi-modal behaviour is confirmed by Swift-XRT long-term monitoring (Weng & Feng 2018).
This variability is unlikely to be produced by the propeller effect, where the centrifugal barrier of the magnetosphere prevents further infalling of gas onto the NS (Illarionov & Sunyaev 1975), as for a period of ∼ 1.47 s (Sathyaprakash et al. 2019), we should expect a drop in luminosity by a factor ∼ 220 (see for instance equation 5 from Tsygankov et al. 2016) assuming a NS radius of 10 km and a mass of 1.4 M , whereas the observed drop in luminosity from brightest to dimmest is only about a factor ∼ 5 (using the luminosities in the 0.3 -10 keV band from epochs 2006-03-06 and 2013-06-08).
Instead, the absence of transitions to the propeller regime can be used to set constraints on the maximum magnetic field of the source, since we require that the magnetospheric radius is always smaller than the co-rotation radius, the radius at which the Keplerian disk velocity equals the rotation of the NS. We thus require that at the minimum luminosity observed in the source, R m < R C . We can rearrange equation (37) from Mushtukov et al. (2015) to derive an upper limit on the magnetic field strength: where L intr is the minimum intrinsic luminosity in units of 10 39 erg/s, B 12 is the magnetic field in units of 10 12 G, R 6 is the NS radius in units of 10 6 cm, m is the NS mass in M , P is the period in seconds and ξ is the same dimensionless parameter as for Equation 1 which we take again to be 0.5. We can express the intrinsic luminosity in terms of the observed luminosity taking into account a beaming factor (b) (e.g. King & Lasota 2020) so that L intr = b L observed with b ≤ 1. Setting L observed = 2.2±0.1 (from epoch 2013-06-08), R 6 = 1, m = 1.4 and P = 1.47 s we obtain an upper limit on the magnetic field of B = (b 1/2 33.0±0.7) × 10 12 G. Thus, this suggests that we can rule out extreme magnetic fields (B ≥10 13 G) for moderate values of b ( 0.1).
Therefore the hardening of the source with luminosity might instead support a scenario in which a conical outflow from a supercritical disk imprints highly anisotropic emission. Changes in the viewing angle due to the source precession and a varying degree of down-scattering in the wind could thus explain the source variability. For a discussion on possible mechanisms for precession in ULXs we refer the reader to Vasilopoulos et al. (2020). The stability of the soft component and the fact that the changes in HR and luminosity are likely driven by a super-orbital period support that changes in the mass-accretion rate are not the main source of variability. Additionally, the presence of strong outflows may be supported by the residuals observed at soft energies around 1 keV (see Figure 12) in the epoch with the longest exposure (2017-06-20), reminiscent of those seen in NGC 5408 X-1 and NGC 55 ULX 1 Pinto et al. 2016Pinto et al. , 2017.
Overall the variability of the source and its low pulse-fraction (∼ 5%) support a scenario in which NGC 1313 X-2 is a weakly magnetised NS, in which R sph > R m and therefore outflows and precession cause the emission to be highly anisotropic. This is also supported by the recent ray tracing Monte-Carlo simulations of Mushtukov et al. (2021), showing that larger scale-height flows lead to lower pulse-fraction due to the increased number of scatterings.
NGC 7793 P13 and M51 ULX-7: R m > R sph -Conversely, both the long-term evolution of NGC 7793 P13 and M51 ULX-7 show little variability in terms of hardness ratio, albeit also being associated with super-orbital periodicities, of 66.9 days (see Fürst et al. 2018, Figure 5) in the case of NGC 7793P13 and of 39 days in the case of M51 ULX-7 (Vasilopoulos et al. 2020). Indeed, long-term Swift-XRT monitoring shows no clear bi-modal behaviour (Weng & Feng 2018) in a hardness-intensity diagram in the case of NGC 7793 P13. Both sources also show very hot hard component (T hard ∼ 3 keV), albeit it is unclear whether the absence of a third, high-energy, component may have boosted the inferred temperature. Similarly as for NGC 1313 X-2, most of the luminosity variability is again seen in the hard component (the soft component varies by a factor of ∼ 2 and the hard component by a factor of ∼ 5). However, for NGC 7793 P13 and M51 ULX-7, the soft component does seem to brighten with source luminosity.
The lack of HR variability might indicate that anisotropic emission caused by the wind funnel is not important in these sources. As stated before, in the presence of a strong magnetic field, the disk might be truncated before radiation pressure starts to be significant to inflate the disk and drive the strong outflow. Assuming the geometry of the accretion flow outside R m is similar to that of supercritically accreting black holes, our soft diskbb could represent the emission from the outer regions of the accretion disk, with partial reprocessing by the wind if R m > R sph (Kitaki et al. 2017). If we consider that this model component can give us a rough estimate of the size of this emitting region, a larger emitting area compared to NGC 1313 X-2, for which we have argued that outflows are important, could indicate that the disk is being truncated further from the accretor in the case of NGC 7793 P13 and M51 ULX-7. To illustrate this, we compute the mean radius of the inner disk given by the soft diskbb normalisation (N) from all epochs. Using: where i is the inclination of the system, D 10 is its distance in units of 10 kpc and f col is the colour correction factor (Shimura & Takahara 1995) which we take as 1.8 (e.g. Gierliński & Done 2004) throughout this paper. This gives 1637 +106 −66 (cos i) −1/2 km, 1975 +197 −134 and 2257 +125 −104 (cos i) −1/2 km for NGC 1313 X-2, M51 ULX-7 and NGC 7793 P13, respectively. Naively assuming this radius gives a rough estimate of the size of the magnetospheric radius (R m ), it may indicate a higher-mass accretion rate for NGC 1313 X-2 (and/or a lower magnetic field strength) and thus a scenario in which the disk becomes thick and outflows cause anisotropic emission. Instead, the larger radius of NGC 7793 P13 and M51 ULX-7 may suggest that either the mass-accretion rate is lower or the magnetic field strength is higher, which can result in the disk remaining geometrically thin (see also Chashkina et al. 2017) and therefore in reduced anisotropy. We note that these would support previous studies by Koliopanos et al. (2017), where the same relationships for R m and R sph were found for NGC 7793 P13 and NGC 1313 X-2 (see their Table 2 and 4).
Assuming the mass-accretion rate varies within a similar range in the three sources, a stronger magnetic field in NGC 7793 P13 and M51 ULX-7 is also supported by the fact both sources undergo periods of inactivity to 10 38 erg/s, likely associated with the propeller regime (e.g. Fürst et al. 2016;Vasilopoulos et al. 2021), indicating that the condition R m > R C is more easily achieved. Furthermore, the lower pulse-fraction of NGC 1313 X-2 compared to that of NGC 7793 P13 and M51 ULX-7 is consistent with less material reaching the magnetosphere and thus the accretion column, as a result of the mass loss in the disk. Therefore, the emission at high-energies is not only intrinsically diminished but also down-scattered in the outflow (Mushtukov et al. 2021), which might explain why NGC 1313 X-2 is generally softer than NGC 7793 P13 and M51 ULX-7.
We thus suggest that the magnetic field in NGC 7793 P13 and M51 ULX-7 is likely to be higher than that of NGC 1313 X-2 so that R m > R sph . The low degree of beaming and highmagnetic field implied by this solution is in agreement with previous magnetic field estimates (Vasilopoulos et al. 2020;Rodríguez-Castillo et al. 2020), that suggested a magnetic field of ∼ 10 13 G in M51 ULX-7.
NGC 5907 ULX1: R m ∼ R sph -Contrary to the rest of PULXs in our sample, the luminosity of NGC 5907 ULX1 clearly exceeds 10 40 erg/s. The variability between the observations clustered at L X ∼ (6-8) × 10 40 erg/s (epochs 2003 and 2014) and epoch 2012 was shown to be associated with different phases of the 78-day super-orbital period (Walton et al. 2016a) of the source by Fürst et al. (2017). This suggests that these changes are not due to a change in the mass-accretion rate and suggests instead changes in the viewing angle as the sources precesses. It has been speculated that the extreme luminosity of the source could be due to a high-degree of beaming (e.g. King et al. 2017;King & Lasota 2019), in which the super-orbital modulation was due to a conical outflow beaming the emission into and out of our line of sight (e.g. Dauser et al. 2017). However, our Figure  6 shows that the source is harder when it becomes dimmer in epoch 2012-02-09 (see also Figure 3 from Sutton et al. 2013) compared to epochs 2003/2014, and thus these changes associated with the super-orbital period may be hard to reconcile with changes imprinted by the precession of an outflowing cone, as we would expect a softer emission at low luminosities. This might imply that R sph ≤ R m is likely in the case of NGC 5907 ULX1.
We note that other mechanisms could also cause the luminosity to be overestimated under the assumption of isotropic emission. In fact, the emission from the accretion column is not expected to be emitted isotropically. Instead, radiationhydrodynamic simulations of super-Eddington accretion onto magnetised NSs by Kawashima et al. (2016) show that the accretion column is expected to have a flat emission profile along its sides, as the emission is only able to escape through the lateral sides of the confined material in it. This naturally creates highly anisotropic emission and can cause the observed emission to be greatly in excess of the Eddington limit.
A highṁ 0 is still likely required to produce the observed luminosities. This may require the presence of a strong magnetic field so that the disk is truncated roughly at the point where it becomes supercritically. Thus, as suggested previously (Walton et al. 2018c; King & Lasota 2019) R sph ∼ R m seems a plausible condition to explain the high-luminosity of NGC 5907 ULX1.

The non-pulsating NS: M51 ULX8
This source was identified as a NS through the identification of a possible cyclotron resonance feature (Brightman et al. 2018) 10 . We note that its position in the HLD (see Figure 4) may be consistent with a lack of pulsations (Brightman et al. 2018), as this source is markedly dimmer and softer than the overall PULX sample, and sources with higher pulse fractions tend to sit in the harder end of the diagram, as stated before. However, given the apparent lack of variability, it is hard to give a comparison between this source and other PULXs in our sample. Strong variability is only observed in epoch 2018-05-25, where the hard component increased by a factor of 2 in luminosity. Its behaviour is somewhat similar to NGC 1313 X-2 and M81 X-6, that we argue is a good PULX candidate (see Section 4.3), although we lack enough observations of the source at higher luminosities to confirm this similarity. The soft component shows also little variability and no clear correlation, suggesting again a link between these three sources, which would favour a weakly magnetised NS in M51 ULX8 in agreement with the study pre-sented by Middleton et al. (2019). Should this be the case, then long-term monitoring of the source will be crucial to attempt to identify any quasi-periodicity similar to those seen in NGC 1313 X-2 and M81 X-6.

PULX candidates
M81 X-6 -As stated above, M81 X-6 constitutes our best NS-ULX candidate. The spectral evolution and the track in the HLD for this source is strikingly similar to that of NGC 1313 X-2 ( Figure 6). Both sources transit back and forth from a soft (HR 1.5) and low luminous state ( 4 × 10 39 erg/s) to a hard (HR ∼ 3) and brighter state. The similar temperature, luminosity range and variability (or lack of it) of the soft component also suggests a link between these two sources. Instead, the variability is driven mostly by the hard component (the soft diskbb changes by a factor 2 in luminosity while the hard component luminosity varies by a factor 5 in both sources). Furthermore, the variability of M81 X-6 is also likely associated with a 115 days quasiperiodicity (Weng & Feng 2018), compatible with the one seen in NGC 1313 X-2. Given the similarity of these two sources, we suggest that it also harbours a weakly magnetised NS, so that the presence of strong outflows along with source precession may account for the source spectral variability.
Prompted by this similarity, we searched for coherent pulsations in M81 X-6 using the code HENDRICS (Bachetti 2018) which is based on the publicly available PYTHON library Stingray (Huppenkothen et al. 2019). Unfortunately, all of the observations except one have less than 5000 counts in pn, and typically 10000 counts seem to be required in order to detect pulsations (Rodríguez-Castillo et al. 2020). We thus searched in the observation with the longest exposure (∼ 73 ks in pn, see Table 1 and Figure 6 epoch 2001-04-23, HR ∼ 2, L ∼ 4 × 10 39 erg/s) suitable for pulse searches, assuming M81 X-6 has a period and period derivative similar to the other PULXs. We used the Chandra coordinates given by Swartz et al. (2003) to extract barycentred corrected events in the 0.2 -12 keV band. We ran HENDRICS on the unbinned event file searching for coherent pulsations in the 0.2 -8 Hz range, based on previous PULX detections, using the Z 2 statistic (Buccheri et al. 1983) suitable for sinusoidal pulses. The search was performed using the option fast, that optimises the search in the f -ḟ space and reduces the computational time 10-fold compared to a classical search. We found no detection above the 3σ level. We ran the same search in the 4 longest GTI intervals, ranging from 20 ks to 40 ks, as pulsations in PULXs have been shown to vary during the course of an observation (e.g. Bachetti et al. 2020), but again we did not find any significant detections. Overall there is no peak that can be robustly identified as several peaks with similar Z 2 power are found, well below the 3σ level. We also looked in the pn data of the individual observations with shorter exposures, but found no significant detections. We performed a last search looking into the 0.02 -0.2 Hz range to look for longer periods as those seen in NGC 300 ULX1 (see e.g. Vasilopoulos et al. 2018), with similar results. This could indicate that pulsations in this source are as elusive and faint as those found in NGC 1313 X-2 (Sathyaprakash et al. 2019) and that deep exposures with the source on-axis or deeper searches correcting for the orbital parameters will be required to detect pulsations.
Alternatively, the source spectral state may play a role in the detectability of the pulsations. Considering the case of NGC 1313 X-2, the epochs when pulsations were detected by Sathyaprakash et al. (2019) are 2017-09-02 and 2017-12-09 (i.e. the last two epochs in Figure 6). The authors also found that the pulse fraction (and hardness, as shown in this work, Figure 4) decreases with the source luminosity. As argued before, we understand the hardening of the source as a decrease in the viewing angle as the system precesses. Considering the pulse-fraction calculations for super-critical accretion columns proposed by Inoue et al. (2020), this might imply that the angle between the rotational axis and the magnetic field axis (Θ B ) is greater than the angle between the observer's line of sight and the rotational axis (Θ obs ) (e.g. Θ B > 30 • and Θ obs < 30 • , see Figure 5 from Inoue et al. (2020)). Assuming the same applies to M81-X6, this could imply that pulsations are more likely to be found in softer and dimmer states (HR ∼ 1.5, L ∼ 2.5 × 10 39 erg/s) where we expect the pulse fraction to be higher.
If instead, the dilution of the pulsed emission is mainly due to a stochastic process such as multiple scatterings through the wind, then it may be possible to find a PULX in similar spectral states, with and without pulsations. Nevertheless, studying the dependence of the appearance of pulsations on the source spectral states seems a promising tool to put constraints on the accretion flow geometry in PULXs.

Geometrical effects of a supercritical funnel
Several of the softer sources for which the accretor is unknown show a common pattern in their long-term evolution: three distinct spectral states, two of them at similar low luminosities but distinct hardness and a third one at a higher luminosity (see for instance NGC 1313 X-1, Circinus ULX5 and IC342 X-1 in Figure 6). Two other sources that show also three marked spectral states are Holmberg II X-1 and NGC 5204 X-1, albeit the luminosity of the dimmer states differ in this case by a factor of ∼ 2-4. The difference in luminosity between one of the two dim states and the bright state might be naturally explained by changes iṅ m 0 . However, the presence of an additional dim state requires another explanation. A super-critical funnel, as we discuss below, may offer an explanation to these three states either through obscuration asṁ 0 increases (sources in Section 4.4.1) or due to changes in the inclination of the system (sources in Section 4.4.2). Given some of the common transitions and other spectral properties as we show below, we discuss together NGC 1313 X-1, Holmberg IX X-1, NGC 55 ULX1, Circinus ULX5 and IC 342 X-1 in Section 4.4.1 and Holmberg II X-1 and NGC 5204 X-1 in Section 4.4.2.
For the discussion, we will use NGC 1313 X-1 as our benchmark to discuss some of the transitions observed to the soft and dim states. In some cases, the timescale between these transitions and the duration of each state are poorly constrained due to the sparsity of our data. However, in a few cases, like for NGC 5204 X-1 and NGC 1313 X-1, the sampling rate is high enough so that we do observe the source switching from one state to another and thus we refer to these changes as transitions.

Optically thick funnels
NGC 1313 X-1 and Holmberg IX X-1 -As we show later, NGC 1313 X-1 undergoes a transition similar to that observed in the super-soft ULXs in NGC 247  and M101 (Soria & Kong 2016), which are seen to transit from ULSs to a soft ULXs spectra. A similar transition is also seen in NGC 55 ULX1 (Pinto et al. 2017) (and in this work as we argue later), although the source would still classify as a ULX when in this dim-state. These transitions are all marked by an increase in the size of the emitting region of the soft component and a decrease in tem-perature, interpreted as an expansion of the wind photosphere as the spherisation radius increases with the corresponding decrease in temperature (Poutanen et al. 2007). While the ULXs in M101 and NGC 247 are frequently thought to be viewed at high inclinations (e.g. Ogawa et al. 2017), NGC 1313 X-1 is likely viewed down the optically thin funnel (e.g. Poutanen et al. 2007;Narayan et al. 2017), so that the hard component dominates the emission (epoch 2012-12-16 in Figure 6).
The spectral transitions of NGC 1313 X-1 between the low state (L∼ 8 × 10 39 erg/s, epoch 2012-12-06) and the high-state (L∼ 18 × 10 39 erg/s, epoch 2004-06-05) have been frequently interpreted as the wind entering our line due to a narrowing of the funnel as the mass-accretion rate increases (e.g. Sutton et al. 2013). However, the fact that the high-energy emission ( 10 keV) remains relatively stable may be at odds with this interpretation, as we should expect the wind to down-scatter The physical processes at play to produce these high-energy photons are still poorly understood (e.g. Walton et al. 2020) but it is generally accepted that this emission is produced in the vicinity of the accretor (e.g. Kawashima et al. 2012;Takahashi et al. 2016;Walton et al. 2020). For the remainder of this part, we assume that this high-energy component is indeed produced in the inner regions of the accretion flow and focus on the influence of the wind/funnel structure on the spectra, rather than on the origin of this emission, which will be discussed in Section 4.5.
The fact that the high-energy component remains stable, could therefore imply that the gas within the funnel has remained optically thin over a certain range of mass-transfer rate, and that the inclination of the system (i) remains well below the halfopening angle of the funnel (θ f ). The second condition is required so that the higher degree of beaming caused by the reduction of θ f , will only result in an increase in the amount of photons that are down-scattered off the wind walls into the observer's line of sight, with the wind remaining out of the line of sight. Since the optical depth of the wind is lowest near the rotational axis of the compact object (Poutanen et al. 2007), the emission from the innermost regions are more likely to reach the observer without suffering severe energy loses (Kawashima et al. 2012). This may support previous works suggesting that NGC 1313 X-1 is seen at low viewing angles (e.g. Middleton et al. 2015a).
The lack of obscuration of the high-energy emission suggests that the mass-accretion rate has to be moderate (Ṁ 10Ṁ Edd ) 11 as for higher mass-transfer rates the gas within the funnel is expected to become optically thick (Narayan et al. 2017). Therefore, regardless of the exact nature of this high-energy powerlaw tail, we argue that albeit the increase in mass-accretion rate in NGC 1313 X-1 up to epoch 2004-06-05, the physical conditions within the funnel have remained stable. The similar persistent high-energy emission seen in Holmberg IX X-1 and the similar L-T positive correlation, suggests a similar evolution in both sources, albeit NGC 1313 X-1 is seen in an obscured state (see below) not seen in Holmberg IX X-1. We discuss in more detail the possible differences between these two sources focussing on the nature of the high-energy tail in Section 4.5.
Interestingly, after NGC 1313 X-1 reaches its maximum luminosity (epoch 2004-06-05), it becomes extremely soft and its luminosity decreases (epoch 2004-08-23obscured state). This spectral transition (note that these two observations are just two months apart) can be understood if a further increase in the mass-accretion rate leads to a narrowing of the opening angle of the funnel as the wind becomes more mass-loaded, to the point where the gas within the funnel becomes optically thick to the high-energy radiation. This implies that now θ f < i and thus the wind effectively enters the line of sight and starts obscuring the inner accretion flow. The optical depth of the wind in the direction parallel to the disk is also expected to be an order of magnitude higher than in the perpendicular direction (Poutanen et al. 2007). High-energy photons are now heavily down-scattered or absorbed and therefore we mostly observe the soft emission from the expanded wind photosphere, as supported by the increase in the normalisation of the soft component (from ∼ 5 to ∼ 50 before and after the obscuration respectively).
This transition is also in good agreement with the GR-RMHD simulations presented by Narayan et al. (2017) of super-Eddington accretion onto black holes. The authors observe a transition from a hard spectrum to a very soft one, as the mass accretion rate increases (Ṁ ∼ 23Ṁ Edd from their simulations) and the gas within the funnel becomes optically thick (see their Figure 9). Furthermore, their simulations also show that the luminosity for an observer with a line of sight close to the rotational axis of the accretor (i ∼ 10 • ) is capped at around 2 × 10 40 erg/s, which is in very good agreement with the maximum luminosity observed in NGC 1313 X-1. Their simulations also predict an increase in the spectral emission at low energies ( 0.6 keV), which seem at odds with our observations, where the luminosity of the soft component has remained stable compared to the brightest state. We note however, that Kawashima et al. (2012), who found qualitatively the same results as Narayan et al. (2017) for high-mass transfer rates ( 23Ṁ Edd ), do not observe an increase in luminosity at low energies. It is also possible that due to the expansion of the photosphere, the soft component peaks now in the extreme UV and thus given our limited bandpass, we cannot reliable assess whether the luminosity of the soft component has increased.
Numerical simulations by Ogawa et al. (2017) also predict a steep decline of the high-energy emission as the outflow photosphere enters the line of sight. We find that the temperature of the hard diskbb diminishes from 1.6±0.1 keV in epoch 2004-06-05 to 0.92 +0.10 −0.08 keV in epoch 2004-08-23 and its unabsorbed bolometric luminosity decays from (11.2±0.6) × 10 39 erg/s to (2.9±0.4) × 10 39 erg/s, which seems to support this interpretation.
NGC 55 ULX1 -Similarly, we argue that the soft and dim spectral state observed in NGC 55 ULX1 is analogue to the obscured state observed in NGC 1313 X-1. In both cases, we observe a possible increase in the neutral absorption column (see Section 3.1.2). Albeit this might be model dependent (this is discussed further below), it suggests that their spectrum has evolved in a similar manner. The ratio of unabsorbed bolometric fluxes in the obscured state are F harddiskbb /F softdiskbb = 0.34 +0.12 −0.08 and 0.29 +0.01 −0.02 in NGC 1313 X-1 (epoch 2004-08-23) and NGC 55 ULX1 (epoch 2010-05-24), respectively. This is a factor of ∼ 3 times lower in both cases compared to when the sources were at their brightest and indicates that the hard component is responsible for the drop in luminosity. We see again an increase in the normalisation of the soft diskbb (see also Pintore et al. 2015), akin to the transitions seen in the ULXs in M101 and NGC 247 Soria & Kong 2016). The increase in the neutral absorption column may indicate that now we see parts of the wind less exposed to the central source (Pinto et al. 2020a), where self-absorption could start to be important, albeit more physically motivated models are needed to address this. The presence of outflows is supported by studies using highresolution spectroscopy (Pinto et al. 2017(Pinto et al. , 2020b which might have revealed the presence of soft residuals associated with outflowing winds in NGC 1313 X-1 and NGC 55 ULX1. Their similar transitions highlighted here support therefore the unification scenario proposed by (Middleton et al. 2015a;Pinto et al. 2020a).
IC 342 X-1 and Circinus ULX5 -As shown in Figure 6, IC 342 X-1 and Circinus ULX5, not only share a very similar evolution in the HLD, but are also found in a soft and dim state (epochs 2012-10-29 and 2016-08-23 for IC 342 X-1 and Circinus ULX5 respectively), reminiscent again of the obscured state seen in NGC 1313 X-1. When both sources are hard and dim (e.g. epochs 2012-08-07 and 2001-08-06 for IC 342 X-1 and Circinus ULX5 respectively) the mass-transfer rate is likely to be low, similar to NGC 1313 X-1 in epoch 2012-12-16. The brighter and harder states (epochs 2005-02-10 and 2013-02-30 for IC 342 X-1 and Circinus ULX5 respectively) might correspond to an increase in the mass-transfer rate while the softest and dimmest states are likely again due to the central source being obscured by the funnel becoming optically thick at hightransfer rates. This is supported by the diminishing of the hard component in both temperature and luminosity. In this case, we do not see an increase in the local n H -value as for NGC 1313 X-1 or NGC 55 ULX1, that could strengthen the similarities of these obscured states, but we note that these are the two sources with the largest n HGal -values ( 30 × 10 20 cm −2 ) in our sample and that we noted some calibration uncertainties at low energies (see Section 2.2). Nevertheless, these sources show that both archetypal soft ULXs (e.g. NGC 55 ULX1) and hard ULXs (e.g. IC 342 X-1) undergo similar type of transitions.

Inclination effects
Holmberg II X-1 and NGC 5204 X-1 -The similarities between Holmberg II X-1 and NGC 5204 X-1 are clear when looking at their long-term evolution in the HLD (see also the similarity between the three spectra shown in Figure 6) and are further supported by the similar L-T correlations found for the soft diskbb (α NGC5204X-1 = 3.6±0.4, α HolmbergIIX-1 = 3.3±0.4). Therefore, regardless of the physical processes powering these two sources, these similarities strongly suggest that we are witnessing the same type of source and/or accretion flow (see also Gúrpide et al. in prep).
In epochs 2004-04-15 and 2006-11-16 of Holmberg II X-1 and NGC 5204 X-1 respectively, both sources are found with a hard spectrum and an intermediate luminosityhard/intermediate state. Again, we favour a low viewing angle as for NGC 1313 X-1 as the hard component dominates the emission, although the inclination in this case may be higher than for NGC 1313 X-1, given their softer spectra. As both these sources move in the HLD (see 2003 XMM-Newton epochs for NGC 5204 X-1 for the transition) from these epochs to softer spectra and brightest luminositiesbright/soft state (e.g. epochs 2004-04-15 and 2006-11-16 for Holmberg II X-1 and NGC 5204 X-1 respectively), the temperature of the hard component decreases as seen in NGC 1313 X-1 12 , while the soft component increases in temperature and luminosity (see Table 2). Similarly, most of the variability between these two epochs is seen at mid to soft energies (∼ 0.3 -5 keV), whereas the high-energy emission ( 5 keV) shows little variability. We argue that these transitions are due once again to an increase in the mass-accretion rate as the slight softening of both sources at high-energy and the increase in the soft component seems to match the evolution presented by Kawashima et al. (2012) (see their Figure 2). In this case, the small dimming seen at high-energies ( 10 keV, see especially spectra for Holmberg II X-1) may be due to the fact that our line of sight may be now grazing the optically thick walls of the wind, and thus some of the high-energy photons from the inner parts of the accretion flow are now being Compton down-scattered by the wind. The transitions from hard/intermediate to soft/bright in NGC 5204 X-1 were also reported by Sutton et al. (2013) in terms of hard and soft ULX transitions.
Our study shows that these sources are also seen to transit from bright/soft to another state that we term dim/soft (epochs 2002-09-18 and 2001-05-02 for Holmberg II X-1 and NGC 5204 X-1 respectively) and vice-versa. For NGC 5204 X-1, this is also seen in the Chandra observations of 2003 (see also the full set of Chandra observations presented by Roberts et al. (2006) 13 ). For Holmberg II X-1, this is also seen in 2010 (see also the 2009/2010 Swift-XRT monitoring of Holmberg II X-1 in Grisé et al. (2010)).
While there are certain similarities between these transitions and that seen in NGC 1313 X-1 in the obscured state, namely a softening and dimming of the source, we found also certain differences that may indicate that these transitions are not due to the same phenomenon as for NGC 1313 X-1. From the bright/soft state to the dim/soft state, the entire spectrum seems to have diminished in luminosity in the case of Holmberg II X-1 and NGC 5204 X-1 (see the spectra from Figure 6), while for NGC 1313 X-1 we showed that it was the hard component that was mostly responsible for the dimming. Indeed, the soft diskbb is the faintest in the dim/soft state, while the temperature of the hard diskbb remains consistent within errors with respect to the bright/soft state (for both Holmberg II X-1 and NGC 5204 X-1). If the funnel has become optically thick due to an increase in the mass-transfer rate and is now obscuring the hard emission, we should expect the soft component to be relatively stable as seen in NGC 1313 X-1. Therefore, the dimming of this component may be at odds with this interpretation.
It is worth noting that the spectral evolution of NGC 5204 X-1 and Holmberg II X-1 from bright/soft to dim/soft bears some resemblance with how the inclination affects the spectral shape (see for example Kawashima et al. 2012;Kitaki et al. 2017;Narayan et al. 2017), which could suggest that changes in the inclination are responsible for these spectral changes. Indeed, numerical simulations by Narayan et al. (2017) predict a decrease in about one order of magnitude in luminosity between a source with a face-on aspect and a source viewed at high inclinations (i > 30 • ), which is consistent with our observations of the transitions from bright/soft to dim/soft. However, if the inclination of the system is indeed changing due to precession of the supercritical funnel, we should expect these changes to be associated 12 For NGC 5204 X-1, see those epochs of higher quality as the epochs when simpl was not included tend to appear with artificially hotter temperatures for the hard diskbb. 13 Albeit these transitions were well sampled by Chandra at the end of 2003, we were not able to use the short exposure (∼ 5ks) observations when the source was caught repeatedly in the low state, given the limited number of counts registered. with some periodicity. Thus, the fact that these transitions do not seem to be periodic (Grisé et al. 2010) may be at odds with the effects of a precessing funnel (albeit see Gúrpide et al. in prep).
These transitions were also shown to occur rapidly, in some cases in timescales shorter than half a day (Grisé et al. 2010). This short-term variability may be expected if dense clumps of the wind are intersecting our line of sight (Takeuchi et al. 2013;Middleton et al. 2015a) or if our line of sight is rapidly changing between seeing down the funnel and seeing through the wind walls, expected if our line of sight grazes the wind as argued before. More information and monitoring is needed about the timescale of these transitions to address this issue and thus this will be further studied in a forthcoming publication.

Origin of the high-energy tail
As stated above, NGC 1313 X-1 and Holmberg IX X-1 show a remarkably stable emission above ∼8 keV, albeit the sources are clearly varying at lower energies. Several physical processes have been proposed in order to explain the high-energy emission in ULXs. Here we attempt to discuss its nature based on this observed stability.
Crucially, the absence of a hard surface in black holes naturally offers an explanation for the presence of a stable emission component: advection and photon trapping effects (Abramowicz et al. 1988;Ohsuga & Mineshige 2007). In the super-Eddington regime, the diffusion photon time in the vicinity of the black hole is expected to be greater than the accretion timescale, while part of the excess energy will go into powering the outflow, so that the luminosity only increases logarithmically with the mass-transfer rate (Poutanen et al. 2007): where x = 1 if only advection is considered or x = 0.6 if all the energy goes into powering the wind. Theoretical works show that this can lead to saturation of the continuum in the hard Xray band for stellar-mass black holes (Feng et al. 2019) and thus this stability at high-energies may represent the smoking gun in differentiating BH-from NS-ULXs, as NS have no means of swallowing any excess energy. Therefore, if we assume the high-energy emission is rather insensitive toṁ 0 as suggested by the stability of the high-energy component in Holmberg IX X-1 and NGC 1313 X-1, we suggest that the former might harbour a heavier black hole than the latter given its brighter high-energy component (around a factor ∼ 1.6). Alternatively, Kawashima et al. (2012) and Kitaki et al. (2017) based on numerical radiation hydrodynamic (RHD) simulations of super-Eddington accretion onto black holes, showed that an overheated region (T ∼ 8 keV) is formed in the vicinity of the black hole, where a shock is produced in the region where the outflow collides with the inflow. Photons from the disk entering this overheated region will gain energy through multiple Compton up-scatterings prior to escaping through the funnel or the wind itself (where they will undergo Compton down-scattering). Photons escaping mostly through the funnel will be less affected by Compton down-scattering and will be observed as a high-energy powerlaw tail in the spectrum. Kitaki et al. (2017) argued that the temperature of this region is independent on the black hole mass, resulting in a similar spectral shape for the high-energy tail regardless of the black hole mass. Therefore, provided that the mass-transfer rate between two given sources is similar, the luminosity of this high-energy component may provide means to estimate the black hole mass ratio between two given sources. The two sources for which this stability is best observed, NGC 1313 X-1 and Holmberg IX X-1, have indeed similar slope of the hard tail Γ NGC1313X-1 =2.90 +0.05 −0.04 and Γ 1HolmbergIXX-1 =2.9 +0.2 −0.3 and Γ 2HolmbergIXX-1 =3.4 +0.2 −0.3 where 1 and 2 indicate the two different Γ values we have used for the low and high flux epochs respectively. If we instead refit all the high data quality sets of Holmberg IX X-1 assuming one single value for Γ tied between all epochs, we obtain Γ=3.1±0.2 with χ 2 r = 1.06 for 3722 degrees of freedom, consistent with the slope found in NGC 1313 X-1. Therefore, the brighter luminosity of the high-energy component in Holmberg IX X-1, would again suggest that Holmberg IX X-1 harbours a heavier black hole compared to NGC 1313 X-1.
It is however not clear yet how the physical properties of this overheated region change with the mass-transfer rate. Assuming the overheated region is indeed responsible for the stable highenergy tail, then its physical properties should also be rather insensitive to the mass-transfer rate. Given that the overheated region is formed due to the shock of the outflows colliding with the inflow, we may also expect it to form around weakly magnetised NSs too. However, it is to be seen if the physical conditions within this overheated region and the funnel are expected to lead to the formation of a similar high-energy powerlaw tail as predicted for super-critically accreting BHs.
Lastly, in the case of accretion onto a NS magnetic poles, the accretion column is expected to be responsible for the emission at high-energies (Walton et al. 2018c). Unfortunately, simulated spectra from super-critically accreting NS from numerical simulations are still under study (e.g. Takahashi & Ohsuga 2017;Takahashi et al. 2018) and it is currently hard to know how the emission from the accretion column reacts to changes in the mass-transfer rate and magnetic field strength. Nevertheless, if the accretion column is responsible for the stable emission observed in these sources, it is still unclear why we observe this difference in terms of stability between PULXs and sources like NGC 1313 X-1 and Holmberg IX X-1.

M83 ULX1: a stellar-mass black hole?
This source sits at the lower end of the ULX luminosity distribution. We found the maximum unabsorbed luminosity of M83 ULX1 to be ∼ 3.5 × 10 39 erg/s, close to the maximum observed luminosity of ∼ 4.5 × 10 39 erg/s by Soria et al. (2015), although this was computed using a powerlaw which could have boosted the unabsorbed luminosity. This maximum luminosity, using Eddington mass scaling, suggests an accretor of ∼ 25 M . We also found that the hard diskbb component follows L hard ∝ T 4.4±0.7 hard , which as suggested by Figure 10, does not seem to be spuriously created by existing degeneracies. This may suggest that this component arises from an accretion disk with constant inner radius. This fact together with its low maximum luminosity, may suggest that we are witnessing a black hole accreting close to the Eddington-limit, as binary synthesis population studies predict that BH-ULXs tend to emit isotropically, since super-Eddington mass-transfer rates (a factor ∼ 8 above the classical Eddington limit) are harder to obtain from binary population evolution in the case of BHs (Wiktorowicz et al. 2019).
We thus consider the possibility that the source could be accreting close to the Eddington-limit. If so, then we could expect the accretion disk to deviate from the standard thin accretion disk, as radiation pressure inflates the disk making it geometrically slim or thick. This may lead to a departure of the radial temperature index (p) of -0.75 for a standard thin accre-Article number, page 23 of 44 A&A proofs: manuscript no. aanda Fig. 13. Dependency of the p radial index of the diskpbb with its temperature for M83 ULX1 (see text for details). This dependency is similar to that seen in stellar-mass black holes in the standard regime (Kubota & Makishima 2004). Symbols are as per Figure 7. tion disk (Shakura & Sunyaev 1973), implicitly assumed in the diskbb model. In order to explore such deviations, we refitted all our data with an absorbed broadened disk model (diskpbb in XSPEC). As in Section 3, we assumed again constant absorption column and we jointly fitted our data tying n H across all datasets with the diskpbb model. We obtained an excellent fit with χ 2 ∼ 1.02 for 1493 degrees of freedom (parameters are listed in Table  6). The overall evolution of M83 ULX1 is very similar to that seen in the ∼ 10 M black hole XTE J1550-564 (Kubota & Makishima 2004). Our hard diskbb when using the dual-thermal component shows constant inner-disk radius, closely following the L∝ T 4 relationship as XTE J1550-564 in the standard regime (i.e. when L disk ∝ T 4 ) (e.g. period 3 in Kubota & Makishima 2004) 14 . Additionally, when using the diskpbb model, the temperature increases with the radial index p of the diskpbb (Figure 13) as seen in XTE J1550-564 and LMC X-3 when fitted with the same model (see their Figure 9). Kubota & Makishima (2004) argued that this increase in p with temperature is an artefact caused by the limited bandpass and the fact that the radial dependency is flatter near the innermost disk radius than the -0.75 given by the diskbb approximation. Our diskpbb L-T re-14 Note that Kubota & Makishima (2004) uses RXTE that covers the 3-20 keV range and thus their soft component corresponds roughly to our hard component. lationship follows α = 3.0 ± 0.6 (90% confidence level) with Spearman's test coefficient of 0.94, again with roughly constant normalisation, suggesting we are witnessing the inner-most stable orbit as in XTE J1550-564 and LMC X-3 in the standard regime. A typical value of the normalisation N ∼ 0.002, corresponds to R in ∼ 95 (cos i) −1/2 km, assuming f col = 1.8 as previously. If we assume that the constant radius we observe corresponds to the innermost stable orbit for a non-spinning black hole, then R in = R isco = 3R S , and we obtain a BH mass estimate of ∼ 10 (cos i) −1/2 M . This mass estimate would be a factor 6 larger if we consider instead a maximally spinning Kerr black hole.
A further constraint on the mass of the black hole comes from the fact that Kubota & Makishima (2004) argued that a source enters the anomalous regime when L diskb / L Edd ∼ 0.4. Therefore, given that L disk ∼ 4 × 10 39 erg/s, then the mass of the M83 ULX1 could be of the order of 60 M , which could easily be accounted for with reasonable values of inclination and spin of the black hole. We note that it is unlikely that the source is in the anomalous state as we should expect p to decrease with temperature, as this parameter starts to deviate from standard value for a thin disk of -0.75 (Kubota & Makishima 2004;Shakura & Sunyaev 1973), at odds with our observations (Figure 13).
Albeit the exact mass estimate is rather uncertain, we conclude that the behaviour of this source is consistent with a massive stellar-mass black hole accreting close to the Eddington limit, in the high/soft state, given its similar evolution with accreting black holes in the standard regime. This conclusion is in agreement with previous studies by Soria et al. (2015) and other studies suggesting that sources below 3 × 10 39 erg/s could be consistent with massive black holes accreting close to the Eddington limit Sutton et al. 2013). Such massive black holes might not be rare given the black hole masses estimated from gravitational wave events (Abbott et al. 2019).

Conclusions
We have presented a thorough study of the long-term spectral evolution of a representative sample of ULXs and PULXs using data from XMM-Newton, Chandra, and NuSTAR. By studying their spectral states and transitions, we have been able to explain the main sources of variability in these sources which can be summarised as: changes in the mass-transfer rate, changes in the degree of beaming, precession and obscuration by the optically thick parts of the wind as this becomes more mass-loaded.
We have shown that PULXs are among the hardest sources in the sample and discussed their evolution in terms of the interplay between the magnetospheric radius and the spherisation radius. We favour a scenario in which the softest PULX, NGC 1313 X-2, is consistent with being a weakly magnetised NS so that R sph > R m and the wind/funnel structure is responsible for imprinting highly anisotropic emission as the source precesses, given the wide HR variability the source spans. This interpretation can explain the significantly softer spectra of NGC 1313 X-2 and its lower pulsed-fraction, as the primary emission from the accretion column is expected to be downscattered in the cool electrons of the outflow. Additionally, the lack of transitions to the propeller regime in NGC 1313 X-2 supports this interpretation, as the weak magnetic field (or high-mass transfer rate) implied by the condition R sph > R m , will naturally lead to smaller magnetospheric radii and thus transitions to the propeller regime are less likely to occur. Instead, the hardest PULXs, NGC 7793 P13 and M51-ULX7, are consistent with being strongly magne-tised, so that R m > R sph given the lack of HR variability which we interpret as lack of strong anisotropy. In this scenario, the accretion disk is being truncated before it becomes supercritical, suppressing the anisotropy that the funnel/wind structure would otherwise imprint. For NGC 5907 ULX1, we have shown that in those epochs associated with the super-orbital variability, the source appears harder when dimmer. This is hard to reconcile with the anisotropy expected from the funnel/wind structure in which we expect the source to become harder when brighter (as for NGC 1313 X-2). Still, a high-mass transfer rate is required to explain its high luminosity and therefore we conclude that R sph ∼ R m is a plausible condition to explain the source variability.
By comparing the evolution of PULXs with the sources in our sample, we have been able to identify a strong NS candidate with very similar evolution to that seen in NGC 1313 X-2: M81 X-6. Albeit we were not able to detect pulsations in the source, it is possible that longer exposures sampling the source in different spectral states may be needed to detect pulsations. Additionally, deeper pulse searches taking into account orbital parameters corrections may be needed.
Most of the softer sources for which the accretor is unknown, show three markedly different spectral states: one at highest luminosity and two at similar low luminosities but different hardness ratio. A super-critical funnel can offer an explanation of such degeneracy between luminosity and hardness, because a source is expected to be dim at both low mass-transfer rates and when the gas within the funnel becomes optically thick at highmass transfer rates, so that the hard radiation from the inner regions of the accretion flow becomes abruptly obscured. This could explain the evolution seen in NGC 1313 X-1, NGC 55 ULX1, IC 342 X-1 and Circinus ULX5. For Holmberg II X-1 and NGC 5204 X-1, these transitions may be better explained if our line of sight is grazing the half-opening angle of the funnel, so that our view of the accretion flow rapidly transits between seeing down the funnel and seeing through the optically thick wind walls as the source precesses. Future higher cadence monitoring of these transitions will be key in order to determine the exact nature of these transitions, by studying both their timescale and the source evolution prior and after them. Nevertheless, these transitions are suggestive of strong winds in these sources, which together with their softer appearance compared to most PULXs supports a scenario in which the sources considered here are powered by weakly magnetised NSs or BHs.
Finally we have reported on the stability of the high-energy emission ( 10 keV) in some of the sources in our sample. Notably, none of the PULXs show such stability, albeit further high-quality NuSTAR observations are needed to probe the different spectral states of both PULXs and those sources for which the accretor is unknown. Black holes are favoured candidates to explain this stability, as they naturally offer means to swallow any excess radiation, stabilising the output radiation even as the mass-transfer rate increases. Should this be the case, this highenergy emission may be the smoking gun to identify BH-ULXs.
On the other hand, should some of this sources host NS, then this stability may offer interesting clues about the accretion flow geometry around NSs in the super-Eddington regime. Nevertheless, we stress the importance of obtaining future NuSTAR observations as we may expect to see most of the observational differences between BH-and NS-ULXs at high-energies, where the mechanism responsible for the emission is expected to differ (i.e. an accretion column compared to the case of the inner regions of the accretion disk around a black hole). Holmberg II X−1 (D = 3.27 Mpc a , n H = 3.7 × 10 20 cm −2 ) 1X XMM       Table 2. Best fit spectral fit parameters. Errors are quoted at a 90% confidence level. The first epochs in each source correspond to the high-quality datasets, that were jointly fitted and from which we determine Γ and n H . The rest of the data were fitted individually by freezing n H and/or Γ to the best fit value found in the joint fit of the high-quality data.     Table 3. Estimated unabsorbed luminosities in units of 10 39 erg/s for each epoch using the distances from Table 1 and models from Table 2. For jointly fit data, the error estimation takes into account the errors on the joint parameters. Uncertainties are quoted at 90% confidence level. First column indicates the epoch given in Table 1 sorted chronologically. The simpl flux luminosity corresponds to the entire diskbb⊗simpl component.