Fragmentation and filaments at the onset of star and cluster formation

Context. The structure formation of the dense interstellar material and the fragmentation of clumps into cores is a fundamental step to understand how stars and stellar clusters form. Aims. We aim to establish a statistical view of clump fragmentation at sub-parsec scales based on a large sample of massive clumps selected from the ATLASGAL survey. Methods. We used the APEX/SABOCA camera at 350 μm to image clumps at a resolution of 8.′′5, corresponding to physical scales of <0.2 pc at a distance <5 kpc. The majority of the sample consists of massive clumps that are weak or in absorption at 24 μm. We resolve spherical and filamentary structures and identify the population of compact sources. Complemented with archival Herschel data, we derive the physical properties, such as dust temperature, mass and bolometric luminosity of clumps and cores. We use association with mid-infrared 22-24 μm and 70 μm point sources to pin down the star formation activity of the cores. We then statistically assess their physical properties, and the fragmentation characteristics of massive clumps. Results. We detect emission at 350 μm towards all targets and find that it typically exhibits a filamentary(-like) morphology and hosts a population of compact sources. Using Gaussclumps we identify 1120 compact sources and derive the physical parameters and star formation activity for 971 of these, 874 of which are associated with 444 clumps. We find a moderate correlation between the clump fragmentation levels with the clump gas density and the predicted number of fragments with pure Jeans fragmentation scenario. We find a strong correlation between the mass of the most massive fragment and the total clump mass, suggesting that the self-gravity may play an important role in the clumps’ small scale structure formation. Finally, due to the improved angular resolution compared to ATLASGAL, we are able to identify 27 massive quiescent cores with Mcore > 100 M within 5 kpc; these are massive enough to be self-gravitating but do not yet show any sign of star-formation. This sample comprises, therefore, promising candidates of massive pre-stellar cores, or deeply embedded high-mass protostars. Conclusions. The submillimeter observations of the massive clumps that are weak or completely dark at 24 μm reveal rich filamentary structures and an embedded population of compact cores. The maximum core mass is likely determined by the self-gravity of the clump. The rarity of massive pre-stellar core candidates implies short collapse time-scales for dense structures.


Introduction
Despite the significant influence of massive stars on their natal environment and the Galaxy as a whole, their formation process remains poorly constrained. Massive stars (M 10 M ) are considerably rarer than solar-mass stars. They constitute less than ∼10% of the stellar initial mass function (IMF) in mass. The duration of their formation time is short, hence the deeply embedded early stages are difficult to identify. Adding to the complexity of the high-mass star-forming scenario is that massive stars form typically in clusters (Stahler et al. 2000;Lada & Lada 2003), yet, the mass assembly and the fragmentation of massive clumps into cores is still not well understood (for recent reviews see Tan et al. 2014, Motte et al. 2018).
In the last decade, Galactic plane surveys in the far-infrared and submillimeter regime such as the APEX Telescope Large Area Survey of the Galaxy (ATLASGAL, Schuller et al. 2009;Csengeri et al. 2014), the Bolocam Galactic Plane Survey (BGPS, Aguirre et al. 2011), the Herschel infrared Galactic Plane Survey (Hi-Gal, Molinari et al. 2010) and the JCMT Plane Survey (JPS, Moore et al. 2015;Eden et al. 2017) have proven to be excellent finding charts for identifying large samples of massive clumps that exhibit typical physical characteristics of > 10 3 −10 4 M , an extent up to ∼1 pc and a temperature of ∼20 K typically associated with giant molecular cloud complexes (e.g. Csengeri et al. 2016a;Svoboda et al. 2016;König et al. 2017;Elia et al. 2017;Urquhart et al. 2018). Massive clumps fragment into cores where star formation takes place (Motte et al. 2018). Investigating how the dense gas is structured within massive clumps, and how they fragment and form filaments and cores on a 0.1-0.3 pc scale (e.g. Motte et al. 2007;André et al. 2014) Article number, page 1 of 53 arXiv:1909.01011v1 [astro-ph.GA] 3 Sep 2019 A&A proofs: manuscript no. aanda_saboca_yuxin_v3_md_subm_rev2_subm is, therefore, an important step to understand the mass assembly process to form stars and clusters.
Due to the typically large, several kilo-parsec distance of high-mass star-forming regions, investigating their structure and fragmentation requires high-angular resolution interferometric studies (e.g. Bontemps et al. 2010;Palau et al. 2013Palau et al. , 2015Csengeri et al. 2017b;Beuther et al. 2018). Reaching physical scales of a few thousands to tens of thousands of au is necessary to resolve the typical Jeans length in these objects, which is 0.1-0.3 pc, assuming a temperature of 20 K and a volume density of 10 4 -10 5 cm −3 , characteristic of massive cores or clumps (e.g. Williams et al. 2000;Motte et al. 2007;Wienen et al. 2012;Urquhart et al. 2018). Mapping of large areas is not feasible with current interferometers in the (sub)millimetre wavelength range. However, ground-based telescopes with sensitive, large field-of-view bolometer arrays provide an efficient way to map large regions at a moderate resolution at short wavelengths (350 and 450 µm, i.e. ∼10 corresponding to a physical scale of 0.1 pc at d ∼2 kpc). This allows us to conduct a statistical study of fragmentation from clump to core scales (e. g. Motte et al. 2003;Minier et al. 2009;Ragan et al. 2013;Merello et al. 2015;André et al. 2016;Rayner et al. 2017, Heyer et al. 2018. We have, therefore, carried out observations towards more than 200 ATLASGAL selected sources at 350 µm with the Submillimetre APEX Bolometer array camera (SABOCA, Siringo et al. 2010) at the Atacama Pathfinder Experiment 12-meter telescope (APEX, Güsten et al. 2006). These observations achieve an angular resolution of 8. 5, a factor of > 2 improvement in terms of angular resolution compared to the ATLASGAL survey, allowing us to identify a population of cores with sizes of 0.1-0.2 pc embedded in clumps with distances d 5 kpc. The angular resolution achieved corresponds to an intermediate physical scale to dissect structures of massive star-forming clumps, which is a missing gap to link with typical interferometric studies. The sample is twice that of Merello et al. (2015) who targeted a sample of BGPS sources with the SHARC-II camera on the Caltech Submillimeter Observatory (CSO). Combining our results with ancillary submillimeter and infrared data, we estimate the cores' physical parameters and compare them with the properties of clumps in order to constrain their fragmentation and structure formation. The higher angular resolution compared to ATLASGAL allows us to distinguish between quiescent and star-forming cores and deliver a sample of starless or pre-stellar massive cores. This paper is organised as follows: in Section 2 we present the observations, data reduction, and the complementary data used. In Section 3 we present our results and the data analysis, including source extraction, and determination of physical properties of clumps and SABOCA compact sources. In Section 4 we discuss the statistics of the sample in terms of the fragmentation properties and the relation between clump and core properties. Our main results and conclusions are summarised in Section 5.

Source selection
Over 10 000 compact sources have been identified in the AT-LASGAL survey Urquhart et al. 2014), and the majority of them (8007) have distance estimates allowing their physical parameters to be determined . In this study, we use 350 µm SABOCA observations to investigate small scale structure of the clumps. In comparison with the extensive follow-up observations of the brightest submil-  Fig. 1. Distribution of the integrated flux density and bolometric luminosity-to-mass ratios (L/M) of all ATLASGAL sources (in blue filled contours, the darker region corresponds to higher density) , and the target sources observed with SABOCA (gray pluses and dots) at different distance ranges. The histograms in the right side and top panel show the 1-dimensional distributions of integrated flux density and L/M of all ATLASGAL sources (blue) and all target sources (gray), respectively. limetre clumps selected from the ATLASGAL survey using both single dish (Giannetti et al. 2014(Giannetti et al. , 2017Csengeri et al. 2016a;Kim et al. 2017Kim et al. , 2018, and interferometric observations (Csengeri et al. 2017b this study focuses more on a population of massive, but lower surface density clumps (with typically lower submillimeter flux density). Our targets have been selected to be weak or in absorption at 24 µm based on the Spitzer MIPS GALactic plane survey (MIPSGAL; Carey et al. 2009). This initial sample of 95 clumps has been complemented with sources at different evolutionary stages based on their mid-infrared color selection at 24 and 70 µm mostly having a bolometric luminosity to mass ratio (L bol /M) less than 10. In addition, we include the data for the brightest 25 sources from ATLASGAL survey.
Altogether we targeted 204 ATLASGAL clumps, which is the largest number of clumps targeted with SABOCA. Since we obtained maps of the targets, there are 371 other ATLASGAL sources covered in the fields, on average we have 2-3 ATLAS-GAL clumps per field. The total number of observed clumps from the GaussClump compact source catalog of Csengeri et al. (2014) is, therefore, 575. In Fig. 1, we compare the integrated flux density and L bol /M of our sample with the same quantities of the whole ATLASGAL sample where available . This shows that our sample is representative of the entire population of sub-millimetre sources, having a L bol /M range of 0.1 to 100 L /M . Many of these sources are good candidates to host objects in early stages of high-mass star and cluster formation (see for example Csengeri et al. 2017b). The sample also comprises, however, massive clumps in more advanced stages of high-mass star formation that are associated with bright midinfrared sources and ultra-compact Hii regions exhibiting L/M > 100 L /M . In the following, we perform all the analysis on the entire sample homogeneously, and clarify in the discussion when only focusing on the mid-infrared dark sources. We show the distance distribution of all sources in Fig. 2 for which a distance has been determined by Urquhart et al. (2018); these are primarily kinematic distances. Altogether, this provides us with a distance to 507 clumps, which corresponds to 88% of our sample.
The entire sample covers a distance range from 1 to 14.5 kpc (Fig. 2), corresponding to a physical resolution between 0.04 and 0.6 pc. The large range of linear scales (∼ 10) has an impact the observed morphologies and derived physical properties. To mitigate any possible observational biases we restrict our statistical analysis to a distance limited sample (d 2-4 kpc); this corresponds to physical sizes of 0.07 and 0.15 pc, and thus, reducing the range of linear scales to a factor of 2. The majority of the clumps are located within 5 kpc (314 out of the 507 clumps corresponding to 62% of the sample), 205 of which are located in 2-4 kpc. The more distant sources are mostly luminous OB clusters that tend to be associated with very active star forming regions, which belong to the 25 strongest submillimeter sources with ATLASGAL survey, such as the W51, W43 complexes (e.g. Ginsburg et al. 2015;Motte et al. 2014).

SABOCA observations and data reduction
We targeted the selected ATLASGAL sources using the SABOCA instrument installed at the APEX telescope which offers ∼8 angular resolution at 350 µm. We used a raster spiral mapping strategy for each source that mapped a 4 region centered on the source position. The observations were conducted in 2010 and 2012 with a median precipitable water vapor (PWV) of 0.3 mm and all less than 0.5 mm. The typical on-source integration time is ∼5 minutes, which gives a better than ∼160 mJy/beam noise level for most of the sources (Fig. 3). This corresponds to a mass sensitivity of 1.84 M at 3 kpc, assuming a dust temperature of 20 K, κ 350µm = 0.076 cm 2 g −1 , and a gas-to-dust ratio of 100 (Eqs. 1 and 2).
We used the BoA software (Schuller 2012) with standard procedures and iterative masking for the data reduction. The atmospheric zenith opacity determined from the radiometer in each science scan was used to correct for atmospheric opacity. These values were compared to the results of the skydip mea-surement and found to be consistent. Mars and standard calibrators, such as IRAS 13134-6264, were regularly observed and used as primary flux calibrators. Typical calibration uncertainties are within 20%. Pointing and focus were updated every 1-2 hours. The telescope pointing accuracy is within 2-3 .
We used the standard, iterative source-masking reduction procedures optimised for the faint and the bright sources, respectively. In the first step of the iterative processing, an initial map is obtained after opacity correction, and then the correlated noise and bad channels are removed. The resulting map is then smoothed by 2/3 of the beam size (yielding a resolution of 9.5 ) to achieve a better signal-to-noise ratio. This is then used as an input model for the following iterations to represent the real source structure. The residual flux is modeled by subtracting the model image from the initial map, with a cut-off level at a signal-to-noise ratio of 3. After adding the model back to the image, the process is repeated until the peak flux level remains unchanged. For the final maps, we applied 3 smoothing which gives a resolution of 8. 5 (original beam size ∼8 ). We checked the convergence of the data reduction process by examining the relative change of the peak and integrated flux density around the brightest pixels during the iterative process. The pixel size is 1. 85 in the final map.
In order to have a uniform and consistent noise level estimation for all fields, we use a wavelet-based estimator implemented in the skimage python package; this assumes a Gaussian noise distribution and calculates the standard deviation in each map (Donoho & Johnstone 1994). Before this estimation, we trimmed the edge of the maps, because of the increased noise level due to the non-uniform sampling at the map edges caused by the scanning strategy. We selected several sources to check the derived values and found them to be consistent with measurements made of an emission-free region of the map. The achieved rms noise distribution of all the sources is shown in Fig. 3. The mean noise level for the relatively weak sources (from projects M-0085.F-0046-2010, M-0090.F-0026-2012) is 0.12 Jy/beam with a standard deviation of 0.07 Jy/beam, and for the brightest sources (M-0085.F-0055-2010) it is 0.21 Jy/beam with a standard deviation of 0.10 Jy/beam. We find that the brightest sources, above 100 Jy/beam peak flux density, have up to a factor of 5 higher rms noise compared to the weaker sources (Fig. 3). Altogether the sources exhibit a large range of peak flux density from as low as 1.25 Jy/beam up to 600.4 Jy/beam, with an average of 52.0 Jy/beam and a median of 9.0 Jy/beam at 350 µm. In agreement with our initial selection from ATLASGAL, this shows that we have a large number of relatively faint and a few extremely bright clumps.

Ancillary data
To be able to determine the physical properties of the clumps, we rely on the far-infrared emission of the dust. For this, we use level 2.5/3 archival data from Herschel PACS and SPIRE at 70/160 µm, 250/350 µm, respectively, from the Hi-Gal survey (Molinari et al. 2010(Molinari et al. , 2016. The zero point offsets of the SPIRE data have been corrected by Planck-HFI data via crosscalibration 1 ; this accounts for the thermal background produced by the instrument itself. There are 6 fields not covered in Hi-Gal or at the edges of the maps that were excluded from the following analysis. We complement the HiGAL data with the combined ATLAS-GAL and Planck-HFI 870 µm data (Csengeri et al. 2016b). This data product has a considerably larger spatial dynamic range than the ATLASGAL data alone allowing the recovery of largescale structures, which can be directly compared with the emission probed by Herschel. To associate the sub-millimetre emission with mid-infrared sources, we also make use of the MIPS-GAL survey and its point source catalog (Gutermuth & Heyer 2015).

filaments and cores
We show examples of massive clumps at 350 µm observed by APEX/SABOCA in Fig. 4, where we also directly compare them with the 24 µm emission from the Spitzer/MIPSGAL survey probing their star formation activity at a comparable angular resolution of 6 . We detect emission at 350 µm towards all fields, typically tracing emission from the ambient cloud material, as well as from compact sources. The images for all the targeted sources are shown in App. A.
The observed fields have a range of peak flux densities spanning two orders of magnitude at 350 µm with a typical dynamic range of ∼100, however, this is noticeably lower for fields containing brighter sources. The high dynamic range and improved resolution allows us to conduct a detailed analysis of the overall morphology of the sample in a homogeneous way for the majority of the sources; this is particularly true for the mid-infrared weak sources due to the better sensitivity and dynamic range.
As revealed in Fig. 4, the morphology of the 350 µm emission exhibits a large variety including isolated compact spherical structures (Fig. 4a) and prominent elongated structures, which are often referred to as filaments (Fig. 4b,c,d). The structure of the source presented in Fig. 4d reveals a relatively straight fila-mentary morphology, however, we also find a network of connecting filamentary-like and branches, which are illustrated in Fig. 4b, c, for example. These structures are reminiscent of the large-scale filamentary structure of the ISM (e.g. . Visual inspection of all the maps in App. A indicates that the majority of the fields exhibit elongated, filamentary(-like) emission (67%). This either corresponds to low-intensity uniformly distributed material or consists of brighter emission with several compact sources associated with it. These sources are marked in Table 1 together with the properties of the identified substructures (see more detail in Sec. 4.3). In general, fields dominated by a single, bright source at 350 µm represent only a minor fraction of the sample.

Extraction of compact sources
We use the Gaussian decomposition algorithm Gaussclumps to identify compact structures embedded within the ATLAS-GAL clumps. 2 . Originally developed for identifying structures in 3D, position-position-velocity data cubes, Gaussclumps assumes that the resolved structures are Gaussian-shaped and iteratively subtracts the fitted structures from the map (Stutzki & Guesten 1990). Gaussclumps is the same tool used to extract the compact sources from the ATLAGSAL survey .
We set the initial guesses of aperture cutoff and aperture FWHM to 8. 5 and 1.1 times the angular resolution respectively. The initial guess for the source FWHM is also set to 1.1 times the angular resolution, and we require the identified substructure by Gaussclumps to have a peak flux larger than 3 times the rms. There are several other control parameters in the leastsquare fitting algorithm. The stiffness parameters s 0 , s c and s a adjust the tolerance between the observed intensity, maximum intensity and its position, respectively, with the corresponding fitted parameters. We performed a series of tests to see how the choices for the stiffness parameters affect the identified sources. We found that using 1, 1, 10 for s 0 , s c and s a results in a more stringent constraint on the fitted peak flux position, and seems to more robustly recover the prominent structures while avoiding the detection of spurious sources at relatively low signal-to-noise levels.
In total, we identify 1120 structures with Gaussclumps within our SABOCA maps, yielding on average 4-5 compact sources within the 4 maps. We list the parameters of the extracted compact sources in Table 1. In the following, we use the nomenclature commonly adopted in the literature (e.g., Williams et al. 2000, Motte et al. 2007, Liu et al. 2012a, where clumps refer to structures with sizes of ∼0.5-1 pc, and cores refer to ∼0.1 pc structures embedded within a clump. Since most of our analysis and discussions are based on the distance limited sample, in the following we use the term cores to refer to the compact sources extracted from the SABOCA 350 µm observations, since the majority of them have a FWHM size 0.2 pc. When discussing the properties of SABOCA 350 µm compact sources together with clumps, we use the term fragments instead, to indicate their relation. Y. Lin et al.: Fragmentation and filaments at the onset of star and cluster formation

Physical properties of clumps and cores
4.1. Dust temperature and H 2 column density maps on the clump-scale We performed a pixel-by-pixel fitting of the far-infrared spectral energy distribution (SED) to compute maps of the dust temperature, T d , and the H 2 column density, N(H 2 ). For this, we used PACS 160 µm, SPIRE 250 µm, and 350 µm maps and the combined APEX/LABOCA-Planck-HFI 870 µm data. We first convolved the maps to a common resolution of ∼25 and then gridded the data on to the same pixel grid. We performed SED fit of each pixel, using a modified-blackbody model: where in τ ν = µm H N(H 2 )κ ν , m H is the mean molecular weight and we adopt a value of 2.8, N(H 2 ) is the gas column density. We adopted a dust opacity law of κ ν = κ 300 µm ( ν 1000 GHz ) β , where κ 300 µm = 0.1 cm 2 g −1 is the dust opacity per unit mass (gas and dust) at a reference wavelength 300 µm and β is fixed to 1.8, which is the average value found towards the Galactic plane (Planck Collaboration et al. 2011). These parameters have been used for the Herschel Gould Belt Survey (André et al. 2010), the HOBYS  and Hi-Gal (e.g. Elia et al. 2017) key programs. This dust emissivity relation yields κ 870 µm = 0.0147 cm 2 g −1 , while the previous ATLASGAL studies, e.g. Csengeri et al. (2014), König et al. (2017) use a κ 870 µm of 0.0185 cm 2 g −1 , which corresponds to a factor of 1.25 difference in the mass estimates. In addition, for the clumps which have a fitted temperature larger than 45 K we additionally include the PACS 70 µm measurements in the final fits since in this temperature regime inclusion of the shorter wavelength better constrains the fit. We note that there is emerging evidence that β and T d are anti-correlated (e.g. Juvela et al. 2013), which means fixing β to a certain value will cause an uncertainty of T d which propagates into the mass estimate. More sophisticated modelling methods are required to accurately recover this anti-correlation (Juvela et al. 2013;Galliano 2018) by properly taking into account observational uncertainties, line-of-sight mixing of different components, etc. In addition, the dust opacity (reference value) can also increase in dense and cold cloud, due to grain growth (dust coagulation, e.g. Ossenkopf & Henning 1994, Planck Collaboration et al. 2011. In general, the mass estimates may have an uncertainty up to a factor of 2-3 due to possible variations of the dust opacity and emissivity index β. Using ATLASGAL and Herschel data, König et al. (2017) determined the properties of 109 ATLASGAL selected massive clumps from SEDs of mid-to far-infrared data. That study is based on the clump averaged integrated properties, while here we map the spatial distribution of the physical properties (e.g. dust temperature and gas column density). Examples of the resulting dust temperature and column density maps on the 25 grid are shown in the top panels of Fig. 5.

Dust temperature and H 2 column density maps on the core-scale
To obtain the same parameters at the scale of cores, we make use of the high-resolution SABOCA observations for the SED fitting. Since ground-based submillimeter observations are not sensitive to extended emission due to sky noise subtraction, the SABOCA maps do not recover emission from scales larger than ∼0.9-1.0 (∼ 60-70% instrument's field-of-view of SABOCA) (Siringo et al. 2009). Since the Herschel SPIRE 350 µm covers a similar frequency range ( Fig. 6) we can use these data to recover the extended emission. The sampled angular scales of the two data sets have an overlap between the ∼25 resolution Herschel 350 µm and the largest angular scales recovered by SABOCA. For the combination of the SABOCA and the SPIRE 350 µm data, we first calculate the color corrections due to the different shape of the transmission curves and nominal frequencies of SABOCA and SPIRE instruments. Following a similar procedure described in Csengeri et al. (2016a), we determine the central wavelength correction factor from SPIRE to SABOCA (F = 0.97) and color correction factors for both instruments (C SPIRE = 0.93, C SABOCA = 0.86). These factors were applied to maps made with each instrument before the combination procedure. The transmission curves and the relation between the assumed spectral index and the correction factors are shown in Fig. 6.
An additional complication is that 10 of the brightest sources are saturated in the SPIRE 350 µm images. We correct for this by interpolating between the saturated pixels assuming a 2dimensional Gaussian flux distribution around the center. We find that the interpolated images are, in general, consistent with the SABOCA images after they are smoothed to the SPIRE 350 µm resolution of 25 . Finally, for all sources, we linearly combine the SABOCA and SPIRE 350 µm images in the Fourier domain using the immerge task of the Miriad software package (Sault et al. 1995). Before the combination, we calculate a weighting factor from the overlapping spatial scales between the two dataset, which is then applied to scale the SABOCA flux to match that of the SPIRE 350 µm measurements. For more details on the procedure of the combination, we refer to Lin et al. (2016).
We use these combined maps together with the PACS 70 µm band, at a similarly high resolution of ∼10 after smoothing, to estimate the physical properties of the SABOCA sources, such as H 2 column density N H 2 , mass, and average dust temperatures T d . We do this by fitting SEDs to each pixel in the maps to trace the local heating sources and estimate the core-scale temperature and column density. The formulations used are the same as when fitting the 4 bands at a coarser resolution (Sec. 4.1). Assuming that the 4-band SED fits, in general, describe the bulk of the cold component in the coarser resolution maps, we scaled the 70 µm flux densities in order to remove the contribution of a second, hotter gas component often seen towards deeply embedded protostars and massive young stellar objects (MYSO). The scaling factor is defined as the flux ratio of the observed 70 µm emission map and the extrapolated 70 µm flux based on the SED fit with the 4 bands. This step is necessary because the majority of our sources are mid-infrared weak, thus dominated by cold dust which mainly emits at wavelengths longer than ∼100 µm. Hence our method of using only the 70 µm and the 350 µm data would lead to an underestimate of core masses, with flux at 70 µm predominantly originating from the deeply embedded heating sources.
We find that the scaling factors are generally less than 1.0 for most of our mid-infrared dark and relatively bright evolved sources. However, for the extremely bright OB cluster forming regions such as G10.6-0.4, G12.8-0.2 (ATLASGAL names: G010.6237-0.3833, G012.7914-0.1958) we find a ratio larger than one for the predicted 70 µm flux compared to the observed flux density. These sources are at the edge of prominent compact 70 µm emission features, and due to their more evolved nature, including the 70 µm flux in the SED fits without any scaling provides a better constraint on their temperature fit (e.g. Lin et al. 2016). Therefore, for those cases where the scaling factor exceeds 1, we use the original 70 µm fluxes. In the App. B we present two examples including one of these extremely bright OB clusters and the statistics of the scaling factors with the derived properties. The difference between the derived values using the original and scaled 70 µm flux density in the SED fits are generally within 20% for temperature and 50% for column density depending on the flux scaling factor, which are discussed in detail in Appendix B. The uncertainties of the fitted column density and dust temperature are within 20%. Examples of the 10 temperature and column density maps are shown in bottom panels of Fig. 5. Altogether we could perform a pixel-by-pixel SED fits for 198 of our maps, which includes 971 cores identified in the SABOCA maps.

Statistics of the physical properties of ATLASGAL clumps and SABOCA cores
We derive the physical properties, such as average dust temperature, mass, bolometric luminosity of the identified SABOCA compact sources, and their ATLAGSAL host clumps, from the pixel-by-pixel SED fitting described in Sec the FWHMs reported in Csengeri et al. (2014). Similarly, we calculate the clump masses by summing up the column densities (N H 2 ) of each pixel over the source area at distance, d (see Sec. 2.1), where A stands for the clump area defined by πR 2 . The radius is defined as R = σ, where σ = FWHM 2 √ 2ln2 . We estimate the total bolometric luminosity by integrating the fitted SEDs, where S ν is derived by the fitted SED curve described in Sec. 3.3.1, and is summed up within R = σ, as well.
To estimate the physical properties of the 350 µm SABOCA compact sources we follow the same procedure as described above using the corresponding parameters fitted by Gaussclumps and using the high-resolution column density and dust temperature maps (10 ). The physical parameters of the individual AT-LASGAL clumps and the SABOCA sources are listed in Table 2 and Table 3, respectively. We list the general statistics of the clump and core properties in Table 4 for all the sources and the distance limited sample.
The distribution of the physical properties of the clumps are shown in Fig. 7. The average mass of all of the clumps is 600 M . In total, 65% of the sources have > 100 M and the most massive clump has a mass of 1 × 10 4 M , which is one of the more distant sources in our sample. In addition, there are 130 clumps above 650 M , which is a mass limit used by Csengeri et al. (2014) as a threshold for a massive clump to potentially form high-mass stars. The average mass of all the SABOCA sources is 198 M , and for the distance limited sample it is 50 M , which is ∼3 times lower than the values of the clumps ( Table 2).
The mean clump FWHM size of all the clumps and those in 2-4 kpc is 0.68 ± 0.45 pc and 0.42 ± 0.11 pc, corresponding to the typically observed scales for Galactic clumps. The mean FWHM size is 0.31 ± 0.21 pc for all the SABOCA sources, and for the sources in 2-4 kpc the value is 0.19 ± 0.05 pc, corresponding to a typical core-scale.
We also calculate the surface density based on the average column density derived from SED fits, where Σ = µm H N(H 2 ) (g cm −2 ). In Table 4, we can see that SABOCA sources have typically larger surface densities on average than the clumps, indicating a concentration of dense gas at smaller scales. The majority of the clumps (65%) and cores (78%) have a mass surface density larger than 0.2 g cm −2 which is used to identify promising massive star progenitors by Butler & Tan (2012).
We find dust temperatures between ∼14 K and ∼50 K for the clumps, and on average we find similar values for clumps and cores. This is expected from our SED fitting method which is mainly sensitive to the cold component. The most active starforming clumps have an average dust temperature of > 40 K and correspond to the most massive OB cluster-forming regions in the Galaxy.
Due to our source selection strategy, we have a large fraction of clumps with low bolometric luminosity and a luminosity-tomass ratio of less than 10 (74%). Since a low bolometric luminosity together with a large mass is indicative of an early evo-Article number, page 7 of 53 A&A proofs: manuscript no. aanda_saboca_yuxin_v3_md_subm_rev2_subm   lutionary stage (e.g. Motte et al. 2007, Molinari et al. 2016), a large number of our targeted clumps are likely to be at early evolutionary stages. We indicate the L bol /M = 1 and L bol /M = 10 vertical lines in the luminosity-to-mass ratio distribution plot (Fig. 7); which show empirical limits from Molinari et al. (2016) for massive clumps in a starless or pre-stellar stage, and clumps hosting high-mass ZAMS stars, respectively. The L bol /M > 10 ratio can also correspond to a qualitatively different heating source reflecting the presence of deeply embedded UC-Hii regions (see Sec. 4.4; see also Kim et al. 2018). The most evolved clumps, with luminosity-to-mass ratios above 100, e.g. G34.3+0.2, G10.6−0.4, G333.6+0.2 (Campbell et al. 2004;Liu et al. 2010;Lo et al. 2015, respectively), are clumps residing in well-known extreme Galactic massive star-forming regions.

SABOCA sources
We use the MIPSGAL point source catalog at 24 µm (∼6 ; Gutermuth & Heyer 2015) and the Herschel/Hi-GAL point source catalog at 70 µm (∼8. 4; Molinari et al. (2016)) to assess the star formation activity of the SABOCA 350 µm compact sources. These catalogs have a comparable angular resolution to the SABOCA observations. We also make use of the WISE point source catalog (11 at 22 µm; Cutri et al. 2012) to complement the mid-infrared surveys as MIPSGAL is saturated at >2 Jy, while the WISE point source photometry is reliable to ∼330.0 Jy flux density at 22 µm. We also used the existing RMS (Lumsden et al. 2013;Urquhart et al. 2013) and the CORNISH (UC-)Hii region catalogs (Kalcheva et al. 2018) to identify the sources associated with Hii regions that are likely to be saturated in mid-infrared images. These catalogs have a coordinate  range that covers in total ∼30% of our SABOCA (295 out of 971) cores. The higher angular resolution of the SABOCA data allows a better matching with these higher angular resolution ancillary catalogs compared to ATLASGAL, which is necessary to assess the star-forming activities in the cores.
We performed a cross-match between the position of the SABOCA sources and these catalogs with an angular separation limit of 8. 5 corresponding to the angular resolution of SABOCA maps. The distribution of the angular offsets between the SABOCA compact sources and their mid-infrared emission (Fig. 8, left panel) suggests a good positional correlation between the mid-infrared sources and the 350 µm SABOCA peak positions. The dispersion of offsets in the range of 2-4 , which is consistent with the pointing accuracy of the telescope.
We use the association with UC-Hii regions, 24/22 µm or 70 µm point sources to distinguish between star-forming and quiescent cores and assess the statistics of their physical properties. Due to the varying diffuse mid-infrared background emission, the completeness of these catalogs is, however, not homogeneous. The sensitivity limit of the MIPSGAL 24 µm survey (5σ, ∼10 mJy) can probe down to 18 L protostars within 15 kpc following the calculation of Motte et al. (2007) based on IRAS colors from Wood & Churchwell (1989). This suggests that even intermediate-mass protostars could be revealed throughout our sample. However, as discussed in Gutermuth & Heyer (2015), the MIPSGAL catalog completeness is strongly dependent on the spatial variation of the local emission, therefore the sensitivity limit does not reflect the level of the completeness towards all sources. The sensitivity limit of the PACS 70 µm band of 0.2 Jy (Molinari et al. 2016) corresponds to a luminosity of ∼25 L at 5 kpc (Dunham et al. 2008). This similarly suggests that at least intermediate-mass protostars should be detected at 70 µm, if neglecting the spatial variation of the completeness level and source confusion.
In all the 1120 compact sources identified from the SABOCA maps, of which 971 have physical parameters based on their SED fitting, 13 (1.3%) have UCHii region counterparts. In the remaining sources, 184 (19.0%) are directly associated with 24/22 µm compact sources. In addition, we also have 87 (8.9%) sources that spatially coincide with regions of saturated 24 µm emission, but without 22 µm or 70 µm counterparts which are ignored in the following analysis. Of the 184 sources that have 24/22 µm compact sources, 18 do not have a 70 µm compact source counterpart. Visual inspection of these 18 sources shows that they are all associated with faint or extended 70 µm emission, and hence they are likely to be below the completeness limit of the PACS 70 µm catalog. Among the remaining sources, 142 (14.6%) have a compact source counterpart only at 70 µm. The remaining 545 (56.1%) sources are either dark at 24/22 µm or are associated with extended or diffuse 70 µm emission which is not included in the compact source catalog. To summarise, we have used the various surveys described above to classify 884 cores; 545 cores are found to be quiescent (64%) and 339 cores to be as star-forming (36%).
Dust extinction may prohibit detection of deeply embedded 24/70 µm compact sources. We calculate the dust opacities at 70 µm based on the derived column densities for quiescent cores with our assumed opacity law (see in Sec. 4.1). We find that 90% of the calculated opacities are lower than 1 with a 95% quartile level of opacity ∼2.4, indicating that dust extinction does not significantly affect our classification. However, the effect of Article number, page 9 of 53 A&A proofs: manuscript no. aanda_saboca_yuxin_v3_md_subm_rev2_subm source geometry and the uncertainty of dust emissivity may also influence the actual extinction.
To check whether the non-uniform distances of our sources affect the classification, we show the distance distribution of the star-forming and quiescent cores in Fig. 8 (right panel). The two distributions are similar with a KS-test yielding p-value = 0.31 and D = 0.08 (p-value > 0.05 and D < D α at a significance level of α = 0.05), which means that there are not more quiescent cores appearing at nearer distances than star-forming cores, suggesting that the different distances do not bias our classification. However, we caution that this similarities in the distributions could be due to the presence of a mutual cancellation of two opposite biases: at larger distances, the objects identified will tend to be more massive and hence more likely to be star-forming already. This effect will be counteracting the decreasing likelihood of detecting mid-infrared compact sources. In the following, we restrict our analysis to distance-limited sample, or samples in certain distance bins, so that the results are less affected from the classification bias due to distance non-uniformity.

Discussion
In this section we compare the physical properties of the starforming and quiescent SABOCA cores (Section 5.1), and investigate their capability of forming high-mass stars (Section 5.2). We study the fragmentation properties and the clump structure in Section 5.3 based on a distance limited sample of 2-4 kpc. In Section 5.4 we identify a sample of massive quiescent SABOCA cores at < 5 kpc that are potential high-mass pre-stellar core candidates.

Physical properties of star-forming and quiescent cores
In Sec. 4.4 we classify the SABOCA cores into two categories according to their radio, mid-and far-infrared compact source associations: star-forming and quiescent. As discussed in the following paragraphs, our approach is more sensitive to the deeply embedded low-to intermediate mass protostars than using a single wavelength. A 24 µm point source associated with a compact 350 µm source suggests the presence of a local heating source, corresponding to one or more protostars. However, the absence of a 24 µm point-source does not necessarily mean that there is no star-forming activity, a high optical depth or mid-infrared confusion may inhibit the detection of compact 24 µm sources. In that case, being more sensitive to the outer regions of internally heated cores, the PACS 70 µm compact source association (e.g. Dunham et al. 2008;Ragan et al. 2013) is used to further pinpoint the star-formation activity. While it is clear that the absence of a 24-70 µm point source appears to exclude the presence of a massive YSO, early stage (even up to high-mass) protostars have been detected towards infrared quiet massive clumps Feng et al. 2016;Csengeri et al. 2018), and dense cores with luminosities as low as 400 L (Duarte-Cabral et al. 2013).
In Fig. 9 we show the distribution of the physical parameters of cores located at a distance of 2-4 kpc. This sample consists of 233 quiescent and 129 star-forming cores. We find that the star-forming cores have, in general, a modestly higher temperature (24.0 K±0.75 K) than the quiescent cores (18.9 K±0.19 K). We obtain a p-value < 0.001 for a KS-test confirming that the temperatures of the two populations are statistically different. However, we also note that there is a large fraction (43%) of star-forming cores have a relatively low dust temperature (< 20 K). There is also a small number of quiescent cores with temperatures larger than 25 K (5%). Visual inspection of these cores reveals that they are located close to extended nebulosities at 70 µm; these cores may, therefore, be star-forming cores that remain undetected due to confusion from the nearby bright source or genuinely quiescent cores with the warmer temperature caused by external heating.
The luminosity distribution of the SABOCA cores is presented in the bottom left panel of Fig. 9. This plot shows a similar picture as the dust temperature with the star-forming cores having higher luminosities. The KS test results in p-value < 0.001, again indicating a distinct distribution. However, we find a large number of quiescent sources with high luminosities of >10 3 L . The number of quiescent sources decreases to zero above L bol ∼10 3.5 L , which is equivalent to the ZAMS luminosity of a B2-B4 star. As discussed in the previous paragraph, these large values of the bolometric luminosities towards quiescent cores are either due to external heating or to a higher level of confusion and blending with nearby sources. So it is possible that some of the quiescent cores might be protostellar, however, the lack of a corresponding bright 24 µm source suggests that the star formation is still at a very early stage.
The mass distribution of the quiescent and star-forming sources is remarkably similar. Together with their difference in bolometric luminosities, this may suggest that the cores classified as quiescent are in an earlier stage of star formation, and would follow the same evolutionary path as the star-forming cores (Csengeri et al. 2017a, see also discussion in Sec. 5.2). The range of typical core masses is between 10 and 320 M , with an average mass of star-forming and quiescent cores of 53.1±4.2 M and 45.5±3.1 M , respectively. The KS test yields p-value = 0.26, indicating a similar mass distribution. The typical core masses for star-forming and quiescent cores are not significantly different. The star-forming cores have on average smaller size (0.15±0.002 pc in FWHM) compared to quiescent cores (0.16±0.003 pc in FWHM). The mean mass surface density of star-forming cores (0.50±0.03 g cm −2 ) is higher than that of the quiescent cores (0.40±0.02 g cm −2 ). The KS-test yields pvalue ∼ 0.012 (∼1%), indicating that the two distributions are different, with star-forming cores having higher surface density.
The similar mass distribution and relatively different surface density distribution of quiescent and star-forming cores indicate that, at the typical scale of ∼0.2 pc the quiescent cores have enough mass assembled to evolve into protostellar cores when contracting. Cores can be gravitationally contracting ever since they first appear, allowed by the presence of an unstable background (Naranjo-Romero et al. 2015). In the framework of dynamical evolution cores are clump-fed (see also Wang et al. 2010), which is supported by growing observational evidence for gas replenishment beyond the core-scale (e.g. Galván-Madrid et al. 2009, Csengeri et al. 2011, Peretto et al. 2013, Liu et al. 2015, Chen et al. 2019. In this case cores prior to star formation may be expected to have a lower mass compared to star-forming cores, however, the fact that quiescent cores are relatively more extended, and their aspect ratio (from Gaussclump) showing them to be less roundish compared to star-forming cores, may indicate the similar mass distributions do not contradict with such dynamical evolution scenario. In addition, some of the starforming cores may be in a state of gas dispersal (Fig. 11). Among the quiescent cores, the most luminous ones (L > 10 2.5 L ) have larger surface densities of 0.69±0.13 g cm −2 compared to their less luminous counterparts with 0.33±0.01 g cm −2 . The number of quiescent cores decreases significantly above ∼0.5 g cm 2 . This may suggest that when reaching a higher mass surface density regime the cores necessarily form high-mass stars, for which the formation time-scales are shorter than for low-and intermediate mass star-formation .

Fraction of massive star-forming cores
To estimate what fraction of the SABOCA cores are potential precursors or in the early stage of forming massive stars, we show the mass-radius plot and mass-luminosity diagram of all the samples in Fig. 10 and Fig. 11. Both of these plots show the distribution for the star-forming and quiescent cores in red and blue, respectively. For this analysis we only use sources located at 1-2 kpc, and 2-4 kpc distance ranges, which allows us to compare sources with similar size-scales. We briefly discuss sources more distant for comparison purposes.
The mass-radius plot (Fig. 10) shows no clear difference between the star-forming and the quiescent cores in mass scale. A substantial number of the cores are found above the empirical threshold for massive star-formation introduced by Kauffmann & Pillai (2010) suggesting that a large fraction of them are capable of, or are already in the process of forming high-mass stars. In particular, we cover a large fraction of quiescent cores that have a radius of ∼0.2 pc with masses of ∼100 M corresponding to a mass surface density of ∼0.2 g cm −2 , thus representing potentially young regions where massive star formation could occur (e.g. Butler & Tan 2012).
The derived power-law relation of M ∝ r α gives α=2.67±0.25 and α=2.60±0.16 for star-forming and quiescent cores in a distance range of 2-4 kpc, respectively. For sources in 1-2 kpc, the indices are somewhat larger, with α=3.00±0.62 and α=3.07±0.32 for the star-forming and quiescent sample, respectively. Moving to sources at 4-8, 8-12 kpc, more shallower slopes of ∼2.0 are found. These slopes are generally steeper than observed on the even larger clump-scale where a relation of M∝R 1.4−1.7 is derived (e.g. Urquhart et al. 2018). The trend of a steepening mass-radius relation at small scale (or smaller distance range) is also seen in, e.g., Ragan et al. (2013) towards cores in infrared dark clouds, and in Beuther et al. (2018) towards young massive star-forming regions at thousands of au scale (as reflected in their Fig. 11).
Observationally, the Larson's density-size relation (M∝R 2 ) could be due to a selection effect of a certain column density threshold, as frequently pointed out by previous works (e.g. Larson 1981, Ballesteros-Paredes & Mac Low 2002, Camacho et al. 2016). In our case, the more distant objects having smaller slope of power-law dependence on radius could be simply due to the fact that they correspond to larger physical sizes, and thus a larger contribution from their more diffuse envelope. The steepening of the mass-radius relation from slope∼2 to ∼3 correspond to a change from constant column density to constant vol-0.100 2 × 10 −2 Radius (1σ) (pc)  ume density. Considering that cores are usually embedded in a gaseous environment, the slopes of α=3, 2, 1 can result from different environments such as (spherically) uniform medium, a two-dimensional layer or a centrally concentrated filament, respectively (Myers 2009). This again indicates that at larger distances the amount of envelope gas might be influencing the slope, i.e. cores appear more "concentrated" if a diffuse layer is involved. On the other hand, the different power-law forms of M∝R 2 or M∝R 3 might indicate different modes of accretion (Hennebelle 2012), either dominated by turbulence from larger scale or the proto-cluster's self-gravity. Hence the steepening of the mass-radius relation may be reflective of the successively prevailing role of self-gravity at smaller scale (see also Camacho et al. 2016).
The luminosity and mass distribution for the star-forming versus quiescent cores is shown in Fig. 11. The evolutionary tracks for different final stellar masses and initial envelope masses (André et al. 2008;Molinari et al. 2008) are also included in this plot. These tracks have two main components, a vertical track that represents the main accretion phase and a horizontal track that indicates the clump gas dispersal phase. Where these lines join forms an apex that effectively defines the start of a star's zero-age main sequence (ZAMS) lifetime. The majority of the quiescent cores are below this apex and are, therefore, in the early stage of the evolution. Some of the star-forming cores, on the other hand, may be in a phase of gas dispersal. The brightest star-forming cores are clusters around or above the apex, indicating that many have reached the ZAMS stage and the envelope clean-up phase has probably begun (Molinari et al. 2008). Although scattered, the two distributions for quiescent and star-forming cores are clearly distinct for core masses larger than a few tens of M , with star-forming cores having larger luminosities. A linear regression fit between the log-log mass and luminosity give L∝M 0.97±0.06 and L∝M 1.58±0.16 for quiescent and star-forming cores in 2-4 kpc, respectively. Similarly drastic differences in power-law slopes for the sources in 1-2 kpc distance are found, with L∝M 0.76±0.16 and L∝M 1.37±0.23 for quiescent and star-forming cores. Moving further to sources at 4-8 kpc, the difference in slopes gets smaller, with quiescent cores having L∝M 1.20±0.08 and star-forming cores L∝M 1.40±0.13 . Within the uncertainties, for clumps a consistent relationship is found between luminosity and mass with L∝M 1.3 (e.g. Molinari et al. 2008;Urquhart et al. 2018). The less pronounced difference in the mass-luminosity scaling relation of these two populations with larger distances can be explained if the mid-infrared source association with the submillimeter compact source becomes less robust with increasing distance to distinguish genuine quiescent cores, but also because more distant sources have more ambient gas, as mentioned before, which contributes to the mass but not much to the luminosity. The latter case is consistent with the fact that for sources more distant than 8 kpc, shallower slopes are found for quiescent and star-forming cores as L∝M 0.81±0.13 and L∝M 0.91±0.20 . In line with our previous results, i.e. the mass distribution and the mass-radius relation of the quiescent cores and star-forming cores, the luminosity and mass diagram with evolutionary tracks also suggests that a significant fraction of these cores can form a high-mass star, should they eventually collapse on a single object.  In Fig. 12 we show the mass distribution of the cores (∆N/∆log(M), André et al. 2014), of the SABOCA cores at < 5 kpc. The completeness level of our sample is not straightforward to assess due to the varying sensitivity, and the different dust temperatures in each field. We estimate the mass completeness use the largest noise level (0.18 Jy/beam) for sources at d < 5 kpc, and assume a dust temperature of 15 K, which is the lowest value we have for the core population. This allows us to obtain a conservative estimate of the mass completeness limit of cores in the sample. With a flux density of 3σ and a source size close to the beam size, the mass completeness limit is approximately 41.4 M . We fit a power-law to the cores at d < 5 kpc above 41.4 M , and find a power-law index of -1.89±0.39. Although the fit has large uncertainty, the slope is larger than the Salpeter IMF (−1.35), and is also steeper than recent findings of mass distribution of cores in higher angular-resolution studies of the early-stages of high-mass proto-clusters (e.g. Csengeri et al. 2017b;Motte et al. 2018). This could be explained by the fact that our study is more limited by confusion due to the poorer angular resolution; the slope is fitted in a higher mass range than these works suggesting that there could be intrinsically less massive cores at this larger scale (0.1-0.2 pc). On the other hand, the derived core mass distribution may be described better by a lognormal than a power-law distribution, as shown by previous studies (e.g. Peretto & Fuller 2010). Indeed we also find that moving the cutoff mass to a higher value would result in a steeper fitted slope. We also note that different source extraction methods may result in different core mass distributions (e.g. Cheng et al. 2018) and such results should be interpreted with caution.

Association of cores to clumps: fragmentation level
Here, we investigate the hierarchical structure of the clumps by assigning the SABOCA cores (child fragments) to their parent clumps in which they are embedded. We focus on the 971 SABOCA sources that have physical properties derived in the analysis here. Because the Jeans length is typically resolved up to 4 kpc, we again consider sample in 2-4 kpc distance range, while the sample located in 1-2 kpc are added in some plots for comparison purpose. For each parent and child structure, we generate a mask based on their FWHM size from Gaussclumps and assign the child fragments to their parental clumps based on their overlapping area as illustrated in Fig. 13. Clumps located at the edge of the SABOCA maps are omitted from this analysis since we may not cover the full extent of the clump. We calculate the fragmentation level in two ways: exact N mm corresponds to the exact sum of the probability of overlapping areas of child fragments within the extent of the parent clump; the other is rounding N mm , which assigns each fragment to the clump with which it has the largest overlapping area. In short, the exact N mm considers fragments overlapping with multiple clumps.For example, fragment 3 in Fig. 13, which overlaps with two clumps giving 0.20 and 0.33 to the two clumps' exact N mm while it is counted to the central clump as 1 in rounding N mm . In this approach, the sum of exact N mm equals that of rounding N mm while for some clumps, rounding N mm may be zero if a fragment overlapping with it has a larger proportion of area inside another clump.
We allocate 874 fragments to 444 clumps for the whole sample, with the remaining ∼100 SABOCA sources lack of parental clump structure, and 353 fragments to 161 clumps in the 2-4 kpc distance bin, 77 fragments to 37 clumps in the 1-2 kpc dis-tance bin; consequently, each clump average ∼2 fragments. This is similar to what Merello et al. (2015) find going from 33 to 8.5 resolution. The thermal Jeans mass and Jeans length for a clump with a temperature of 20 K and a volume density of 10 4 -10 5 cm −3 , is 2-6 M and 0.1-0.3 pc, respectively Palau et al. 2015). Even at a later stage with a typical temperature of 30 K, the Jeans mass is only a factor of two higher, and the Jeans length is only 20% larger. Therefore, we resolve the hierarchical fragmentation of clumps to cores, on the scales of 0.1-1.0 pc. Considering the typical clump properties of our sample, our results suggest that the number of fragments is broadly consistent with the Jeans length that we resolve here: the number of fragments is consistent with the number of predicted Jeans lengths within the clump's size. But the SABOCA cores have considerably larger masses compared to that of the thermal Jeans mass by a factor of 10-100. This has been found in similar studies of 0.1 pc cores or condensations (e.g. Motte et al. 2007, Wang et al. 2011, Csengeri et al. 2017b).
In Fig. 14 we investigate in detail the fragmentation level of the clumps as a function of their physical properties for sources in 2-4 kpc distance. For the analysis, we only take into consideration parental clumps that have at least one fragment lying entirely within the clump. The coloured points mark the fragment numbers according to the rounding N mm criterion and gray points the exact N mm with each pair linked with dotted lines. We find that although the different fragment allocation methods may result in quantitatively different fragmentation levels, the general trend of fragmentation is similar. Therefore, both methods give a qualitatively similar picture.
Comparing the level of fragmentation, i.e. the number of fragments with clump mass, L/M and the predicted number of fragments with Jeans scenario, we find a generally significant scatter in the fragmentation levels, while the left panel of Fig. 14 suggests that the number of fragments increases slightly with clump mass. The Spearman correlation coefficient is 0.39 (exact N mm vs. M clump , 0.38 of rounding N mm vs. M clump , p-value<0.001), indicating a moderate correlation. We hereby define the specific fragmentation level as a number of fragments per unit clump size, as N mm /R clump . The normalisation by clump size is necessary given our limited angular resolution improvement in identifying fragments, such that one tends to pick up more fragments when the clump size is larger. The middle panel shows the specific fragmentation level as a function of the clumps' luminosity-to-mass ratio, a proxy of the evolutionary stage (e.g. Molinari et al. 2008;Csengeri et al. 2016a;Urquhart et al. 2018). There is a trend of subtle increase in the specific fragmentation as a function of L/M, with a Spearman correlation coefficient of 0.24 (exact N mm /R clump vs. L clump , 0.29 of rounding N mm /R clump vs. L clump , p-value∼0.015). We caution that a higher L bol /M ratio could also be due to that the most massive (proto)stars are forming within the clump (Ma et al. 2013) rather than an evolved stage characterised by gas dispersal. Finally, comparing the number of fragments with the thermal Jeans fragmentation prediction, we examine the correlation between specific fragment level with clump gas density (middle right panel of Fig. 14), and find a Spearman correlation coefficient of 0.41 (exact N mm /R clump vs. ρ clump , 0.45 of rounding N mm /R clump vs. ρ clump , p-value<0.001). Based on the clump's temperature and gas density, we also find a moderate correlation between the fragment level with the predicted Jeans number (N Jeans = M clump /M Jeans ), with a Spearman correlation coefficient of 0.31 (exact N mm vs. N Jeans , 0.32 of rounding N mm vs. N Jeans , p-value∼0.001). We also notice that the clumps considered here have a positive correlation of their gas volume density  and temperature (T ∝ ρ 0.22 ), corresponding to the gas equation of state (EOS) having a polytropic exponent of γ∼1.22, close to the upper limit relevant for Galactic molecular clouds (Spaans & Silk 2000). The increase of temperature with density means that the Jeans mass decreases slowly with density than isothermal gas, favouring less fragmentation. If compared with an assumed initial temperature of 20 K, i.e. the clump temperature when fragmentation happened (e.g. Palau et al. 2015), a stronger correlation between observed fragmentation level with predicted Jeans number is found, with a Spearman correlation coefficient of 0.40 (exact N mm vs. N Jeans , 0.42 of rounding N mm vs. N Jeans , p-value<0.001). On smaller scales towards more evolved massive clumps, it seems Jeans fragmentation is at work (e.g. Palau et al. 2015, Beuther et al. 2018. It is possible that turbulence dominates the fragmentation process at larger scale (e.g. Zhang et al. 2009, Wang et al. 2014 to induce a larger critical mass. On the other hand, the time delay of fragments having Jeans mass may be explained within the global hierarchical collapse scenario (Vázquez-Semadeni et al. 2019), since the density of the fragments must grow to sufficiently high contrast, and while this happens, the Jeans mass in the contracting clump will have decreased. This implies that the fragment masses, especially in early stage clumps, will in general be larger than the average Jeans mass measured for the clump.
Since Gaussclump is a non-hierarchical source extraction method, the association of child structures to parental structures suffer from ambiguity, which introduces a source of uncertainty in the fragmentation level and the correlations with properties of the parental clumps. Here we provide two set of fragmentation levels as a first attempt to deal with the association uncertainties. As a further benchmark, we adopt a completely independent method of source extraction (dendrogram, Rosolowsky et al. 2008) which is not hampered by the association issue, and present the same correlations between fragmentation level with clump properties in Appendix C. We find that the moderate correlations we derive here are also present in the dendrogram analysis, with some of them showing slightly tighter correlations, i.e. The relation between cluster mass and maximum stellar mass in indicated in red dashed-dotted line, which is an analytical relation from Weidner et al. (2013), and is scaled up by 3.3 assuming a 30% star-formation efficiency. Left and middle plots share the same color bar, with clumps color coded according to their sizes. Right: Surface density of the most massive fragment as a function of clump surface density. Vertical line marks the 0.2 g cm −2 empirical limit from Butler & Tan (2012). The blue dashed line shows the result of a linear fit to sources in 2-4 kpc distance range of logΣ fragments = 1.03logΣ clumps +0.43.
Studies of smaller, but more homogeneous samples of massive cores and high-mass protostars based on higher angular resolution observations, e.g. thousands of au, did not find a clear correlation between the core properties and fragmentation level, except that a larger central volume density, a larger clump mass may lead to a larger number of fragments (Palau et al. 2015). Although we discuss here the core scale and a considerably larger sample, we can only conclude that there is moderate indication of Jeans fragmentation at work, and there is no trend of fragmentation level with clump evolution. It is, however, possible, that the overall sample covers a too wide range of Galactic clumps to reveal which clump properties could influence the fragmentation level. In addition, our source selection may not be homogeneous for massive clumps with luminosity-to-mass ratio range larger than 10 L /M . In Fig. 15 we illustrate the relationship between the most massive fragment and clump properties for sources in 1-2 kpc (triangles) and 2-4 kpc (circles) distance bins. For these plots, we again only include cases in which the parental clump has at least one fragment lying entirely within the clump. Comparing the mass of the most massive fragment with the clump L bol /M ratio, it seems that more evolved clumps with a larger L bol /M ratio (Fig. 15, left panel) only host core masses above ∼20 M . We hereby derive the correlations and regressions only for sources in 2-4 kpc distance range, and sources in 1-2 kpc distance range are shown in the figure for comparison purpose. We find a strong correlation between the clump mass and the mass of the most massive core. This correlation is statistically significant and has a Spearman coefficient of 0.92 and p-value less than 0.001. This clearly indicates more massive cores tend to form in more massive clumps. A linear fit in the logarithmic scale yields a relation of logM mmf = 0.96log M clump -1.18, which corresponds to M mmf = 0.31 M 0.96 clump . This relation is close to a fraction of 30% of the clump mass is assembled in the most massive fragment. In App. C we demonstrate this strong correlation is present when dendrogram source extraction method is applied, regardless of the distinct concepts behind these algorithms in recognising cloud structures. As a comparison, Merello et al. (2015) find that 19% of the clump mass resides within the total dense substructures seen at 350 µm, when going from 30 to 8. 5 resolution maps. We note that the exact values of this fraction depend primarily on the relative resolution differences in defining the parental structure and substructures. We also find that the surface density of the most massive fragment correlates well with the clump surface density, and has a fitted relation of logΣ mmf = 1.02log Σ clump + 0.42, close to Σ mmf ∼1.5Σ clump . This indicates that the gas in the cores is denser compared to their parental clumps.
With the approximate relations between clump and its most massive fragment Σ mmf ∼1.5Σ clump and M mmf ∼0.31M clump , assuming the clump density profile follows a power-law form as a function of radius n(r) ∝ r −k ρ , a density structure of n(r) ∝ r −1.5 is obtained, equivalent to that the clump enclosed mass profile is M(r) ∝ r 1.5 . The self-similar density profile of n(r) ∝ r −k ρ with k ρ in range of 1.6-1.9 is obtained in gas spheres of high density from simulations of self-gravitating supersonic turbulence (Collins et al. 2012, Murray & Chang 2015. In particular, Murray & Chang (2015) suggests that the n(r) ∝ r −1.5 is an attractor solution at clump inner radii that is reached over clumps' selfsimilar gravitational collapse, when the density is then independent of time, i.e. n(r, t) → n(r). Furthermore, the clump density power-law forms can be directly linked with the power-law tails seen in molecular cloud column density probability distribution functions (N-PDF), lnN ∼ N −s , with s = 2/(k ρ -1) (Federrath & Klessen 2013). k ρ ∼1.5 thus corresponds to an N-PDF slope of -4. Lin et al. (2017) proposed a tentative critical value of the N-PDF power-law slope of -4 to separate massive star-forming cloud intermediate evolutionary stages, which have a shallower slope, with early or late stages, by analysis of 10 massive starforming clouds from infrared dark stage to evolved stage with embedded luminous OB clusters. Since most of the clumps in consideration here are of L/M < 10, which are in relatively early evolutionary stage (Molinari et al. 2008), this small scale clump density distribution is in accord with the evolution trend of cloud N-PDF, which are compatible indirect observational evidence of cloud multi-scale collapse, as in global hierarchical collapse scenario (Vázquez-Semadeni et al. 2019).
In middle panel of Fig. 15 we also plot the mass of the most massive star in a cluster (M max ) as a function of the cluster mass (M ecl ) (Weidner et al. 2013) scaled up by 3.3 times, together with the relation of masses between the most massive fragments and parental clumps. Here we make the simple assumption that the mass of most massive stellar member formed in the most massive fragment is proportional to the fragment mass with a constant SFE of 30% regardless of the mass-scale. The relation between m max and M ecl indicates that the mass of the most massive star systematically changes with the cluster mass in stars, and is incompatible with a scale-free IMF (Weidner et al. 2013). The sampling procedure that introduces such a relation also suggests star formation is highly regulated, with low-mass stars forming until feedback from the massive stars is able to prevail against the gravity dominated formation process (Weidner & Kroupa 2006). The fact that the formation of low-mass stars starts early and continues until massive stars are formed is seen in the numerical simulations of cloud formation and collapse by converging flows, over a time span of a few Myr (Vázquez-Semadeni et al. 2017). The relation of the most massive fragment and clump mass is compatible with the m max − M ecl relation, with the most massive core likely to be the progenitor of the most massive stellar member and perhaps fragmenting into somewhat less massive cores which may also remain unresolved at present, indicating the reservoir clump mass to be essential in regulating the mass distribution at smaller scales. The larger deviation of our observed most massive core-clump mass relation to m max − M ecl in larger mass range is understandable since we do find more massive objects tend to have more fragments as discussed before. With higher angular resolution observations we might resolve the bending of the most massive core-clump mass relation. Numerical simulations of rotating massive cloud cores with radiative feedback have indicated that fragmentation-induced starvation could be a possible scenario that set the limit of the mass of the most massive stars (Peters et al. 2010).
The higher resolution 350 µm observations with SABOCA compared to the 870 µm LABOCA maps of ATLASGAL offer the possibility to better distinguish between star-forming and quiescent cores. We take this opportunity to investigate in more detail the brightest SABOCA sources that are quiescent and lack embedded compact sources at 24 and 70 µm. To select the most massive cores, we use our mass estimates measured in a radius of 1σ, corresponding typically to a mass measurement within an FWHM diameter of 0.17 pc for a 0.2 pc core. We select cores that are located at d<5 kpc and have M 1σ >100 M , assuming that they are capable of forming at least one early B or O-type star. Among all SABOCA cores, we find 87 sources that fulfil this distance and mass criteria, of which 34 (37%) are not associated with any embedded compact sources at 24 and 70 µm. The fraction of star-forming versus quiescent cores is higher in this high-mass regime than towards all sources within 5 kpc where 58% are found to be quiescent. Considering only the most massive cores the relative fraction of star-forming versus quiescent cores is significantly higher, which is consistent with other studies (e.g. Csengeri et al. 2014;Urquhart et al. 2018) suggesting shorter formation timescales for the most massive clumps. A quiescent massive core with > 100 M within a typical radius of 0.085 pc has a volume density > 7.5 × 10 5 cm −3 , and a mass surface density of 0.92 g cm −2 assuming a uniform density distribution. However, most of these sources typically have non-flat flux distribution, suggesting a steeper density profile, as it is the case for their embedding clump. The bulk of the gas is cold with dust temperatures between 14 and 25 K, and on average of 18 K. Like the overall population of quiescent cores, the massive quiescent cores also exhibit lower bolometric luminosities between 60 and 2300 L .
To investigate whether these massive quiescent cores could qualify as candidates for extremely elusive and rare high-mass pre-stellar cores (Csengeri et al. 2017a;Motte et al. 2018), we visually inspected these candidates, and removed 7 cores from the final list that are too close to and blended with a bright source at 24 µm. This left us with a final list of 27 sources (Table 5). Being embedded in a larger scale clump, these sources are typically surrounded by other cores, some of which are lower mass and/or star-forming. We do not find any isolated or single candidate high-mass pre-stellar core above the > 100 M mass limit. An example of the massive quiescent core embedded in clump structure with lower-mass counterparts is shown in Fig. 16 We searched for available ancillary higher angular resolution archival data, that would help us to investigate the nature of these cores further, and found that G333.1298-0.5602 and G333.4659−0.1641 have been observed at high angular resolution as part of the SPARKS project using ALMA (Csengeri et al. 2017b. These studies suggest, however, that the brightest fragments on a 2000 au scale in these sources are high-mass protostellar objects rather than pre-stellar cores. To further probe the star formation activity of these cores we examined ancillary data on molecular tracers obtained with the IRAM 30m telescope described in Csengeri et al. (2016a), and the MALT90 survey observed with the MOPRA telescope (Foster et al. 2013) which are available for all but one of the sources. These observations, however, have a coarser angular resolution than the SABOCA maps are, therefore, only indicative as an independent probe of their star formation activity.
Similarly to Motte et al. (2007), we first investigate here the velocity profiles of the SiO (2-1) line. SiO is a widely used tracer for shocks, in particular, it becomes abundant in fast shocks associated with jet activity (e.g. Schilke et al. 1997;Codella et al. 1999). Therefore, high-velocity line wings could be indicative of the presence of jets and outflow activity related to embedded protostars, however, it might also originate from a nearby more evolved core if the resolution is lower than our SABOCA maps. We use here the pointed observations of Csengeri et al. (2016a) and the MALT90 maps (Foster et al. 2013), where we extracted and averaged the spectra towards the core position within the MOPRA beam (∼38 ). For the MOPRA data, these figures are shown in App. C. Except for the cores GS351.4379+0.6537 and G351.4416+0.6534 we find no clear evidence of a broad velocity component in this line towards our candidate high-mass prestellar cores, however, the lack of SiO line-wings could also be due to sensitivity limitations. Where available (6 sources), we checked the IRAM 30m observations for the CO (1-0) line covered by our spectral survey described in Csengeri et al. (2016a) that could be a more sensitive probe for high-velocity outflowing gas. Although the line shape is strongly affected by absorption from the reference position, there is no clear evidence for highvelocity emission that could be related to outflowing gas in this tracer either. The observed line wings of the SiO (2-1) transitions towards GS351.4379+0.6537 and G351.4416+0.6534 are likely due to blending with a star-forming core within the beam of the MOPRA observations. From the remaining 24 sources, 13 (54%) have detections in the SiO (2-1) line showing a narrow SiO component and altogether 11 sources show no detection in the SiO (2-1) line at all. To conclude, the SiO (2-1) line does A&A proofs: manuscript no. aanda_saboca_yuxin_v3_md_subm_rev2_subm  Csengeri et al. (2016a), c : possible contamination by a nearby bright source, d : within the same beam as the core above. In the last two columns, sources without observations are left blank; "y" stands for confirmation and "-" otherwise.
not provide an indication for deeply embedded ongoing star formation activity.
To make use of the available spectroscopic data, we also investigated whether infall signatures on the clump scale could be revealed towards our sample. While Csengeri et al. (2016b) suggests that most of the massive clumps are gravitationally unstable, spectroscopic signatures of infall (e.g. Myers et al. 1996;Mardones et al. 1997;Wu & Evans 2003) could provide an indication that the clump itself is collapsing (see, however, Smith et al. 2013). To search for an infall signature, i.e. red-shifted self-absorption of optically thick lines, we use the HCO + and H 13 CO + (J=1-0) lines, and where detected, we used a single Gaussian fit to the H 13 CO + line to extract the v lsr of the clump. Although a proper interpretation of these line profiles would require dedicated modelling, visual inspection of the spectra suggests possible infall motions towards 16 sources (60%). The information obtained from molecular line data is listed in Table 5.
Considering a typical star formation efficiency of 20-40% (Tanaka et al. 2017), the SABOCA cores with >100 M are good candidates to form at least one massive star. The absence of 24-70 µm emission, together with the indications from SiO/HCO + /H 13 CO + tracers suggest that except for two cores, there is no further indication for star formation activity related to jets and outflows in the sample based on the SiO (2-1) line, while more than half of the clumps hosting half of the cores shows indications of the presence of global infall motions. We identify 7 sources where the SiO (2-1) line is not detected and there is no indication for infall. On the clump scale, these sources could be considered as starless massive sources. However, we cannot exclude on-going low-mass star-formation when the core would fragment and form only a low-mass cluster. What physical processes determine the possibility for a massive core to form highmass stars is not yet clear.
We conclude that considering all of the available low angular-resolution data, our sample of candidate high-mass prestellar cores is robust, although deeply embedded high-mass protostars have been detected towards mid-infrared quiet massive cores and clumps Wang et al. 2011;Ohashi et al. 2016;Csengeri et al. 2017b). High angular resolution observations are needed to confirm their nature.

Conclusions
We present the largest sample of Galactic clumps studied at a moderate resolution of 8. 5 at 350 µm observed with APEX/SABOCA. By combining with available Herschel and LABOCA observations from ATLASGAL, we analyze the core properties derived from SED fits and investigate the fragmentation as a function of the physical properties of clumps. Our major findings are as follows: of morphology from compact, relatively isolated sources to complex filamentary structures with branches. We find that the majority of the targeted sources exhibit a filamentary structure emission. Fields dominated by a single, bright source at 350 µm represent only a minor fraction of the sample.
2. We identify 1120 compact sources at 350 µm using Gaussclumps. We estimate the physical properties towards 971 SABOCA sources from SED fits. The average FWHM size is 0.32 pc, dust temperature is 21.6 K, average mass is 198 M and average surface density is 0.40 g cm −2 . Among them, 405 cores are located in a distance range of 2-4 kpc; these have an average FWHM size of 0.19 pc, an average dust temperature of 21.8 K, average mass of 52 M and average surface density of 0.40 g cm −2 . We find a systematic shallowing of slopes of the cores' mass-radius relation as a function of distance, from ∼3 to ∼2.
3. Using the presence of 22-24 µm and 70 µm point sources, we distinguish between star-forming and quiescent SABOCA sources. The majority of the sources lack mid-and farinfrared counterparts (56.1%) and are classified as quiescent cores. For the distance-limited sample (2-4 kpc), we find that the quiescent cores have a similar mass range as the star-forming cores but are somewhat more extended and hence less dense, and of lower dust temperature (0.16 pc, 0.40 g cm −2 ,18.9 K) than the star-forming cores (0.15 pc, 0.50 g cm −2 , 24.0 K).
4. We find on average each clump hosts two cores, which is broadly consistent with the length scale of thermal Jeans fragmentation. The core masses are, however, a factor of 10-100 times larger than what the thermal fragmentation would predict. The observed number of fragments is moderately correlated with the clump density and the number of fragments predicted by pure thermal Jeans fragmentation inferred from the clump properties (correlation coefficient of 0.4, p-value∼0.001). We do not find strong indication that a larger number of cores would form as the clumps evolve as traced by their increasing L bol /M ratio.
5. We find a strong correlation between the clump mass and the mass of the most massive core, close to that ∼30% of the clump mass is assembled in the most massive core. The surface density of the most massive fragment is increased compared to the parental clumps.
6. We identify 27 massive (> 100 M ) quiescent cores at d < 5 kpc that represent promising candidates of massive star-forming progenitors at earliest stages. Further highresolution observations are necessary to confirm the nature of these cores and settle whether they could host genuine high-mass pre-stellar cores.   In this paper we use Gaussclumps to identify the compact sources in the SABOCA emission maps. Gaussclumps is a non-hierarchical source extraction method, which assumes that cloud structures are composed of multiple Gaussian sources which can overlap. In contrast, another widely-used source extraction method, dendrogram, is a hierarchical source identification method, which is discreet in the sense that a given emission area (or pixel) can only belong to a single structure, no overlap is allowed among the structures (of same level). We explore here the dendrogram representation of the cloud structure seen in the SABOCA maps, and compare these results with Gaussclumps.
There are several parameters to control the dendrogram source extraction, min_value, min_delta and min_npix, representing the minimum threshold, the minimum significance of a structure compared to its merging level, and the minimum size of a structure. Similarly as we did with Gaussclump, we set the min_value to 3 times the rms of the SABOCA map, and min_npix as the number of pixels corresponding to beam size. For min_delta, we set this to zero, as an attempt to pick up the maximum number of sources as well as to be consistent with Gaussclump, which does not have a similar control parameter.
We run dendrogram on the SABOCA emission maps and apply the extracted structures to the 10 column density maps in order to derive the physical properties of these structures. We define the size scale of the structures extracted by dendrogram as the geometrical mean of the area, similarly, assuming this size scale is representative of the scale of the line-of-sight dimension, the average gas density (ρ clump ) is estimated. For the analysis, we only include structures that have closed contours within the map. We find that the relatively small FOV of the SABOCA maps provide an incomplete view of the structures at the lowest contours, especially for those showing a filamentary structure. Therefore, requiring a closed contour for the first structure is a condition that strongly limits our statistics.
With dendrogram, we define the fragmentation level as the number of leaves residing within the lowest parental structure (branch or trunk) on the map. Isolated leaves without any parental structures are omitted. For example in Fig. C.1, source G19.8835−0.5363 appears to be an isolated structure, while source G334.2000−0.2000 has 3 leaves (1 blue, 2 red) residing in the lowest branch (outer green contour) which gives one statistic. In general, compared to Gaussclump, dendrogram picks up in addition numerous structures with irregular morphology, i.e. the magenta leaves at the edge of G333.7700−0.2500; the number of leaves is smaller than number of compact sources identified by Gaussclump with the average size appears larger. The differences are intrinsically linked with their different ways of defining structures. Although a systematic comparison between these two methods is beyond the scope of this paper, here we benchmark the analysis in Sec. 5.3.1 to see whether the choice of source extraction method causes systematic differences in the results.
In Fig. C.2 the fragmentation level is compared with the clump properties, similarly as in Fig. 14. We again find a correlation between the fragmentation level with the parental clump mass, with a Spearman correlation coefficient of 0.42 (p-value<0.001), although we have smaller statistics here. We again define the specific fragmentation level as number of fragments normalised by clump size, as N mm /R clump . There is no correlation between specific fragmentation level with clump L/M ratio, with Spearman correlation coefficient of 0.018. Finally, comparing the fragmentation level with the Jeans fragmentation prediction, we find a Spearman correlation coefficient of 0.30 of specific fragmentation level with clump gas density (p-value∼0.05). As for the comparison of fragmentation level with number of Jeans mass predicted by clump dust temperature and gas density, a Spearman correlation coefficient of 0.48 is found  Mass of the most massive fragment as a function of clump luminosity-to-mass ratio. Vertical lines mark the luminosity-to-mass ratio of 1 and 10. Middle: Mass of the most massive fragment mass as a function of clump mass. Gray lines show the 30% and 100% proportion of clump mass; blue line shows the result of a linear fit on logarithmic scale to sources in 2-4 kpc distance range of M fragments = 0.87logM clumps − 0.21. Left and middle plots share the same color bar, with clumps color coded according to their sizes. Right: Surface density of the most massive fragment as a function of clump surface density. Gray line marks the line of equality; the blue line shows the result of a linear fit to sources in 2-4 kpc distance range of logΣ fragments = 1.51logΣ clumps +1.13. The different sizes of markers correspond to the size ratios between the parental clump size and most massive fragment size, as indicated in the legend, i.e. the smaller the size of a marker, the larger size difference between fragment and its parental clump, up to a factor of 5.
Article number, page 49 of 53 A&A proofs: manuscript no. aanda_saboca_yuxin_v3_md_subm_rev2_subm Article number, page 53 of 53