Open Access
Issue
A&A
Volume 560, December 2013
Article Number A76
Number of page(s) 38
Section Galactic structure, stellar clusters and populations
DOI https://doi.org/10.1051/0004-6361/201321626
Published online 09 December 2013

© ESO, 2013

1. Introduction

Stars form by gravitational collapse of high-density fluctuations in the interstellar molecular gas, which are generated by supersonic turbulent motions (e.g., Klessen 2011). Following the nomenclature of Williams et al. (2000), star formation takes place in dense (n ≳ 104 cm-3) clumps, which are in turn fragmented into denser (n ≳ 105 cm-3) cores, in which individual stars or small multiple systems are born. Given this nature of the star formation process, stars are born correlated in space and time, with typical scales of 1 pc and 1 Myr, respectively (see Kroupa 2011), constituting young stellar agglomerates known as embedded clusters (ECs). Bressert et al. (2010) studied the spatial distribution of star formation within 500 pc from the Sun and found that, in fact, most of the young stellar objects (YSOs) in their sample are found in regions with number densities greater than ~2   pc-3, which is more than an order of magnitude higher than the density of field stars in the Galactic disk, 0.13   pc-3 (Chabrier 2001).

Many of the ECs defined in this way, however, are not gravitationally bound and will not become classical open clusters (OCs), i.e., bound stellar agglomerates that are free of gas and have lifetimes on the order of 100 Myr. It is very important to make the distinction from the start because there is often some confusion about this in the literature. In the definition used throughout this work (see Sect. 1.2), ECs are not necessarily the direct progenitors of bound OCs, but just the natural outcome of the star formation process, which is “clustered” with respect to the field stars.

The dynamical evolution of an EC is quite complex and can progress in several possible ways, depending on both the characteristics of the recently born stellar population and the physical properties of the parent molecular cloud. A gravitationally unbound molecular cloud or an unbound region of a molecular complex might still be able to form stars in subregions that are locally bound (e.g., Bonnell et al. 2011), but the resulting EC born there is globally unbound and quickly disperses into the field. On the other hand, within a molecular complex, especially in bound regions, many ECs might merge and form a few large entities (Maschberger et al. 2010). If a certain EC (once born or after merging) manages to remain gravitationally bound in the gas potential, at some point the effect of stellar feedback starts to influence the parent molecular material in the vicinity. These feedback mechanisms include protostellar outflows, evaporation driven by non-ionizing ultraviolet radiation, photoionization and subsequent H ii region expansion, stellar winds, radiation pressure and, eventually, supernovae. Again, the relative importance of a certain dissipation process is determined by the physical conditions of the system and the environment (Fall et al. 2010).

The energy and momentum introduced by stellar feedback eventually disrupts the clump and sweeps up the residual gas out of the cluster volume. The stars of this emerging cluster are now tied to each other uniquely by the stellar gravitational potential, which might not be sufficient to keep the stars together, so that the cluster dissolves. This is the classical “infant mortality” paradigm established by Lada & Lada (2003). However, Kruijssen et al. (2012) argue that this effect is only important in low-density regions, and by analyzing the dynamical state of the ECs arising from star formation hydrodynamic simulations, they find that in dense regions the formed clusters are actually bound and even close to virial equilibrium. They propose that those clusters are instead destroyed via tidal shocks from the surrounding dense gas. An alternative disruption mechanism for small-N systems or larger clusters with a hierarchical substructure has recently been studied by Moeckel et al. (2012), who find through N-body simulations that those clusters undergo a quick expansion owing fast internal relaxation. Bound exposed clusters are therefore the few survivors of all these processes and represent the remnants of originally more massive ECs.

The observational study of ECs is fundamental to account for most of the newly formed stellar population in the Galaxy and to investigate the interaction with its parent molecular material through stellar feedback. In the past decade, thanks to the development of all-sky infrared imaging surveys, such as 2MASS and GLIMPSE (see Sect. 1.1), many new ECs have been discovered in the Galaxy (e.g., Dutra et al. 2003a; Bica et al. 2003b; Mercer et al. 2005; Borissova et al. 2011), significantly increasing the number of known systems. However, so far there have only been a few systematic studies of the whole current sample of ECs and OCs in a significant fraction of the Galactic plane (e.g., Bonatto & Bica 2011; Kharchenko et al. 2012), and none of these studies has distinguished clearly the embedded population from the OC sample (see below). The main goal of this paper is to fill this gap.

Here, we statistically study all OCs and ECs known so far in the inner Galaxy from different cluster catalogs in the literature, after compiling a considerable number of physical and observational properties of these objects, particularly their degree of correlation with the surrounding molecular environment, if present. We take advantage of the recently completed ATLASGAL submm continuum survey (see Sect. 1.1), which provides a spatially unbiased view of the distribution of the dense molecular material in the Milky Way. While the distinction of ECs from OCs in these catalogs has primarily been made via correlations with known H ii regions or nebulae seen in the infrared, the ATLASGAL survey allows us to objectively tell1 whether or not these objects are associated with dense molecular gas, as well as to possibly detect the presence of stellar feedback via simple morphological criteria.

This paper is organized as follows. In the remainder of this introduction, we shortly present the main observational data and the nomenclature used throughout this work (Sects. 1.1 and 1.2, respectively). In Sect. 2, we describe the literature compilation of a merged list of Galactic OCs and ECs, including a new search for ECs we conducted on the GLIMPSE survey; more details about the literature cluster lists used here are given in Appendix A. Section 3 summarizes the construction of an extensive catalog for the cluster sample within the Galactic range covered by ATLASGAL, with many pieces of information, including: characteristics of the submm and mid-infrared emission, correlation with known objects, distances (kinematic and/or stellar), ages, and membership in big molecular complexes. A more detailed description of all the assumptions and procedures made when organizing this information in the catalog is given in Appendix B. In Sect. 4, we report the results of a statistical analysis performed on this catalog, in which we delineate a morphological evolutionary sequence with decreasing correlation with ATLASGAL emission, classify the sample in ECs and OCs, and separately study their distance distribution, completeness, and age distribution. Finally, Sect. 5 summarizes the main conclusions of this paper.

1.1. Observations: Galactic surveys

The APEX Telescope Large Area Survey of the Galaxy (ATLASGAL, Schuller et al. 2009) is the first unbiased submm continuum survey of the whole inner Galactic disk, covering a total of 360 square degrees of the sky with Galactic coordinates in the range || ≤ 60° and |b| ≤ 1.5°. The observations were carried out at 870 μm using the Large APEX Bolometer Camera (LABOCA; Siringo et al. 2009) of the APEX Telescope (Güsten et al. 2006), located on Llano de Chajnantor, Chile, at 5100 m of altitude. With an antenna diameter of 12 m, the observations reach an angular resolution2 of 19.2′′ at this wavelength. The submm continuum emission mainly represents thermal radiation from cool dust, which is generally optically thin and, therefore, an excellent tracer of the amount of interstellar material on the line of sight. The ATLASGAL survey reaches an average rms noise level of ~50 mJy/beam, which translates in a 3σ detection limit of ~4  M of total molecular mass (for a nominal distance of 2 kpc and a dust temperature of Td = 20 K).

In the infrared, we primarily use two large scale surveys that cover the inner Galactic plane: the Two Micron All Sky Survey (2MASS, Skrutskie et al. 2006) which provides near-infrared (NIR) images of the whole sky, in the J (1.25 μm), H (1.65 μm), and Ks (2.16 μm) filters, with an angular resolution of ~2.5′′; and the Galactic Legacy Infrared Mid-Plane Survey Extraordinaire (GLIMPSE, Benjamin et al. 2003; Churchwell et al. 2009), which is a set of various mid-infrared (MIR) surveys of the Galactic plane carried out with the InfraRed Array Camera (IRAC, Fazio et al. 2004), on board of the Spitzer Space Telescope (Werner et al. 2004). Here we use the GLIMPSE I and II surveys which cover the (ℓ,b) ranges: 5° < || ≤ 65° and |b| ≤ 1°; 2° < || ≤ 5° and |b| ≤ 1.5°; || ≤ 2° and |b| ≤ 2°, comprising a total of 274 square degrees. The IRAC camera provides images at four filters centered on wavelengths 3.6, 4.5, 5.6, and 8.0 μm, with an angular resolution of ~2′′.

The GLIMPSE surveys have revealed very peculiar structures in star-forming regions (a summary is provided in Sect. 2 of Churchwell et al. 2009). The 8.0 μm filter is particularly useful to detect the presence of bright fluorescent emission from polycyclic aromatic hydrocarbons (PAHs), which are excited by the stellar far ultraviolet (UV) field, but are destroyed by the harder UV radiation present within ionized gas regions. Thus, PAH emission is often observed from IR bubbles, which appear projected as ring-like structures and in many cases are tracing molecular material swept up by the expansion of H ii regions created by the ionizing radiation from massive stars (Deharveng et al. 2010). On the other hand, infrared dark clouds (IRDCs), already found in previous MIR surveys, are seen as extinction features against the bright and diffuse mid-infrared Galactic background. They represent the densest and coldest condensations within giant molecular clouds and are the most likely sites of future star formation.

For a few regions within the ATLASGAL Galactic range not covered by the GLIMPSE survey, we use data from the Wide-field Infrared Survey Explorer (WISE, Wright et al. 2010), which mapped the entire sky in four infrared bands centered on 3.4, 4.6, 12, and 22 μm, with an angular resolution of ~6′′ in the first three bands. Despite the lower sensitivity and coarser resolution as compared with GLIMPSE, bright PAH emission and prominent IRDCs can still be identified in the WISE images, especially at 12 μm (see Sect. B.3).

1.2. “Stellar cluster” definitions

Table 1

Number of clusters for every catalog used in this work.

In this paper, we define:

  • an embedded cluster (EC) as any stellar group recently born andstill containing an important fraction of residual gas within andsurrounding its volume, keeping in mind that it may never becomea bound open cluster on its own. Since star formation takes place inmolecular clouds, this definition is equivalent to the concept of acorrelated star formation event introduced byKroupa (2011); we keep the term “cluster” in orderto match older designations in the literature;

  • an open cluster (OC) as any agglomerate of spatially correlated stars, and relatively free of the remaining gas. We use this observational definition of OC (see also Sect. 4.3) in order to account for those objects that observationally appear like classical OCs, but whose dynamical state is unknown, in some cases they can actually be gravitationally unbound;

  • a physical OC as a gravitationally bound OC (i.e., a classical OC);

  • an association as an unbound OC.

In this work, we sometimes use the term “star clusters” generically for all the classes defined above, especially when concerning observations. Bound, exposed star clusters, however, will be always be referred to explicitly as “physical OCs”.

2. Compilation of cluster lists

Although the number of known OCs and ECs in the Galaxy has considerably increased over the last years, the current cluster sample is still far from being complete. As we discuss in Sect. 4.5, the detection of a stellar cluster in the inner Galactic plane is particularly difficult, due to the high extinction and the crowded stellar background, making the cluster sample severely incomplete for distances larger than a few kpc from the Sun. If we are able to quantify this incompleteness, however, all the statistical results can properly be corrected, as we do in this work. Of course, the more complete the cluster sample, the smaller the corresponding uncertainties.

We thus performed an extensive compilation of all Galactic star cluster catalogs from the literature. For completeness, this compilation was initially not restricted to the ATLASGAL Galactic range; we only did it afterwards for the comparison with ATLASGAL emission and all the subsequent analysis. The catalogs are listed in the first three columns of Table 1, where we give, respectively, an ID used throughout this work, the corresponding reference, and its category according to the wavelength at which the clusters are detected: optical, NIR or MIR. Optical clusters are taken mostly from the current version (3.1, from November, 2010) of the catalog by Dias et al. (2002). NIR cluster catalogs are compilations, or lists from visual and automated searches mainly performed on the 2MASS survey. MIR clusters represent the objects detected by Mercer et al. (2005) in the GLIMPSE data, and the new clusters discovered by us using a different search method on the same survey, which were missing in the Mercer et al. (2005) list (see Sect. 2.1). In our total sample, we also included individual star clusters from the literature not listed in the previous catalogs (referred to as “Not catalogued clusters” in Table 1). A more detailed description of the diverse catalogs and references used to construct our cluster sample is given in Appendix A. This literature compilation has been updated till August 2011.

Since we are dealing with different cluster catalogs which were constructed independently, a specific object can be present in more than one list. We therefore implemented a simple merging procedure to finally have an unique sample of stellar clusters. The first condition to identify one repetition, i.e., the same object in two different catalogs, was that the angular distance between the two given center positions were less than both listed angular diameters. We checked all merged objects under this criterion looking for the corresponding cluster names, when available, and confirmed a repetition when the names coincided. Otherwise (names not available or different), two clusters were considered the same object when the angular distance was less than both angular radii, which were also required to agree within a factor of 5. The last condition was imposed to account for the case when a compact infrared cluster shares the same field of view of a (different) optical cluster with a large angular size. This cross-identification process was not intended to be perfect, but good enough to not affect the statistical results of the whole cluster sample. Within the ATLASGAL Galactic range, a much more thorough revision was done (see Sect. 3), further refining the cross-identifications, and even recognizing a few duplications and spurious clusters which were excluded from the final sample (see Sect. A.4).

In Table 1, for a given reference, we represent as Ncl the absolute (original) number of clusters in the catalog, whereas is the number of different entries with respect to all catalogs listed before it (i.e., after merging). The optical catalogs were put first, so that any cluster visible in the optical is considered an optical cluster. The infrared lists (including the NIR and MIR clusters) were positioned afterwards in chronological order, and therefore following roughly the discovery time. Absolute and after-merging numbers are presented for the total sky range of every list, the ATLASGAL Galactic range (|| ≤ 60° and |b| ≤ 1.5°), and finally for only those associated with ATLASGAL emission according to the criterion explained in Sect. 4.1. We warn that the number of clusters given there are after removing a few spurious objects and globular clusters (listed in Table A.1).

After cross-identifications, we ended up with a final sample of 3904 stellar clusters, of which 2247 are optical, 1493 NIR, and 164 MIR clusters. Taking the repetitions within each category into account, but not between them, the numbers of objects are 2247 for optical, 1950 for NIR, and 197 for MIR. Note that the low number of MIR clusters is due to the confined Galactic range of the GLIMPSE survey; actually, when only considering the ATLASGAL range, which is similar to the GLIMPSE range, the numbers of objects are of the same order for the different categories: 227 optical, 315 NIR, and 153 MIR clusters, after merging.

As argued in Sect. A.4, for ECs (as defined in this work) we expect a minimal contamination by spurious detections, whereas for OCs that have not been confirmed by follow-up studies, we estimate a spurious contamination rate of ~50%, following Froebrich et al. (2007b).

thumbnail Fig. 1

Spitzer-IRAC three-color images made with the 3.6 (blue), 4.5 (green) and 8.0 μm (red) bands, of six (out of 75) new ECs discovered in this work, using the GLIMPSE survey. The dashed circles represent the estimated angular sizes. The images are in Galactic coordinates and the given offsets are with respect to the cluster center, indicated at the bottom of each panel.

Open with DEXTER

2.1. New search for ECs in GLIMPSE

The GLIMPSE on-line viewer3 from the Space Science Institute represents a very useful tool to quickly examine color images constructed from data collected in the four 3.6, 4.5, 5.8 and 8.0 μm IRAC filters, of the whole survey. By inspecting some specific regions with this viewer, we noticed that some heavily ECs are still missing in the Mercer et al. (2005) list. An EC consists mostly of YSOs, which are intrinsically redder than field stars due to thermal emission from circumstellar dust, so that they are distinguished from background/foreground stars mainly by their red colors. Such a cluster would therefore produce a clearer spatial overdensity of stars in a point source catalog previously filtered by a red-color criterion, and would be more likely missed in a search of overdensities considering the totality of point sources, due the high number of field stars. We believe that this is the principal reason which would explain the incompleteness of the Mercer et al. (2005) catalog.

We then implemented a very simple automated algorithm using the GLIMPSE point source catalog to find the locations of EC candidates. First, we selected all point sources satisfying a red-color criterion: [4.5]−[8.0]  ≥ 1, following Robitaille et al. (2008), who applied this condition to create their catalog of GLIMPSE intrinsically red sources. As already explained in that work, the use of these specific IRAC bands is supported by the fact that the interstellar extinction law is approximately flat between 4.5 and 8.0 μm, and therefore the contamination by extinguished field stars in this selection is reduced compared to other red-color criteria. By applying this condition to the entire GLIMPSE catalog, 268 513 sources were selected. We did not impose the additional brightness and quality restrictions used by Robitaille et al. (2008) because we favor the number of sources (and therefore higher sensitivity to possible YSO overdensities) rather than strict completeness and photometric reliability, which are not needed to only detect the locations of potential ECs. With the 268 513 selected sources, a stellar surface density map was constructed by counting the number of sources within boxes of 0.01° (=36″), in steps of 0.002° (=7.2″). This significant oversampling was adopted in order to detect density enhancements that would have fallen into two or more boxes if we had used not overlapping bins. The bin size correspond to the typical angular dimension of some ECs serendipitously found using the on-line GLIMPSE viewer. To account for larger overdensities, a second stellar density map was produced with a bin size of 0.018° (=64.8″), using the same step size of 0.002°.

The red-source density maps were checked in a test field, and we found that thresholds of 5 sources for the small bin, and 7 sources for the large bin, are needed to detect the positions of all clusters which can be identified by-eye using the GLIMPSE on-line viewer within that area, although at the same time these low thresholds yield the detection of many spurious red-source overdensities that do not contain clusters. We decided to keep these thresholds in order not to miss any real cluster that might have a low number of members listed in the point source catalog, and perform a visual inspection of the images after the automated search to filter all spurious detections. It was also noticed that using the GLIMPSE point source archive instead of the catalog is roughly equivalent to utilizing the catalog with a lower threshold, so as long as we choose a correct threshold, the use of the more reliable GLIMPSE catalog (with respect to the archive) is justified. Within the whole GLIMPSE area, we detected 702 independent positions of overdensities (bins containing not-intersecting subsets of red sources), corresponding to 172 bins of 36″ with densities ≥5 sources/bin, 195 bins of 64.8″ with densities ≥7 sources/bin, and 335 locations satisfying the thresholds for both bin sizes. It should be noted that since the red-color criterion produced density maps with low crowding and therefore the local background density is always close to zero, a more sophisticated algorithm is not needed. In fact, the red-source density maps have a mean and a standard deviation of 0.039 and 0.21 sources/bin for the small bin, and 0.13 and 0.43 sources/bin for the large bin, which means that the used thresholds are above the 15σ level. Again, we emphasize that the automated search was only used to find possible locations of ECs; we did not intend to catch the complete YSO population for a given cluster in this process.

As pointed out above, a subsequent visual selection was performed by examining the GLIMPSE images, based on a series of criteria which are explained in the following. Because the GLIMPSE on-line viewer has limited pixel resolution and is not efficient to inspect a high number of specific locations, we downloaded original GLIMPSE cutouts around these 702 positions and constructed three-color images using the 3.6 (blue), 4.5 (green) and 8.0 μm (red) IRAC bands. This by-eye inspection led us to finally select 88 overdensities as locations of clusters, 17 of which are identified as known clusters from our literature compilation presented before. The remaining 71 new objects are listed in Table 2. The adopted identification is a record number (Col. 1) preceded by the acronym “G3CC” (GLIMPSE 3-Color Cluster4). The final coordinates and the angular diameter (Col. 6) were estimated by eye on the GLIMPSE three-color images fitting circles interactively with the display software SAO Image DS95. The visual criteria applied to select the 88 overdensities are identified for each new object as flags in the last column of Table 2. Figure 1 shows GLIMPSE three-color images of 6 clusters, illustrating these different criteria. An almost ubiquitous characteristic of the selected clusters (present in 82 cases) is their association with typical mid-infrared star formation signposts (see Sect. 1.1), namely: extended 8.0 μm emission in the immediate surroundings (flag E8, see Figs. 1a–d,f), likely representing radiation from UV-excited PAHs or warm dust; more localized extended 4.5 μm emission within the cluster area (flag E4, Fig. 1a), which might trace shocked gas by outflowing activity from protostars (see Cyganowski et al. 2008, and references therein); and presence of an infrared dark cloud in which the cluster is embedded (flag DC, Figs. 1a,e). We also indicate whether a cluster appears to have more stellar members than those identified by the red-color criterion, including the following situations: cluster composed of red sources and additional bright normal (not reddened) stars (flag BR, Fig. 1d), suggesting that the cluster is in a more evolved phase, probably emerging from the molecular cloud; cluster exclusively composed of bright normal stars (flag B, but only two cases, in conjunction with flag V2, see below); and presence of additional probable YSOs within the cluster, identified as sources uniquely detected at 8.0 μm (flag U8, representing extreme cases of red color), or compact 8.0 μm objects not listed in the point source catalog or archive (flag C8, Figs. 1b–d,f), due to the bright and variable extended emission at this wavelength, saturation for bright sources, or localized diffuse emission around a particular source which makes its apparent size larger than a point-source. The other flags indicate when the cluster shows up as a sparse, not centrally condensed set of sources (flag S, Fig. 1b), or if the cluster was noticed by-eye on the GLIMPSE images in a nearby location of an automatically detected overdensity, but not exactly at the same position (flag V2).

The remaining positions were rejected as clusters, and typically correspond to background stars extinguished by dark clouds or seen behind foreground 8.0 μm diffuse emission, producing a red-source density enhancement by chance, sometimes together in the same line of sight with a couple of intrinsically red sources (YSOs) which however do not represent a cluster by their own. Quantitatively, we found that, in general, most of the rejected positions are overdensities with fewer elements than the ones selected as clusters. In fact, if we choose stricter thresholds of 8 sources for the small bin, and 10 sources for the large bin, instead of the 5 and 7 originally used, respectively, the total set of overdensities decreases from 702 to just 87 independent positions, 37 of which represent our clusters. This would mean an improved “success” rate of 37/87 = 43% for the automated method rather than the original 88/702 = 13%. Furthermore, if we consider the effective number of elements in the 88 bins originally selected as being locations of clusters, i.e., summing possible additional stellar members (flags BR, C8, U8) within the bins, we find that 61 of our clusters satisfy the new threshold. We emphasize, however, that the additional stellar members of each cluster were recognized after detailed inspection of the GLIMPSE images, so that the use of low density thresholds in the automated method was necessary to identify the initial cluster locations, despite of the consequent detection of many spurious red-source overdensities. If we had used from the beginning the stricter thresholds, we would have missed 88−37 = 51 clusters. Column 7 of Table 2 lists for every cluster the estimated number of stellar members within the assumed radius, Ncirc, counting the YSOs selected by the red-color criterion and the additional members identified in the images (flags BR, C8, U8). Note that this number represents a lower limit, especially in distant clusters, since lower mass members could still be undetected due to the limited angular resolution and sensitivity of the GLIMPSE data.

We note that, because our simple automated method to find YSO overdensities is based on the GLIMPSE point source catalog, it is unavoidably biased towards young ECs that are not yet associated with very bright extended emission, which would hide many of the cluster members from the point source detection algorithm. Fortunately, it is quite likely that those bright nebulae were already looked for the presence of clusters by previous by-eye searches (see Sect. 4.5), so probably a few of them are really missing in our total compiled sample. We tried anyway to complete our list of new clusters by performing a systematic visual inspection with the on-line viewer over the entire area surveyed by GLIMPSE, including also fully exposed clusters that appear bright at 3.6 μm (equivalent to flag “B”). We found from this process 23 additional clusters, of which, however, only 4 are new discoveries with respect to our literature compilation. They are marked in Col. 8 of Table 2 with a “V”, while the ones detected by the automated method are indicated with an “A”. We remark that, of the 17 known clusters we rediscovered from the red-source overdensities, only 3 are from the Mercer et al. (2005) list. This practically null overlap between the two detection methods demonstrates that our search is fully complementary and particularly useful to detect ECs, confirming the ideas we presented at the beginning of this section.

Although our literature compilation of clusters is up to date until August, 2011, it is interesting to cross-check our list of new GLIMPSE clusters with the ECs recently discovered by Majaess (2013), who applied a combination of color and spectral index criteria to find YSO candidates using the WISE and 2MASS catalogs, and then looked for clusters by visually inspecting the YSOs spatial distribution. We found that only 5 new GLIMPSE clusters (they are indicated in Table 2) are associated with objects from the published list by Majaess (2013), in particular these 5 clusters are contained within the corresponding objects identified by Majaess (2013), which cover a much larger area. Due to the coarser angular resolution of WISE data with respect to GLIMPSE data, the typical stellar densities in our ECs are probably too high to make all the individual members detectable at the WISE resolution, and consequently they are hidden in the Majaess (2013) YSO selection.

3. Properties of the cluster sample

The next step of this work was to characterize the ATLASGAL emission, if present, at the positions of the star clusters compiled in Sect. 2, and to compare this emission with NIR and MIR images. Hereafter, our study is naturally restricted to the ATLASGAL Galactic range (|| ≤ 60° and |b| ≤ 1.5°), and we refer to the list of the 695 stellar clusters within that range as the “whole cluster sample” (or simply as the “cluster sample”), unless noted. Together with this process, we performed a critical literature revision in order to add and update distances and ages for an important fraction of the sample, as well as to look for connections with known H ii regions, IRDCs, and IR bubbles. We organize all this information in an unique catalog, whose construction is summarized in the following, and described in more detail in Appendix B. The catalog is only available in electronic form at the CDS, together with a companion list of all the references with the corresponding identification numbers used throughout the table. For illustration, an excerpt of the catalog is given in Appendix C.

3.1. ATLASGAL and MIR emission

In order to search for submm dust continuum emission tracing molecular gas likely associated with the clusters, we examined the ATLASGAL emission around the cluster positions. The column Morph is a text flag that gives information about the morphology of the detected ATLASGAL emission versus the IR emission. It is composed of two parts separated by a period. The first part tells about how the ATLASGAL emission is distributed throughout the immediate star cluster area, including the following cases:

  • emb: cluster fully embedded, with its center matching the submmclump peak (Fig. 2, top).

  • p-emb: cluster partially embedded, whose area is not completely covered, or the submm clump peak is significantly shifted from the (proto-)stars locations (Fig. 2, bottom).

  • surr: possibly associated submm emission surrounding the cluster or close to its boundaries (Fig. 3, top).

  • few: one or a few ATLASGAL clumps within the cluster area (mostly for optical clusters having a large angular size), not necessarily physically related with the cluster.

  • few*: the same morphology as before, but now the clump(s) is (are) likely associated with the star cluster according to previous studies in the literature, or because the kinematic distance derived from molecular lines agrees with the stellar distance. See Sect. 3.3 for a brief description of the distance determinations.

  • exp: exposed cluster, without ATLASGAL emission in immediate surroundings (Fig. 3, middle and bottom).

  • exp*: cluster that is physically exposed, but presents submm emission within the cluster area which appears in the same line of sight, but with a kinematic distance discrepant from the stellar distance (the cluster would be categorized as few or surr if no distance information were available).

We indicate in the second part of the column Morph (after the period) details about the mid-infrared morphology of each cluster, after visually inspecting GLIMPSE three-color images made with the 3.6 (blue), 4.5 (green) and 8.0 μm (red) bands. For a few clusters with no coverage in the GLIMPSE survey (7% of the cluster sample), we instead examined WISE three-color images using the 3.4, 4.6 and 12 μm filters. This flag includes the following cases:

  • bub-cen: presence of an IR bubble which seems to be producedby the cluster through stellar feedback, and appears in the imagescentered on near the cluster position (Fig. 3, top).

  • bub-cen-trig: the same situation than before, together with the presence of possible YSOs at the periphery of the bubble identified by their reddened appearance in the images, suggesting triggered star formation generated by the cluster (see also Fig. 3, top).

  • bub-edge: in this case, the cluster itself appears at the edge of an IR bubble, suggesting that it was probably formed by triggering from an independent cluster or massive star.

  • pah: presence of bright and irregular emission at 8.0 μm (12 μm for WISE) which seems to be produced by the cluster through stellar radiative feedback (Fig. 2, bottom); it is attributed to radiation from UV excited PAHs or warm dust, but is not clearly identified as an IR bubble (though it sometimes shows bubble-like borders)6.

thumbnail Fig. 2

Examples of the two morphological types defined for ECs (see Sect. 4.1): cluster G3CC 38 of type EC1 (top panels), and the cluster [DBS2003] 113 of type EC2 (bottom panels). The left panels show Spitzer-IRAC three-color images made with the 3.6 (blue), 4.5 (green) and 8.0 μm (red) bands. The right panels present 2MASS three-color images of the same field of view, constructed with the J (blue), H (green), and Ks (red) bands. The overlaid contours on the 2MASS images represent ATLASGAL emission (870 μm); the contour levels are { 5,8.8,15,25,46,88,170 }  × σ, where σ is the local rms noise level (σ = 45 mJy/beam for G3CC 38, and σ = 42 mJy/beam for [DBS2003] 113). The images are in Galactic coordinates and the given offsets are with respect to the cluster center, indicated in the left panels below the cluster name. The dashed circles represent the estimated angular sizes from the original cluster catalogs (see Sect. B.1). The 1 pc scale-bar was estimated using the corresponding distance adopted in our catalog.

Open with DEXTER

thumbnail Fig. 3

Examples of the three morphological types defined for OCs (see Sect. 4.1): cluster [DBS2003] 176 of type OC0 (top panels), cluster NGC 6823 of type OC1 (middle panels), and cluster BH 222 of type OC2 (bottom panels). The local rms noise level of the ATLASGAL emission is, respectively, 36, 46, and 29 mJy/beam. See caption of Fig. 2 for more details of the images.

Open with DEXTER

3.2. Correlation with known objects

Associated IR bubbles that are listed in the catalogs by Churchwell et al. (2006, 2007) are identified in the table column Bub. On the GLIMPSE three-color images and on the 8.0 μm images (WISE three-color and 12 μm images when GLIMPSE data were not available), we also identified the presence of an infrared dark cloud in which the cluster appears to be embedded (column IRDC; see Fig. 2, top), and we give the designation from the catalogs by Simon et al. (2006) or Peretto & Fuller (2009) when the object is listed there. Finally, we searched in the literature for associated H ii regions (column HII_reg), and we flagged the sources that have been classified in the literature as ultra compact (UC) H ii regions.

3.3. Distance and age

An important part of this work was to assign distances to as many clusters as possible. In this regard, we took advantage of the fact that many of the ATLASGAL clumps at the locations or in the vicinity of the stellar clusters have measurements of molecular line LSR velocities (e.g., Wienen et al. 2012; Bronfman et al. 1996; Urquhart et al. 2008). Using these velocities and a combined rotation curve based on the models by Brand & Blitz (1993) and Levine et al. (2008), we computed kinematic distances for the clumps (column KDist) and, therefore, for the corresponding clusters when they were assumed to be physically associated. The kinematic distance ambiguity (KDA) was disentangled mainly by searching for previous resolutions in the literature (e.g. Caswell & Haynes 1987; Faúndez et al. 2004; Anderson & Bania 2009; Roman-Duval et al. 2009), for the clumps themselves or nearby H ii regions in the phase space. A total of 424 clusters have kinematic distance estimates for the ATLASGAL clumps, 92% of which have available KDA solutions. The uncertainties (column e_KDist) have been determined by shifting the LSR velocities by ±7 km s-1 to account for random motions, following Reid et al. (2009), who suggest this value as the typical virial velocity dispersion of a massive star-forming region.

We also compiled values for the stellar distance (column SDist) and age (column Age), estimated from studies of the stellar population of the clusters. These data were obtained from the original cluster catalogs or from new references found in SIMBAD. To prevent underestimation of the uncertainties (provided in columns e_SDist and e_Age), we imposed minimum errors depending on the computation method for the stellar distance, and on the range for the age (the latter following Bonatto & Bica 2011). Stellar distances are available for 222 clusters (32% of the sample), and ages for 209 clusters (30% of the sample). The most common method for stellar distance and age determination is isochrone fitting (e.g., Loktin et al. 2001), which implies that these parameters are available mainly for exposed clusters (see Sect. 4.7).

The final adopted distance for each cluster (column Dist) was chosen to be the available distance estimate with the lowest uncertainty. In some cases, we adopted independent distance estimates from the literature if they were more accurate than SDist and KDist (e.g., from maser parallax measurements; see Reid et al. 2009, and references therein). Clusters within a particular complex (identified in the column Complex) were assumed to be all located at the same distance, determined from the literature, or kinematically from an average position and velocity.

In total, there are distance determinations (Dist) for 538 clusters, i.e., for 77% of our sample. Naturally, there is a dichotomy in the distance estimation method depending on whether or not the cluster is associated with an ATLASGAL source with available velocity, so that most exposed clusters uniquely have stellar distances, whereas the distances for ECs are mainly kinematic or from associations with complexes. However, it is still possible to compare stellar and kinematic determinations for a subsample of 38 clusters (mostly embedded) which have distances available from both methods. This comparison is shown in Fig. 4, which reveals that in our cluster sample both methods are quite consistent with each other, with an 84% agreement (32 out of 38 objects). We note that among the discrepant cases, there are two ECs (points (2.16,4.30) kpc and (5.05,1.30) kpc in the plot) whose method for age and (stellar) distance estimation was found to be particularly inaccurate (see Sect. 4.7.1).

The rms between the stellar and kinematic distances compared in Fig. 4 is 1.28 kpc, which represents the combined error, for this particular subsample, of both stellar and kinematic distances added in quadrature. If we compute this error from the estimated uncertainties e_KDist and e_SDist averaged over the subsample, we obtain a value of 1.59 kpc, which means that we slightly overestimated some of the uncertainties, probably because we were quite conservative in determining the minimum errors for the stellar distances (see Sect. B.5). The average uncertainties are ⟨ e_KDist ⟩  = 0.67 kpc and ⟨ e_SDist ⟩  = 1.45 kpc for the subsample of the 38 clusters used for comparison, and ⟨ e_KDist ⟩  = 0.68 kpc and ⟨ e_SDist ⟩  = 0.58 kpc for the whole sample. The high average error for the stellar distance in the subsample with respect to the whole sample is due to the fact that most of these clusters have stellar distances estimated from the spectrophotometric method, which is more inaccurate than, e.g., main sequence or isochrone fitting (see Sect. B.5). The average estimated uncertainty in the adopted distance is ⟨ e_Dist ⟩  = 0.51 kpc for the whole sample (and 0.52 kpc for the subsample).

4. Analysis

4.1. Morphological evolutionary sequence

Here, we use the characterization of the ATLASGAL emission found throughout each cluster’s area and/or environment (described in Sect. 3.1) to define main morphological types and delineate an evolutionary sequence. First, in order to test our visual ATLASGAL morphological flags specified above (corresponding to the first part of the column Morph, and represented hereafter by m0), we compared them against the more quantitative parameter s ≡Clump_sep of our catalog, which is the projected distance of the nearest ATLASGAL emission pixel, normalized to the cluster angular radius. We found a reasonable correlation: s = 0 for all deeply ECs (m0 = emb), s < 0.42 for partially ECs (m0 = p-emb), 0.40 < s < 1.97 for clusters surrounded by submm emission (m0 = surr), and s > 0.94 for exposed clusters (m0 = exp). Exposed clusters with s < 1 only comprise a few cases with a large angular size and very faint emission close to their borders. The remaining morphological flags are very specific and we do not expect any correlation with the quantity Clump_sep.

thumbnail Fig. 4

Comparison of kinematic and stellar distances for the 38 clusters of our sample with both estimations available. Plus signs (+) indicate agreement within the errors, and circles mark the discrepant cases. Colors indicate which distance estimate was finally adopted in our catalog: stellar (red), kinematic (blue), and other (black). The dashed line is the identity.

Open with DEXTER

Denoting by Cf0 the first digit of the flag Clump_flag from our catalog (a value >0 means that the nearest ATLASGAL clump is likely associated with the cluster), and using the logical operators ∧, ∨ and ¬ (“and”, “or”, and “not”, respectively), we define five morphological types as follows:

  • EC1: m0 =emb;

  • EC2: m0 = p -emb;

  • OC0: m0 = surr   ∨   m0 = few*   ∨   (m0 = few   ∧   Cf0 > 0);

  • OC1: m0 = exp   ∧   (Cf0 > 0   ∨   KDist ≃SDist);

  • OC2: (m0 = exp   ∨   m0 = exp*   ∨   m0 = few)     ∧   ¬(OC1   ∨   OC2).

The morphological type for each cluster is given in the column Morph_type of our catalog. Figures 2 and 3 present one example cluster for each morphological type, shown in GLIMPSE three-color images, and 2MASS three-color images overlaid with ATLASGAL contours. In simpler words, given that star clusters are expected to be less and less associated with molecular gas as time evolves, due to gas dispersal driven by stellar feedback, we have defined above a morphological evolutionary sequence, with decreasing correlation with ATLASGAL emission. EC1 are deeply ECs (Fig. 2, top), EC2 are partially ECs (Fig. 2, bottom), OC0 are emerging exposed clusters (Fig. 3, top), and finally there are two kinds of totally exposed clusters: OC1 are still physically associated with molecular gas in their surrounding neighborhood (an ATLASGAL clump at a projected distance of Clump_sep times the cluster radius, see Fig. 3, middle), whereas OC2 are all the remaining exposed clusters, which present no correlation with ATLASGAL emission (Fig. 3, bottom).

Table 3

Number of clusters in each morphological type.

Note that, however, this classification is not perfect. For example, although the gas velocity and stellar distance data are quite extensive, they are not complete to identify all the m0 =few*, m0 =exp* and KDist≃SDist cases, so that some misclassification might occur in the type OC2. Similarly, the physical link between the submm emission and the ECs was based on the morphology seen in the images, and some chance alignments might still be present in a few cases (estimated to be about 5%, see Sect. 4.2). Therefore, the defined morphological types should primarily be considered in a statistical way, and for individual objects they must be treated with caution. Column 2 of Table 3 lists how many objects fall in each morphological type for the whole cluster sample. Note that the low number of OC1 clusters could be partially due to the observational difficulty in identifying an exposed cluster physically associated with molecular gas in their surroundings, as remarked before. Column 3 gives the number of clusters with available distances, and the remaining columns will be described in Sect. 4.6.

With this morphological classification, it is easy to determine (again, statistically) which clusters are associated with ATLASGAL emission: simply as those with types EC1, EC2, OC0 or OC1. These clusters are counted for every catalog in the last two columns of Table 1, as absolute and after-merging numbers of objects (Ncl and , respectively). As expected, optical clusters are rarely associated with ATLASGAL emission (only ~15% of them, most of which are of type OC0 or OC1), since otherwise they would be barely visible at optical wavelengths due to dust extinction. On the other hand, the majority of the NIR and MIR clusters are physically related with submm dust radiation (~79% and 74% of them, respectively). Although this is also expected because infrared emission is much less affected by dust extinction than visible light, these high percentages might partially be a consequence of the detection method of the infrared cluster catalogs, which in most cases tried to intentionally highlight the EC population. For example, the 2MASS by-eye searches by Dutra et al. (2003a) and Bica et al. (2003b) were done towards known radio/optical nebulae, and our new GLIMPSE cluster candidates were detected after applying a red-color criterion (see Sect. 2.1). In these particular catalogs, almost the totality of objects are associated with ATLASGAL emission.

4.2. Chance alignments

We computed the probability of chance alignments of our stellar clusters with ATLASGAL clumps, and the different known objects looked for spatial correlation in our catalog (see Sect. 3.2), in order to test the validity of the assumption of physical relation, when this is only based on the position of the objects on the sky. For a given sample of objects, this probability was estimated semi-analytically by assuming that the objects within |b| ≤ 1° (where most sources are located for all samples used) and the longitude range originally covered, are uniformly distributed over that area, and that their angular sizes are distributed according to the observed sizes. We first calculated the probability of overlap of each cluster with one or more objects from this hypothetical sample, and then we averaged these probabilities over two different sets of clusters: morphological types EC1 and EC2 together (hereafter EC-); and types OC0, OC1 and OC2 together (hereafter OC-).

For ATLASGAL clumps, we adopted a total number of 6451 objects within 330° ≤  ≤ 21° and | b| ≤ 1°, from the compact source catalog by Contreras et al. (2013), which, together with their estimated effective radii, gives an average chance alignment probability of 8.8% for clusters with types EC-, and 32% for clusters with types OC-. Considering that the submm and infrared morphologies of deeply ECs (type EC1) usually support the real physical relation with molecular gas (e.g., matching peaks of submm emission and stellar density), and that partially ECs (type EC2) are generally associated with more than one ATLASGAL clump, in practice the fraction of chance alignments of EC- clusters with ATLASGAL compact sources is likely below 5%, which is low enough to not affect the statistics of this work. Due to their larger angular sizes, clusters of types OC- are more prone to be aligned with ATLASGAL clumps by chance, and therefore our additional requirements to assume that an exposed cluster is associated with ATLASGAL emission are justified (morphological criteria or matching distances for types OC0 and OC1).

For the known objects considered in our catalog, we assume that there are 4936 IR bubbles in the range || ≤ 60° and |b| ≤ 1° (Simpson et al. 2012)7, 17 364 IRDCs within 10° ≤ || ≤ 60° and |b| ≤ 1° (from the catalogs by Simon et al. 2006; Peretto & Fuller 2009), and 944 H ii regions in the range 343° ≤  ≤ 60° and |b| ≤ 1° (from the recently discovered and previously known H ii regions listed in Anderson et al. 2011). In this case, to compute the chance alignment probability of each cluster with the objects of a given sample, we also required that the objects were larger than half the size of the cluster and that the distance between the object’s position and the cluster center were less than the sum of both radii divided by two, so that the alignment really mimics a physical relation misidentified by eye. The averaged probabilities are quite similar for clusters with types EC- and OC-, and they are all low: ~2% for IR bubbles, ~3.5% for IRDCs, and ~0.3% for H ii regions.

4.3. Observational classification of OCs and ECs

We can also use the morphological evolutionary sequence established in Sect. 4.1 to observationally define in our sample the concepts of EC and OC. Since any stellar agglomerate that appears deeply or partially embedded in ATLASGAL emission would satisfy our physical definition of EC presented in Sect. 1.2, we simply use as observational definition the embedded morphological types: EC = EC1 ∨ EC2. We consider the remaining morphological types as OCs, but excluding those objects that have not been confirmed by follow-up studies, since we expect for them a high contamination rate by spurious candidates (see Sect. A.4): OC = (OC0 ∨ OC1 ∨ OC2) ∧ (ref_Conf not empty), where ref_Conf is the column in the catalog indicating the reference for cluster confirmation (see Sect. B.5).

However, this observational definition of OC does not necessarily mean that the cluster is bound by its own gravity, and therefore, is not fully equivalent to the concept of physical OC defined in Sect. 1.2. To investigate under which conditions both definitions agree, we can apply the empirical criterion proposed by Gieles & Portegies Zwart (2011) which distinguishes between physical OCs and associations by comparing the age of the object with its crossing time, tcross, computed as if it were in virial equilibrium. In useful physical units, Eq. (1) of Gieles & Portegies Zwart (2011) becomes8(1)where M and Reff are, respectively, the mass and the observed 2D projected half-light radius of the cluster. Unfortunately, mass estimates and accurate structural parameters are usually not directly available in the OC catalogs; in particular, there are no mass data in the Dias et al. (2002) catalog, and the given sizes come from individual studies compiled there and are mostly derived from visual inspection. We therefore used the masses and radii determined by Piskunov et al. (2007), who fitted a three-parameter King’s profile (King 1962) to the observed stellar surface density distribution of 236 objects taken from an homogeneous sample of 650 optical clusters in the solar neighborhood (Kharchenko et al. 2005b,a), which is a subset of the current version of the Dias et al. (2002) catalog. Piskunov et al. (2007) estimated the masses from the tidal radii, and the effective radius Reff entering in Eq. (1) can be derived from both the core and tidal radius (we used Eq. (B1) of Wolf et al. 2010). Because only 14 of the clusters analyzed by Piskunov et al. (2007) are within the ATLASGAL sky coverage, in order to improve the statistics we applied the Gieles & Portegies Zwart (2011) criterion to the 236 studied objects, under the assumption that they are all OCs as observationally defined by us. This supposition is quite acceptable since they are optically-detected clusters and indeed within the ATLASGAL range almost all of them (13 out of 14) are classified as OCs.

We computed the crossing times using Eq. (1), and in Fig. 5 they are plotted versus the corresponding ages available from the Kharchenko et al. (2005b,a) catalogs. According to the Gieles & Portegies Zwart (2011) criterion, the identity tcross = Age divides the physical OCs (tcross ≤ Age) from associations (tcross > Age). It can be seen in the plot that, because the resulting crossing times are relatively short (log   (tcross/yr) ≲ 7.6), the majority of the objects studied by Piskunov et al. (2007) are physical OCs for ages in excess of 10 Myr. In fact, for log   (Age/yr) > 7.2, which is the threshold above which the age distribution can uniquely be explained through classical cluster disruption mechanisms (see Sect. 4.7.2), only 2.6% of the objects are formally associations. We thus conclude that our observational definition of OC agrees with the physical one provided by Gieles & Portegies Zwart (2011, what we call a physical OC) for ages greater than ~16 Myr, which corresponds to the 74% of our OC sample within the ATLASGAL range. Younger OCs can be either associations, as a result of early dissolution, or already physical OCs.

4.4. Spatial distribution

thumbnail Fig. 5

Crossing time vs. age for an all-sky sample of 236 clusters (Piskunov et al. 2006) taken from an homogeneous catalog of 650 optical clusters in the solar neighborhood (Kharchenko et al. 2005b,a). The dashed line is the identity tcross = Age, which divides the physical OCs (tcross ≤ Age) from associations (tcross>Age) according to the criterion proposed by Gieles & Portegies Zwart (2011).

Open with DEXTER

thumbnail Fig. 6

Galactic locations of a) OCs and b) ECs within the ATLASGAL range, superimposed over an artist’s conception of the Milky Way (Hurt’s from the Spitzer Science Center, in consultation with Benjamin), which was based on data obtained from the literature at radio, infrared, and visible wavelengths, and attempts to synthesize many of the key elements of the Galactic structure. The coordinate system is centered at the Sun position, indicated by the “⊙” symbol, and we have scaled the image such that R0 = 8.23 kpc (Genzel et al. 2010). The two diagonal lines represent the ATLASGAL range in Galactic longitude (|| ≤ 60°). In panel a), we indicate the names of the spiral arms.

Open with DEXTER

In this section, for the clusters in our sample with available distance estimates we study their spatial distribution in the Galaxy, and with respect to the Sun. Figure 6 shows the Galactic distribution of the clusters separated in the (a) OC and (b) EC categories defined in the previous section, on top of an artist’s conception of the Milky Way viewed from the north Galactic pole (R. Hurt from the Spitzer Science Center, in consultation with R. Benjamin). The image was constructed based on multiwavelength data obtained from the literature, and we have scaled it to R0 = 8.23 kpc (Genzel et al. 2010, see Sect. B.4). It is clear from the image that ECs probe deeper the inner Galaxy than the OC sample, which is concentrated within a few kpc from the Sun (≲2 kpc). This, of course is an observational effect mainly produced by the difficulty in detecting exposed clusters against the Galactic background, compared to ECs (see Sect. 4.5), and enhanced by the fact that some genuine OCs have no distance estimates and therefore cannot be included in the spatial distribution analysis (e.g., there are 123 clusters of type OC2 without available distance, half of which might be real). ECs are spread over larger distances from the Sun (≲6 kpc) and, although few of them can be detected beyond the Galactic center, a paucity of ECs is hinted within the Galactic bar, augmented by some apparent crowding close to both ends of the bar. The Galactic distribution of ECs is consistent with the spiral structure delineated on the background image; however, the large distance uncertainties (~0.5 kpc on average, see Sect. 3.3), and the limited distance coverage, prevent the ECs from clearly defining the spiral arms by their own.

thumbnail Fig. 7

Histogram of heights from the Galactic plane, as measured from the Sun (Z = z − z0), for a) OCs and b) ECs, using a bin width of ΔZ = 10 pc and Poisson uncertainties. The overplotted solid curve in each panel represents: a) the fitted Z-distribution ΦZ(Z) from Eq. (10) with best-fit parameters z0 = 14.7 ± 3.7 pc and zh = 42.5 ± 9.9 pc; b) the predicted Z-distribution from Eq. (10), using the parameters fitted for the OC sample. In panel b), the darker shaded region is the Z-histogram for ECs with distances D < 4 kpc, whereas the dashed curve indicates the corresponding distribution as predicted from Eq. (10) and the same parameters z0 and zh.

Open with DEXTER

thumbnail Fig. 8

Histogram of heliocentric distances, D, for a) OCs and b) ECs, using a bin width of ΔD = 0.4 kpc and Poisson uncertainties. In each panel, the solid curve represents the fitted D-distribution ΦD(D) from Eq. (8), with the completeness distance Dc as free parameter (see Eq. (9)); the dashed curve shows the fit with fixed Dc = 0 (see text for details). The best-fit parameters are given in Table 4.

Open with DEXTER

To really quantify how deep our OC and EC samples reach into the inner Galaxy, and to estimate the completeness fraction at a given distance, we need to study the observed heliocentric distance distribution of the clusters, and compare it to what is expected from making some basic assumptions. In the following, we denote by D the distance of the cluster from the Sun, projected on the Galactic plane9, and by z the height of the cluster above the Galactic plane. For simplicity, we also define Z ≡ z − z0, where z0 is the displacement of the Sun above the plane; this is actually what we obtain directly10 from the cluster distance d and its Galactic latitude b, Z = dsinb. The observed Z- and D-distributions are shown, respectively, in Figs. 7 and 8, for our cluster sample separated in OC and EC categories. In the construction of the histograms, we used fixed bins of ΔZ = 10 pc and ΔD = 0.4 kpc, but since the distance uncertainties are quite nonuniform, we have fractionally spread the ranges determined by the central values and their uncertainties over the covered bins. In other words, for a cluster with distance and uncertainty D ± σD, we considered all the bins overlapping with the range [D − σD,D + σD] and in each bin we added the fraction (with respect to the total width of the range, 2σD) comprised by the corresponding overlap. The total OC and EC distance distributions were obtained by repeating this procedure for all the clusters. The Z-distributions were constructed using the same method, and the fitted curves plotted in Figs. 7 and 8 are explained in the following.

4.4.1. Assumed model for the spatial distribution

In general, we can assume that the spatial number-density of OCs or ECs in the Galaxy is described by a combination of two independent exponential-decay laws for the cylindrical coordinates z and R, centered in the Galactic center: ρ(R,z) = ρ0   ϕR(R)   ϕz(z), with ϕR(R) = eR/RD and ϕz(z) = e−|z|/zh. This is a common functional form used to characterize the Galactic distribution of stars (see Sect. 1.1.2 of Binney & Tremaine 2008), and has already been applied in previous OC studies (Bonatto et al. 2006; Piskunov et al. 2006). One might want to consider the imprint of spiral arm structure in the azimuthal distribution of ECs, since they are still embedded in molecular clouds, but here we are interested in the distance and height longitude-averaged distributions, for which azimuthal substructure is less important. Furthermore, as noted above, our EC distances are not accurate enough to constrain the location of the spiral arms. If we transform the density ρ(R,z) to a coordinate system centered on the Sun, and assume that we are observing the totality of the clusters in the Galaxy within the ATLASGAL range (|b| ≤ b1 and || ≤ 1, with b1 ≡ 1.5° and 1 ≡ 60°), the resulting density (not averaged in longitude yet) can be written as (2)where (3)Now we can derive an analytical expression for the D-distribution of an ideally complete sample: where Σ0 ≡ 2zhρ0 is the surface number-density on the Galactic disk for R = 0, and we have defined the function fb1(D) as (6)which arises from the fact that the limited latitude coverage restricts the integration in Z at each distance.

4.4.2. Completeness fraction

In practice, however, as already mentioned before and discussed in Sect. 4.5, we are unable to detect the totality of the clusters within the ATLASGAL range, due to the difficulty in star cluster identification towards the inner Galaxy. Indeed, the D-distributions that we really observe for OCs and ECs (see Fig. 8) do not increase with distance up to the Galactic center (D = R0), as we would expect from Eq. (5); instead, they reach a maximum at a nearby distance and then decay considerably, especially for optical clusters. The observed D-distributions are dominated by the high incompleteness at increasingly larger distances from the Sun, and therefore, are insensitive to large scale structure on the Galactic disk such as the scale length RD. Attempts to include RD in the parametric fit to the distance distributions described below resulted in heavily degenerated output parameters and practically no constraint on their values. We then eliminated the dependence of the model on RD by making the rough approximation that the underlying radial distribution of clusters is uniform, i.e., ϕR(R) = 1. This is supported by the fact that, due to the incompleteness, most clusters in our sample are within a few kpc from the Sun, where the variations in ϕR(R) can be considered small relative to the completeness decay. The constants ρ0 and Σ0 must now be interpreted as solar neighborhood values, and from Eq. (5) the complete D-distribution becomes (7)On the other hand, defining a fractional factor fc(D) that quantifies the completeness of the cluster sample as a function of distance11, we can express the observed D-distribution ΦD(D) as (8)In order to assign a particular parametric shape to the completeness fraction, we chose an ansatz for fc(D) based on previous statistical works of OCs in the whole sky. Bonatto et al. (2006) studied the WEBDA database12 at that time and found, by completeness simulations, that their analyzed OC sample is highly incomplete in the inner Galaxy, even within what they called the “restricted zone”, defined as an annulus segment with Galactocentric distances R in the range [R0 − 1.3  kpc, R0 + 1.3  kpc]. The completeness fraction they determined decays almost immediately from R = R0 to R < R0 (see their Fig. 11; note that R0 = 8.0 kpc in that work). However, Piskunov et al. (2006) claim that the Kharchenko et al. (2005b,a) OC catalogs constitute a complete sample up to about 0.85 kpc from the Sun. This is nicely illustrated in their Fig. 1, where a flat distribution of surface number-density of clusters is exhibited up to that distance, after which the distribution starts to decrease considerably. If the completeness fraction of their sample in the inner Galaxy were similar to that obtained by Bonatto et al. (2006), the surface density distribution would be a decreasing function immediately from D = 0 kpc rather than from D = 0.85 kpc13. We think that this discrepancy is mainly caused by two effects: 1) the cluster sample studied by Bonatto et al. (2006) (654 objects with known distances) is less complete than, e.g., the current version of the Dias et al. (2002) catalog used in this work (1309 clusters with available distances), which is equivalent to the Kharchenko et al. (2005b,a) sample within 0.85 kpc; and 2) the “restricted zone” considered by Bonatto et al. (2006) covers a larger area than the circle defined by the completeness limit of Piskunov et al. (2006) (radius of 0.85 kpc centered on the Sun), and thus includes regions where the OC sample is indeed incomplete. In fact, we performed a quick test on the current Dias et al. (2002) catalog by constructing the Galactocentric radii distribution of clusters within 1 kpc from the Sun, and we obtained a shape that is not incompatible with a exponential law in the whole range, as opposed to the distribution derived by Bonatto et al. (2006, their Fig. 9).

Based on the above discussion, the completeness fraction for our OC sample is likely ~1 up to a close distance from the Sun, Dc, and then starts to decay significantly. We assume that the decay is exponential: (9)This parametrization allows us to investigate the possibility that the sample is always incomplete, as for Bonatto et al. (2006), by just imposing Dc = 0. We employ the same functional form for the completeness fraction of ECs, but of course varying the parameters Dc and s0.

4.4.3. Fit for the height distribution

Before proceeding to fit Eq. (8) to the observed D-distributions, we first need some estimates for zh and z0 which are used to compute the factor fb1(D). We obtain those estimates from the Z-distribution, which can be analytically written as (10)The advantage in writing this equation explicitly in terms of ΦD(D) is that we can directly use the observed D-distribution instead of its analytical expression (and compute the integral numerically), so that it is possible to fit the Z-distribution with only two free parameters, z0 and zh, and independently of the fit for the distance distribution. All the fits were performed using the Levenberg-Marquardt least-squares minimization package mpfit (Markwardt 2009), implemented in IDL, and we have assumed Poisson uncertainties. The best fit of Eq. (10) to the observed Z-distribution of OCs is shown in Fig. 7a as a solid curve, and the corresponding fitted parameters are z0 = 14.7 ± 3.7 pc and zh = 42.5 ± 9.9 pc. These values are in excellent agreement with the ones derived by Bonatto et al. (2006), if we consider their scale height zh within the solar circle (which is the case for almost the totality of our OC sample).

The observed Z-distribution of ECs (Fig. 7b) is much more irregular than that of OCs, and therefore a proper fit is not possible. This is likely due to the fact that ECs are spread over a larger area than OCs, and therefore, present lower statistics in the solar neighborhood and larger average errors in Z (Z ∝ D). In addition, ECs are usually grouped in complexes, as we will see in Sect. 4.8 and can already be noted in Fig. 6b, where some particular locations appear crowded with many close objects, enhancing the non-uniformity of their spatial distribution. However, if we adopt the same parameters z0 and zh derived from the OC sample and compute the predicted distribution from Eq. (10) (naturally, using now the observed ΦD(D) of ECs), the resulting curve is roughly consistent with the observed Z-distribution, as shown in Fig. 7b. The most systematic discrepancy can be identified for Z < −40 pc, where there is a significant deficit of observed clusters with respect to the predicted distribution, probably due to the difficulty in detecting ECs below the Galactic disk for large distances. Indeed, Fig. 7b also shows the observed Z-distribution for ECs with D < 4 kpc and the corresponding prediction, and we can see that in this case the deficit of observed clusters below the Galactic plane is only marginal. Another explanation might be the fact that we have assumed that the b = 0 plane is parallel to the Galactic disk, while in reality the combined effect of the offset of the Sun above the “true” Galactic plane, and of the Galactic center below the b = 0 plane, slightly tilts the b = 0 plane towards the south of the Galaxy (see Goodman et al., in prep.), so that clusters at large distances from the Sun and below the Galactic plane would appear at more negative values in the true Z-distribution. This could help to populate the bins in the range of the deficit of observed clusters, and would also explain why the deficit is less important for the distribution of clusters with D < 4 kpc.

4.4.4. Fit for the distance distribution

Using now values for z0 and zh obtained from the OC sample, which are also consistent with the EC height distribution, to compute the factor fb1(D) defined in Eq. (6), we fitted the analytical distribution ΦD(D) from Eq. (8) to the observed D-distributions of OCs and ECs, with free parameters Σ0, Dc, s0. The last two parameters are implicit in the completeness factor fc(D) defined in Eq. (9). The best fits are overplotted as solid curves on the corresponding histograms of Fig. 8, and the fitted parameters are given in Table 4. As can be already noted in the plots and confirmed by the reduced χ2 values (0.90 for OCs, and 1.48 for ECs), the assumed form of the completeness fraction (Eq. (9)) is a good representation of the overall detectability of star clusters in the inner Galaxy. The few outliers in the observed distribution with respect to the fitted analytical function for OCs with distances D ≳ 6 kpc mainly correspond to exposed clusters recently discovered at infrared wavelengths. A similar tendency is hinted for ECs with D ≳ 11 kpc, although in this case these outliers are also consistent with the irregular nature of the distribution in general, which slightly deviates (at one-sigma level) from the fitted curve at other distance bins. However, some problems with the resolution of the KDA, resulting in ECs incorrectly assigned to the far distance, cannot be ruled out.

It is remarkable that, despite the lower statistics caused by restricting to the ATLASGAL range, the fitted completeness limit of our OC sample, Dc = 1.01 ± 0.16 kpc, is consistent with that derived by Piskunov et al. (2006) for their all-sky sample in the solar neighborhood14. For ECs, both the completeness limit Dc and the completeness scale length s0 are larger than the corresponding values of the OC distribution (see Table 4), quantitatively confirming that, from an observational point of view, the EC sample traces larger distances from the Sun than the ones traced by our OC sample.

Table 4

Best-fit parameters from the Z- and D-distributions of OCs and ECs.

The fitted completeness limits for OCs and ECs are significantly above zero, practically discarding the possibility that the cluster samples are always incomplete in the inner Galaxy, as suggested by Bonatto et al. (2006) for OCs. To further test this option, we performed an alternative fit of Eq. (8) to the observed D-distributions, now fixing Dc = 0. For each distribution in Fig. 8, the resulting best fit is shown, and we immediately notice that this alternative fit is poorer than the one with Dc as free parameter, especially for OCs. Indeed, we applied a Kolmogorov-Smirnov test to all the fitted distribution functions in a distance range free of far-distance outliers (D ≤ 6 kpc for OCs, D ≤ 9 kpc for ECs), and we found that the Dc = 0 fit can be rejected with a significance level of 5% for OCs, and 6.5% for ECs. We thus conclude that the OC and EC samples in the inner Galaxy are roughly complete up to a distance of ~1 kpc and ~1.8 kpc, respectively, as derived from the free-Dc fits.

4.5. Discussion on the completeness

In general, the existence of a stellar cluster is observationally established by an excess surface density of stars over the background, so that its detectability depends on its richness, its angular size, the number of resolved individual members and their apparent brightness (which is directly related to the distance), the surface density of field stars, and the amount of extinction on the line of sight (Lada & Lada 2003). Consequently, it is particularly difficult to identify a star cluster in the inner Galactic plane, where both the stellar background and the extinction are relatively high, or a very distant cluster, for which its members appear faint and could be confused as a few single stars due to limited angular resolution of the observations. In fact, we have shown in the previous section that the current samples of OCs and ECs in the inner Galaxy are complete up to only a close distance from the Sun, and then the completeness heavily decreases as distance increases.

We have also seen that incompleteness affects the OC sample more severely than the ECs, i.e., the latter have a higher completeness limit and a less drastic decay in the completeness fraction. At first glance, this might seem contradictory since ECs are, by definition, embedded in molecular clouds and thus subject to a high degree of in situ dust extinction. However, at infrared wavelengths, ECs become easier to detect than exposed clusters because it is easier to distinguish them from the field population. Since ECs are usually associated with illuminated interstellar material, they can be identified by eye towards the locations of known nebulae or star-forming regions (e.g., Dutra et al. 2003a; Bica et al. 2003b; Borissova et al. 2011), even if the clusters are partially resolved or highly contaminated by extended emission. In other words, despite bright nebular emission can prevent young stars from being found by point source detection algorithms and therefore hide the host EC from automated searches, at the same time it can help to identify such a cluster when searched by eye against a high stellar background. For clusters with fainter or less irregular extended emission, automated searches can also take advantage of some distinctive characteristic of ECs (like the red-color criterion of our GLIMPSE search, see Sect. 2.1) to separate them from the background, which is in general not feasible for an evolved OC because its member stars present similar observational properties than the field population.

It is interesting to compare our distance distribution of ECs (Fig. 8b) with that of individual Spitzer-detected YSOs (Robitaille et al. 2008), as simulated by Robitaille & Whitney (2010) using a population synthesis model. They show that the synthetic YSOs that would have been detected by Spitzer and included in the Robitaille et al. (2008) catalog correspond to massive objects with a mass distribution that peaks at ~8  M. The corresponding distance distribution of this model is presented in Fig. 1 of Beuther et al. (2012) for the 10° ≤  ≤ 20° range. The plot reveals a high number of far YSOs up to distances of ~14 kpc, showing that, despite the high extinction, individual (massive) YSOs can be detected deep into the Galactic plane, as opposed to ECs. We therefore think that the low detectability of a far EC is mainly due to the faint apparent brightness of its low-mass population and confusion of its members, so that the whole cluster might be misidentified as an individual massive young star. At near-infrared wavelengths, however, extinction could still play an important role in hiding a far EC.

4.6. Definition of a representative sample

We can quantify how many OCs and ECs we are missing within a certain distance from the Sun, using the analytical expressions for the observed distance distribution, ΦD(D) (Eq. (8)), and for the distance distribution that would be observed if we detected the totality of the clusters in the inner Galaxy, (Eq. (7)), and using the fitted parameters given in Table 4. We define the cumulative completeness fraction, Fc(D), as the ratio of the number of observed clusters with distances ≤D to the number that would represent a complete sample within D: (11)Now we can define a representative cluster sample as all objects with distances D ≤ Drep for which the fraction Fc(Drep) is above a certain threshold in both the OC and EC samples (this naturally places the restriction on the OC sample alone, since it is more incomplete). We chose a threshold of 0.25, for which the distance has to be D ≤ 3.15 kpc. For simplicity, we just adopt Drep = 3.0 kpc, where Fc(Drep) = 0.28 and Fc(Drep) = 0.79 for the OC and EC samples, respectively. Note that although the selection of the threshold is somewhat arbitrary, if we keep in mind the above fractions, we only need a certain distance limit Drep where the samples are not too incomplete and at the same time have a reasonable absolute number of objects to perform a statistical analysis.

In Col. 4 of Table 3, we list the number of clusters with D   ≤   3.0 kpc for each morphological type; the total number of ECs in the representative sample is 98. To count the number of OCs, according to our definition we need that the clusters are also confirmed (ref_Conf not empty). The number of confirmed clusters with D ≤ 3.0 kpc is given in Col. 5 for each morphological type, from which we obtain a total number of 146 OCs in the representative sample. With the fractions Fc(Drep) computed before, it is also possible to estimate the number of clusters that we would observe within 3 kpc, if we had complete samples of OCs and ECs. The corresponding estimates are listed in Col. 6, and were simply derived as Ncl(≤ Drep)/0.79 for EC types, and /0.28 for OC types. Note that the large number of OC2 clusters in this ideally complete sample is due to the fact that they cover a wide age range. The age distribution of our sample is analyzed in the next section.

4.7. Ages

We would expect that the ages of the stellar clusters increase along the morphological evolutionary sequence defined in Sect. 4.1. By dividing the cluster sample in such morphological types, we indeed obtained an increasing tendency in the corresponding ages distributions. However, we were unable to estimate an average age or age ranges for each individual type, given the low number of clusters with available ages that fall within each category, except for OC2. In the whole sample, for types EC1, EC2, OC0 and OC1 there are, respectively, only 9, 16, 15 and 9 objects with age estimates, whereas for OC2 clusters there are 160. Note that for types OC0 and OC1, the total number of objects is also low (see Table 3), so that the main reason for the small number of age estimates is the low absolute statistics. On the other hand, for the much more numerous EC1 and EC2 morphological types (and possibly also part of the OC0 type), the lack of age estimates may simply be caused by the difficulties involved in obtaining these values.

It is still possible, however, to derive an upper limit for the ages of the ECs (EC1 and EC2 together), and also to study the age distribution of the whole OC population (OC0, OC1 and OC2 together), as described below.

4.7.1. Upper limit age of ECs

The EC ages compiled from the literature were estimated using a variety of methods, including: comparison with theoretical isochrones on a Hertzsprung-Russell diagram constructed after spectroscopic classification in the near infrared (e.g., Furness et al. 2010), use of the relation between the circumstellar disk fraction in the cluster and its age (following Haisch et al. 2001), and comparison with synthetic clusters constructed by Monte Carlo simulations (Stead & Hoare 2011), among others. We notice that from the 25 ECs with available age estimates, there are two objects that seem to be artificial outliers, with ages that are too old to be embedded, namely 7.5 ± 2.6 Myr and 25 ± 7.5 Myr (respectively, clusters VVV CL100 and VVV CL059 from Borissova et al. 2011)15. These two objects are precisely the only ECs in our sample whose age was determined with the distance via isochrone fitting, and the high uncertainty of this method for very young clusters is indeed acknowledged by the authors (Borissova et al. 2011). In a few other cases where isochrone fitting was used to derive the age of an EC, an independent measure of the distance was used as input in order to reduce the uncertainty (e.g., Ojha et al. 2010).

Excluding these two outliers from our sample, we found that 90% (21 out of 23) of the ECs with available age estimates are younger than 3 Myr. Furthermore, given the high errors in this age range, even the remaining two clusters are consistent with being younger than 3 Myr, within the uncertainties: age of 3.3 ± 2.1 Myr for [BDS2003] 139 (Stead & Hoare 2011), and 4.2 ± 1.5 Myr for [DBS2003] 118 (Roman-Lopes 2007). We therefore adopt an upper limit of 3 Myr for the embedded phase, which represents a better constraint than the 5 Myr limit often quoted in the literature (from Leisawitz et al. 1989). Since practically all available EC ages in our sample are ≲3 Myr, the same result is obtained if we consider the representative sample (D ≤ Drep = 3 kpc), despite the low statistics (10 out of 11 ECs are formally younger than 3 Myr, after removing one outlier).

4.7.2. Age distribution of OCs

thumbnail Fig. 9

Age distribution of OCs within the representative sample (D ≤ 3 kpc), using a logarithmic bin width of Δlog    (Age/yr) = 0.25 and Poisson uncertainties. The solid curve corresponds to the fitted age distribution from Eq. (12), following Lamers & Gieles (2006), with best-fit parameters CFR = 0.93 ± 0.09 Myr-1 and Mmax = (4.46    ±    0.85) ×   104  M.

Open with DEXTER

The much higher number of OCs with available age estimates allowed us to study their age distribution, which is shown in Fig. 9 for the representative sample (a total of 143 OCs). Assuming a constant cluster formation rate (CFR), the decreasing number of OCs as time evolves is due to the effect of different disruption processes. Lamers & Gieles (2006) provide a theoretical parameterization of the survival time of initially bound OCs in the solar neighborhood, taking four main mechanisms into account: stellar evolution, tidal stripping by the Galactic gravitational field, shocking by spiral arms, and encounters with giant molecular clouds. They show that the observed age distribution Φa(a) for a constant CFR and a power-law cluster initial mass function with a slope of −2 can be written as (12)where a is the age, C is a constant, Mlim(a) is the initial mass of a cluster that, at an age a, reaches a mass equal to the detection limit (assumed to be 100 M), and Mmax is the maximum initial mass of clusters that are formed. It can be shown that the cluster formation rate within the initial mass range [100   M,Mmax] is related to the factor C by (13)We fitted Φa(a) from Eq. (12) to the observed age distribution of OCs in the representative sample, with free parameters C and Mmax; the input function Mlim(a) was obtained by digitizing the dashed curve in Fig. 2 of Lamers & Gieles (2006). We plot the resulting best fit as a solid curve in Fig. 9, corresponding to the parameters CFR = 0.93 ± 0.09 Myr-1 and Mmax = (4.46 ± 0.85) × 104  M. It is clear from the figure that there is an excess of observed young OCs with respect to the fitted theoretical distribution, whereas for older ages the fit is a pretty good representation of the data. The observed excess of young OCs could be the result of two effects. First, young OCs dominate at larger distances because they contain more luminous stars, so that within an incomplete sample the proportion of young OCs is relatively higher than that of older clusters (Piskunov et al. 2006). Second, since the parameterization of Lamers & Gieles (2006) considers the dissolution of initially bound OCs due to classical mechanisms, the observed over-population of young clusters might consists of associations, i.e., clusters which are already unbound due to disruption processes that are not accounted for by Lamers & Gieles (2006). These associations will quickly dissolve into the field and, therefore, will not be able to populate the older age bins of the distribution in the future.

While the age-dependent incompleteness is likely playing a role within our Drep = 3 kpc limit, it is interesting to investigate whether or not there is also a contribution from the presence of associations, for which we need to restrict the sample to smaller distances, where the incompleteness is not important. We found that the excess of observed young OCs still holds if we perform the fit for samples restricted to successively smaller distances, down to D ≤ 1.4 kpc; nevertheless, the low statistics in the solar neighborhood within the ATLASGAL range prevents us to perform this test on an even more restricted subsample of our catalog. We therefore fitted the model to all-sky samples of OCs, namely, the Dias et al. (2002) catalog and the Kharchenko et al. (2005b,a) sample, restricted to a certain limit in projected distance, D. For clusters with D ≤ 0.6 kpc, in both samples, we recovered the results from Lamers & Gieles (2006)16, whose observed age distribution practically does not show the excess of young OCs with respect to the fitted curve (see their Fig. 3). If we restrict the samples to D ≤ 1.4 kpc, however, the age distribution for the Dias et al. (2002) catalog presents a statistically significant over-population of young OCs, whereas for the Kharchenko et al. (2005b,a) sample the excess is only marginal.

Given that the Kharchenko et al. (2005b,a) sample is a subset of the Dias et al. (2002) catalog, this behavior means that the young excess in the sample with D ≤ 1.4 kpc cannot purely be due to the age-dependent incompleteness, since otherwise we would obtain a more noticeable effect in the less complete sample. Then, there must necessarily be a contribution from presence of associations. The excess is less significant for the Kharchenko et al. catalog and not noticeable for clusters in both samples with D ≤ 0.6 kpc probably because there is an observational limitation in detecting associations at very close distances, due to their larger sizes. In summary, we think that the excess of young clusters in our representative OC sample (D ≤ 3.0 kpc) with respect to the theoretical description of Lamers & Gieles (2006) is caused by a combination of age-dependent incompleteness and presence of associations.

The age distribution shown in Fig. 9 was constructed using a bin width large enough to ensure good statistics over the whole age range, but we can refine the grid to constrain better a certain feature, as long as the presentation remains statistically significant. By constructing the age distribution with smaller bin widths and doing the fitting again, we found that the transition after which the theoretical description fits well the data occurs at an age of log   (a/yr) ≃ 7.2, i.e., ~16 Myr. Consistently, we have seen in Sect. 4.3 that the ~16 Myr limit is roughly the age before which an observed OC might be either an association or a physical OC, whereas observed OCs older than that are practically always bound and therefore are disrupted through “classical” mechanisms over a longer timescale.

4.7.3. Young cluster dissolution

Similarly to the estimation of the cumulative completeness fraction (see Sect. 4.6), we can use the analytical expressions for the distance distributions from Sect. 4.4 to transform the absolute CFR in the representative sample to an incompleteness-corrected cluster formation rate per unit area, , representative of the inner Galaxy close to the Sun. It can be easily shown that the conversion is (14)where (15)For the OC sample, Deff(Drep) = 1.28 kpc, which implies that the fitted cluster formation rate per unit area is Myr-1 kpc-2. This value can now be compared with the analogous parameter in the Lamers & Gieles (2006) fit for a complete all-sky sample within 0.6 kpc from the Sun, Myr-1 kpc-2. Together with the maximum mass of Mmax = 3 × 104  M they obtain, we can see that both fits are consistent within the uncertainties, assuming that their errors are similar to ours (theirs are not provided). On the other hand, from the observed number of OCs in our representative sample with ages log   (a/yr) < 7.2, we derive Myr-1 kpc-2 (using Poisson errors), which sets an upper limit of ~0.5 to the fraction of observed young OCs that are actually associations. The observed cluster formation rate corrected by age-dependent incompleteness is some value between and that can be parametrized as , where fadi is a factor in the range [0,1] (fadi = 0 for no age-dependent incompleteness, and fadi = 1 for no intrinsic young excess).

To obtain a realistic estimate of the fraction of young clusters that will dissolve or merge with other(s) agglomerate(s), and therefore will not become physical OCs by their own, we also need an equivalent estimate for the formation rate of ECs. For that, we can simply take the local surface density Σ0 obtained from fitting the distance distribution of ECs (Table 4), and divide it by their upper limit age of 3 Myr, resulting in Myr-1 kpc-2. This EC formation rate, however, is not directly comparable to that of OCs, since within 3 kpc from the Sun we are likely detecting ECs with masses below the detection limit of 100 M adopted by Lamers & Gieles (2006) for OCs, as shown, e.g., by Lada & Lada (2003), whose EC catalog includes objects with masses down to 20 M, with a large number of clusters with masses in the range [50,100]   M. Fortunately, we found that the uncertainty in the fraction of ECs with masses above 100 M, f> 100  M, is not dominant and does not prevent us to compute a good estimate of the young dissolution fraction.

If we assume that f> 100  M is in the range [0.1,1], we obtain that the fraction of ECs and young exposed clusters, fdiss, that will not become physical OCs is (16)where the uncertainty has been numerically computed assuming Gaussian random variables, except for f> 100  M and fadi which were drawn from uniform probability distributions in the corresponding domains ([0,1] range for fadi, see above). The value is in excellent agreement with that obtained by Lada & Lada (2003). However, the explanation proposed by these authors, that this high fraction is produced by the dissolution of ECs after fast gas expulsion, has been modified (or extended) considerably in recent years. As we have reviewed in the Introduction, depending on the physical conditions of each individual system and its environment, several other phenomena can contribute to the high observed number of ECs relative to physical OCs, namely: dissolving associations from birth, merging of young subclusters, and young cluster dispersion due to tidal shocks from environment or due to fast relaxation for small-N systems.

4.8. Correlations

Table 5

Statistics for each morphological type (in percentages).

In this section, we look for correlations between the morphological types defined in Sect. 4.1 and other information compiled in our cluster catalog, such as the MIR morphology and association with known objects. The percentages of clusters that satisfy the studied criteria within each morphological type are presented in Table 5. Column 2 gives the percentage of clusters that appear to be exciting PAH emission through UV radiation from their stars, as traced by bright diffuse 8 μm emission (12 μm for WISE) or the presence of IR bubbles (MIR morphology bub-cen, bub-cen-trig, or pah, see Sect. 3.1). Column 3 lists the fraction of clusters that seem to be triggering further star formation at the edge of the associated IR bubble (MIR morphology bub-cen-trig alone), whereas Col. 4 indicates the fraction of clusters that are located at the edge of an IR bubble (MIR morphology bub-cen-edge). Columns 5–7 give, respectively, the percentage of objects that are associated with IRDCs, H ii regions of any type, and UCH ii regions alone. Finally, Col. 8 lists the fraction of clusters that are part of a complex of several clusters (see Sect. B.6). In this table we present the statistics calculated for the whole cluster sample, because we obtained the same results for the representative sample, within the uncertainties (assumed to be Poisson errors). The only exception is the association with infrared dark clouds, for which we give the fractions within the representative sample. This is expected since an IRDC can only be identified at a relatively near distance because, to be detectable, it has to manifest itself as a dark extinction feature in front of the diffuse Galactic background. We also computed the statistics restricted to clusters with GLIMPSE data available, in order to minimize possible systematic errors arising from the lower resolution and sensitivity of the WISE images (see Sect. B.3), but since only 7% of the clusters have no GLIMPSE data, we obtained identical results than those presented in Table 5.

We note from the table that the presence of stellar feedback as traced by PAH emission and H ii regions is very important in the first four stages of the evolutionary sequence. When excluding UCH ii regions, we found that both indicators of feedback are roughly equivalent, i.e., the same clusters present both tracers. That a few clusters have PAH emission but no H ii region is probably due to the incompleteness of the current sample of H ii regions. Alternately, in some cases we might be dealing with lower mass clusters whose UV radiation is strong enough to excite the PAH molecules, but not to produce a detectable region of ionized gas (Allen et al. 2007). On the other hand, the few H ii regions without PAH emission are probably more evolved, or UCH ii regions not identified as such. However, it is remarkable that although the identification of an ultra compact region was only based on the literature, such objects are much more frequently associated with the first morphological type, which presumably covers the youngest clusters. The almost null correlation of OC2 clusters with indicators of stellar feedback is consistent with the fact that these clusters are mostly classical OCs and already gas-free.

Concerning triggered star formation, we see that only EC2, OC0, and OC1 clusters are able to produce it, in roughly 10% of the cases. EC1 clusters are not able because they are too embedded and have not yet started to sweep up the surrounding material; in turn, their formation might be triggered itself by another cluster or massive star, but in only a very small fraction (see Col. 4). We warn, however, that our diagnoses of triggered star formation are purely based on morphology, so that its real existence in these cases is definitely not conclusive.

Infrared dark clouds are mostly associated with the first morphological type, confirming that they trace the earliest phases of star cluster formation. Interestingly, we found that the presence of IRDCs and PAH emission are almost mutually exclusive: within the representative sample, both tracers combined practically account for the totality of EC1 clusters, with almost null intersection. In other words, IRDCs and PAH emission trace, respectively, an earlier and later stage within the deeply embedded phase (type EC1). A simple interpretation for this behavior is that at some point IRDCs are “illuminated” by the radiation of the recently formed ECs, before their actual disruption, so that they become undetectable as extinction features in the mid-infrared but still prominent in the submm dust continuum emission traced by ATLASGAL.

Although we have not identified the totality of complexes of physically related clusters in our sample, Table 5 shows a clear tendency for ECs to be grouped in complexes. In contrast, OCs are much more isolated (the type OC2 dominates the OC population). Only those OCs that are still associated with some molecular gas (types OC0, OC1) present a similar degree of grouping with other clusters as ECs. This is consistent with the fact that star formation occurs in giant molecular cloud complexes with a hierarchical structure, in which star-forming regions with a relatively higher stellar density would be observationally identified as ECs. Many of them will dissolve, while others, if close enough, will undergo a merging process as a result of dynamical evolution, all in a timescale shorter than ~15 Myr (see Sect. 4.7). The final outcome, after the parent molecular cloud is destroyed, might therefore be very few or even an unique physical OC, which will appear relatively in isolation.

5. Conclusions

We have statistically studied all ECs and OCs known so far in the inner Galactic plane and their correlation with dense molecular gas, taking particular advantage of the improved cluster sample over the past decade and the ATLASGAL submm continuum survey, which traces cold dust and dense molecular gas. The main results and conclusions presented in this paper are summarized as follows:

  • 1.

    We compiled a merged full-sky list of 3904 ECs and OCs in theGalaxy, collected from several optical and infrared clustercatalogs in the literature, dealing properly withcross-identifications.

  • 2.

    As part of the above compilation, we performed our own search for ECs on the mid-infrared GLIMPSE survey, complementing the catalog of 92 exposed and less-embedded clusters detected by Mercer et al. (2005) on the same data. Our method basically consisted of visual inspection of three-color images around positions previously selected as potential YSO overdensities, which correspond to enhancements on a stellar density map of the GLIMPSE point source catalog filtered by a red color criterion. With this technique, we found 75 new clusters.

  • 3.

    The sample of 695 ECs and OCs within the ATLASGAL Galactic range (|| ≤ 60° and |b| ≤ 1.5°) was studied in more detail, particularly regarding the correlation with submm emission. We constructed an extensive catalog (available in electronic form at the CDS) with all the relevant information on these objects, including: the characteristics of the submm and mid-infrared emission; correlation with IRDCs, IR bubbles, and H ii regions; distances (kinematic and/or stellar) and ages; and membership in big molecular complexes.

  • 4.

    Based on the morphology of the submm emission and, for exposed clusters, on the agreement of the clump kinematic distances and cluster stellar distances, we defined an evolutionary sequence with decreasing correlation with ATLASGAL emission: deeply embedded clusters (EC1), partially embedded clusters (EC2), emerging exposed clusters (OC0), totally exposed clusters still physically associated with molecular gas in their surrounding neighborhood (OC1), and all the remaining exposed clusters, with no correlation with ATLASGAL emission (OC2).

  • 5.

    The morphological evolutionary sequence correlates well with other observational indicators of evolution. In particular, we found that IR bubbles/PAH emission and H ii regions are both equivalently important in the first four stages of the evolutionary sequence, suggesting that ionization is one of the main feedback mechanisms in our cluster sample. IRDCs are significant mostly in the first type (EC1), tracing a very early phase prior to the stage in which the EC starts to “illuminate” the host molecular clump while still embedded (EC1 clusters with PAH emission). The presence of big complexes containing several clusters is, again, relevant in the first four morphological types, which is consistent with the fact that star formation occurs in giant molecular clouds and that older OCs (OC2) are just the bound survivors of a very complex process of merging and dissolution of young agglomerates.

  • 6.

    We observationally defined an EC as any cluster with morphological types EC1 or EC2; OCs were defined as all the remaining types, OC0, OC1, and OC2, but were required to be confirmed by follow-up studies, in order to minimize the contamination by spurious candidates.

  • 7.

    We found that our observational definition of OC agrees with the physical one (a bound exposed cluster, referred to in this work as a physical OC) for ages greater than ~16 Myr. In our sample, some OCs younger than this limit can actually be associations.

  • 8.

    By fitting the observed heliocentric distance distribution for OCs and ECs within the ATLASGAL range, we found that our OC and EC samples are roughly complete up to a distance of ~1 kpc and ~1.8 kpc, respectively. Beyond these limits, the completeness of the OC and EC samples decay exponentially with scale lengths of ~0.7 kpc and ~1.8 kpc, respectively.

  • 9.

    We argued that ECs probe deeper the inner Galactic plane than OCs because, at infrared wavelengths, ECs can be more easily distinguished from the field population than OCs. On the other hand, a very distant EC is hardly detected due to the combined effect of extinction, the faint apparent brightness of its low-mass population and confusion of its members.

  • 10.

    From a subsample of 23 ECs with available age estimates, we derived an upper limit of 3 Myr for the duration of the embedded phase.

  • 11.

    We studied the OC age distribution within 3 kpc from the Sun, which was used to fit the theoretical parametrization of Lamers & Gieles (2006) of different disruption mechanisms for bound OCs. We found an excess of observed young OCs with respect to the fit, thought to be a combined effect of age dependent incompleteness and presence of associations for ages ≲16 Myr.

  • 12.

    We derived formation rates of 0.54, 1.18, and 6.50 Myr-1 kpc-2 for bound OCs, all observed young OCs, and ECs, respectively, which translates into a EC dissolution fraction of 88 ± 8%. This high fraction is thought to be produced by a combination of the following effects: dissolving associations from birth; merging of young subclusters; and young cluster dispersion due to fast gas expulsion, tidal shocks from environment, or fast relaxation for small-N systems.

The new generation of all-sky near-infrared surveys, such as the UKIDSS Galactic Plane Survey (Lucas et al. 2008) and VISTA Variables in the Vía Láctea (VVV, Minniti et al. 2010), will constitute valuable tools to discover new OCs and ECs in the Galactic plane and to start filling in the highly incomplete parts of the plane beyond 1 or 2 kpc from the Sun (for OCs and ECs, respectively). In the future, we plan to update our cluster database for the inner Galaxy to include the new discoveries. Furthermore, the improved sensitivity and resolution of these surveys relative to 2MASS will allow studies of the stellar population of ECs which appear too crowded and/or faint in the 2MASS data. Very importantly, this will increase the number of young clusters with available estimates of their physical properties, such as ages and masses. In particular, stellar masses can be combined with estimates of gas masses (e.g., from ATLASGAL) to derive star formation efficiencies and investigate possible trends with the age and the presence of feedback, placing important constraints on star formation theories.

Appendix A: Cluster lists in the literature

In this appendix, we describe the diverse catalogs and references used for our cluster compilation, separated in three categories according to the wavelength at which the clusters are detected: optical, NIR and MIR clusters. Furthermore, we present a brief discussion of the contamination by false cluster candidates. Again, as for Table 1, the number of clusters quoted within the text represent values after removing these spurious objects and some globular clusters (listed in Table A.1), unless explicitly mentioned.

Appendix A.1: Optical clusters

Dias et al. (2002) provide the most complete catalog of optically visible OCs and candidates, containing revised data compiled from old catalogs and from isolated papers that were recently published. The list is regularly updated on a dedicated webpage17, with additional clusters seen in the optical and revised fundamental parameters from new references. We used the version 3.1 (from November 2010), which contains 2117 objects, of which 99.7% have estimated angular diameters, and 59.4% have simultaneous reddening, distance and age determinations. Kinematic information is also given for a fraction of clusters, 22.9% of the list have both radial velocity and proper motion data. It should be noted that this catalog aims at collecting not only the OCs first detected in the optical, but also most of (ideally all) the clusters that were detected in the infrared and are visible in the optical. For example, 293 objects from the 998 2MASS-detected clusters of Froebrich et al. (2007b) were included in the last version of the catalog, based on by-eye inspection of the Digitized Sky Survey (DSS) images.

We also included the list of new galactic OC candidates by Kronberger et al. (2006) in our compilation, who did a visual inspection of DSS and 2MASS images towards selected regions, and a subsequent analysis of the 2MASS color–magnitude diagrams of the candidates. The clusters were divided into different lists, some of them with fundamental parameters determined, and are all included in the Dias et al. (2002, ver. 3.1) catalog, except most of the stellar fields classified as suspected OC candidates (their Table 2e), which adds 130 objects to the optical cluster sample.

Appendix A.2: NIR clusters

Stellar clusters detected by NIR imaging, mainly from surveys of individual star-forming regions, are compiled from the literature by Porras et al. (2003), Lada & Lada (2003), and Bica et al. (2003a). The first two catalogs are exclusively limited to nearby regions (distances less than 1 kpc and ≃2 kpc, respectively); Bica et al. (2003a) did not use that restriction, but their list is only representative for nearby distances too (≲2 kpc). It is not surprising that the three compilations overlap considerably, as is shown in Table 1. All together, these catalogs contribute 297 additional objects with respect to the optical cluster sample.

However, most of the NIR clusters correspond to recent discoveries using the 2MASS survey. More than 300 new clusters were found by visual inspection of a huge number of 2MASS J, H, and especially Ks images (Dutra & Bica 2000, 2001; Bica et al. 2003b; Dutra et al. 2003a). In the pioneer work of Dutra & Bica (2000), 58 star clusters and candidates were originally detected by doing a systematic visual search on a field of 5° × 5° centered close to the Galactic center, and towards the directions of H ii regions and dark clouds for || ≤ 4°, though most of them were observed later at higher angular resolution, and 36 turned out to be spurious detections mainly due to the high contamination from field stars in this area (see Sect. A.4). Additional 42 objects were discovered by Dutra & Bica (2001), who searched for ECs around the central positions of optical and radio nebulae in the Cygnus X region and other specific regions of the sky (they are included in the literature compilation by Bica et al. 2003a). They extended the method for the whole Milky Way (Dutra et al. 2003a; Bica et al. 2003b, southern and equatorial/northern Galaxy, respectively), inspecting a sample of 4450 nebulae collected from the literature, and they found a total of 337 new clusters.

In addition to the visual inspection technique, a large number of 2MASS star clusters have been discovered by automated searches, which are based on the selection of enhancements on stellar surface density maps constructed with the point source catalog. The early works of Ivanov et al. (2002) and Borissova et al. (2003) led to 14 detections (the ones not present in any of the catalogs mentioned above are counted in the “Not catalogued (NIR)” row of Table 1); similarly, Kumar et al. (2006) found 54 ECs of which 20 are new detections, focusing the search on the positions of massive protostellar candidates. More recently, Froebrich et al. (2007b) searched for 2MASS clusters along the entire Galactic plane with |b| ≤ 20°, automatically looking for star density enhancements, and manually selecting all remaining objects possessing the same visual appearance in the star density maps as known star clusters. They identified a total of 1788 star cluster candidates, 1021 of which resulted to be new discoveries and were presented as a catalog; an estimate of the contamination suggested that about half of these new candidates are real star clusters. A considerable number of objects from the Froebrich et al. (2007b) list have been analyzed in more detail by a variety of authors, and they were compiled by Froebrich et al. (2008). For these objects and the ones recently studied by Froebrich et al. (2010) (comprising a total of 68 clusters), we use the refined coordinates and diameters instead of the original ones. The follow-up studies compiled by Froebrich et al. (2008) also unveil 22 spurious clusters and one globular cluster (see Table A.1). A similar automatic 2MASS search done by Glushkova et al. (2010) in the |b| < 24° range, which includes the verification of the obtained star density enhancements by the analysis of color–magnitude diagrams and radial density distributions, produced a list of ~100 new clusters (most of them included in the last version of the catalog by Dias et al. 2002), providing physical parameters for a total of 168 new and previously discovered objects.

Expectations for the near future are that the new generation of all-sky NIR surveys, such as the United Kingdom Infrared Deep Sky Survey (UKIDSS) and VISTA Variables in the Vía Láctea (VVV), will give rise to the discovery of many more stellar clusters, thanks to their improved limiting magnitude and angular resolution compared to 2MASS. A cluster search using these data has already been performed by Borissova et al. (2011), who found 96 previously unknown stellar clusters by visually inspecting multiwavelength NIR images of the VVV survey in the covered disk area (295° ≤  ≤ 350° and |b| ≤ 2°), towards directions of star formation signposts (masers, radio, and infrared sources). The objects listed in their catalog were required to present distinguishable sequences on the color–color and color–magnitude diagrams, after applying a field-star decontamination algorithm, in order to minimize the presence of false detections. Automated cluster searches in the UKIDSS and VVV surveys are being done by the corresponding teams18.

In our star cluster compilation, we also included recent NIR studies towards specific star-forming regions, or individual star clusters, which are not listed in the previous catalogs. In their NIR survey of 26 high-mass star-forming regions, Faustini et al. (2009) identified the presence of 23 clusters, 16 of which are new discoveries. Additional individual new objects are counted as “Not catalogued clusters (NIR)” in Table 1.

Appendix A.3: MIR clusters

As a result of the high sensitivity of the GLIMPSE mid-infrared survey, Mercer et al. (2005) managed to find 92 new star clusters (2 of which are globular clusters) using an automated algorithm applied to the GLIMPSE point source catalog and archive, and a visual inspection of the image mosaics to search for ECs (the GLIMPSE Galactic range at that time was 10° ≤ || ≤ 65° and |b| ≤ 1°, excluding the inner part of the GLIMPSE II survey). The automatic detection method consisted of the construction of a renormalized star density map, which accounts for the varying background, the estimation of the clusters’ spatial parameters by fitting 2D Gaussians to the point sources with an expectation-maximization algorithm, and finally the removal of false detections by using a Bayesian criterion. This technique yielded 91 cluster candidates, 59 of which were new discoveries. Most of the clusters were detected applying a bright magnitude cut at 3.6 μm before the construction of the stellar density map. Additional 33 new ECs were identified by the visual inspection, which were missed by the automated method.

However, simple by-eye examination of some GLIMPSE color images led us to conclude that there are still some ECs missing in the Mercer et al. (2005) list. Because of this (and also to cover the GLIMPSE II area) we performed a new semi-automatic search in the whole GLIMPSE data, focused in the ECs, which resulted in increasing the number of MIR clusters to a total of 164 objects19. The search is described in Sect. 2.1.

Appendix A.4: Spurious cluster candidates

The majority of the new IR star cluster catalogs compiled here are based on algorithmic or by-eye detections of stellar density enhancements on images of IR Galactic surveys, and do not provide any information about whether the identified objects are really composed of physically related stars or are instead produced by chance alignments on the line of sight. Owing to the patchy interstellar extinction, an apparent stellar overdensity can simply correspond to a low extinction region with high extinction surroundings. In addition, bright spatially extended emission might be incorrectly classified as unresolved star clusters embedded in nebulae. Confirmation of a real cluster can be achieved through deeper, high-resolution IR photometry or through spectroscopic observations of the candidate stellar members (e.g., Dutra et al. 2003b; Borissova et al. 2005, 2006; Messineo et al. 2009; Hanson et al. 2010; Davies et al. 2012), which in some cases enables the estimation of physical parameters. Though an important number of such studies have been carried out during the past decade, they still cover a small fraction of the total sample of cluster candidates to be confirmed, mainly because these objects represent relatively new discoveries and the observations needed for a more detailed analysis are very time-consuming.

Nevertheless, we can roughly estimate the contamination by spurious detections in our sample of cluster candidates in a statistical way. For example, by comparison of the basic characteristics (Galactic distribution, detection method, and morphology) of the cluster candidates with those of known clusters rediscovered by their method, Froebrich et al. (2007b) find that about 50% of their catalog entries correspond to false clusters. Detailed follow-up studies of unbiased subsets of objects from this catalog, only restricted to certain areas, have determined similar contamination fractions (Froebrich et al. 2008, and references therein). Another example is the Dutra & Bica (2000) catalog, where 52 (out of 58) candidates have been observed using higher resolution NIR imaging (Dutra et al. 2003b; Borissova et al. 2005), resulting in 36 previously unresolved alignments of a few bright stars (probably in most cases unrelated) which resemble compact clusters at the 2MASS resolution. This would imply a ~70% contamination by spurious detections, but we note that, since this catalog is based on a systematic search for sources projected close to the Galactic center, it is particularly affected by a higher number of background/foreground stars and more intervening dust, which all help to mimic (or hide) star clusters.

The subsequent 2MASS by-eye searches performed by this team (Dutra & Bica 2001; Dutra et al. 2003a; Bica et al. 2003b) cover the whole Galactic plane and, furthermore, they are focused on radio/optical nebulae which generally correspond to H ii regions, increasing the chance to find real stellar clusters. Typical spurious clusters associated with radio/optical nebulae represent one or a couple of bright stars plus extended emission (e.g., Borissova et al. 2005). We caution that, however, as the number of stars in these embedded multiple systems is larger, under the assumption that the stars are physically related, the consideration of a particular candidate as spurious or possible cluster is more dependent on how we define an EC. Under the definition used throughout this work (see Sect. 1.2), since we do not impose any constraint on the number of members, we expect a minimal contamination by false detections for clusters associated with molecular gas20. For exposed clusters, in contrast, the probability that a cluster candidate consists of only unrelated stars on the same line of sight is much higher. Based on the above discussion, we estimate an overall spurious contamination rate of ~50% for exposed clusters that have not been confirmed by follow-up studies.

In Table A.1 we list the spurious candidates within the compiled cluster catalogs that were not included in our final sample. This table comprises the false detections found by Dutra et al. (2003b) and Borissova et al. (2005), and the candidates from the Froebrich et al. (2007b) catalog listed as “not a cluster” by the literature compilation of follow-up studies by Froebrich et al. (2008). The other objects are a few globular clusters and false clusters or duplications found in this work, primarily from the literature revision of the cluster sample in the ATLASGAL range (see Appendix B).

Table A.1

List of spurious clusters, duplicated entries, and globular clusters within the catalogs used in this work.

Appendix B: Construction of the cluster catalog

Here, we report in detail the construction of our cluster catalog within the ATLASGAL Galactic range (|| ≤ 60° and |b| ≤ 1.5°), including explanations for all the assumptions and procedures made when compiling the used information, as well as descriptions for all the columns provided in the catalog. The catalog and a list of cited references are electronically available at the CDS, and an excerpt is given in Appendix C.

Appendix B.1: Designations, position and angular size

The basic information for each cluster is directly obtained from the original cluster catalogs compiled (see Sect. 2). The column ID is a record number from 1 to 695 with the clusters sorted by Galactic longitude. The cluster designation, based on the original catalog, is listed in the column Name, which was chosen, in general, to be consistent with the SIMBAD database identifier. Other common names, or designations from other catalog(s) (for clusters originally present in more than one catalog), are given in the column OName. In the column Cat, we provide the original cluster catalog(s) from which each object was extracted, using the reference ID defined in Table 1.

The position of each object is based on the equatorial coordinates listed in the original catalog(s). For multiple catalogs, we averaged the listed positions and angular sizes to obtain the final values given here, ignoring in some cases certain references that were considered less accurate or redundant (which are listed between parentheses in the column Cat). The Galactic coordinates are given in GLON and GLAT, whereas the equatorial coordinates (J2000.0) are listed in RAJ2000 and DECJ2000. The column Diam is the angular diameter in arcseconds.

Appendix B.2: ATLASGAL emission

From the ATLASGAL survey images, we extracted submaps centered on the cluster locations and with a field of view of max { 30′,2 ∗ Diam } to search for submm dust continuum emission tracing molecular gas likely associated with the clusters, and to then characterize its morphology. The first computation needed to determine the presence of real emission in those fields is a proper estimation of the local rms noise level, σ, for which we used an iterative sigma-clipping procedure21 with a threshold of 2σ and a convergence criterion of 1% (iteration stops when the non-sky pixels are a fraction lower than 1% of the total of sky pixels of the previous iteration). With these chosen parameters, the computed values of σ agree well with quick estimates of the noise over emission-free regions identified by eye in some test fields. The average noise level is σ = 45 mJy/beam, and 95% of the total of fields have σ in the range [30,60] mJy/beam.

Using the computed rms noise level of each field, we identified clumps of emission by applying the decomposition algorithm Clumpfind (Williams et al. 1994) in its IDL implementation for 2D data, clfind2d. This routine requires only two input parameters: 1) the intensity threshold, which determines the minimum emission to be included in the decomposition; and 2) the stepsize which sets the contrast needed between two contiguous features to be identified as different clumps. We chose threshold = stepsize = 3σ, after visualizing the decomposition on some test fields and requiring that the obtained clumps were roughly similar to those that would be identified by the human eye. We slightly modified the IDL code of clfind2d to improve the clump decomposition and to avoid false detections. Originally, the code developed by Williams et al. (1994) deals with blended emission by splitting it into its corresponding clumps using a simple friends-of-friends method, but instead the current implementation breaks up the emission by assigning the blended pixels to the clump with the nearest peak, which produces some disconnected clumps, i.e., pixels of the same clump not connected by a continuous path. We thus changed the peak distance criterion by the minimum distance to a clump to assign blended emission to the existing clumps, which noticeably minimizes the effect of disconnected clumps and resembles the friends-of-friends method. A second modification to the code was to require that the clumps have angular sizes larger than the beam in both image directions, in order to reject “snake”-shaped clumps marginally above the threshold which correspond to minor image artifacts rather than real astronomical emission.

The employed algorithm assigns into clumps all the emission above the given threshold and with an extent larger than the beam. We computed the angular distance from the cluster center of the nearest detected ATLASGAL emission pixel to have a quick first impression of the presence of molecular gas. Such values are listed in the column Clump_sep, normalized to the cluster angular radius (when no emission is detected in the whole ATLASGAL submap, a lower limit is given).

We also performed a careful visual inspection of every ATLASGAL submap, using an IDL script to overplot the positions of all star clusters of our sample within the field, and the submm clumps detected before, as well as any interesting object, such as the positions of measured molecular line velocities (see Sect. B.4.1). In another window, the script displays a smaller field of view (~10′) with the cluster itself seen by whole set of IR images (2MASS and GLIMPSE, including three-color images) overlaid with ATLASGAL contours, in order to morphologically compare the IR and the submm emissions. The column Clump_flag is a two-digit flag which indicates whether or not the cluster appears physically related to the nearest submm clump detected by Clumpfind, as seen by the inspection of these images. The first digit of Clump_flag can take the values: 0, when the nearest ATLASGAL clump does not seem to be associated with the cluster; 1, when it does seem to be clearly associated, especially for the cases of star clusters deeply embedded within centrally condensed ATLASGAL clumps; and 2, when the physical connection is less clear but still likely, in most cases when the clump appears to belong to the same star-forming region than the stellar cluster, connected by some diffuse MIR emission. The second digit of Clump_flag provides information about the line velocity available for each object and will be described in Sect. B.4.1.

The column Morph is a text flag composed of two parts separated by a period. The first part gives further information about the morphology of the detected ATLASGAL emission throughout the immediate star cluster area, including the cases: emb, p-emb, surr, few, few*, exp, and exp*, which are explained in Sect. 3.1. The second part indicates the MIR morphology and will be described in the next section.

Appendix B.3: MIR morphology and association with known objects

The mid-infrared morphology of a stellar cluster can also provide some clues about its evolutionary stage and presence of feedback, in particular the intensity and distribution of the 8.0 μm emission. We indicate in the second part of the column Morph (after the period) details about the 8.0 μm morphology of each cluster, after visually inspecting GLIMPSE three-color images made with the 3.6 (blue), 4.5 (green) and 8.0 μm (red) bands, as part of the process described in the previous section. This flag includes the cases: bub-cen, bub-cen-trig, bub-edge, and pah, which are explained in Sect. 3.1.

All IR bubbles associated with star clusters and recognized in this work are identified in the table column Bub. We give the bubble names from the catalogs by Churchwell et al. (2006, 2007) when the objects are listed there, otherwise an identifier based on the cluster ID is provided. We also list in this column IR bubbles that are located in the neighborhood of the clusters but that do not appear clearly associated with them or do not represent any of the scenarios defined above (e.g., bubble in the same star-forming region but not directly interacting with the cluster). Similarly, on the GLIMPSE three-color images and on the 8.0 μm images we identified the presence of an infrared dark cloud in which the cluster appears to be embedded (see Fig. 2, top). These objects are listed in the column IRDC using a name based on the cluster ID when the IRDC has not been catalogued so far, or the designations from the catalogs by Simon et al. (2006) and Peretto & Fuller (2009) if it was identified there before. Unlike the IR bubbles, since we do not provide information of the IRDCs within the Morph flag, we only list in the column IRDC those objects that exhibit possible physical connection with the cluster. Many of the IRDCs reported by Peretto & Fuller (2009) are only small dark fluctuations over a bright background and do not constitute cluster-forming clumps.

We note that, since the ATLASGAL Galactic range is wider than the GLIMPSE coverage, 7% of the cluster sample have no GLIMPSE data available, and this is indicated in the column no_GL (no_GL= 1 when there is no GLIMPSE data, otherwise no_GL= 0). In those cases, we used WISE three-color images made with the 3.4 (blue), 4.6 (green) and 12 μm (red) filters, to identify all the features described above. Prominent PAH bands are covered by the 12 μm filter; indeed, by comparing both sets of 3-color images for clusters with GLIMPSE data available, we found that bright PAH 8.0 μm emission illuminated by the clusters is unambiguously detected at 12 μm. Similarly, most of the extended IRDCs identified at 8.0 μm can also be seen at 12 μm. However, because of saturation and the relatively low resolution, more detailed structures such as the presence of IR bubbles, smaller IRDCs, or possible triggered star formation are much harder to identify than in the GLIMPSE images.

In addition, we searched in the literature for the presence of H ii regions associated with the clusters, and they are listed in the column HII_reg with designations compatible with SIMBAD or common names used in the literature for large molecular complexes (see the references for complexes, ref_Complex, explained in Sect. B.6). Particular designations used here which do not exist in SIMBAD and do not belong to complexes are those starting with: “HRDS”, indicating the H ii regions discovered recently by Anderson et al. (2011) using radio recombination line (RRL) observations; and “RMS”, which represent possible H ii regions corresponding to radio continuum sources found by the RMS survey (see Sect. B.4.1 for a description of the on-line search we performed in such database; the objects listed here were taken from the “Radio Catalogue Search Results” section of the webpage of each individual RMS source investigated). It is worth noting that, for the H ii regions primarily found using SIMBAD, we carefully checked their nature in the literature by requiring the presence of radio continuum emission or RRLs, since some sources are misclassified as H ii regions in SIMBAD. Two important consulted references of RRL observations were Caswell & Haynes (1987) (sources with prefix [CH87]) and Lockman (1989) (sources with prefix [L89b]). We also specified two flags at the end of some names to indicate two particular situations: the flag “(UC)”, when the source is classified as an ultra compact H ii region in the literature; and the flag “(bub)”, when the H ii region appears associated with the listed IR bubble, but not directly with the star cluster. However, we note that classification as an UC H ii region may not be accurate, considering that detailed interferometric and large-scale observations are needed to really unveil the spatial distribution and evolutionary status of a particular H ii region.

Appendix B.4: Kinematic distance

As stated in Sect. 3.3, many of the ATLASGAL clumps at the locations or in the vicinity of the stellar clusters have measurements of molecular line LSR velocities. By assuming a Galactic rotation model, we can transform these velocities in kinematic distance estimates for the clumps and, therefore, for the corresponding clusters when they were assumed to be physically associated.

Appendix B.4.1: Line velocities

We used four main references of line velocities, which were systematically searched on the ATLASGAL submaps (positions overlaid there), in the following priority order: 1) follow-up NH3   (1,1) observations towards bright ATLASGAL sources (Wienen et al. 2012, for northern sources; and Wienen et al., in prep., for southern ones); 2) similar targets observed in the N2H+   (1−0) line (Wyrowski et al., in prep.); 3) the CS  (2−1) Galactic survey by Bronfman et al. (1996) towards IRAS sources with colors typical of compact H ii regions; and 4) velocities of massive YSO candidates from the Red MSX Source (RMS) survey (Urquhart et al. 2008) available on-line22, corresponding mainly to targeted observations in the (1−0) and (2−1) transitions of 13CO, or literature velocities compiled there. The priority sequence was primarily based on the number of ATLASGAL clumps available in each of the lists, in order to make the velocity assignments more uniform; the RMS survey was put at the end because the 13CO traces less dense gas than the other three molecules, which are unambiguously linked to the ATLASGAL emission. We note that, however, when the same clump is found in more than one list, the velocity differences are negligible compared to the error assumed for the computation of the kinematic distance (7 km s-1, see below). The adopted LSR velocity is listed in the column Vlsr (in km s-1) of the catalog. We give the corresponding reference in the column ref_Vlsr, and the source name in name_Vlsr (SIMBAD compatible or the one used in the original paper). If no velocity was available from any of the four main lists mentioned before, additional velocity references were found by doing a coordinate query in SIMBAD.

In some cases, we did not find any velocity for the closest detected ATLASGAL clump, but we did for another possibly associated clump or for the H ii region. This information is indicated in the second digit of the flag Clump_flag, which can take the values: 0, when no velocity is available; 1, when the listed velocity is from the nearest ATLASGAL clump or from a clump directly adjacent to it; 2, when the clump with the velocity is not the nearest but is within the cluster area (used in cases of optical clusters with large angular size); 3, when the velocity is from an ATLASGAL clump which is apparently associated with the cluster as seen in the images, but is independent of the nearest one; and 4, when we list the RRL velocity of the related H ii region. Considering the value of Clump_flag as an unique integer number, i.e., combining the first digit which gives information about the closest ATLASGAL clump (see Sect. B.2) with the second digit explained here, the kinematic distance computed from Vlsr can be assigned to the star cluster if Clump_flag≥ 03.

Appendix B.4.2: Rotation curve

Once all the available LSR velocities had been collected, kinematic distances were calculated using a Galactic rotation curve. The widely employed rotation curve fitted by Brand & Blitz (1993) was based on a sample of H ii regions and reflection nebulae with known stellar distances, and their associated molecular clouds, which have the velocity information. Most of these sources are located in the outer Galaxy, out to a Galactocentric radius R of about 17 kpc. They added to the sample the H i tangent point velocities available at that time to cover the inner Galaxy, (i.e., for R < R0, where R0 ~ 8 kpc is the distance from the Sun of the Galactic center). However, since they used a global functional form to simultaneously fit the inner and the outer Galaxy, this curve does not properly match the data for R < R0, as is shown, e.g., in Figs. 6 and 7 of Levine et al. (2008). These authors constructed an updated rotation curve for the inner Galaxy using recent high-resolution H i tangent point data. The linear function fitted by them to R ≤ 8 kpc resulted to be steeper than the Brand & Blitz (1993) curve in that range, and better reproduces the increase of the rotation velocity with increasing R. Given that most of our studied sources are within the solar circle (R < R0), we decided to adopt the Levine et al. (2008)23 rotation curve for R/R0 ≤ 0.78, which is the point where it intersects the Brand & Blitz (1993) curve. For R/R0 > 0.78, we adopted the Brand & Blitz (1993) curve to cover large Galactocentric radii. We used this intersection point instead of the whole range available in Levine et al. (2008) to ensure continuity of the overall rotation curve assumed.

It is worth mentioning that the fourth quadrant part of the same H i data used by Levine et al. (2008) were previously analyzed by McClure-Griffiths & Dickey (2007) who fitted their own rotation curve. As already suspected by Levine et al. (2008), the systematic shift of ~7 km s-1 between the two curves (see their Fig. 7) is due to the differences in determining the terminal velocities from the data. We note that the erfc fitting method (used by McClure-Griffiths & Dickey 2007) is conceptually equivalent to consider the half-power point of the tangent velocity profile. Fitting instead the theoretical function derived by Celnik et al. (1979), which is a better approximation of the tangent velocity profile, it is found that the half-power point is shifted by ~0.7σv from the real terminal velocity (where σv is the typical velocity dispersion; see the proof in that paper). We thus favor the rotation curve by Levine et al. (2008), since they fitted Celnik et al. (1979) profiles to derive the tangent point velocities.

We did not use the more recent rotation curve by Reid et al. (2009) mainly because it is based on maser parallax distances of only 18 star-forming regions, which cover just the first and second quadrant, so that the obtained rotation curve is not fully representative of our Galactic range and, as the authors acknowledge, cannot conclusively be distinguished from a flat curve (which is their assumed form at the end). In addition, their recommended fit assumes that the massive star-forming gas orbits slower the Galaxy than expected for circular rotation, which has been questioned by some subsequent studies (Baba et al. 2009; McMillan & Binney 2010).

Appendix B.4.3: Derivation of the kinematic distances

Both rotation curves used here (Brand & Blitz 1993; Levine et al. 2008) were originally constructed assuming the standard IAU values for the Galactocentric radius and the orbital velocity of the Sun, R0 = 8.5 kpc and Θ0 = 220 km s-1, respectively. Nevertheless, it can be easily shown that the solution for x = R/R0 derived by applying these curves and a particular LSR velocity is practically independent of the choice of (R00) (fully independent for the case of a linear rotation curve constructed from tangent point velocities, as for Levine et al. 2008), and that any scaling of the curve parameters to match updated values of (R00) is equivalent to adopt the original parameters in all the parts of the equations. The only thing we need afterwards is an accurate value for R0, to transform from the dimensionless solution x to the physical Galactocentric radius R. Moreover, it can be also shown that the solution does not depend on the exact definition of the LSR, provided that the rotation curves and the input data use the same solar motion (generally standard in radiotelescopes), and that any possible correction is only important in the direction of the Galactic rotation, V (which is also true; see Table 5 of Reid et al. 2009; and Schönrich et al. 2010), so that if applied it would be canceled out in the equations.

We then applied the original rotation curves and the velocities Vlsr with no correction, to solve for x = R/R0. To finally obtain R, we adopted R0 = 8.23 (± 0.20) kpc from Genzel et al. (2010), who computed the weighted mean of all recent direct estimations of the Galactic center distance from the Sun. We exclude from the kinematic distance estimation those sources with R < 2.4 kpc (only 2% of the cases), which is the point were the approaching and receding parts of the rotation curve constructed by Marasco & Fraternali (2012, using coarser resolution H i data, but covering smaller R start to show significant differences likely due to non-circular motions in the region of the Galactic bar. The Levine et al. (2008) curve covers radii R ≥ 3 kpc, which means that we implicitly extrapolated it to R = 2.4 kpc when we solved the equation for x.

There is a simple geometrical relation between the obtained Galactocentric radius R and the kinematic distance, but within the solar circle (in our sample, 99% of all kinematic distance estimations) an unique value of R results in two possible distances equally spaced on either side of the tangent point, which are referred to as the near and far distances. This is known as the kinematic distance ambiguity (KDA) problem. Fortunately, as discussed in Sect. B.4.4, there exist a number of methods that have been applied in the literature for an important fraction of the sample to solve the KDA, which allowed us to assign an unique kinematic distance in the 92% of the cases. We list the 424 derived kinematic distances in the table column KDist (in kpc); when the KDA is not solved, both near and far distances are given separated by “/”. Uncertainties in these distances, provided in the column e_KDist, have been determined by shifting the LSR velocities by ±7 km s-1 to account for random motions, following Reid et al. (2009), who suggest this value as the typical virial velocity dispersion of a massive star-forming region. We acknowledge, however, that the error in the kinematic distance can be larger due to randomly oriented peculiar motions of up to 20 or 30 km s-1 with respect to Galactic rotation, as shown, e.g., by the hydrodynamical simulations by Baba et al. (2009). Similarly, such large systematic velocities have been found from maser parallax observations, leading to up to a factor 2 wrong kinematic distances (e.g., Xu et al. 2006; Kurayama et al. 2011). However, in some such cases it has been found also that the star-forming region does follow circular rotation (e.g., Sato et al. 2010). With the assumed velocity dispersion of σv = 7 km s-1, there are some critical cases where we can only assign an upper limit for the near distance (|Vlsr| < σv), or a lower limit for the far distance (Vlsr within σv from the forbidden velocity), and that are properly indicated in the table column KDist.

Appendix B.4.4: Resolution of the kinematic distance ambiguity

The solutions for the distance ambiguity found in the literature are given in the table column KDA, which informs whether the source with available velocity (listed in name_Vlsr) is located on the near (KDA = N) or far side ( KDA = F), or just at the tangent point (KDA = T). A companion question mark indicates a doubtful assignation, e.g., from low-quality flags in the original reference, but this happens for only 2% of the solutions. The most common methods for resolution of the distance ambiguity are (examples of references are given below): 1) radio recombination lines in conjunction with H i absorption toward H ii regions, called the H i Emission/Absorption method (H i E/A); and 2) H i self-absorption (H i SA) and molecular line emission towards molecular clouds and massive YSOs. We considered any source with Vlsr within σv = 7 km s-1 of the terminal velocity as consistent with being at the tangent point, and in general we assigned a KDA = T. However, for some of these sources, there still exist reliable24 KDA solutions that can further constrain the kinematic distance to a either the near (for which KDA = NT) or the far distance ( KDA = FT).

The following references for resolved KDAs were checked systematically (positions overplotted on the ATLASGAL submaps): Caswell & Haynes (1987, presence/absence of optical counterparts + H i , Faúndez et al. (2004, application of a spiral arms model of the IV quadrant), Anderson & Bania (2009, H i E/A + H i , Roman-Duval et al. (2009, H i , and the RMS survey (Urquhart et al. 2008). For the RMS survey, which is an ongoing project, we took the KDA solutions from an on-line search we performed for every possibly associated source on “The RMS Database Server”25; these solutions arise from dedicated application of H i absorption methods (Urquhart et al. 2011, 2012), from the literature, or from grouping of sources close in the phase space where there is at least one with resolved KDA. Additional KDA solutions were found through the SIMBAD coordinate query of each source, or from the reference from which the final cluster distance was adopted (e.g., a more accurate method such as maser parallax, see Sect. B.6). All used references are listed as integer numbers in the column table ref_KDA. An “*” following the number means that the source in the corresponding reference with resolved KDA is not located at the same position of the source from which we took the velocity, but is nearby in the phase space (close position and similar velocity) indicating that is likely connected. A reference between parentheses means that it contradicts the KDA solution adopted in this work (see below). Non-numeric flags in the column ref_KDA indicate complementary criteria used here to solve the distance ambiguity:

  • C: we adopt the KDA solution for the whole associated complex(see Sect. B.6), or from a particular source in thecomplex.

  • D: source associated with an IRDC, favoring the near distance (see the arguments given by Jackson et al. 2008).

  • O: out of the solar circle, i.e., no ambiguity in the kinematic distance.

  • S: adopted KDA solution consistent with the stellar distance (see Sect. B.5).

  • z: near distance adopted, since if located at the far distance the source would be too high above the Galactic plane. We adopted a height value of |z| = 200 pc to exclude the far distance, following Blitz (1991).

If the assumption of two or more references or criteria delivered contradictory solutions for the KDA, in general we adopted the more recent, or the one using a more accurate method. Although this decision is somehow arbitrary, there are some reasonable guidelines that can be applied, e.g., we favor the consistency with stellar distance or with the complex (flags S and C), and we adopted the solution from the H i E/A method when conflicting with the H i SA method, since the first has been found to be more robust (Anderson & Bania 2009). In any case, the KDA solutions from different references usually agree; discrepant ones are only the 12% of the total of resolutions and should not affect the statistical results of this work.

Appendix B.5: Stellar distance and age

A direct estimation of the distance to a cluster, i.e., from the member stars, is particularly useful when the accuracy is better than that of the kinematic distance from the gas (e.g., when a large sample of stars is used), or when the cluster is fully exposed and there is no nebula that can be associated to it. Using data from the original cluster catalogs and new references found in SIMBAD for each object, we compiled values for the stellar distance (in kpc; table column SDist) and its uncertainty (column e_SDist), as well as the age and its error (in Myr; columns Age and e_Age, respectively) computed by studies of the cluster stellar population. The corresponding references of the adopted parameters are listed in the columns ref_SDist and ref_Age. For the optical clusters in the Dias et al. (2002, see Sect. A.1 catalog, we generally used the original parameters given there, unless new estimates based on a better method (or data) provided a real improvement in accuracy. A more rigorous approach for multiple references of the same cluster would be similar to that taken in Paunzen & Netopil (2006), and is beyond the scope of this work. However, these authors concluded that their literature-averaged parameters have the same statistical significance as the data from the Dias et al. (2002) catalog, so that for the purposes of our work a correct estimation of the uncertainties (see below) is much more important than careful averaging. Out of the 216 clusters from the Dias et al. (2002) catalog present in our sample, 131 objects come with determinations of both age and distance (+4 clusters with only the distance). We adopted these parameters for most of clusters (110 with original values, and 21 with new ones), and added parameters for 25 more. To keep track of all these changes, the original references used in the Dias et al. (2002) catalog are listed in the column ref_Dias.

The uncertainties in the cluster fundamental parameters are often ignored or underestimated in the literature; in particular, they are not provided in the Dias et al. (2002) catalog. We therefore collected all available errors from the corresponding references and, to prevent underestimation, we imposed uniform minimum uncertainties in the derived parameters. We also assumed these values as errors when they were not given in the literature. For the stellar distance, the minimum uncertainty was carefully chosen depending on the method used to calculate it, in order to correctly compare it with the kinematic distance (e.g., to decide which of both distances is finally adopted, see Sect. B.6). All most common methods for cluster distance determination use stellar photometry, so that the corresponding uncertainty is dominated by the errors from the absolute magnitude calibration and from the extinction estimation (e.g., Pinheiro et al. 2010). For the extinction, in addition to the statistical error intrinsic to the method, there is a systematic error produced by possible variations in the extinction law (e.g., Fritz et al. 2011; Moisés et al. 2011), which is often not considered in the literature and might be particularly relevant in the NIR regime. In the optical, we can consider that the typical extinction law assumed (RV ≃ 3.1, appropriate for diffuse local gas) is not subject to important variations, since the observed stars are relatively close to the Sun and not heavily embedded in the associated molecular clouds (if any), otherwise they would not be visible at these wavelengths. In the NIR, the extinction law can be described by a power law, Aλ ∝ λβ, and the variations can be accounted for with different values for the exponent β. Using the typical spread in β obtained by Fritz et al. (2011) in their compilation, we found that the corresponding uncertainty in the K-band extinction is σ(AK) ≃ 0.2   AK.

In the following, we list the main methods for stellar distance determinations of the used references, and the corresponding minimum uncertainties adopted in this work:

  • Optical main-sequence (MS) or isochronefitting (e.g., Kharchenkoet al. 2005b; Loktinet al. 2001): in this case, we followPhelps & Janes (1994) who estimated an uncer-tainty in distance modulus of σ(m − M) ~ 0.32, from a detailedanalysis of the typical error in fitting a template main sequenceto the optical color–magnitude diagram. This is equivalent to anerror of ~15% in distance. Due to the fact that, fromthe point of view of the distance uncertainty, fitting a MS is anal-ogous to fitting an isochrone, we also adopted a minimum errorof ~15% for the isochrone method. Furthermore,this is consistent with the spread in distance modulus found byGrocholski & Sarajedini (2003, see their Table 2)in their comparison of different isochrone models.

  • NIR isochrone fitting (e.g., Tadross 2008; Glushkova et al. 2010): we adopted the same minimum distance error as for optical isochrone fitting, 15%. Extinction law variations might be present, but since the type of clusters for which isochrone fitting is possible are not severely extinguished (they are generally not young), the corresponding uncertainty in AK due to these variations is also low (recall σ(AK) ≃ 0.2   AK).

  • Optical spectrophotometric distance (e.g., Herbst 1975): here, we assumed an absolute magnitude calibration uncertainty of σ(MV) ≃ 0.5, consistent with the typical spread of massive OB star calibration scales (e.g., Martins et al. 2005), and an error in spectral type determination of 1 subtype, equivalent to ±0.3 mag in MV for the Martins et al. (2005) calibration. Adding both contributions in quadrature gives an overall uncertainty of ~0.58 mag in distance modulus, or ~27% in distance.

  • NIR spectrophotometric distance (e.g., Moisés et al. 2011): for calibration and spectral type errors, we adopted the same overall uncertainty of ~0.58 mag in distance modulus as for the optical method (absolute magnitudes are usually converted from the optical to the NIR using tabulated intrinsic colors with little error). We added in quadrature an uncertainty to account for possible extinction law variations: assuming a typical extinction of AK ≃ 1.5, σ(AK) ≃ 0.2   AK ≃ 0.3. The final error in distance modulus is ~0.66 mag, equivalent to ~30% in distance.

  • Average of spectrophotometric distances from many stars (e.g., Moisés et al. 2011; Pinheiro et al. 2010): redefining the errors here would mean a complete re-computation of the average distance, since the minimum errors should be imposed in every individual star. Fortunately, in general the uncertainty of the average is dominated by the variance of the sample rather than by the individual errors. We thus kept the original quoted uncertainty in this case.

  • Kinematic distance from average stellar radial velocity (e.g., Davies et al. 2008): for consistency with gas kinematic distances, here we recomputed the stellar kinematic distance using the cluster LSR velocity, a velocity dispersion of 7 km s-1 (in all cases higher than the quoted error in the cluster velocity) and the rotation curve as described in Sect. B.4. This special case is indicated with the flag “(K)” after the reference number in the column ref_SDist.

  • 10th brightest star method (Dutra et al. 2003b; Borissova et al. 2005): we do not use the stellar distances derived by applying this technique, because they are very uncertain. The errors can easily reach a factor 10 or more in distance (Borissova et al. 2005), which thus places no constraints on the cluster location at Galactic scales.

For the cluster ages, we simply adopted uniform minimum errors based on the corresponding age range, following Bonatto & Bica (2011): 35% for Age< 20 Myr, 30% for 20 Myr ≤ Age < 100 Myr, 20% for 100 Myr ≤ Age < 2 Gyr, and 50% for Age≥ 2 Gyr. The most common method for age determination is isochrone fitting (e.g., Loktin et al. 2001). For a few clusters with stars studied spectroscopically, the age can be estimated using the evolutionary types of the identified stars and knowledge about their typical ages and lifetimes (e.g., Messineo et al. 2009). For a total of 209 clusters age estimates can be found in the literature (30% of our sample).

For some clusters of our sample for which no fundamental parameters are available, there are still some studies in the literature that present what can be considered as confirmations of the star cluster nature of the objects, i.e., the possibility of an erroneous identification as a cluster can be practically discarded. These references are given in the column ref_Conf of the catalog, and usually report higher resolution or/and sensitivity imaging NIR observations in which the star cluster is unequivocally revealed (e.g., Dutra et al. 2003b; Borissova et al. 2005; Kumar et al. 2004). They also comprise detailed studies towards star-forming regions which are too young to really constrain the cluster physical parameters by isochrone fitting, but where it is still possible to recognize YSO candidates within the cluster as color excess sources in color–color and color–magnitude diagrams (e.g., Roman-Lopes & Abraham 2006). The objects with both determined age and stellar distance can also be considered as confirmed stellar clusters, because the derivation of parameters usually requires the identification of the cluster sequence or stellar spectroscopy. We thus listed again the references for age and distance in the column ref_Conf, including in some cases additional references presenting further cluster analysis.

Appendix B.6: Complexes, subclusters, and adopted distance

Young star clusters are normally not found in isolation but within bigger complexes of gas, stars and other clusters, as a result of the fact that star formation occurs in giant molecular clouds with a hierarchical structure. If a group of stellar clusters in our sample was found to form a physically associated complex according to their positions and radial velocities, we identified it in the column Complex of the catalog. When the complex was identified in the literature, we here list its name (e.g., the giant molecular cloud W51; Kang et al. 2010). References for complex identification and analysis are provided in the column ref_Complex. Small complexes of clusters not previously established in the literature but whose morphology in the IR images (field of view of ~10′) suggests that they belong to the same star-forming region are indicated by Complex = MC-i, where i is a record number. Bigger complexes of stellar clusters not found in the literature and visually identified within the ATLASGAL fields (of ~30′) through the proximity of their members in the phase-space are marked by Complex = KC-j, where j is another record number. We warn that, however, since the complexes were recognized as part of the visual inspection of the maps, or were found in the literature, not all possible physical groupings of star clusters are provided here. For that, a subsequent statistical analysis is needed, which will be presented in a forthcoming paper. We also identified in the IR images a few cases where there is a pair of star clusters even closer, usually sharing part of their population, which can be considered as subclusters of an unique merging (or merged) entity. Those subclusters are indicated in the table column SubCl with an identical record number.

For all the clusters of our sample, the final adopted distances with their corresponding errors are listed in the table columns Dist and e_Dist (in kpc), respectively, and were chosen to be the available distance estimate with the lowest uncertainty, corresponding in some cases to a determination from the literature which was more accurate than SDist and KDist. Clusters within a particular complex were assumed to be all located at the same distance. The origin of the adopted distance is properly indicated in the column ref_Dist, and can be one of the following:

  • K: kinematic distance adopted, Dist=KDist.

  • S: stellar distance adopted, Dist=SDist.

  • Ref:n: adopted distance from literature reference with identification number n.

  • KC: complex distance computed kinematically from an average position and velocity, using the values compiled here for all the clusters within the complex with available (and not repeated) Vlsr, and the rotation curve used in Sect. B.4.

  • SC: complex distance computed by averaging the stellar distances (SDist) of the member clusters.

  • C(Ref:n): distance for the whole complex adopted from literature reference with identification number n.

  • CV(Ref:n): complex distance computed kinematically from an average position and velocity given by the reference with identification number n, and the rotation curve used in this work.

  • C(ID:m): adopted for the whole complex the distance given for the cluster with ID= m (used when a particular cluster within a complex has a very accurate distance estimation).

Appendix B.7: Additional comments

Specific comments about the stellar cluster itself, or its compiled fundamental parameters (stellar distance and age) are provided in column Comments1. We give additional remarks about the ATLASGAL emission, the associated complex or other objects, or about the finally adopted distance in column Comments2. For comments, the quoted literature is indicated by the code Ref:n, where n is the identification number of the used reference.

Appendix C: Excerpt of the cluster catalog

This appendix gives an excerpt of the cluster catalog whose construction is explained in Appendix B. The totality of the catalog, together with a list of cited references, is electronically available at the CDS. Here, we present all the catalog columns (except columns Comments1 and Comments2 which are sometimes too wide for the paper version) for 50 (out of 695) stellar clusters. Only for presentation, here the columns are distributed in five tables (Tables C.1 to C.5), but the on-line version of the catalog is a single table. The names of the columns are the same as defined in Appendix B, and they are briefly described in the following (the corresponding sections of the paper in which they are explained in more detail are given in parentheses):

  • ID: identification number (Sect. B.1)

  • Name: main name (Sect. B.1)

  • OName: other designation (Sect. B.1)

  • Cat: catalogs from which each cluster was extracted (Sect. B.1)

  • GLON: Galactic longitude (Sect. B.1)

  • GLAT: Galactic latitude (Sect. B.1)

  • RAJ2000: right ascension (Sect. B.1)

  • DEC2000: declination (Sect. B.1)

  • Diam: angular size (Sect. B.1)

  • Dist: adopted distance (Sect. B.6)

  • e_Dist: distance error (Sect. B.6)

  • ref_Dist: distance reference (Sect. B.6)

  • Age: age (Sect. B.5)

  • e_Age: age error (Sect. B.5)

  • ref_Age: age reference (Sect. B.5)

  • Morph_type: morphological type (Sect. 4.1)

  • Morph: morphological flag (Sect. 3.1)

  • Clump_sep: projected distance to the nearest ATLASGAL emission pixel (Sect. B.2)

  • Clump_flag: gives information about the correlation with ATLASGAL and line velocity available (Sects. B.2 and B.4.1)

  • name_Vlsr: source name for line velocity (Sect. B.4.1)

  • Vlsr: gas line velocity (Sect. B.4.1)

  • ref_Vlsr: reference for line velocity (Sect. B.4.1)

  • KDist: kinematic distance (Sect. B.4.3)

  • e_KDist: error in the kinematic distance (Sect. B.4.3)

  • KDA: solution of the kinematic distance ambiguity (Sect. B.4.4)

  • ref_KDA: reference for the KDA solution (Sect. B.4.4)

  • SDist: stellar distance (Sect. B.5)

  • e_SDist: error in the stellar distance (Sect. B.5)

  • ref_Sdist: reference for the stellar distance (Sect. B.5)

  • ref_Dias: reference for stellar parameters adopted in the Dias et al. (2002) catalog (Sect. B.5)

  • ref_Conf: reference for cluster confirmation (as real cluster) or further studies (Sect. B.5)

  • HII_reg: associated H ii region (Sect. B.3)

  • Bub: associated infrared bubble (Sect. B.3)

  • IRDC: associated infrared dark cloud (Sect. B.3)

  • no_GL: indicates when there is no GLIMPSE data available (Sect. B.3)

  • SubCl: groups subclusters (Sect. B.6)

  • Complex: groups spatially associated clusters (Sect. B.6)

  • ref_Complex: reference for complex identification (Sect. B.6)

Table C.1

Excerpt of the cluster catalog (Cols. 1–8).

Table C.2

Excerpt of the cluster catalog (Cols. 9–17).

Table C.3

Excerpt of the cluster catalog (Cols. 18–25).

Table C.4

Excerpt of the cluster catalog (Cols. 26–32).

Table C.5

Excerpt of the cluster catalog (Cols. 33–38).


1

In combination with distance information for cases of ambiguous physical relation.

2

Throughout this paper, we will refer as angular resolution to the full width at half-maximum of the point-spread function (or telescope beam).

4

Referring to the fact that the clusters were finally selected on the GLIMPSE three-color images.

6

This situation is conceptually different from the one indicated by the flag E8 for G3CC objects (see Sect. 2.1), where any extended 8.0 μm emission in the vicinity of the cluster is flagged. Here, the emission has to be located throughout most of the cluster area and appear as produced by the whole cluster.

7

This is a recent catalog of IR bubbles which is much more complete than the Churchwell et al. (2006, 2007) catalogs, but was not used in this work because it was published after our cluster catalog was constructed. In any case, we searched for IR bubbles by eye at every cluster position to describe the MIR morphology (see Sect. 3.1).

8

Before converting to physical units, we corrected a mistake in the original equation by Gieles & Portegies Zwart (2011): the transformation from virial radius to projected half-light radius is just 16/(3π) for a Plummer model, so that the constant in their equation is [32/(3π)] 3/2 = 6.26 instead of 10.

9

In practice, we did not distinguish between the distance d and the projected distance D = dcosb. Since the maximum latitude within the ATLASGAL range is |b| = 1.5°, the difference is less than 0.03%, far below the distance uncertainties.

10

In this paper, for simplicity we have assumed that the b = 0 plane is parallel to the “true” Galactic plane, although in reality this is not the case (Goodman et al., in prep.). While this has a negligible effect on the distance distribution and the completeness, it may distort the derived height distribution when considering clusters at large distances from the Sun (see Sect. 4.4.3).

11

Ideally, one should consider a completeness fraction dependent on Galactic longitude also, fc(D,ℓ), as we expect lower cluster detectability for low | |, where the stellar background is higher. However, since we made the approximation ϕ(D,ℓ) = 1, the integration in longitude would only affect the term fc(D,ℓ), and therefore the factor fc(D) we used can be thought as a longitude-averaged completeness fraction.

12

WEBDA is an on-line OC database originally developed by Mermilliod (1996), and available at http://www.univie.ac.at/webda/; the clusters of this database are included in the Dias et al. (2002) catalog.

13

We checked by numerical integration of that the raising of the surface density distribution in the inner Galaxy due to an exponential Galactic disk is practically imperceptible for D < 1 kpc, and therefore, a flat distribution cannot be the combined result of incompleteness and exponential disk structure.

14

Very recently, a significant effort in obtaining distances and other parameters of most of the known OCs and ECs has been published by Kharchenko et al. (2013), who claim an overall completeness limit of 1.8 kpc. Since ECs are not dominant within a complete sample, the new limit represents an intrinsic improvement in the OC completeness.

15

Note that the quoted uncertainties are from our catalog, which might be larger than the values given in the original paper because we adopted minimum errors for the age estimates (see Sect. 3.3).

16

This is totally expected for the Kharchenko et al. sample, since Lamers & Gieles (2006) used basically the same clusters. The only difference is that they did not include the objects newly detected by Kharchenko et al. (2005a). On the other hand, the fact that for the Dias et al. (2002) sample we obtain the same result implies that there are no systematic effects arising from differences between both samples, in particular regarding the age estimates.

18

According to unpublished data, there seem to be more than 300 new clusters detected so far by the UKIDSS team. An independent automated search on UKIDSS, leading to the discovery of 167 additional clusters and multiple star forming regions, has already been published by Solin et al. (2012), after the last update of our cluster compilation was done.

19

Including 3 additional GLIMPSE clusters from the literature counted as ‘Not catalogued clusters (MIR)” in Table 1.

20

For consistency with earlier studies, however, we anyway excluded from our sample a few EC candidates that have been considered spurious in the literature.

21

We use the routine meanclip from the IDL Astronomy User’s Library.

23

Levine et al. (2008) provide a rotation curve as a function of both Galactocentric radius, R, and height off the Galactic plane, z. Here we z-averaged their rotation curve, so that it only depends on R.

24

Considering that the source is near the tangent point and some method/solution combinations are not longer valid. Examples of reliable solutions are: an associated stellar distance, a far solution from the H i E/A method, or a near solution from the H i SA method.

Acknowledgments

We thank the referee for making useful suggestions that improved the clarity of the paper, and Thomas Robitaille for reading the manuscript and providing helpful comments. We acknowledge the useful discussions with Pavel Kroupa, Maria Messineo (about the GLIMPSE search for ECs), and Marion Wienen (about kinematic distances). We also benefited from email discussions with D. Froebrich (about his catalog of clusters), A. Moisés (about NIR spectrophotometric distances), and M. Gieles (about Eq. (1)). This research is based on: data from the ATLASGAL project, which is a collaboration between the Max-Planck-Gesellschaft (MPIfR and MPIA), the European Southern Observatory and the Universidad de Chile; observations made with the Spitzer Space Telescope, which is operated by the Jet Propulsion Laboratory, California Institute of Technology under a contract with NASA; data products from the 2MASS, which is a joint project of the University of Massachusetts and the Infrared Processing and Analysis Center/California Institute of Technology, funded by the National Aeronautics and Space Administration and the National Science Foundation; and data products from the WISE, which is a joint project of the University of California, Los Angeles, and the Jet Propulsion Laboratory/California Institute of Technology, funded by the National Aeronautics and Space Administration. This work has made use of the SIMBAD database, operated at the CDS, Strasbourg, France, the NASA’s Astrophysics Data System, and the VizieR database of astronomical catalogs (Ochsenbein et al. 2000). This paper has made use of information from the Red MSX Source survey database at http://www.ast.leeds.ac.uk/RMS which was constructed with support from the Science and Technology Facilities Council of the UK. E.F.E.M was supported for part of this research through a stipend from the International Max Planck Research School (IMPRS) for Astronomy and Astrophysics at the Universities of Bonn and Cologne. This was work partially carried out in the Max Planck Research Group Star formation throughout the Milky Way Galaxy at the Max Planck Institute for Astronomy (MPIA).

References

Online material

Table 2

New GLIMPSE stellar clusters identified in this work.

All Tables

Table 1

Number of clusters for every catalog used in this work.

Table 3

Number of clusters in each morphological type.

Table 4

Best-fit parameters from the Z- and D-distributions of OCs and ECs.

Table 5

Statistics for each morphological type (in percentages).

Table 2

New GLIMPSE stellar clusters identified in this work.

Table A.1

List of spurious clusters, duplicated entries, and globular clusters within the catalogs used in this work.

Table C.1

Excerpt of the cluster catalog (Cols. 1–8).

Table C.2

Excerpt of the cluster catalog (Cols. 9–17).

Table C.3

Excerpt of the cluster catalog (Cols. 18–25).

Table C.4

Excerpt of the cluster catalog (Cols. 26–32).

Table C.5

Excerpt of the cluster catalog (Cols. 33–38).

All Figures

thumbnail Fig. 1

Spitzer-IRAC three-color images made with the 3.6 (blue), 4.5 (green) and 8.0 μm (red) bands, of six (out of 75) new ECs discovered in this work, using the GLIMPSE survey. The dashed circles represent the estimated angular sizes. The images are in Galactic coordinates and the given offsets are with respect to the cluster center, indicated at the bottom of each panel.

Open with DEXTER
In the text
thumbnail Fig. 2

Examples of the two morphological types defined for ECs (see Sect. 4.1): cluster G3CC 38 of type EC1 (top panels), and the cluster [DBS2003] 113 of type EC2 (bottom panels). The left panels show Spitzer-IRAC three-color images made with the 3.6 (blue), 4.5 (green) and 8.0 μm (red) bands. The right panels present 2MASS three-color images of the same field of view, constructed with the J (blue), H (green), and Ks (red) bands. The overlaid contours on the 2MASS images represent ATLASGAL emission (870 μm); the contour levels are { 5,8.8,15,25,46,88,170 }  × σ, where σ is the local rms noise level (σ = 45 mJy/beam for G3CC 38, and σ = 42 mJy/beam for [DBS2003] 113). The images are in Galactic coordinates and the given offsets are with respect to the cluster center, indicated in the left panels below the cluster name. The dashed circles represent the estimated angular sizes from the original cluster catalogs (see Sect. B.1). The 1 pc scale-bar was estimated using the corresponding distance adopted in our catalog.

Open with DEXTER
In the text
thumbnail Fig. 3

Examples of the three morphological types defined for OCs (see Sect. 4.1): cluster [DBS2003] 176 of type OC0 (top panels), cluster NGC 6823 of type OC1 (middle panels), and cluster BH 222 of type OC2 (bottom panels). The local rms noise level of the ATLASGAL emission is, respectively, 36, 46, and 29 mJy/beam. See caption of Fig. 2 for more details of the images.

Open with DEXTER
In the text
thumbnail Fig. 4

Comparison of kinematic and stellar distances for the 38 clusters of our sample with both estimations available. Plus signs (+) indicate agreement within the errors, and circles mark the discrepant cases. Colors indicate which distance estimate was finally adopted in our catalog: stellar (red), kinematic (blue), and other (black). The dashed line is the identity.

Open with DEXTER
In the text
thumbnail Fig. 5

Crossing time vs. age for an all-sky sample of 236 clusters (Piskunov et al. 2006) taken from an homogeneous catalog of 650 optical clusters in the solar neighborhood (Kharchenko et al. 2005b,a). The dashed line is the identity tcross = Age, which divides the physical OCs (tcross ≤ Age) from associations (tcross>Age) according to the criterion proposed by Gieles & Portegies Zwart (2011).

Open with DEXTER
In the text
thumbnail Fig. 6

Galactic locations of a) OCs and b) ECs within the ATLASGAL range, superimposed over an artist’s conception of the Milky Way (Hurt’s from the Spitzer Science Center, in consultation with Benjamin), which was based on data obtained from the literature at radio, infrared, and visible wavelengths, and attempts to synthesize many of the key elements of the Galactic structure. The coordinate system is centered at the Sun position, indicated by the “⊙” symbol, and we have scaled the image such that R0 = 8.23 kpc (Genzel et al. 2010). The two diagonal lines represent the ATLASGAL range in Galactic longitude (|| ≤ 60°). In panel a), we indicate the names of the spiral arms.

Open with DEXTER
In the text
thumbnail Fig. 7

Histogram of heights from the Galactic plane, as measured from the Sun (Z = z − z0), for a) OCs and b) ECs, using a bin width of ΔZ = 10 pc and Poisson uncertainties. The overplotted solid curve in each panel represents: a) the fitted Z-distribution ΦZ(Z) from Eq. (10) with best-fit parameters z0 = 14.7 ± 3.7 pc and zh = 42.5 ± 9.9 pc; b) the predicted Z-distribution from Eq. (10), using the parameters fitted for the OC sample. In panel b), the darker shaded region is the Z-histogram for ECs with distances D < 4 kpc, whereas the dashed curve indicates the corresponding distribution as predicted from Eq. (10) and the same parameters z0 and zh.

Open with DEXTER
In the text
thumbnail Fig. 8

Histogram of heliocentric distances, D, for a) OCs and b) ECs, using a bin width of ΔD = 0.4 kpc and Poisson uncertainties. In each panel, the solid curve represents the fitted D-distribution ΦD(D) from Eq. (8), with the completeness distance Dc as free parameter (see Eq. (9)); the dashed curve shows the fit with fixed Dc = 0 (see text for details). The best-fit parameters are given in Table 4.

Open with DEXTER
In the text
thumbnail Fig. 9

Age distribution of OCs within the representative sample (D ≤ 3 kpc), using a logarithmic bin width of Δlog    (Age/yr) = 0.25 and Poisson uncertainties. The solid curve corresponds to the fitted age distribution from Eq. (12), following Lamers & Gieles (2006), with best-fit parameters CFR = 0.93 ± 0.09 Myr-1 and Mmax = (4.46    ±    0.85) ×   104  M.

Open with DEXTER
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.