Painting a portrait of the Galactic disc with its stellar clusters

T. Cantat-Gaudin; F. Anders; A. Castro-Ginard; C. Jordi; M. Romero-Gómez; C. Soubiran; L. Casamiquela; Y. Tarricq; A. Moitinho; A. Vallenari; A. Bragaglia; A. Krone-Martins; M. Kounkel

doi:10.1051/0004-6361/202038192

Home

All issues

Volume 640 (August 2020)

A&A, 640 (2020) A1

Full HTML

Free Access

Issue		A&A Volume 640, August 2020


Article Number		A1
Number of page(s)		17
Section		Galactic structure, stellar clusters and populations
DOI		https://doi.org/10.1051/0004-6361/202038192
Published online		28 July 2020

A&A 640, A1 (2020)

Painting a portrait of the Galactic disc with its stellar clusters^⋆

T. Cantat-Gaudin¹, F. Anders¹, A. Castro-Ginard¹, C. Jordi¹, M. Romero-Gómez¹, C. Soubiran², L. Casamiquela², Y. Tarricq², A. Moitinho³, A. Vallenari⁴, A. Bragaglia⁵, A. Krone-Martins³^,6 and M. Kounkel⁷

¹ Institut de Ciències del Cosmos, Universitat de Barcelona (IEEC-UB), Martí i Franquès 1, 08028 Barcelona, Spain
e-mail: tcantat@fqa.ub.edu
² Laboratoire d’Astrophysique de Bordeaux, Univ. Bordeaux, CNRS, UMR 5804, 33615 Pessac, France
³ CENTRA, Faculdade de Ciências, Universidade de Lisboa, Ed. C8, Campo Grande, 1749-016 Lisboa, Portugal
⁴ INAF-Osservatorio Astronomico di Padova, Vicolo Osservatorio 5, 35122 Padova, Italy
⁵ INAF-Osservatorio di Astrofisica e Scienza dello Spazio, Via Gobetti 93/3, 40129 Bologna, Italy
⁶ Donald Bren School of Information and Computer Sciences, University of California, Irvine, CA 92697, USA
⁷ Department of Physics and Astronomy, Western Washington University, 516 High St, Bellingham, WA 98225, USA

Received: 17 April 2020
Accepted: 6 May 2020

Abstract

Context. The large astrometric and photometric survey performed by the Gaia mission allows for a panoptic view of the Galactic disc and its stellar cluster population. Hundreds of stellar clusters were only discovered after the latest Gaia data release (DR2) and have yet to be characterised.

Aims. Here we make use of the deep and homogeneous Gaia photometry down to G = 18 to estimate the distance, age, and interstellar reddening for about 2000 stellar clusters identified with Gaia DR2 astrometry. We use these objects to study the structure and evolution of the Galactic disc.

Methods. We relied on a set of objects with well-determined parameters in the literature to train an artificial neural network to estimate parameters from the Gaia photometry of cluster members and their mean parallax.

Results. We obtain reliable parameters for 1867 clusters. Our catalogue confirms the relative lack of old stellar clusters in the inner disc (with a few notable exceptions). We also quantify and discuss the variation of scale height with cluster age, and we detect the Galactic warp in the distribution of old clusters.

Conclusions. This work results in a large and homogeneous cluster catalogue, allowing one to trace the structure of the disc out to distances of ∼4 kpc. However, the present sample is still unable to trace the outer spiral arm of the Milky Way, which indicates that the outer disc cluster census might still be incomplete.

Key words: open clusters and associations: general / Galaxy: disk

^⋆

List of cluster parameters and complete list of their members are only available at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/cat/J/A+A/640/A1

© ESO 2020

1. Introduction

The shape and dimension of our galaxy, which we commonly refer to as the Milky Way, is difficult to appreciate from our vantage point. From the pioneering work of early modern astronomers (Herschel 1785; Shapley 1918; Trumpler 1930) to recent studies (Reid et al. 2019; Gravity Collaboration 2019; Anders et al. 2019), the distance to individual objects is one of the most valuable pieces of information we rely on to reconstruct the overall structure of the Milky Way.

Among the variety of astronomical objects to which we can derive distances, stellar clusters present the advantage of spanning a wide range of ages, from a few million years (tracing episodes of recent star formation) to several gigayears (as old as the Galactic disc itself), which can be estimated with a greater precision than for individual stars. Samples of clusters with known ages have long been used to trace various properties of the Galactic disc, such as the path of its spiral arms (Becker & Fenkart 1970) or the evolution of its scale height (van den Bergh 1958). Although the precision and accuracy of age estimates are tied to the quality of the observational data and the correctness of theoretical models, distinguishing a young cluster from an old one is often relatively straightforward in a colour-magnitude diagram¹. While the first catalogues of cluster parameters only reported sky coordinates (e.g. Melotte 1915) and sometimes distances (Trumpler 1930; Collinder 1931), modern catalogues also provide associated ages. The widely-used catalogue of Dias et al. (2002) is a curated compilation of parameters from a large number of studies, which was obtained with a variety of methods and photometric systems. Another widely-cited study by Kharchenko et al. (2013) presents an automated characterisation of the cluster population (known at the time), which was performed with all-sky 2MASS photometry (Skrutskie et al. 2006). It represents a homogeneous set of parameters, but to a lesser precision than dedicated studies of individual objects.

The second data release of the European Space Agency (ESA) Gaia mission (DR2: Gaia Collaboration 2018a) represents the deepest all-sky astrometric and photometric survey ever conducted. The Gaia astrometry (proper motions and parallaxes) allows us to identify the members of clusters, and it has enabled the discovery of several hundreds of new objects. Combining parallaxes with the deep Gaia photometry allows us to estimate cluster distances, ages, and extinctions on a large scale with unprecedented precision. Thus far, the largest study on this particular topic was conducted by Bossini et al. (2019), who derived parameters for 269 clusters (mostly nearby and well-populated). Despite the high precision of their results, this sample only constitutes less than 15% of the clusters for which members can be identified with Gaia.

The aim of the present work is to study the structure of the Galactic disc revealed by clusters of various ages. To this effect, we derived cluster parameters in a homogeneous and automatic fashion for ∼2000 Galactic clusters with members identified in the Gaia data. In Sect. 2 we present the input data and our list of reference clusters. Section 3 describes the artificial neural network that we built and trained in order to estimate parameters. Section 4 introduces our catalogue of cluster parameters. In Sect. 5 we use this cluster sample to trace the structure of the Galactic disc. Section 6 contains a discussion of the results, and Sect. 7 closes with concluding remarks.

2. Data

2.1. Cluster members from Gaia DR2

We retained the probable members (probability > 70%) of 1481 clusters whose membership list was published by Cantat-Gaudin & Anders (2020), who estimated the membership probabilities for stars brighter than G = 18 using the unsupervised classification scheme UPMASK (Krone-Martins & Moitinho 2014; Cantat-Gaudin et al. 2018a). We collected the list of members provided by the authors for the recently discovered UBC clusters (Castro-Ginard et al. 2018, 2019, 2020).

We also applied UPMASK to the 56 cluster candidates proposed by Liu & Pang (2019). We were able to find secure members for 35 of them. These objects are listed in our catalogue as “LP”, followed by the entry number given in Liu & Pang (2019). Since UPMASK is not suited for very extended clusters, we took the list of members for the nearby clusters Melotte 25 (the Hyades) and Melotte 111 (Coma Berenices), which were derived from Gaia DR2 astrometry by Gaia Collaboration (2018b). In total, this compiled sample comprises ∼230 000 stars that are brighter than G = 18, which belong to 2017 clusters.

2.2. Reference clusters

We compiled a list of 347 clusters with parameters (age, reddening, and distance modulus) that are known to a sufficient precision to be used as points of reference. Their ages and distances are shown in Fig. 1. We strove to use a small number of reference studies to maximise homogeneity, while also covering the entire parameter space and privileging studies that employed Gaia data for their membership selection.

Fig. 1.

Age and distance modulus of our reference clusters (described in Sect. 2.2).

The 269 clusters of Bossini et al. (2019) represent the bulk of this reference set, and they constitute the largest homogeneous sample of cluster ages obtained from Gaia data. Their parameters were determined by fitting PARSEC isochrones (Bressan et al. 2012) with the Bayesian code BASE9 (von Hippel et al. 2006) to Gaia DR2 photometry of the cluster members identified in Cantat-Gaudin et al. (2018b).

This sample contains few clusters that are older than 1 Gyr and few clusters that are more distant than 4 kpc. We therefore supplemented the sample with 36 clusters from the BOCCE survey, which focuses mainly on old clusters, of which many are distant and characterised with a combination of deep photometry and high-resolution spectroscopy (Bragaglia & Tosi 2006; Bragaglia et al. 2006; Tosi et al. 2007; Andreuzzi et al. 2011; Cignoni et al. 2011; Donati et al. 2012, 2014a, 2015; Ahumada et al. 2013).

Since these two samples contain very few clusters that are younger than log t ∼ 7.5, we supplemented the training set with 21 young clusters with distances smaller than 1.5 kpc and parameters that were taken from the catalogue of Kharchenko et al. (2013), which have visually well-defined colour-magnitude diagrams. We also included seven clusters that have been the subject of dedicated papers by the Gaia-ESO Survey: NGC 3293 (Delgado et al. 2016); NGC 4815 (Friel et al. 2014); NGC 6705 (Cantat-Gaudin et al. 2014); NGC 6802 (Tang et al. 2017); Pismis 18 (Hatzidimitriou et al. 2019); Trumpler 20 (Donati et al. 2014b); and Trumpler 23 (Overbeek et al. 2017). We consider their parameters to be especially reliable due to the large number of radial velocities collected for these studies (allowing for good membership selections) and precise metallicities.

The Swift UVOT Stars Survey provides cluster parameters for 49 clusters, which were studied with Gaia DR2 astrometry and isochrone fitting to near-ultraviolet photometry (Siegel et al. 2019). Eighteen of them are not present in the previously mentioned references, so we added them to our reference sample.

3. Cluster parameters and machine learning

Estimating the main parameters (age, distance modulus, extinction, and sometimes metallicity) of a star cluster is often done via isochrone fitting: A theoretical model of the sequence traced by a coeval group of stars in a two-dimensional colour-magnitude diagram (CMD) is compared to the observed distribution of stars. Perhaps surprisingly, designing a robust and efficient automatic procedure for isochrone fitting is far from trivial. Observed CMDs of clusters do not simply follow a single sequence, but they feature unresolved binaries (a problem addressed by the τ² statistics of Naylor & Jeffries 2006), blue stragglers, broadened turnoffs (Marino et al. 2018; Bastian et al. 2018; Sun et al. 2019; Li et al. 2019; de Juan Ovelar et al. 2020), and almost always contamination by field stars, which can also be taken into account with ad-hoc statistics (as in e.g. Monteiro et al. 2010). The stellar phases that provide the most clues about the age and distance of a cluster (its turnoff, red clump, and red giants) also happen to be the least populated parts of a CMD², and they must be given a higher weight subjectively. The Bayesian code BASE9 (von Hippel et al. 2006; Jeffery et al. 2016) relies on robust statistical principles and it allows for the use of prior knowledge (most importantly, a distance constraint provided by e.g. Gaia parallaxes). However, its runs can be very time-consuming, it generally requires a large number of cluster members (it was in fact originally designed for globular clusters), and it is currently unable to deal with CMDs affected by differential extinction. The ASteCa package (Perren et al. 2015) uses a sophisticated approach with a modelling of a synthetic cluster from theoretical isochrones, but it is also relatively time-consuming and unable to deal with differential extinction and blue stragglers at present.

Isochrone fitting is therefore often performed by hand, which when done properly provides satisfactory results, but it is impractical to perform it on the samples of hundreds to thousands of clusters available from modern sky surveys. To address this problem and avoid direct comparisons with theoretical isochrones, we built a data-driven procedure to estimate the parameters of an unknown cluster based on its similarities with objects of known parameters. Although the age accuracy is ultimately tied to the reference values, which are derived from stellar evolution models, our approach has the advantage of putting all clusters on the same age scale and providing reliable relative ages. Learning from labelled CMDs can be thought of as a generalisation of the empirically calibrated morphological age index, which allows for a quick estimate of a cluster age by measuring the magnitude difference between its turn off and red clump (used by e.g. Lynga 1982; Janes & Adler 1982; Janes et al. 1988; Phelps et al. 1994; Carraro & Chiosi 1994; Janes & Phelps 1994; Friel 1995; Salaris et al. 2004), or the morphological age ratio (Anthony-Twarog & Twarog 1985; Twarog & Anthony-Twarog 1989).

3.1. Artificial neural network

The increasing size and dimensionality of astronomical datasets have made machine learning increasingly popular in the field (see e.g. the reviews by Fluke & Jacobs 2020; Baron 2019). Artificial neural networks (ANNs) are particularly popular due to their flexibility and performance at both classification (e.g. Ting et al. 2018; Castro-Ginard et al. 2018) and regression tasks (e.g. Leung & Bovy 2019; Kounkel & Covey 2019; Boucaud et al. 2020).

An ANN is a system that maps the input observables to the target output quantities through a series of nodes. Here, the three targets are the cluster age, extinction, and distance modulus. Nodes are organised in layers, where every node receives input from the previous layer and output from a non-linear function of the input to the successive layer. For this work, we use a rectified linear unit (ReLU). Formally, ANNs are universal approximators, which means that any continuous function can be approximated by an ANN with at least one hidden layer. Approximating a complex function might require a large number of nodes in the hidden layer, making the network slower to train and more prone to overfitting. An equivalent or better approximation can often be obtained with a smaller number of nodes if they are organised into several hidden layers in which each one contains an increasingly abstract representation of the data structure. For this study we experimented with various architectures and settled on an ANN with three hidden layers, as is shown in Fig. 2.

Fig. 2.

Architecture of our artificial neural network, indicating the width (number of nodes) of each layer. The example cluster is Haffner 22. The input quantities are described in Sect. 3.1.

The main input observable that we provided to our ANN was a 2D histogram of the Gaia colour-magnitude diagram of each cluster, with a bin width of 0.2 mag in colour and 0.5 mag in magnitude. The histogram was pre-processed before being fed to the ANN. We took the logarithm of the counts and scaled it so the most populated bin always had a value of 1. The entire histogram contains 700 bins. Applying a principal component analysis to the flattened histograms of our training set (described in Sect. 3.2) shows that 99.9% of the variance can be expressed with only 410 components. We therefore applied the transformation computed on the training set, which reduced the number of input quantities by nearly half, with a negligible loss of information.

We also provided the median parallax ⟨ϖ⟩ to the ANN, which is a strong predictor of distance, especially for the most nearby clusters. For each cluster, we provided two additional quantities estimated from the CMD (illustrated in Fig. 2). The quantity s_bright is the slope in the relation between colour and magnitude for the stars whose distance-corrected magnitude³ is brighter than 4. This quantity strongly correlates with the cluster age. Finally, we denote MS_4, 5 as the mean colour of stars whose distance-corrected magnitude is between 4 and 5. In this magnitude range, stars are always expected to be on the main sequence even in the oldest clusters, and their colour is a strong predictor of reddening.

If fewer than ten stars were available to estimate s_bright, we set it to an edge value of −10. If no stars were available for MS_4, 5, which happens for distant and reddened clusters, we also set its value to −10. Thanks to their hidden layers, ANNs are able to approximate logical functions, which implicitly allows them to handle missing values.

3.2. Training set

Our first attempts to estimate cluster parameters involved ANNs, which were trained with mock CMDs. Such systems were extremely good at recovering the input parameters of other mock CMDs, but overall they returned disappointing results when applied to real, observed, Gaia CMDs. We therefore chose a data-driven approach that would not require us to generate mock clusters from theoretical models. Training machine learning procedures on labelled observed data is an increasingly common practice in various sub-fields of astronomy. For instance Ting et al. (2018) trained an ANN to distinguish red giant branch stars from red clump stars, Leung & Bovy (2019) determined elemental abundances with an ANN trained on high signal-to-noise ratio spectra, and Arnason et al. (2020) identified new X-ray binary candidates in M 31.

The basis of our training set are the clusters presented in Sect. 2.2. A good training set must not only cover a wide range of parameters, but also be dense enough so that the ANN cannot memorise it and it must learn how the relevant features relate to the output. We performed data augmentation by creating variations of the reference clusters by artificially increasing their distance modulus and their extinction, by sub-sampling them, and by adding differential extinction.

The simulated distance modulus was randomly picked between 0.5 mag smaller than the reference value and 16 mag (∼15.85 kpc). We adjusted the simulated parallax accordingly and removed stars whose simulated G magnitude was fainter than 18. To account for the uncertainties in the mean parallax, the local parallax zero-point variation, and to simulate the known zero-point offset in parallaxes (Lindegren et al. 2018; Arenou et al. 2018), we then subtracted 0.029 mas and added a random offset that was uniformly picked between −0.05 and +0.05 mas. Adding noise to the simulated parallaxes is important so the ANN learns that for distant clusters, the distance modulus is mostly constrained by the CMD morphology and not by the parallax.

In order to cover a wide range in extinction, additional extinction was added up to A₀ = 5, using the polynomial relation presented by Danielski et al. (2018) and Gaia Collaboration (2018b). Differential reddening was added to half of the variations by first picking a random value between 0 and 1, setting the intensity of the differential reddening for this variation, then adding a random extinction picked between 0 and this maximum intensity.

Finally, we sub-sampled every reference cluster by picking a random number of stars, which went as low as ten, for every variation. In total we created 1500 versions of each reference cluster and 3000 for the clusters with log t < 7.4 and log t > 9.4 since there are few of them in our reference sample.

Since the cluster members were selected based on their astrometry only, many clusters (especially the distant ones) include a fraction of field star contaminants. They were not removed from the training set, which means some training examples contain field contamination. The trained ANN is therefore able to deal with contamination in the non-reference clusters that we characterise in this study.

3.3. Implementation and training

We implemented the ANN on a desktop computer as a multi-layer perceptron regressor from the scikit-learn Python library (Pedregosa et al. 2011). The training was performed with the built-in ADAM solver (Kingma & Ba 2014). The scikit-learn implementation optimises the R-squared score defined as $R^{2} = 1 - \frac{u}{v}$ $R^2 = 1 - \frac{u}{\mathit{v}}$ where u = ∑(y_true − y_pred)² is the residual sum of squares and $v = \sum {(y_{true} - \bar{y_{true}})}^{2}$ $\mathit{v} = \sum (\mathit{y}_{\mathrm{true}}{-}\overline{\mathit{y}_{\mathrm{true}}})^2$ is the total sum of squares. A score of 1 would indicate a perfect prediction for all of the labels.

To make each training iteration faster and to alleviate the risk of the optimisation staying stuck in a local optimum, each iteration only used a random 20% of the training set. We built a validation set, which was created exactly like the training set, but containing other random variations of the reference clusters. We trained the ANN for 1000 iterations. At each iteration, we also verified the prediction of the ANN on the validation sample. We show in Fig. 3 that although the training score steadily increases, the validation score reaches a maximum of around 200 iterations then it slowly decreases, which is a sign that the ANN starts overfitting. For the rest of this study, the ANN that we use is the one that was trained for 200 iterations.

Fig. 3.

Evolution of the training and validation scores with training iterations. The network used in this study is the result of 200 iterations.

3.4. Performance on the validation set

To assess the ability of the ANN to recover ages, extinctions, and distances, we investigated its performance on the validation set. Figure 4 shows the difference between the age estimated by the ANN and the reference value as a function of the number of stars. We see from this figure that young clusters with very few stars tend to have their ages slightly overestimated because the sparsely populated turn off appears fainter. Whereas for old clusters, the absence of red giants makes them appear younger. This is not specific to our machine learning approach, but rather a general limitation of using CMDs to estimate cluster ages. In practice, less than 10% of our observed clusters have fewer than 20 members. In a successive step (Sect. 4), we also flag the clusters whose CMDs are too sparse and/or too blurry to show a meaningful pattern.

Fig. 4.

Difference between the age estimate and the reference value for ∼120 000 validation samples, split in two age groups. The full line is a running mean. The dashed lines represent the upper and lower standard deviation.

Overall, the uncertainty on the determination of log t ranges from 0.15 to 0.25 for young clusters and from 0.1 to 0.2 for old clusters. For the extinction and distance modulus, the precision of the ANN also depends on the number of stars, but only marginally on the age of the cluster. The typical uncertainty of A₀ ranges from 0.1 to 0.2 mag, and the typical distance modulus uncertainty ranges from 0.1 to 0.2 (∼5% to 10% distance uncertainty).

If we assume that the reference values represent the ground truth, then these mean differences indicate the precision of our procedure. However the scatter encompasses both the uncertainties due to our methodology and the uncertainties of the reference parameters.

At the beginning of training, the weights of the ANN are initialised to random values. Every training run therefore converges to a slightly different final state. We have verified that the difference between several networks trained for 200 iterations with the same training set is negligible.

4. The catalogue of cluster parameters

We applied the trained ANN to estimate the parameters of all 2017 clusters mentioned in Sect. 2.1. We visually inspected the CMD of every cluster, with theoretical isochrones corresponding to the estimated parameters. For the large majority of them, the result looked satisfactory and closely matched the result that would have been obtained by a human expert. In 61 cases, the parameters had to be adjusted manually in order to better match the aspect of the CMD with a PARSEC isochrone (Bressan et al. 2012) of solar metallicity. The reason why the ANN performed poorly on these objects is not clear – they do not correspond to a specific age or distance range – and might be due to field contaminants. The parameters proposed by the ANN were still close enough to make this manual correction faster than having to pick an isochrone without a suggested starting point. We flagged these 61 objects in our catalogue.

We also flagged 81 clusters whose CMD is too blurred and reddened. They mostly distribute close to the Galactic plane in the direction of the Galactic centre, and most of them are known embedded clusters. Some of these objects include NGC 1579, which is associated with the Northern Trifid HII region, or the young massive clusters Westerlund 1 and Westerlund 2.

We further flagged 69 objects for which the CMD is too sparse to estimate meaningful parameters from photometry. Finally, we used literature values for three objects with a clear enough CMD but where the ANN failed to recover good parameters. Two of them are the very nearby Hyades (Melotte 25) and Coma Ber (Melotte 111), whose distance modulus is out of the range covered by our training set. We set their parameters to the values quoted by Gaia Collaboration (2018b). The third cluster is Gaia 2 for which our only members are red giant branch stars. We took its parameters from Koposov et al. (2017).

We end up with 1867 clusters with reliable parameters. We provide the list of all investigated clusters with their mean parameters and corresponding flags as an electronic table available at the CDS.

4.1. Comparisons with the literature

In the top row of Fig. 5, we show comparisons between our recovered parameters and the values listed by Kharchenko et al. (2013; hereafter K13), which were obtained by isochrone fitting to 2MASS photometry (Skrutskie et al. 2006). Many of the clusters for which K13 lists old ages while we find young ages are very reddened objects, where the bright turnoff stars have been mistaken by K13 for a red branch (e.g. FSR 1335, whose CMD is shown in Fig. 6). Conversely, the cleaner membership and the distance constraint provided by the Gaia astrometry show that objects such as FSR 1402 (also shown in Fig. 6) are evolved clusters. Since FSR 1335 is young, sparse, and distant, any estimate of its age from just Gaia photometry of its brightest members is affected by large uncertainties. It is, however, evident that it is not an old cluster. Our procedure generally returns lower extinctions than K13. This could be due to our choice of defining A₀ as the extinction corresponding to the blue edge of the sequence in a CMD, before the effect of differential reddening, rather than determining the value for which the isochrone passes through the middle of the sequence.

Fig. 5.

Top row: comparison of the parameters for the clusters in common with Kharchenko et al. (2013). Middle row: comparison of the parameters for the clusters in common with Monteiro & Dias (2019). The CMDs and isochrones for the labelled clusters are shown in Fig. 6. Bottom row: comparison between our ANN parameters and the literature references presented in Sect. 2.2. All panels display the root mean square (rms) difference.

Fig. 6.

Colour-magnitude diagram, colour-coded by spectral type from the effective temperatures of StarHorse (Queiroz et al. 2018; Anders et al. 2019) for the four clusters labelled in Fig. 5. The lines are PARSEC isochrones of solar metallicity.

A comparison with the parameters recently published by Monteiro & Dias (2019) is shown in the middle panel of Fig. 5. The authors relied on Gaia DR2 to select cluster members and constrain their distance, thus explaining the better agreement to our results. Several clusters still have discrepant age estimates, almost all of them are due to the presence of red stars that we consider to be cluster members. Two of them are labelled in Fig. 5, and their CMDs are shown in Fig. 6.

The bottom row of Fig. 5 shows comparisons with the reference values for the clusters we used to build the training set (presented in Sect. 2.2). The fact that we do not exactly recover the reference parameters is a good sign because it shows that the ANN did build an approximation of the relation between observables and cluster parameters, rather than memorising the aspect of reference clusters. The largest age discrepancies affect a handful of clusters for which Bossini et al. (2019) list ages log t ∼ 7.6, while our ANN estimates log t ∼ 7.9. These objects are too distant for their pre-main sequence stars to be visible, so the main age constraint is the ill-defined location of their turn off.

4.2. Composite Hertzprung-Russell diagram

Having an estimate of A₀ for each cluster, we corrected the observed colours and magnitudes for interstellar extinction by inverting the relations given in Danielski et al. (2018) and Gaia Collaboration (2018b). We then corrected G for distance modulus. The comprehensive Hertzprung-Russell diagram (HRD), which is made up of 1867 clusters, is shown in Fig. 7.

Fig. 7.

Comprehensive Hertzprung-Russell diagram including 1867 clusters, colour-coded by cluster age.

Since a single value of extinction was used for each cluster, this HRD is still affected by differential extinction, which is especially apparent in the elongation of the red clump. A few white dwarfs can be seen. They belong to the very nearby Hyades (Melotte 25), Coma Ber (Melotte 111), and Praesepe (NGC 2632). In the lower right part of the diagram, the presence of pre-main sequence stars is clearly visible in clusters younger than log t ∼ 8.

All of the cluster members used in this study have an apparent G magnitude that is brighter than 18. Since most of the old and very populated clusters are distant objects (e.g. Berkeley 32 or Collinder 261), few old stars with M_G > 5 are visible in the HRD.

4.3. Limitations and potential improvements

Although age, distance, and extinction are the parameters that contribute most to the aspect of a cluster in a CMD, metallicity also plays a role, especially for the coolest stars. Some studies leave it as a free parameter when performing isochrone fitting, but it is common to keep it fixed to an assumed value, as a wrong value mostly affects the reddening and only has a small impact on ages⁴. In this study we did not train the ANN to estimate metallicities, but the training set spans a large range in metallicity. Given that we fed the ANN a coarsely binned representation of the CMD, and given the strong degeneracy between metallicity and extinction, it is unlikely that our ANN could be used to make meaningful estimations of this quantity. An additional issue is that only a relatively small fraction of clusters have homogeneous and precise abundance determinations from high-resolution spectroscopy, which are and often from inhomogeneous sources (a problem discussed by Heiter et al. 2014), meaning that such a machine learning procedure would have to rely on a training set built with mock data.

Since we binned the millimag-precision Gaia DR2 photometry (Evans et al. 2018) into a grid with a resolution of 0.2 mag in colour and 0.5 mag in G magnitude, our approach is obviously not able to take advantage of the finest features observed in some Gaia CMDs. For the best-defined clusters, isochrone fitting procedures (e.g. Naylor & Jeffries 2006; von Hippel et al. 2006; Monteiro et al. 2010) are able to extract more information from the CMDs. We experimented with a finer binning of the CMD, but the size of the training set and the exponential increase in training time made this impractical. In the future, procedures employing an adaptive kernel density estimation might help to overcome this issue.

The use of ground-based photometry, especially at non-optical wavelengths, and value-added catalogues containing astrophysical parameters for individual stars (Andrae et al. 2018; Anders et al. 2019) could help to provide better constraints on the cluster parameters. Colour-magnitude diagrams are not an optimal approach for young clusters, especially when their pre-main sequence stars are not visible and the only age constraint is the colour of the bluest, most massive, identified member. They can also be affected by significant inhomogeneous extinction or feature small age spreads. When spectroscopic measurements are available, the lithium depletion boundary method (LDB) can provide a better constraint than photometry (e.g. Barrado y Navascués et al. 2004; Jeffries & Oliveira 2005), but it can return older ages than CMD fitting (e.g. 21 Myr versus 7.5 Myr in Jeffries et al. 2017). Lyra et al. (2006) have reported and discussed systematical differences between the nuclear ages, for main sequence stars, and contraction ages for pre-main sequence stars. Randich et al. (2018) performed a homogeneous analysis of seven clusters younger than ∼100 Myr, making use of three different sets stellar evolution models of (J, H, K_s, V) photometry and LDB models. They confirm that much of the scatter found in the literature for the age of these objects can be attributed to the use of different models or the choice of photometric passbands included in the isochrone fitting. An additionnal issue affecting young and embedded clusters is that star-forming regions are sometimes known to present anomalous reddening laws that differ from the general interstellar medium (e.g. Feinstein et al. 1973; Vazquez et al. 1996; Hur et al. 2012; Kumar et al. 2014), while the present study employs the same fixed reddening law for all clusters. However, Jordi et al. (2010) remark that varying the extinction law within the range reported by Fitzpatrick & Massa (2007) has a negligible effect on the Gaia photometry.

Another promising approach to deriving cluster ages is the analysis of stellar rotation (so-called gyrochronology, Barnes 2007), which presents the advantage of allowing age estimates for main sequence stars, and up to several billions of years (e.g. Meibom et al. 2015; Douglas et al. 2019). A spectacular application of this method is the characterisation of the recently discovered Pisces-Eridanus stream (Meingast et al. 2019). While it had been previously claimed (based on a single red giant with an uncertain membership status) that the structure could be 1 Gyr old, Curtis et al. (2019) show that 154 main sequence stars with available rotation periods exhibit a similar rotation pattern to the Pleiades (∼120 Myr). Curtis et al. (2019) also point out that although theoretical models have so far been unable to perfectly fit the observed loss of stellar angular momentum with age, empirical comparisons with benchmark clusters of a known age can already provide robust constraints. The Transiting Exoplanet Survey Satellite (TESS, Ricker et al. 2015) provides an all-sky survey from which light curves can be obtained, and many of its targets are cluster members (Bouma et al. 2019). In our sample, several clusters⁵ are located at high Galactic latitudes and only contain late-type stars, but their ill-defined turnoff and the absence of red clump stars make it impossible to constrain their age. The increase in available training data (from e.g. TESS) and the flexibility of machine learning procedures, allowing for missing values and the empirical combination of measurements of a different nature, will make it possible to constrain the ages of such difficult objects.

5. Galactic structure

Using the derived distance modulus, we computed the (X, Y, Z) cartesian coordinates⁶ of all clusters with available parameters. We show the projection of the cluster distribution on the Galactic plane in Fig. 8. We also computed the Galactocentric radius R_GC, assuming a Solar Galactocentric distance of 8340 pc⁷, which is the value adopted by the spiral arm model of Reid et al. (2014). The R_GC versus Z distribution is shown in Fig. 9.

Fig. 8.

Projection on the Galactic plane of the locations of clusters with derived parameters, colour-coded by age. Top panel: all ages. The shaded area shows the spiral arm model of Reid et al. (2014). The dashed arm is the revised path of the Cygnus arm in Reid et al. (2019). Bottom row: splits the sample into three age groups. The Sun is at (0,0) and the Galactic centre is to the right. The most distant objects were left out of the plot.

Fig. 9.

Top: galactocentric distribution for three age groups. Bottom: distance from the Galactic plane against Galactocentric distance, colour-coded by age, for the clusters with derived parameters. The vertical dotted line shows the assumed Solar value of R_GC = 8340 pc (Reid et al. 2014). Our catalogue lacks Saurer 1 members, so we took its parameters from Carraro & Baume (2003).

We show the distribution of extinction in Fig. 10. The sample of known clusters reaches much larger distances in the direction of the outer disc, especially for objects located far above the plane, but it is still limited by interstellar reddening at low Galactic latitudes.

Fig. 10.

Distribution of clusters in Galactic XY coordinates (left) and altitude versus Galactocentric radius (right), colour-coded by extinction A₀. In both panels, a few distant outliers were left out of the plotting window.

5.1. Spiral structure

The spatial distribution of young clusters is known to correlate with the location of the spiral arms in the Milky Way (Morgan et al. 1953; Becker & Fenkart 1970; Dias & Lépine 2005). The projection of the cluster distribution is shown in Fig. 8. Its general aspect is similar to Fig. 11 in Cantat-Gaudin et al. (2018b), where groups of young clusters distribute preferentially along the locations of spiral arms delineated by Reid et al. (2014), but with important gaps and discontinuities.

In the region covered by the present study, the updated spiral arm model of Reid et al. (2019) is virtually identical. Most differences affect the first Galactic quadrant at distances that our sample of clusters does not reach, with the notable exception of the outer, Cygnus, arm. For this arm, Reid et al. (2019) fitted a significantly different location with a pitch angle of 3° and R_GC ∼ 11 to 13 kpc in the anticentre direction (compared to 13.8° and 13 to 15 kpc in Reid et al. 2014). We show the revised arm as a dashed line in Fig. 8.

Our sample of Gaia-confirmed clusters only contains very few objects with R_GC > 12 kpc. The top panel of Fig. 9 exhibits two clearly visible peaks in the young cluster distribution, corresponding to the local arm and the Perseus arm. The Cygnus arm is not visible due to the lack of available tracers. Camargo et al. (2015) estimated the distance to several embedded clusters that were identified in WISE infrared images (Wright et al. 2010), and they propose that they trace the Cygnus arm at a Galactocentric distance of 13.5 to 15.5 kpc, which agrees with the more distant (Reid et al. 2014) model.

It has been noted (e.g. Vázquez et al. 2008) that the Perseus arm traced by clusters appears to be interrupted in the Galactic longitude range of ∼140°–160°. Many clusters have been discovered in the Perseus arm region in the past decade, including two dedicated searches in Gaia DR2 (Cantat-Gaudin et al. 2019; Castro-Ginard et al. 2019), but all of them were found around the gap, rather than inside of it. This region of low density is visible around (X, Y) = (−2000 pc + 1000 pc) in the maps displayed in Fig. 8.

A natural explanation for the lack of detected objects in this direction could be that our view is obscured by interstellar dust, but this range of Galactic longitude does not correspond to a known region of high extinction (e.g. Lallement et al. 2019). The strongest argument against extinction being responsible for this gap is illustrated in Fig. 11. While the number of clusters located at the distance of the Perseus arm drops for ℓ ∼ 140° to 160°, the number of known clusters located behind the arm increases. It can be seen in Fig. 10 that the clusters located beyond the gap are only moderately reddened, with values of A₀ ∼ 1.5 mag. This in fact suggests that the Perseus gap is a window of relatively lower extinction.

Fig. 11.

Top left: distribution of Perseus arm clusters (arbitrarily selected as 10 kpc < R_GC < 11 kpc) in cyan, and more distant clusters in black. Bottom left: galactic coordinates of the same clusters. Right: locations of the same clusters projected on the Galactic plane. The dashed circle has a diameter of 1200 pc.

This gap is visible in the distribution of other young tracers, which are traditionally associated with spiral arms, and is in fact present in the HII map of Becker & Fenkart (1970) as well as in the HI map of Spicker & Feitzinger (1986), although the authors do not comment on it. The distribution of HII regions used by Hou & Han (2014) to trace the spiral structure is interrupted in the same region, and the gap can be seen (tentatively) in the Cepheid distribution of Skowron et al. (2019) as well as in the OB stars shown by Romero-Gómez et al. (2019), Poggio et al. (2018), and Jardine et al. (2019), and the high-mass star-forming regions of Reid et al. (2014, 2019).

A possibly similar gap, which is not as clear however, can be observed in the Sagittarius arm (Fig. 8), with an under-density of young clusters around (X, Y) = (+1000 pc,−1000 pc). Studying clusters in kinematical space could indicate that these arms are fragmenting, which is a phenomenon routinely seen in N-body simulations (e.g. Roca-Fàbrega et al. 2013; Grand et al. 2014; Hunt et al. 2015), and this would show that the Milky Way is not a grand design spiral galaxy, but rather a flocculent one.

We also see that the interarm region between the local arm and the Perseus arm is not as clear in the third quadrant as in the second quadrant, which is in agreement with Moitinho et al. (2006) and Vázquez et al. (2008), who propose that the local arm extends towards the Perseus arm along the ℓ = 245° line. The presence of young clusters in the region between the Perseus and outer arms can also be interpreted as the trace of interam spurs, as reported by Molina Lera et al. (2019) and suggested by the HII maps of Hou & Han (2014). Such features are visible in external spiral galaxies (e.g. Corder et al. 2008; Elmegreen et al. 2018) and naturally occur in numerical simulations (e.g. Shetty & Ostriker 2006; Dobbs & Bonnell 2006; Pettitt et al. 2016).

5.2. Scale height

The fact that old clusters tend to be found at higher Galactic altitudes (further away from the plane) than young clusters has been noted by numerous observers (van den Bergh 1958; van den Bergh & McClure 1980; Janes et al. 1988; Janes & Phelps 1994; Phelps et al. 1994; Friel 1995), and this is visually obvious from our Fig. 9. The main cause for the thickening of the Galactic disc is the gradual velocity scatter, which is introduced by gravitational interactions with giant molecular clouds (first theorised by Spitzer & Schwarzschild 1951, 1953), although it is now understood that the effects of the spiral structure, Galactic bar, warp, and even minor mergers have contributed to the vertical heating of the disc (see e.g. the recent study of Mackereth et al. 2019, and references therein).

Various analytical parametrisations of the vertical density distribution are used in the literature (van der Kruit 1988; Dobbie & Warren 2020). A simple form often used for the cluster distribution is the exponential profile:

$\begin{matrix} N (Z) = \frac{1}{h_{z}} exp (- \frac{| Z - ⟨ z ⟩ |}{h_{z}}), \end{matrix}$ $\begin{aligned} N(Z) = \frac{1}{h_z} \exp { \left( - \frac{|Z-\langle z \rangle |}{h_z} \right) } , \end{aligned}$ (1)

where ⟨z⟩ is the mean offset of the Galactic plane with respect to the Sun and the h_z parameter is called the scale height. Many authors perform a fitting of the scale height in bins of age or Galactocentric radius. Rather than binning, we modelled it with a power-law dependence on age (t) and a linear dependence on the Galactocentric radius:

$\begin{matrix} h_{z} = k + a \times {(\frac{t}{100 Myr})}^{α} + ρ \times (R_{GC} - R_{GC, ⊙}) . \end{matrix}$ $\begin{aligned} h_z = k + a \times \left( \frac{t}{100\,\mathrm{Myr}} \right)^{\alpha } + \rho \times ({R}_{\mathrm{GC} } - {R}_{\mathrm{GC} ,\odot }). \end{aligned}$ (2)

We sampled the parameter space using the Markov chain Monte Carlo sampler emcee (Foreman-Mackey et al. 2013), with flat priors on all parameters. The resulting posterior distribution is shown in Fig. 12.

Fig. 12.

Markov chain Monte Carlo sampling of the posterior distribution for the scale height model presented in Sect. 5.2, showing the last 2000iterations of 32 walkers (64 000 points).

The first free parameter in our model is ⟨z⟩, that is, the mean altitude of the entire sample considering that the Sun sits at altitude 0. The best fit value is ⟨z⟩ = −23 ± 3 pc, corresponding to a solar displacement of z₀ = 23 ± 3 pc. This value is in line with estimates from star counts from Jurić et al. (2008; 25 ± 5 pc), Chen et al. (1999; 28 ± 6 pc), Chen et al. (2001; 27 ± 4 pc), or Maíz-Apellániz (2001; 24 ± 2 pc), for instance. We remark that studies making use of young tracers tend to report a slightly smaller solar displacement, which can be seen in Karim & Mamajek (2017; 17 ± 2 pc) or Reed (2006; 19.6 ± 2.1 pc), for instance, and previous estimates based on clusters such as in Buckner & Froebrich (2014; 18.5 ± 1.2 pc) and Joshi (2007; 13 to 20 pc) reported smaller values. The altitude of the Galactic mid-plane is known to vary with Galactocentric radius (sometimes called corrugation, see e.g. Gum et al. 1960; Lockman 1977; Spicker & Feitzinger 1986; Cantat-Gaudin et al. 2018b), which might be an additional reason why different samples yield slightly different values⁸.

In the Solar neighbourhood, where the typical cluster age is ∼100 Myr, the cluster scale height of the best-fit model is 74 ± 5 pc, which is marginally compatible with the 64 ± 2 pc of Joshi et al. (2016). Our best-fit value of ρ = 0.016 ± 0.003 (18 pc per kpc) is in good agreement with the value of 0.02 reported by Buckner & Froebrich (2014).

We also find that the scale height increases to several hundreds of parsecs for old clusters (also reported by Janes & Phelps 1994; Froebrich et al. 2010; Buckner & Froebrich 2014), with a power-law index of α = 1.3 ± 0.2. The mechanism often invoked to explain the steeper increase at higher ages is that clusters whose orbits do not reach high Z are destroyed at higher rates, which is due to crossing paths more often with giant molecular clouds (Moitinho 2010; Buckner & Froebrich 2014). Friel (1995) remarked that some old clusters reach such high altitudes that the encounter responsible for perturbing their orbit would likely disrupt the cluster in the process. Although Gustafsson et al. (2016) have shown that some clusters might survive such strong perturbations, there are no quantitative arguments to support that this mechanism is the only reason for the increase in scale height.

The phenomenon of heating has been studied more thoroughly for field stars than for clusters, but almost all studies have been performed in velocity space rather than positional space, making direct comparisons difficult. The time dependence of the vertical velocity dispersion in the Solar neighbourhood is often modelled as a power law (σ_v ∝ t^α). Theoretical models predict values of α < 0.3 (Hänninen & Flynn 2002), while observations of field stars suggest an age exponent of α ∼ 0.5 (e.g. Wielen 1977; Holmberg et al. 2009; Aumer et al. 2016; Sharma et al. 2020), showing that other mechanisms have contributed to vertical heating such as mergers (Martig et al. 2014) or more efficient scattering by giant molecular clouds in the young Milky Way (Ting & Rix 2019). We refer the interested reader to Sect. 5.3.2 of Bland-Hawthorn & Gerhard (2016), who discuss recent estimates of the age-velocity dispersion relation.

The age-scale height relation we derive in this study cannot be directly compared to the age-velocity relation. It is not clear how a power-law increase of index 0.5 in velocity dispersion translates in positional space. The details of the relation between maximum velocity and maximum excursion from the Galactic plane (Z_max) depend on the assumed Galactic potential. For the MWPotential2014 which was shipped with galpy (Bovy 2015), the relation is close to Z_max ∝ v^1.3, implying a steeper time dependence than a power law of index 0.5.

Radial migration and heating can also cause clusters to reach higher altitudes: Due to the shallower potential of the outer disc, their vertical velocity allows particles to reach larger excursion from the plane when their guiding radius is shifted outwards. If inward-migrating clusters are destroyed at higher rates than outward-migrating clusters (as suggested by e.g. Anders et al. 2017), then the mean Galactocentric radius and mean altitude of surviving clusters is expected to increase with age. Radial heating also contributes because particles on elliptical orbits reach higher altitudes near their apocentre.

The scale height of very young clusters appears to be rather large in the outer disc, with several of our clusters younger than 200 Myr reaching altitudes of 300 pc. Although the distances of these distant objects are less precise than for more nearby clusters, these results are compatible with the infrared findings of Camargo et al. (2015), who report seven embedded, and therefore very young, clusters that are further than 500 pc from the Galactic plane at R_GC ∼ 14 kpc. Our simple model assumes a linear increase in the scale height with Galactocentric radius, but Kalberla et al. (2007) and Kalberla & Dedes (2008), who could trace atomic hydrogen out to much larger distances than our cluster sample, show that the flaring of HI gas outside the Solar circle is better reproduced with an exponential function, and Wang et al. (2018) used a quadratic function.

Finally, if cluster disruption rates are lower in the outer disc, one would also expect scattering rates to be lower. Mathematically, this could be modelled by modifying Eq. (2) to allow the index α to vary with R_GC. Including radial migration, heating, and disruption rates varying with R_GC and Z would make the model overly complicated and poorly constrained, with highly degenerate parameters.

Characterising the velocity distribution of clusters is out of the scope of this paper, but it would provide further insight on the processes of migration, heating, and disruption. Detailed chemical studies through high-resolution spectroscopy can also shed light on the origin of clusters. The old, metal-rich object NGC 6791 is a well-known case of a cluster migrating from the inner disc (Jílková et al. 2012; Carraro 2014; Martinez-Medina et al. 2018), but lesser-known or newly discovered clusters with discrepant altitudes (such as BH 144 or UBC 648, labelled in Fig. 9) might also shown evidence for radial migration.

5.3. Galactic warp

The Galactic mid-plane is known to deviate from the geometrical b = 0° plane in the outer disc, which is a phenomenon called warp. The warping of the Galactic plane is particularly visible in the HI gas distribution (Burke 1957; Kerr 1957; Westerhout 1957; Levine et al. 2006; Kalberla et al. 2007) and is now known to be a common feature in disc galaxies (e.g. Sancisi 1976; Briggs 1990; Sánchez-Saavedra et al. 2003). The warp is also visible in the distribution of molecular clouds (Wouterloot et al. 1990), dust (Marshall et al. 2006), stars (López-Corredoira et al. 2002; Moitinho et al. 2006; Vázquez et al. 2008; Reylé et al. 2009; Amôres et al. 2017; Chrobakova et al. 2020), and stellar kinematics (Poggio et al. 2018; Schönrich & Dehnen 2018); additionally, it was recently investigated by tracing the distribution of classical Cepheids (Skowron et al. 2019; Chen et al. 2019). These young (∼20 to 120 Myr: Efremov 1978; Bono et al. 2005; Senchyna et al. 2015) and bright stars are visible at large distances and allow for precise distance determinations.

In Fig. 13 we compare the location of known clusters with classical Cepheids. The lower panels only include tracers in two bins of Galactocentric angular coordinates Φ, where Φ = 0° is the line passing through the Galactic centre and the Solar location, and Φ increases in the opposite direction to Galactic rotation (convention used in e.g. Ripepi et al. 2019; Skowron et al. 2019). The distant clusters in the third Galactic quadrant are on average older than 1 Gyr, and they follow the same southward trend as the young Cepheids. The number of known distant clusters is unfortunately too small to allow us to verify whether the Cepheid warp and the old cluster warp still coincide for Φ > 40°. In particular, no known clusters are located in the region of the northern warp.

Fig. 13.

Top: Y versus Z coordinates of our cluster sample and the Cepheids from Ripepi et al. (2019) and Chen et al. (2019). Bottom: galactocentric distance versus altitude Z in two ranges of Galactocentric angular coordinates, both in the third Galactic quadrant.

6. Discussion

Among the clusters for which we can derive parameters, the closest to the Galactic centre is Ruprecht 126 (log t = 8.11; R_GC = 5230 pc). Several known clusters might be located even deeper in the disc, according to their small parallax and apparent location, but their CMDs are too sparse and blurry to allow us to derive meaningful parameters and to constrain their distance with photometry. The deepest known clusters would be BH 222 (also studied by Piatti & Clariá 2002) and Gulliver 41, both of which are at R_GC < 3 kpc and lack estimated parameters in our catalogue.

We label in Fig. 9 several old clusters that stand out as outliers. One of them is the well-studied NGC 6791, an old metal-rich cluster whose likely origin is the bulge or the inner disc. Berkeley 20, Berkeley 29, and Saurer 1 are also well-known distant objects, which are currently located far from the Galactic plane. The object UBC 648 is a recently discovered cluster (Castro-Ginard et al. 2020), and it is also located far from the Galactic plane.

The cluster LP 861 was only recently discovered (Liu & Pang 2019) and is one of the innermost old clusters known. Other intermediate-age or old clusters were recently identified in the Gaia DR2 data, such as UBC 307, UBC 310, UBC 339, LP 866, and UFMG 2, which are all located at R_GC < 6.5 kpc and at very low altitudes. The only such objects known before Gaia were NGC 6005 (Piatti et al. 1998), NGC 6583 (Carraro et al. 2005), Ruprecht 134 (Carraro et al. 2006), and Teutsch 84 (Kronberger et al. 2006). These objects deserve further investigation in order to understand how they can survive to reach old ages in such a dense environment. They might be on very elliptical orbits, have recently migrated inwards, or their initial mass may have been sufficient for them to remain gravitationally bound.

We cannot presently probe the structure of the outer disc (e.g. the trace of the Cygnus arm or the geometry of the warp) with the sample of clusters identified in Gaia (with G < 18). As is visible in Fig 9, very few clusters are known at R_GC > 14 kpc and no clusters are known beyond 16.5 kpc, with the exception of Berkeley 29 and Saurer 1 which are near R_GC ∼ 20 kpc. This lack of available tracers is due, at least in part, to an obscured line of sight preventing us from identifying distant objects near the Galactic plane. A near-infrared Gaia-like mission (Hobbs et al. 2016, 2019) would allow us to see through dust clouds and reveal obscured structures and embedded clusters. The upcoming ground-based LSST (LSST Science Collaboration 2009; Ivezić et al. 2019) will reach stars seven magnitudes fainter than Gaia, and it is expected to provide proper motions better than 1 mas yr⁻¹ down to G∼24, allowing one to push the boundaries of cluster detection further than presently possible.

We note however that the distant outer disc clusters, especially in the third quadrant, are not strongly affected by extinction (Fig. 10). This suggests that the drop in density is not just an observational bias, but also a sign that few clusters populate the distant outer disc. Stellar population studies typically locate the disc truncation radius near 14 kpc (Robin et al. 1992), 15 kpc (Ruphy et al. 1996), or 16 kpc (Amôres et al. 2017). Due to the uncertainty on the completeness of our sample in the outer disc, we did not attempt to fit a radial density profile or try to identify a cut-off Galactocentric radius, but the observed cluster distribution visually agrees with a cut-off point near 14 kpc. The objects Berkeley 29 and Saurer 1, which are on the far edge of the disc, would therefore be outliers on very perturbed orbits, rather than representants of a cluster population forming at extreme Galactocentric distances. On the other hand, several distant disc clusters were recently discovered with a combination of Gaia data and deep ground-based photometry by authors searching for satellite systems (Koposov et al. 2017; Torrealba et al. 2019). The lack of clusters beyond R_GC ∼ 16 kpc could therefore be an observational bias that future studies will be able to fill in.

This study focuses on the present-day location of clusters. The Gaia DR2 catalogue also allows us to determine proper motions for all of them and, therefore, estimate tangential velocities. Soubiran et al. (2018) have obtained mean radial velocities for several hundreds of clusters using the Gaia Radial Velocity Spectrometer (Cropper et al. 2018) and shown a smooth increase in vertical velocity dispersion with age. Further insight can be gathered by supplementing the scarce Gaia radial velocities with observations from other surveys (e.g. Carrera et al. 2019, with APOGEE and GALAH data). Although Gaia DR3 will contain significantly more radial velocities than DR2 (Brown 2019), the Gaia spacecraft only has limited spectroscopic capabilities. Ground-based spectroscopic surveys such as APOGEE (Majewski et al. 2017), Gaia-ESO (Gilmore et al. 2012; Randich et al. 2013), GALAH (De Silva et al. 2015), LAMOST (Cui et al. 2012), or the upcoming WEAVE (Dalton et al. 2012) and 4MOST (de Jong et al. 2012; Guiglion et al. 2019) will provide additional observations allowing for the full characterisation of the 3D velocities of many more objects, and they will shed light on the dynamical processes that drive the evolution of the spiral structure and the heating of the Galactic disc.

7. Summary and conclusion

This study relies almost exclusively on Gaia DR2 data. We characterise clusters whose members were identified with Gaia astrometry. We use an artificial neural network to estimate the age, distance modulus, and interstellar extinction of each cluster from the Gaia photometry of its members and their mean Gaia parallax. The training set was built using observed clusters with reliable parameters.

After visually inspecting the colour-magnitude diagrams and verifying the consistency of the parameter estimates, we end up with 1867 clusters with reliable parameters. The 3D distribution of clusters traces the structure of the Galactic disc, with warping and flaring in the outer disc. We clearly observe the known increase in cluster scale height with age. Various mechanisms contribute to this increase, and the current cluster locations are not sufficient at disentangling the effects of heating, migration, and location-dependent disruption rates. Establishing the 3D velocity vector and characterising the orbital parameters of clusters and their dependence with age will provide further insight on the evolutionary history of the Milky Way.

Projected on the Galactic plane, the locations of young clusters roughly align along the expected spiral pattern, and especially the local and Perseus arms. We argue that the apparent interruption in the Perseus arm is physical, and it is not due to an observational bias introduced by interstellar extinction. More kinematical data is needed in order to determine whether the Perseus arm is in the process of fragmenting. Our present sample does not contain a sufficient number of distant clusters to trace the path of the outer arm or constrain the geometry of the warp in the outer disc.

The catalogue presented in this paper is the largest homogeneous analysis of cluster parameters performed with Gaia data so far, with almost two thousand objects. The continuous discovery of new clusters and the development of data-driven methods that are capable of including other photometric passbands, astrophysical parameters from value-added catalogues, or rotation periods will allow for more precise and accurate cluster parameter estimates as well as a consistent account of observational errors.

¹

Trumpler (1925) was the first to group clusters by age according to their magnitude-spectral class diagrams, but his evolutionary sequence was wrong. It was then believed that stars formed as giants and contracted into main-sequence dwarfs (see Sandage 1988, for a discussion).

²

The pre-main sequence of young clusters is also a good age indicator, but these low-mass stars are too faint to be observed in most objects.

³

The distance-corrected magnitude of a star is based on the cluster mean parallax G = 5 × log₁₀(⟨ϖ⟩/1000)+5 and does not include correction for reddening.

⁴

The morphological age index of Salaris et al. (2004) includes a log t correction of 0.07 per dex of metallicity.

⁵

The “Class C” clusters UBC 605, 610, 625, 632, 642, and 649 from Castro-Ginard et al. (2020) are compact in astrometric space but their CMDs are sparse and blurry.

⁶

The Sun is at the origin. We note that X increases towards the Galactic centre, Y is in the direction of Galactic rotation, and Z is in the direction of the Galactic north pole.

⁷

The most precise and recent estimate (Gravity Collaboration 2019) proposes a slightly smaller radius of ∼8180 pc.

⁸

We refer the interested reader to Karim & Mamajek (2017), who compiled a list of over 60 estimates published since 1918.

Acknowledgments

We thank the referee for useful suggestions that helped clarify this paper. This work has made use of data from the European Space Agency (ESA) mission Gaia (www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, www.cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement. This work was supported by the MINECO (Spanish Ministry of Economy) through grant ESP2016-80079-C2-1-R and RTI2018-095076-B-C21 (MINECO/FEDER, UE), and MDM-2014-0369 of ICCUB (Unidad de Excelencia “María de Maeztu”). TCG acknowledges support from Juan de la Cierva – Formación 2015 grant, MINECO (FEDER/UE). FA is grateful for funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 800502. AM acknowledges the support from the Portuguese Strategic Programme UID/FIS/00099/2019 for CENTRA. AV and AB acknowledge PREMIALE 2015 MITiC. DB is supported in the form of work contract FCT/MCTES through national funds and by FEDER through COMPETE2020 in connection to these grants: UID/FIS/04434/2019; PTDC/FIS-AST/30389/2017 & POCI-01-0145-FEDER-030389. The preparation of this work has made extensive use of Topcat (Taylor 2005), and of NASA’s Astrophysics Data System Bibliographic Services, as well as the open-source Python packages Astropy (Astropy Collaboration 2013), NumPy (Van Der Walt et al. 2011), and scikit-learn (Pedregosa et al. 2011). The figures in this paper were produced with Matplotlib (Hunter 2007). Figure 12 was produced with corner (Foreman-Mackey 2016).

References

Ahumada, A. V., Cignoni, M., Bragaglia, A., et al. 2013, MNRAS, 430, 221 [NASA ADS] [CrossRef] [Google Scholar]
Amôres, E. B., Robin, A. C., & Reylé, C. 2017, A&A, 602, A67 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Anders, F., Chiappini, C., Minchev, I., et al. 2017, A&A, 600, A70 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Anders, F., Khalatyan, A., Chiappini, C., et al. 2019, A&A, 628, A94 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Andrae, R., Fouesneau, M., Creevey, O., et al. 2018, A&A, 616, A8 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Andreuzzi, G., Bragaglia, A., Tosi, M., & Marconi, G. 2011, MNRAS, 412, 1265 [NASA ADS] [Google Scholar]
Anthony-Twarog, B. J., & Twarog, B. A. 1985, ApJ, 291, 595 [NASA ADS] [CrossRef] [Google Scholar]
Arenou, F., Luri, X., Babusiaux, C., et al. 2018, A&A, 616, A17 [Google Scholar]
Arnason, R. M., Barmby, P., & Vulic, N. 2020, MNRAS, 492, 5075 [CrossRef] [Google Scholar]
Astropy Collaboration (Robitaille, T. P., et al.) 2013, A&A, 558, A33 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Aumer, M., Binney, J., & Schönrich, R. 2016, MNRAS, 462, 1697 [NASA ADS] [CrossRef] [Google Scholar]
Barnes, S. A. 2007, ApJ, 669, 1167 [NASA ADS] [CrossRef] [Google Scholar]
Baron, D. 2019, ArXiv e-prints [arXiv:1904.07248] [Google Scholar]
Barrado y Navascués, D., Stauffer, J. R., & Jayawardhana, R. 2004, ApJ, 614, 386 [NASA ADS] [CrossRef] [Google Scholar]
Bastian, N., Kamann, S., Cabrera-Ziri, I., et al. 2018, MNRAS, 480, 3739 [NASA ADS] [CrossRef] [Google Scholar]
Becker, W., & Fenkart, R. B. 1970, in The Spiral Structure of our Galaxy, eds. W. Becker, & G. I. Kontopoulos, IAU Symp., 38, 205 [Google Scholar]
Bland-Hawthorn, J., & Gerhard, O. 2016, ARA&A, 54, 529 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bono, G., Marconi, M., Cassisi, S., et al. 2005, ApJ, 621, 966 [NASA ADS] [CrossRef] [Google Scholar]
Bossini, D., Vallenari, A., Bragaglia, A., et al. 2019, A&A, 623, A108 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Boucaud, A., Huertas-Company, M., Heneka, C., et al. 2020, MNRAS, 491, 2481 [NASA ADS] [CrossRef] [Google Scholar]
Bouma, L. G., Hartman, J. D., Bhatti, W., Winn, J. N., & Bakos, G. Á. 2019, ApJS, 245, 13 [CrossRef] [Google Scholar]
Bovy, J. 2015, ApJS, 216, 29 [NASA ADS] [CrossRef] [Google Scholar]
Bragaglia, A., & Tosi, M. 2006, AJ, 131, 1544 [NASA ADS] [CrossRef] [Google Scholar]
Bragaglia, A., Tosi, M., Andreuzzi, G., & Marconi, G. 2006, MNRAS, 368, 1971 [NASA ADS] [CrossRef] [Google Scholar]
Bressan, A., Marigo, P., Girardi, L., et al. 2012, MNRAS, 427, 127 [NASA ADS] [CrossRef] [Google Scholar]
Briggs, F. H. 1990, ApJ, 352, 15 [NASA ADS] [CrossRef] [Google Scholar]
Brown, A. G. A. 2019, https://doi.org/10.5281/zenodo.2637972 [Google Scholar]
Buckner, A. S. M., & Froebrich, D. 2014, MNRAS, 444, 290 [NASA ADS] [CrossRef] [Google Scholar]
Burke, B. F. 1957, AJ, 62, 90 [NASA ADS] [CrossRef] [Google Scholar]
Camargo, D., Bonatto, C., & Bica, E. 2015, MNRAS, 450, 4150 [NASA ADS] [CrossRef] [Google Scholar]
Cantat-Gaudin, T., & Anders, F. 2020, A&A, 633, A99 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Cantat-Gaudin, T., Vallenari, A., Zaggia, S., et al. 2014, A&A, 569, A17 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Cantat-Gaudin, T., Vallenari, A., Sordo, R., et al. 2018a, A&A, 615, A49 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Cantat-Gaudin, T., Jordi, C., Vallenari, A., et al. 2018b, A&A, 618, A93 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Cantat-Gaudin, T., Krone-Martins, A., Sedaghat, N., et al. 2019, A&A, 624, A126 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Carraro, G. 2014, in Properties and Origin of the Old, Metal Rich, Star Cluster, NGC 6791, eds. H. W. Lee, Y. W. Kang, & K. C. Leung, ASP Conf. Ser., 482, 245 [Google Scholar]
Carraro, G., & Baume, G. 2003, MNRAS, 346, 18 [NASA ADS] [CrossRef] [Google Scholar]
Carraro, G., & Chiosi, C. 1994, A&A, 287, 761 [NASA ADS] [Google Scholar]
Carraro, G., Méndez, R. A., & Costa, E. 2005, MNRAS, 356, 647 [NASA ADS] [CrossRef] [Google Scholar]
Carraro, G., Janes, K. A., Costa, E., & Méndez, R. A. 2006, MNRAS, 368, 1078 [NASA ADS] [CrossRef] [Google Scholar]
Carrera, R., Bragaglia, A., Cantat-Gaudin, T., et al. 2019, A&A, 623, A80 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Castro-Ginard, A., Jordi, C., Luri, X., et al. 2018, A&A, 618, A59 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Castro-Ginard, A., Jordi, C., Luri, X., Cantat-Gaudin, T., & Balaguer-Núñez, L. 2019, A&A, 627, A35 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Castro-Ginard, A., Jordi, C., Luri, X., et al. 2020, A&A, 635, A45 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Chen, B., Figueras, F., Torra, J., et al. 1999, A&A, 352, 459 [NASA ADS] [Google Scholar]
Chen, B., Stoughton, C., Smith, J. A., et al. 2001, ApJ, 553, 184 [NASA ADS] [CrossRef] [Google Scholar]
Chen, X., Wang, S., Deng, L., et al. 2019, Nat. Astron., 3, 320 [NASA ADS] [CrossRef] [Google Scholar]
Chrobakova, Z., Nagy, R., & Lopez-Corredoira, M. 2020, A&A, 637, A96 [CrossRef] [EDP Sciences] [Google Scholar]
Cignoni, M., Beccari, G., Bragaglia, A., & Tosi, M. 2011, MNRAS, 416, 1077 [NASA ADS] [CrossRef] [Google Scholar]
Collinder, P. 1931, Ann. Obs. Lund, 2, B1 [Google Scholar]
Corder, S., Sheth, K., Scoville, N. Z., et al. 2008, ApJ, 689, 148 [CrossRef] [Google Scholar]
Cropper, M., Katz, D., Sartoretti, P., et al. 2018, A&A, 616, A5 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Cui, X.-Q., Zhao, Y.-H., Chu, Y.-Q., et al. 2012, Res. Astron. Astrophys., 12, 1197 [NASA ADS] [CrossRef] [Google Scholar]
Curtis, J. L., Agüeros, M. A., Mamajek, E. E., Wright, J. T., & Cummings, J. D. 2019, AJ, 158, 77 [NASA ADS] [CrossRef] [Google Scholar]
Dalton, G., Trager, S. C., Abrams, D. C., et al. 2012, in WEAVE: The Next Generation Wide-field Spectroscopy Facility for the William Herschel Telescope, SPIE Conf. Ser., 8446, 84460P [Google Scholar]
Danielski, C., Babusiaux, C., Ruiz-Dern, L., Sartoretti, P., & Arenou, F. 2018, A&A, 614, A19 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
de Jong, R. S., Bellido-Tirado, O., Chiappini, C., et al. 2012, in 4MOST: 4-metre Multi-object Spectroscopic Telescope, SPIE Conf. Ser., 8446, 84460T [Google Scholar]
de Juan Ovelar, M., Gossage, S., Kamann, S., et al. 2020, MNRAS, 491, 2129 [Google Scholar]
De Silva, G. M., Freeman, K. C., Bland-Hawthorn, J., et al. 2015, MNRAS, 449, 2604 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
Delgado, A. J., Sampedro, L., Alfaro, E. J., et al. 2016, MNRAS, 460, 3305 [CrossRef] [Google Scholar]
Dias, W. S., & Lépine, J. R. D. 2005, ApJ, 629, 825 [NASA ADS] [CrossRef] [Google Scholar]
Dias, W. S., Alessi, B. S., Moitinho, A., & Lépine, J. R. D. 2002, A&A, 389, 871 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Dobbie, P., & Warren, S. J. 2020, ArXiv e-prints [arXiv:2003.05757] [Google Scholar]
Dobbs, C. L., & Bonnell, I. A. 2006, MNRAS, 367, 873 [NASA ADS] [CrossRef] [Google Scholar]
Donati, P., Bragaglia, A., Cignoni, M., Cocozza, G., & Tosi, M. 2012, MNRAS, 424, 1132 [NASA ADS] [CrossRef] [Google Scholar]
Donati, P., Beccari, G., Bragaglia, A., Cignoni, M., & Tosi, M. 2014a, MNRAS, 437, 1241 [NASA ADS] [CrossRef] [Google Scholar]
Donati, P., Cantat Gaudin, T., Bragaglia, A., et al. 2014b, A&A, 561, A94 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Donati, P., Bragaglia, A., Carretta, E., et al. 2015, MNRAS, 453, 4185 [NASA ADS] [CrossRef] [Google Scholar]
Douglas, S. T., Curtis, J. L., Agüeros, M. A., et al. 2019, ApJ, 879, 100 [NASA ADS] [CrossRef] [Google Scholar]
Efremov, I. N. 1978, Sov. Astron., 22, 161 [Google Scholar]
Elmegreen, B. G., Elmegreen, D. M., & Efremov, Y. N. 2018, ApJ, 863, 59 [CrossRef] [Google Scholar]
Evans, D. W., Riello, M., De Angeli, F., et al. 2018, A&A, 616, A4 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Feinstein, A., Marraco, H. G., & Muzzio, J. C. 1973, A&AS, 12, 331 [NASA ADS] [Google Scholar]
Fitzpatrick, E. L., & Massa, D. 2007, ApJ, 663, 320 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Fluke, C. J., & Jacobs, C. 2020, WIREs Data Mining and Knowledge Discovery, 10, e1349 [CrossRef] [Google Scholar]
Foreman-Mackey, D. 2016, J. Open Sour. Softw., 1, 24 [NASA ADS] [CrossRef] [Google Scholar]
Foreman-Mackey, D., Hogg, D. W., Lang, D., & Goodman, J. 2013, PASP, 125, 306 [CrossRef] [Google Scholar]
Friel, E. D. 1995, ARA&A, 33, 381 [NASA ADS] [CrossRef] [Google Scholar]
Friel, E. D., Donati, P., Bragaglia, A., et al. 2014, A&A, 563, A117 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Froebrich, D., Schmeja, S., Samuel, D., & Lucas, P. W. 2010, MNRAS, 409, 1281 [NASA ADS] [CrossRef] [Google Scholar]
Gaia Collaboration (Brown, A. G. A., et al.) 2018a, A&A, 616, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Gaia Collaboration (Babusiaux, C., et al.) 2018b, A&A, 616, A10 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Gilmore, G., Randich, S., Asplund, M., et al. 2012, The Messenger, 147, 25 [NASA ADS] [Google Scholar]
Grand, R. J. J., Kawata, D., & Cropper, M. 2014, MNRAS, 439, 623 [NASA ADS] [CrossRef] [Google Scholar]
Gravity Collaboration (Abuter, R., et al.) 2019, A&A, 625, L10 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Guiglion, G., Battistini, C., Bell, C. P. M., et al. 2019, The Messenger, 175, 17 [NASA ADS] [Google Scholar]
Gum, C. S., Kerr, F. J., & Westerhout, G. 1960, MNRAS, 121, 132 [NASA ADS] [CrossRef] [Google Scholar]
Gustafsson, B., Church, R. P., Davies, M. B., & Rickman, H. 2016, A&A, 593, A85 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Hänninen, J., & Flynn, C. 2002, MNRAS, 337, 731 [NASA ADS] [CrossRef] [Google Scholar]
Hatzidimitriou, D., Held, E. V., Tognelli, E., et al. 2019, A&A, 626, A90 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Heiter, U., Soubiran, C., Netopil, M., & Paunzen, E. 2014, A&A, 561, A93 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Herschel, W. 1785, Philos. Trans. R. Soc. London Ser. I, 75, 213 [Google Scholar]
Hobbs, D., Høg, E., Mora, A., et al. 2016, ArXiv e-prints [arXiv:1609.07325] [Google Scholar]
Hobbs, D., Brown, A., Høg, E., et al. 2019, ArXiv e-prints [arXiv:1907.12535] [Google Scholar]
Holmberg, J., Nordström, B., & Andersen, J. 2009, A&A, 501, 941 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Hou, L. G., & Han, J. L. 2014, A&A, 569, A125 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Hunt, J. A. S., Kawata, D., Grand, R. J. J., et al. 2015, MNRAS, 450, 2132 [NASA ADS] [CrossRef] [Google Scholar]
Hunter, J. D. 2007, Comput. Sci. Eng., 9, 90 [NASA ADS] [CrossRef] [Google Scholar]
Hur, H., Sung, H., & Bessell, M. S. 2012, AJ, 143, 41 [NASA ADS] [CrossRef] [Google Scholar]
Ivezić, Ž., Kahn, S. M., Tyson, J. A., et al. 2019, ApJ, 873, 111 [NASA ADS] [CrossRef] [Google Scholar]
Janes, K., & Adler, D. 1982, ApJS, 49, 425 [NASA ADS] [CrossRef] [Google Scholar]
Janes, K. A., & Phelps, R. L. 1994, AJ, 108, 1773 [NASA ADS] [CrossRef] [Google Scholar]
Janes, K. A., Tilley, C., & Lynga, G. 1988, AJ, 95, 771 [NASA ADS] [CrossRef] [Google Scholar]
Jardine, K., Poggio, E., & Drimmel, R. 2019, https://doi.org/10.5281/zenodo.3235352 [Google Scholar]
Jeffery, E. J., von Hippel, T., van Dyk, D. A., et al. 2016, ApJ, 828, 79 [NASA ADS] [CrossRef] [Google Scholar]
Jeffries, R. D., & Oliveira, J. M. 2005, MNRAS, 358, 13 [NASA ADS] [CrossRef] [Google Scholar]
Jeffries, R. D., Jackson, R. J., Franciosini, E., et al. 2017, MNRAS, 464, 1456 [NASA ADS] [CrossRef] [Google Scholar]
Jílková, L., Carraro, G., Jungwiert, B., & Minchev, I. 2012, A&A, 541, A64 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Jordi, C., Gebran, M., Carrasco, J. M., et al. 2010, A&A, 523, A48 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Joshi, Y. C. 2007, MNRAS, 378, 768 [NASA ADS] [CrossRef] [Google Scholar]
Joshi, Y. C., Dambis, A. K., Pandey, A. K., & Joshi, S. 2016, A&A, 593, A116 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Jurić, M., Ivezić, Ž., Brooks, A., et al. 2008, ApJ, 673, 864 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
Kalberla, P. M. W., & Dedes, L. 2008, A&A, 487, 951 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Kalberla, P. M. W., Dedes, L., Kerp, J., & Haud, U. 2007, A&A, 469, 511 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Karim, M. T., & Mamajek, E. E. 2017, MNRAS, 465, 472 [NASA ADS] [CrossRef] [Google Scholar]
Kerr, F. J. 1957, AJ, 62, 93 [NASA ADS] [CrossRef] [Google Scholar]
Kharchenko, N. V., Piskunov, A. E., Schilbach, E., Röser, S., & Scholz, R. D. 2013, A&A, 558, A53 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Kingma, D. P., & Ba, J. 2014, 3rd International Conference for learning representations, San Diego [Google Scholar]
Koposov, S. E., Belokurov, V., & Torrealba, G. 2017, MNRAS, 470, 2702 [NASA ADS] [CrossRef] [Google Scholar]
Kounkel, M., & Covey, K. 2019, AJ, 158, 122 [Google Scholar]
Kronberger, M., Teutsch, P., Alessi, B., et al. 2006, A&A, 447, 921 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Krone-Martins, A., & Moitinho, A. 2014, A&A, 561, A57 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Kumar, B., Sharma, S., Manfroid, J., et al. 2014, A&A, 567, A109 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Lallement, R., Babusiaux, C., Vergely, J. L., et al. 2019, A&A, 625, A135 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Leung, H. W., & Bovy, J. 2019, MNRAS, 483, 3255 [NASA ADS] [Google Scholar]
Levine, E. S., Blitz, L., & Heiles, C. 2006, ApJ, 643, 881 [NASA ADS] [CrossRef] [Google Scholar]
Li, C., Sun, W., de Grijs, R., et al. 2019, ApJ, 876, 65 [NASA ADS] [CrossRef] [Google Scholar]
Lindegren, L., Hernández, J., Bombrun, A., et al. 2018, A&A, 616, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Liu, L., & Pang, X. 2019, ApJS, 245, 32 [NASA ADS] [CrossRef] [Google Scholar]
Lockman, F. J. 1977, AJ, 82, 408 [NASA ADS] [CrossRef] [Google Scholar]
López-Corredoira, M., Cabrera-Lavers, A., Garzón, F., & Hammersley, P. L. 2002, A&A, 394, 883 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
LSST Science Collaboration (Abell, P. A., et al.) 2009, ArXiv e-prints [arXiv:0912.0201] [Google Scholar]
Lynga, G. 1982, A&A, 109, 213 [Google Scholar]
Lyra, W., Moitinho, A., van der Bliek, N. S., & Alves, J. 2006, A&A, 453, 101 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Mackereth, J. T., Bovy, J., Leung, H. W., et al. 2019, MNRAS, 489, 176 [CrossRef] [Google Scholar]
Maíz-Apellániz, J. 2001, AJ, 121, 2737 [Google Scholar]
Majewski, S. R., Schiavon, R. P., Frinchaboy, P. M., et al. 2017, AJ, 154, 94 [NASA ADS] [CrossRef] [Google Scholar]
Marino, A. F., Milone, A. P., Casagrande, L., et al. 2018, ApJ, 863, L33 [NASA ADS] [CrossRef] [Google Scholar]
Marshall, D. J., Robin, A. C., Reylé, C., Schultheis, M., & Picaud, S. 2006, A&A, 453, 635 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Martig, M., Minchev, I., & Flynn, C. 2014, MNRAS, 443, 2452 [NASA ADS] [CrossRef] [Google Scholar]
Martinez-Medina, L. A., Gieles, M., Pichardo, B., & Peimbert, A. 2018, MNRAS, 474, 32 [NASA ADS] [CrossRef] [Google Scholar]
Meibom, S., Barnes, S. A., Platais, I., et al. 2015, Nature, 517, 589 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Meingast, S., Alves, J., & Fürnkranz, V. 2019, A&A, 622, L13 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Melotte, P. J. 1915, Mem. R. Astron. Soc., 60, 175 [Google Scholar]
Moitinho, A. 2010, in Star Clusters: Basic Galactic Building Blocks Throughout Time and Space, eds. R. de Grijs, & J. R. D. Lépine, IAU Symp., 266, 106 [Google Scholar]
Moitinho, A., Vázquez, R. A., Carraro, G., et al. 2006, MNRAS, 368, L77 [NASA ADS] [CrossRef] [Google Scholar]
Molina Lera, J. A., Baume, G., & Gamen, R. 2019, MNRAS, 488, 2158 [CrossRef] [Google Scholar]
Monteiro, H., & Dias, W. S. 2019, MNRAS, 487, 2385 [NASA ADS] [CrossRef] [Google Scholar]
Monteiro, H., Dias, W. S., & Caetano, T. C. 2010, A&A, 516, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Morgan, W. W., Whitford, A. E., & Code, A. D. 1953, ApJ, 118, 318 [Google Scholar]
Naylor, T., & Jeffries, R. D. 2006, MNRAS, 373, 1251 [NASA ADS] [CrossRef] [Google Scholar]
Overbeek, J. C., Friel, E. D., Donati, P., et al. 2017, A&A, 598, A68 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Pedregosa, F., Varoquaux, G., Gramfort, A., et al. 2011, J. Mach. Learn. Res., 12, 2825 [Google Scholar]
Perren, G. I., Vázquez, R. A., & Piatti, A. E. 2015, A&A, 576, A6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Pettitt, A. R., Tasker, E. J., & Wadsley, J. W. 2016, MNRAS, 458, 3990 [NASA ADS] [CrossRef] [Google Scholar]
Phelps, R. L., Janes, K. A., & Montgomery, K. A. 1994, AJ, 107, 1079 [NASA ADS] [CrossRef] [Google Scholar]
Piatti, A. E., & Clariá, J. J. 2002, A&A, 388, 179 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Piatti, A. E., Clariá, J. J., Bica, E., Geisler, D., & Minniti, D. 1998, AJ, 116, 801 [NASA ADS] [CrossRef] [Google Scholar]
Poggio, E., Drimmel, R., Lattanzi, M. G., et al. 2018, MNRAS, 481, L21 [NASA ADS] [CrossRef] [Google Scholar]
Queiroz, A. B. A., Anders, F., Santiago, B. X., et al. 2018, MNRAS, 476, 2556 [NASA ADS] [CrossRef] [Google Scholar]
Randich, S., Gilmore, G., & Gaia-ESO Consortium 2013, The Messenger, 154, 47 [NASA ADS] [Google Scholar]
Randich, S., Tognelli, E., Jackson, R., et al. 2018, A&A, 612, A99 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Reed, B. C. 2006, JRASC, 100, 146 [NASA ADS] [Google Scholar]
Reid, M. J., Menten, K. M., Brunthaler, A., et al. 2014, ApJ, 783, 130 [Google Scholar]
Reid, M. J., Menten, K. M., Brunthaler, A., et al. 2019, ApJ, 885, 131 [NASA ADS] [CrossRef] [Google Scholar]
Reylé, C., Marshall, D. J., Robin, A. C., & Schultheis, M. 2009, A&A, 495, 819 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Ricker, G. R., Winn, J. N., Vanderspek, R., et al. 2015, J. Astron. Telesc. Instrum. Syst., 1, 014003 [NASA ADS] [CrossRef] [Google Scholar]
Ripepi, V., Molinaro, R., Musella, I., et al. 2019, A&A, 625, A14 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Robin, A. C., Creze, M., & Mohan, V. 1992, ApJ, 400, L25 [NASA ADS] [CrossRef] [Google Scholar]
Roca-Fàbrega, S., Valenzuela, O., Figueras, F., et al. 2013, MNRAS, 432, 2878 [NASA ADS] [CrossRef] [Google Scholar]
Romero-Gómez, M., Mateu, C., Aguilar, L., Figueras, F., & Castro-Ginard, A. 2019, A&A, 627, A150 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Ruphy, S., Robin, A. C., Epchtein, N., et al. 1996, A&A, 313, L21 [NASA ADS] [Google Scholar]
Salaris, M., Weiss, A., & Percival, S. M. 2004, A&A, 414, 163 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Sánchez-Saavedra, M. L., Battaner, E., Guijarro, A., López-Corredoira, M., & Castro-Rodríguez, N. 2003, A&A, 399, 457 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Sancisi, R. 1976, A&A, 53, 159 [NASA ADS] [Google Scholar]
Sandage, A. 1988, PASP, 100, 293 [CrossRef] [Google Scholar]
Schönrich, R., & Dehnen, W. 2018, MNRAS, 478, 3809 [NASA ADS] [CrossRef] [Google Scholar]
Senchyna, P., Johnson, L. C., Dalcanton, J. J., et al. 2015, ApJ, 813, 31 [NASA ADS] [CrossRef] [Google Scholar]
Shapley, H. 1918, ApJ, 48, 154 [NASA ADS] [CrossRef] [Google Scholar]
Sharma, S., Hayden, M. R., Bland-Hawthorn, J., et al. 2020, MNRAS, submitted [arXiv:2004.06556] [Google Scholar]
Shetty, R., & Ostriker, E. C. 2006, ApJ, 647, 997 [NASA ADS] [CrossRef] [Google Scholar]
Siegel, M. H., LaPorte, S. J., Porterfield, B. L., Hagen, L. M. Z., & Gronwall, C. A. 2019, AJ, 158, 35 [CrossRef] [Google Scholar]
Skowron, D. M., Skowron, J., Mróz, P., et al. 2019, Science, 365, 478 [NASA ADS] [CrossRef] [Google Scholar]
Skrutskie, M. F., Cutri, R. M., Stiening, R., et al. 2006, AJ, 131, 1163 [NASA ADS] [CrossRef] [Google Scholar]
Soubiran, C., Cantat-Gaudin, T., Romero-Gómez, M., et al. 2018, A&A, 619, A155 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Spicker, J., & Feitzinger, J. V. 1986, A&A, 163, 43 [NASA ADS] [Google Scholar]
Spitzer, L., Jr., & Schwarzschild, M. 1951, ApJ, 114, 385 [NASA ADS] [CrossRef] [Google Scholar]
Spitzer, L., Jr., & Schwarzschild, M. 1953, ApJ, 118, 106 [NASA ADS] [CrossRef] [Google Scholar]
Sun, W., de Grijs, R., Deng, L., & Albrow, M. D. 2019, ApJ, 876, 113 [CrossRef] [Google Scholar]
Tang, B., Geisler, D., Friel, E., et al. 2017, A&A, 601, A56 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Taylor, M. B. 2005, in Astronomical Data Analysis Software and Systems XIV, eds. P. Shopbell, M. Britton, & R. Ebert, ASP Conf. Ser., 347, 29 [Google Scholar]
Ting, Y.-S., & Rix, H.-W. 2019, ApJ, 878, 21 [CrossRef] [Google Scholar]
Ting, Y.-S., Hawkins, K., & Rix, H.-W. 2018, ApJ, 858, L7 [CrossRef] [Google Scholar]
Torrealba, G., Belokurov, V., & Koposov, S. E. 2019, MNRAS, 484, 2181 [NASA ADS] [CrossRef] [Google Scholar]
Tosi, M., Bragaglia, A., & Cignoni, M. 2007, MNRAS, 378, 730 [NASA ADS] [CrossRef] [Google Scholar]
Trumpler, R. J. 1925, PASP, 37, 307 [CrossRef] [Google Scholar]
Trumpler, R. J. 1930, Lick Obs. Bull., 420, 154 [NASA ADS] [CrossRef] [Google Scholar]
Twarog, B. A., & Anthony-Twarog, B. J. 1989, AJ, 97, 759 [NASA ADS] [CrossRef] [Google Scholar]
van den Bergh, S. 1958, ZAp, 46, 176 [Google Scholar]
van den Bergh, S., & McClure, R. D. 1980, A&A, 88, 360 [NASA ADS] [Google Scholar]
van der Kruit, P. C. 1988, A&A, 192, 117 [NASA ADS] [Google Scholar]
Van Der Walt, S., Colbert, S. C., & Varoquaux, G. 2011, Comput. Sci. Eng., 13, 22 [CrossRef] [Google Scholar]
Vazquez, R. A., Baume, G., Feinstein, A., & Prado, P. 1996, A&AS, 116, 75 [NASA ADS] [Google Scholar]
Vázquez, R. A., May, J., Carraro, G., et al. 2008, ApJ, 672, 930 [NASA ADS] [CrossRef] [Google Scholar]
von Hippel, T., Jefferys, W. H., Scott, J., et al. 2006, ApJ, 645, 1436 [NASA ADS] [CrossRef] [Google Scholar]
Wang, H.-F., Liu, C., Xu, Y., Wan, J.-C., & Deng, L. 2018, MNRAS, 478, 3367 [NASA ADS] [CrossRef] [Google Scholar]
Westerhout, G. 1957, Bull. Astron. Inst. Neth., 13, 201 [NASA ADS] [Google Scholar]
Wielen, R. 1977, A&A, 60, 263 [NASA ADS] [Google Scholar]
Wouterloot, J. G. A., Brand, J., Burton, W. B., & Kwee, K. K. 1990, A&A, 230, 21 [NASA ADS] [Google Scholar]
Wright, E. L., Eisenhardt, P. R. M., Mainzer, A. K., et al. 2010, AJ, 140, 1868 [Google Scholar]

All Figures

	Fig. 1. Age and distance modulus of our reference clusters (described in Sect. 2.2).
In the text

	Fig. 2. Architecture of our artificial neural network, indicating the width (number of nodes) of each layer. The example cluster is Haffner 22. The input quantities are described in Sect. 3.1.
In the text

	Fig. 3. Evolution of the training and validation scores with training iterations. The network used in this study is the result of 200 iterations.
In the text

	Fig. 4. Difference between the age estimate and the reference value for ∼120 000 validation samples, split in two age groups. The full line is a running mean. The dashed lines represent the upper and lower standard deviation.
In the text