The young stellar content of the giant H II regions M 8, G333.6-0.2, and NGC 6357 with VLT/KMOS

Context. The identiﬁcation and characterisation of populations of young massive stars in (giant) H ii regions provides important constraints on (i) the formation process of massive stars and their early feedback on the environment, and (ii) the initial conditions for population synthesis models predicting the evolution of ensembles of stars. Aims. We identify and characterise the stellar populations of the following young giant H ii regions: M8, G333.6 − 0.2, and NGC6357. Methods. We have acquired H - and K -band spectra of around 200 stars using the K -band Multi Object Spectrograph on the ESO Very Large Telescope. The targets for M8 and NGC6357 were selected from the Massive Young Star-Forming Complex Study in Infrared and X-ray (MYStIX), which combines X-ray observations with near-infrared (NIR) and mid-infrared data. For G333.6 − 0.2, the sample selection is based on the NIR colours combined with X-ray data. We introduce an automatic spectral classiﬁcation method in order to obtain temperatures and luminosities for the observed stars. We analysed the stellar populations using their photometric, astrometric, and spectroscopic properties and compared the position of the stars in the Hertzprung-Russell diagram with stellar evolution models to constrain their ages and mass ranges. Results. We conﬁrm the presence of candidate ionising sources in the three regions and report new ones, including the ﬁrst spec-troscopically identiﬁed O stars in G333.6 − 0.2. In M8 and NGC6357, two populations are identiﬁed: (i) OB main-sequence stars ( M > 5 M (cid:12) ) and (ii) pre-main sequence stars ( M ≈ 0 . 5 − 5 M (cid:12) ). The ages of the clusters are ∼ 1 − 3Myr, < 3Myr, and ∼ 0.5 − 3Myr for M8, G333.6 − 0.2, and NGC6357, respectively. We show that MYStIX selected targets have > 90% probability of being members of the H ii region, whereas a selection based on NIR colours leads to a membership probability of only ∼ 70%


Introduction
The assembly of massive stars is a key problem in modern astronomy. From an observational point of view, two lines of research are being pursued to better understand their formation process (Zinnecker & Yorke 2007;Motte et al. 2018). One approach is aimed at the very early stages of star formation, observing the cold cores in which massive stars are being formed. The central objects are surrounded by extended disks Full Table 4 and normalized spectra are only available at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/cat/J/A+A/633/A155 Based on observations collected at the European Southern Observatory at Paranal, Chile (ESO program 095.C-0048). and produce molecular outflows that can be studied at sub-mm and radio wavelengths (Beltrán & de Wit 2016;Cesaroni et al. 2017;Beuther et al. 2017;Ilee et al. 2016). A second strategy is to study the final phase and outcome of the formation process, characterising the recently formed stars and looking for remnant signatures of their assembly.
Here, we adopt the second strategy. Finding and characterising young massive stars is important for two reasons. First, the properties of the newly formed massive stars, such as mass, rotational velocity, magnetic field, and multiplicity provide us with information about the formation process. Second, the statistical distributions of these properties -once a large enough sample is scrutinised -define the initial conditions for population synthesis models in which the evolution of entire populations of massive stars is traced.
As formation takes place in opaque natal clouds dispersed throughout the Galactic disk, severe extinction constitutes an observational challenge when studying these young massive stars at optical or shorter wavelengths. Therefore, the youngest stars are studied at near-infrared (NIR) wavelengths, where the extinction is much lower. The technical development of instrumentation and detectors in this wavelength regime is also an advantage when studying the youngest populations of stars. For example, by studying stellar populations via NIR spectroscopy, Bik et al. (2012Bik et al. ( , 2014 present evidence of an age spread (2−3 Myr) in the giant H ii region W3 Main, where the most massive star is 35 M (O5.6−O7.5V). They conclude that the high UV flux from this and other massive stars affects the disk fraction of the lowest mass stars. Wu et al. (2016) analysed 14 O-type stars with masses between 20 and 120 M in the giant H ii region W49, and concluded that the cluster age is around 1.5 Myr. Ramírez-Tannus et al. (2017) performed a study of eleven massive young stellar object (mYSO) candidates in the giant H ii region M17, with masses ranging from 6 to 25 M and a likely age <1 Myr, and confirmed the pre-main-sequence (PMS) nature of six objects. This constitutes a small but unique sample of massive stars observed just after their formation in a very young cluster. Interestingly, they observed a lack of doublelined spectroscopic binaries and measured a narrow range of measured radial velocities (−10 < v rad < 20 km s −1 ) which is in stark contrast to the overall properties of small and large samples of fully formed, main-sequence massive stars (e.g. Mason et al. 2009;Sana et al. 2014;Kiminki & Kobulnicky 2012;Peter et al. 2012;Kobulnicky et al. 2014;Dunstall et al. 2015). The samples studied in these works are characterised by large radial-velocity amplitudes (∆v rad > 100 km s −1 ), from which one can conclude that 40% to 50% of OB-type binaries have periods of one month or less.
Spectroscopic studies aiming to classify (massive) stars in star-forming regions have the problem that cluster membership determination using NIR photometry suffer from pollution by fore-and background stars (typically ∼50% Bik et al. 2012;Wu et al. 2016). The Massive Young Star-Forming Complex Study in Infrared and X-ray (MYStIX, Feigelson et al. 2013) applied a novel approach in which X-ray, NIR, and mid-IR (MIR) observations are combined. Since X-rays are tracers of PMS low-and intermediate mass stars (through emission produced by coronal activity), and of massive stars (through emission produced by shocks in their stellar winds or colliding winds), this approach allows for a more efficient removal of foreand background stars and hence more robust constraints on cluster membership. NIR multiplexed instruments like the K-band Multi Object Spectrograph (KMOS) on the ESO Very Large Telescope allow for the simultaneous acquisition of tens of spectra in complex environments. The combination of this observing strategy with input catalogues containing MYStIX probable complex members (MPCM; Broos et al. 2013) provides an efficient way of obtaining spectra of young massive stars in regions with high extinction. Gaia DR2 (Gaia Collaboration 2016, 2018) information on distances and proper motions, as well as spectroscopic followup, provides a reliable way to assess cluster membership. We follow this approach here as well, and cross correlate the MYStIX and NIR photometry methods to assess their reliability. The latter methodologies are the main ones to use at large distances, where Gaia becomes unreliable or unfeasible.
In this paper we present H and K band spectroscopic observations performed with VLT/KMOS surveying three giant H ii regions (M 8, G333.6−0.2, and NGC 6357) in order to characterise their stellar populations. The three star forming regions References. (a) Kuhn et al. (2019). (b) This work.
observed, presented in Sect. 2, were selected to sample a range in cluster morphologies. In Sect. 3 we provide an overview of the target selection procedure. In Sect. 3.3 we describe the KMOS observations and in Sect. 3.4 the data reduction procedure. In Sect. 4 we present the stellar populations of the three giant H ii regions by placing them in NIR and MIR colour-magnitude (CMD) and colour-colour diagrams (CCD). In Sect. 5 we match our sources with the Gaia DR2 catalogues in order to use their astrometric properties to further assess cluster membership. In Sect. 6 we describe the mYSO detected in our sample. We present a detailed NIR spectral classification of the massive stars and the PMS population in Sect. 7. With these results we confirm or reject the massive star nature of the candidates presented by Povich et al. (2017). We combine all the results to place the stars in Hertzsprung-Russell diagrams (HRDs) and obtain an estimate of the age of the regions and the mass range of the observed stars in Sect. 8. We discuss our results in Sect. 9 and summarise our findings in Sect. 10.

Sample of giant H II regions
We observed three H ii regions with different mass ranges and physical sizes: G333.6−0.2, M 8, and NGC 6357 (Table 1). The location of the selected targets is shown in Figs. 1-3. We have colour-coded the targets according to their temperature derived from our spectral classification (see Sects. 7 and 8). The candidate mYSOs (Sect. 6) are labelled as magenta diamonds. Below, we describe each region in some detail.

M 8
Also known as the Lagoon Nebula ( Fig. 1  . The principal source of ionising flux in the H ii region is the O4 star 9 Sgr (Tothill et al. 2008).
The western part of the nebula is concentrated into a compact H ii region with much denser ionised gas than the main body of the Lagoon Nebula. This region is called the Hourglass Nebula and it is powered by the O7 star Herschel 36. Since M 8 is close to the Galactic plane and projected on top of the Galactic bulge, it is difficult to distinguish between the cluster members and foreground and background stars. Many molecular clumps are scattered through the nebula, suggesting that star formation is triggered by ionisation fronts produced by 9 Sgr and Herschel 36 (Tothill et al. 2002). The MPCM catalogue for M 8 contains 2056 sources  for which we obtained KMOS spectra; their colour represents the effective temperature according to their spectral classification. Arias et al. (2007) performed intermediate resolution spectroscopy of PMS stars in M 8. They identified 27 classical T Tauri stars, seven weak-lined T Tauri stars and three PMS emission objects with spectral type G. They identified the stars to be younger than 3 Myr and between 0.8 and 2.5 M . Prisinzano et al. (2007Prisinzano et al. ( , 2019 identified 237 PMS stars as members including 53 binaries. Kumar & Anandarao (2010) identified 64 class 0/I and 168 class II objects using Spitzer NIR photometry.
2.2. G333.6−0.2 G333.6−0.2 (Fig. 2) is an ultra-compact H ii (UCHII) region embedded in the giant H ii region RCW106. It is the most prominent massive star forming region in the G333 complex. Located at a distance of ∼2.6 kpc (Figuerêdo et al. 2005), it is compact (11 in diameter) and dusty, and hosts powerful ionising stars (Kumar 2013). G333.6−0.2 is embedded in a massive molecular clump of 15 000 M ( Mookerjea et al. 2004). Townsley et al. (2014Townsley et al. ( , 2018 detected more than 100 X-ray sources clustered in the central region, and over 500 sources in the field of view. The central sources would be responsible for powering the H ii region. Townsley et al. (2018) also detected diffuse X-ray emission that they associated with plasma flowing out of the embedded cluster. To our knowledge, there are no age estimates for G333.6−0.2 reported in the literature.

NGC 6357
NGC 6357 is a major star-forming region in the Sagittarius-Carina spiral arm (Fig. 3). Its brightest H ii region is G353.1+0.9 located near the northern rim of the cloud and ionised by the massive stellar cluster Pis 24 ). Maíz Apellániz et al. (2007 determined that Pis 24 contains two very massive stars of about 100 M each. The Hα image of NGC 6357 shows a shell containing a Wolf-Rayet star; this could be a super-bubble produced by an earlier generation of massive stars. This hypothesis is supported by the discovery of an older X-ray population around Pis 24 (Wang et al. 2007). Massi et al. (2015) found that Pis 24 is very young (∼1−3 Myr) and that it probably consists of a few sub-clusters. They found evidence that the most massive stars triggered the current star formation episode. The MPCM catalogue for this region contains 2235 sources

Observations and data reduction
The observations presented here were obtained between April and August 2015 using the KMOS (Sharples et al. 2013) mounted at the ESO VLT. For M 8 and NGC 6357 the target selection was based on the results of the MYStIX team Broos et al. 2013;Povich et al. 2017). G333.6−0.2 is not covered by MYStIX, therefore we adopted a different method in order to select the targets.

Target selection in M 8 and NGC 6357
M 8 and NGC 6357 are included in the MYStIX project Broos et al. 2013) which provides a selection of MPCM based on the cross-matching of three samples: (i) X-ray sources with IR counterparts, variability, and/or spatial distribution consistent with being PMS stars; (ii) MIR excess sources consistent with having circumstellar material (i.e. possible YSOs with or without X-ray counterparts); and (iii) known OB stars from Broos et al. (2013), which do not necessarily have X-ray counterparts.
As the MPCM catalogue contains many more sources than we can possibly observe with KMOS, we applied several selection criteria to obtain a prioritised list of targets. We limited our sample to objects with K s < 13.0 mag. Due to possible saturation of the KMOS detector we could not observe sources brighter than K s = 9.0 mag. The main aim of the project is to classify the stellar spectra in order to derive the overall characteristics of the host regions. Therefore, our selection focuses on stars without infrared excess, as these offer the best chance of a direct NIR detection of the stellar photosphere. We identify three priority classes as follows.
As Priority one targets we selected all stars listed as known OB stars in the MPCM catalogue. We added a sample of MYStIX candidate OB stars from Povich et al. (2017). These candidate X-ray emitting OB stars are identified via two methods: (1) By fitting Castelli & Kurucz (2003) models to the 1−8 µm spectral energy distribution and then selecting all objects with L > 10 4 L , which approximately corresponds to main sequence B1 stars or earlier; (2) By inspecting the NIR CCD and CMD for the regions and identifying sources with X-ray detection that are reddened main-sequence stars according to their position in the CCD and that have dereddened J-band magnitudes M J ≤ −2.4 mag. By performing SED fitting, Povich et al. (2017) reported five new candidate OB stars (of which we observed two) in addition to the 28 previously published ones (of which we observed 12) in M 8 (see Table 5). In NGC 6357 there were 16 previously published OB stars (of which we observed 4). Povich et al. (2017) increased this sample with 26 new candidate OB stars (of which we observed 17): 25 of those selected via SED fitting and one more selected via its NIR colours (see Table 7).
For the Priority two targets, we constructed colour-colour diagrams based on the MPCM JHK s photometry, and selected the sources without infrared excess. This is similar to method two described above, but without a J-band magnitude limit. The targets already included under priority one and those with 9 < K s < 13 mag are removed from this subsample. In M 8 we observed 55 priority two sources and in NGC 6357 82.
As low priority sources (Priority three) we added the infrared excess sources rejected in the selection of the priority two sources. We observed 12 sources in M 8 and 20 in NGC 6357.
3.2. Target selection in G333.6−0.2 In the case of G333.6−0.2 we selected 23 stars by matching the X-ray observations of this region with the VISTA DR2 catalogue. We carried out the target selection using the Chandra X-ray catalogue of Townsley et al. (2014), but in Table 4 we report matches to the more recent catalogue of Townsley et al. (2018). This does not change any of the pairings but it does update the X-ray catalogue names.
Many of our objects are saturated in VISTA DR2, and so we replaced any J, H, or K magnitudes brighter than 13 mag with those of the nearest 2MASS star within 3 . Both the VISTA A155, page 4 of 24 M. C. Ramírez-Tannus et al.: The young stellar content of M 8, and NGC 6357 and 2MASS magnitudes should be viewed with some caution as they are often for objects which are crowded or superimposed on nebulosity. We did not replace the positions of saturated stars since VISTA is less susceptible to crowding than 2MASS, and VISTA positions remain relatively precise to quite bright magnitudes (we found that in the range 9 < K < 10 mag 66% of 2MASS-VISTA matches are within 0.15 of each other in this region).
The matches were carried out using the Bayesian method described in Naylor et al. (2013), which assesses the likelihood of an X-ray/IR paring using the uncertainty in position assigned to each X-ray source by the source extraction. It also allows for the fact that brighter sources in the IR are more likely to be the counterparts of a given X-ray source, and reduces the probability of a match if there are other possible counterparts. The mean probability of an infrared star being the counterpart is 97%, with the lowest being 82%.
We complemented this sample by adding 18 stars from the 2MASS catalogue with H − K >1 mag in order to select some of the sources with high extinction. Also, these targets were split in three priority classes, where priority one contains 19 X-ray matched sources without NIR excess. Four X-ray matched sources show an infrared excess and are added as priority two sources. Finally, 18 2MASS sources without an X-ray match are the priority three sources.

VLT/KMOS observations
KMOS consists of 24 deployable integral field units (IFUs) and three spectrographs (Sharples et al. 2013). The IFUs are divided into three groups and the light is redirected into one of three spectrographs, each spectrograph disperses the light of eight IFUs. We divided the H ii regions in several fields and for each field we took a long (300 s) and a short (30 s) exposure in order to cover the faint (13 < K < 11 mag) and bright (K < 11 mag) targets, respectively. For each exposure we applied a 9-point dither pattern both for the H-and the K-band grating. The fields observed per H ii region are shown in Figs. 1-3. In the case of G333.6−0.2 and the upper right field in NGC 6357 we observed the same field twice, once for the bright and once for the faint stars. We used one IFU per spectrograph (three IFUs in total) to perform sky observations. A set of calibration images was taken in addition to each of the observations. This set consists of dark frames, flat fields and an arc-lamp exposure; the dark and arc-lamp frames are taken at six different rotator angles. A telluric standard star was observed in one IFU per spectrograph.

Data reduction
The KMOS observations were reduced using the KMOS pipeline version 3.12.3 (Davies et al. 2013) in combination with the ESO Recipe Execution Tool (esorex). These routines were used to calibrate and reconstruct the cubes of the science and standard-star observations as described by Davies et al. (2013). The sky subtraction was performed using the standard KMOS routines, and the telluric correction was done using the default template for telluric correction: one telluric star is observed in one IFU for each of the three spectrographs. This telluric spectrum is then used to perform the telluric correction for all stars in that spectrograph. The data reduction performed by the pipeline can be divided in three basic steps: (i) flat field correction, (ii) extraction of the spectra, and (iii) telluric correction, instrumental effects correction, and flux calibration.
The first and third steps use procedures that are standard for NIR spectroscopy. Given that we are doing integral field spectroscopy, the second step is different from the standard procedure for other instruments. In the case of KMOS, the reconstruction of the 3D data cubes is performed by interpolating from look-up tables which provide the (x, y, λ) location in the final cube for each pixel on the detector. The interpolation is performed based on weighted averages of neighbouring points; therefore the weighted average cannot have a higher value than the neighbour pixel with highest flux. This procedure is problematic for observations of spatially compact, but spectrally extended objects, like our targets. The spatial axes are in that case under-sampled, resulting in (at times quasi-periodic) ripples which appear in the spectra after interpolation and reconstruction. This problem is described in detail in Sect. 4.2 of Davies et al. (2013). The ripple pattern is prominent and different for each star and is severe in some cases, as can be seen in the spectra displayed in Fig. 14 and at the CDS. For these reasons, we were not able to remove the ripples from the spectra. In the cases where the ripples are severe, they hamper our ability to extract physical information from the spectral lines.
The spectra have a resolving power R between 3570 and 4880 over the 1.46−2.40 µm range, corresponding to ∼80−60 km s −1 respectively, per resolution element. Lapenna et al. (2015) performed a series of tests of the reliability of the KMOS pipeline wavelength calibration. They computed the radial velocity shifts from spaxel to spaxel in one IFU as well as the shifts for different IFUs. They conclude that, both within each IFU and among different IFUs, the wavelength calibration obtained from the KMOS pipeline is accurate at a level of a few km s −1 . They establish that the wavelength calibration by the KMOS pipeline is well suited for bulk kinematic studies, but if one would like to measure the radial velocity of an individual star, the wavelength calibration should be refined.
The normalisation procedure for the OBA stars was performed in two steps. We first fitted a second degree polynomial to all the spectra. Then we visually inspected all the spectra and selected the ones of sufficient quality (not too many ripples and detectable lines, which corresponds to signal-to-noise ratio, S /N > 10) to be classified spectroscopically. From this first step we could also distinguish the late-type objects and the YSOs from the OBA stars. We then re-normalised the OBA stars using the interactive spectrum normaliser specnorm.py 1 which allows to select continuum points of the spectrum interactively and fits a cubic spline through the selected points.

Photometric data
The photometric data were taken from the MYStIX catalogue King et al. 2013;Kuhn et al. 2013) which includes the JHK-band magnitudes from the United Kingdom Infra-red Telescope (UKIRT) wide field camera (Lawrence et al. 2007;Lucas et al. 2008) or from the 2MASS point source catalogue (Skrutskie et al. 2006), plus the four Spitzer IRAC bands (GLIMPSE; Benjamin et al. 2003). In the CDS tables described in Table 4 we list the photometry for each of the observed objects.
The near-and mid-IR CMDs and CCDs for each of the regions are shown in Figs. 4-6. In these diagrams too, we have colour coded the sources according to the spectral classification described in Sect. 7. The diagrams show the zero-age main Infrared excess sources sequence (solid line) and reddening lines (dashed and dotted lines). We adopted the extinction law by Cardelli et al. (1989) with the total-to-selective extinction value R V set to 3.1 (dashed lines) and 5 (dotted lines). In the NIR diagram one can distinguish a reddened main sequence (A V ∼ 2 mag in M 8 and A V ∼ 5 mag in NGC 6357) and a population of lower mass premain-sequence stars. For G333.6−0.2 we could not rely on the MYStIX catalogue. We used the JHK CMD and CCDs of this region in order to check for membership of our selected targets. The spread of the objects on the CMD and the CCD is sizeable and shows that in G333.6−0.2 the extinction varies strongly from sight line to sight line.
In the in the MYStIX IR Excess Source catalogue (MIRES) catalogue compiled by Povich et al. (2013) the authors identified probable YSO members of the MYStIX regions by combining IR SED fitting, IR colour cuts, and spatial clustering analysis. In the MIR CCDs of M 8 and NGC 6357 we plot the likely YSOs and IR excess sources from the MIRES catalogue (orange squares) as well as the sources that Povich et al. (2013) identify as having visible photospheres (grey dots) in order to identify the region in the CCDs that is dominated by YSOs. G333.6−0.2 is not covered by the MIRES catalogue, so we used the cut found in M 8 and NGC 6357 to show the YSO dominated region in the MIR CCD. In the bottom-right panels of Figs. 4-6 the cut used to separate YSOs and visible photospheres is shown with a dashed-grey line.
The CCD and CMD diagrams presented here are further discussed in Sect. 7.7, where they are also used to confirm or reject cluster membership.

Astrometric properties of our sample
Infrared excess sources separation no larger than 1 . In addition we limited ourselves to G ≤ 19 to reduce the risk of false matches with very faint stars. If more than one match was found within 1 , the brighter one was chosen. In addition, we extracted all Gaia sources with G ≤ 19 within a radius of 0.5 • from the central coordinates given in Table 1. The astrometric properties for the stars in the three clusters are shown in Fig. 7. The sky positions of the Gaia sources are shown with blue, orange, and red dots in the first column of this figure. The grey contours represent the density of Gaia sources in the local neighbourhood and the black dots show all MYStIX sources (in M 8 and NGC 6357) and X-ray sources (in G333.6−0.2) that have Gaia counterparts within 1 . We note that for G333.6−0.2 only 12 sources of our catalogue have Gaia counterparts that are brighter than G = 19. The parallaxes ( ) and the proper motions (µ α * , µ δ ) of the Gaia sources are shown in columns two to four of Fig. 7. In blue we show our catalogue stars, and the red and orange dots correspond to the stars among our catalogue stars that are probable outliers based on their astrometric properties ( , µ α * , µ δ ). We checked whether these properties are sufficiently reliable to reject their membership. We computed the renormalised unit weight error (RUWE) which quantifies the quality and the reliability of the astrometric fit (Lindegren 2018). Following the suggestion of the latter authors, we retain only those sources for which RUWE < 1.4 as astrometrically reliable.
The Gaia distances computed using the confirmed members (blue dots) are 1.34 ± 0.07 kpc for M 8 and 1.77 ± 0.12 kpc for NGC 6357, which are in agreement with the distances derived by Binder & Povich (2018) and Kuhn et al. (2019) using Gaia parallaxes of more complete samples. We derive a distance of 2.54 ± 0.71 kpc for G333.6−0.2 which is in agreement with the reported distance to this region by Figuerêdo et al. (2005).

Massive young stellar objects
In the sample selection we gave priority to objects with no IR excess. Nevertheless, our KMOS sample includes six objects that we classified as candidate mYSOs: object 32 in M 8, objects 33, F57, 114, and F68 in NGC 6357, and object 720 in G333.6−0.
Infrared excess sources consistent with a mYSO and/or class II nature. We classify these objects based on the presence of double-peaked emission lines and/or CO overtone emission, indicative of a rotating circumstellar disk. CO overtone emission is produced in high-density (10 10 -10 11 cm −3 ) and high-temperature (2500-5000 K) environments (Charbonneau 1995). CO is easily dissociated, and therefore must be shielded from the strong UV radiation coming from the star. These conditions occur in the inner regions of accretion disks, which makes the CO bandheads an important tool to trace the inner disk structure around mYSOs (e.g. Bik et al. 2006;Ilee et al. 2013;Ramírez-Tannus et al. 2017). The shape of the CO lines probes the Keplerian rotation profile of the circumstellar disk (Bik & Thi 2004;Blum et al. 2004;Davies et al. 2010;Wheelwright et al. 2010;Ilee et al. 2013). The blue shoulder in the bandheads is a measure of the inclination of the disk: an extended blue shoulder indicates a high inclination angle (i.e. near "edge-on" view).
In Fig. 8 we plot the Brackett series profiles and the CO first overtone bandheads for the six candidate YSOs in our KMOS sample. Here we provide a description of the individual mYSO candidates: NGC 6357-F68. We do not detect any CO bandheads in this object, but the Brackett (Br) lines are in emission. Brγ is very strong in emission on top of a photospheric absorption line. Br-12, Br-11, and Br-10 appear to be contaminated by residuals of the data reduction. This source is not part of the MIRES catalogue ).
G333.6−0.2-720. CO bandhead emission is not present. The Brackett lines are in emission superposed on an absorption line; the emission is very narrow and likely has a nebular origin. In order to confirm the presence of a circumstellar disk, more and better signal-to-noise observations are needed.
NGC 6357-114. This object does not show CO bandheads in emission, but all the Brackett lines show clearly double-peaked emission profiles. This is the only object in our sample where the double-peaked structure is resolved. This source is not in the MIRES catalogue.
NGC 6357-F57. The CO bandheads are clearly present in emission and they show a pronounced blue shoulder. The Br lines seem to be filled in by disk emission. Brγ shows a central emission, but the most pronounced part of this feature is very narrow and therefore is likely caused by residuals from the sky subtraction. This source is part of the MIRES catalogue and it is classified as a stage II YSO, dominated by optically thick circumstellar disks. . The blue, red and orange correspond to stars in our catalogues. The blue, red and orange dots correspond to the stars in our catalogue for which a reliable cross-match was found. The blue dots are confirmed cluster members, and the red and orange dots are candidate non-cluster members for which we verified the quality of the astrometric fit. The sources in orange have a reliable Gaia DR2 astrometric solution and therefore we confirm that they are not cluster members; for the sources in red we do not confirm nor reject cluster membership. The grey contours represent the density of Gaia DR2 sources brighter than G = 19 within a radius of 0.5 • from the central coordinates listed in Table 1 and the black dots show all the MYStIX sources with Gaia counterparts within 1 . The first column shows the equatorial Gaia sky positions. Columns two to four show the parallax vs the proper motion (µ α * , µ δ ).
and Brγ show emission. The Br lines are single-peaked and the CO bandheads lack a pronounced blue shoulder indicative of a high inclination angle of the disk. This source is part of the MIRES catalogue and it is classified as a stage II YSO, dominated by optically thick circumstellar disks.
NGC 6357-33. This object shows weak CO bandheads in emission. The Br lines seem absent, but this could be due to veiling. The mYSO nature of this object is unclear. This source is not in the MIRES catalogue.

Spectral classification
The spectral classification of the 243 observed stars was carried out on the basis of their KMOS H and K band spectra, spanning the wavelength range 1.5−2.4 µm. By a first visual inspection of the objects with no spectral signatures of them being mYSOs, we checked for the absence or presence of CO absorption bands; in this way all stars were categorised as either early (OBA) or late (FGKM) type, respectively (or both in case of doubt). For a more precise classification two spectral libraries were employed: (1) 56 normalised, O3 to mid-B type, H and K band spectra from Hanson et al. (1996Hanson et al. ( , 2005 and Bik et al. (2005)  After initial separation in early and late types the spectral classification was done by two independent methods. On the one hand all spectra were classified by visual inspection. On the other hand, a computer code was developed to automatically classify spectra in each category (early or late), making use of the same spectral libraries as for the visual inspection. We now discuss each method separately and subsequently compare their results.

An automatic method to classify stars in the NIR
In the presentation of their template library Rayner et al. (2009) demonstrate that for a few spectral features (Ca ii, Na i, Al i and Mg i; their Fig. 35) a relation exists between equivalent width (EW) and spectral type. They suggest that these relations can be used for spectral classification. Based on this idea and in order to have a reproducible and efficient method at hand, we wrote a Python code to do an automatic spectral classification using H and K band spectra 3 . To the set of features from Rayner et al. (2009) we add features from literature, adjusting the definitions where necessary and including a few of our own. Though the spectral features are often centred on a line, they can also involve multiple lines, a molecular bandhead or only a part of a line, depending on the wavelength range and spectral type considered. The features and continuum regions that were used are listed in Table B.1 for the late-type spectra and in Table B.2 for the earlytype spectra.
The automatic method was tested on the spectra of the template libraries themselves. In each case the spectrum under examination was excluded from the fits mentioned in step two below. Taking the root-mean-square of the errors on the results we determined the mean error of the method for the template spectra to be two spectral subtypes for the late, and one spectral subtype for the early types.
The method treats the early-and late-type stars separately, but the basic steps of the procedure are the same for both cases, as follows.
First, the EWs, together with their variance σ 2 EW , of the defined spectral features are calculated for all template spectra. 3 The Python scripts are available upon request.
Following Cushing et al. (2005) they are given by: where f and f c are the observed feature and estimated continuum flux, σ and σ c are the errors on f and f c , respectively, and the summations are over the number of wavelength intervals ∆λ across the feature. The ∆λ intervals were chosen such that the apparent Doppler shifts in the observed spectra do not affect the EW measurements. We follow the procedure as described in the appendix of Sembach & Savage (1992, Eqs. (A1)-(A20)).
Wavelengths are converted to velocities with the centre of the feature at 0 km s −1 , and further normalised to the highest velocity so that all (velocity) values fall within the interval [−1,1]. The latter in order to simplify the error calculation (for more details see Sembach & Savage 1992); conversion back to Å is done at the end. We estimate f c by fitting a first degree polynomial through the defined continuum regions and σ (the error on the flux f ) is determined as the standard deviation of the observed values around these continuum fits, in the fitted regions. σ c is calculated from the covariance matrix produced by the fit, as detailed in Sembach & Savage (1992). Second, a relation between EW value and spectral type is established per luminosity class. As an example, in Fig. 9 we visualise this relation by plotting the EW values of all templates in the library for four different features for the late types and two features for the early types. The relation is quantified by fitting a 5th (for the late types) or 3rd (for the early types) degree . Relation between EW-value and spectral type on the basis of late type (four top panels) and early type (bottom two panels) templates for four features. The symbols and colours represent different luminosity classes: red triangles for dwarfs (V-IV), green circles for giants (III) and blue squares for supergiants (II-I).
A155, page 11 of 24 A&A 633, A155 (2020) polynomial p fit (SpT) through the EWs for each given luminosity class. Thus, for a certain EW value of a particular feature there is one or a few likely spectral types per luminosity class.
Third, the EWs are determined for a science spectrum. Taking into account the errors, each EW is mapped to a likelihood for spectral types, using the polynomial fits obtained in the previous step. In order to quantify how far the determined EW lies from the polynomial, the fitted polynomial p fit (SpT) is mapped to a Gaussian G(x; µ, σ) with µ = EW and σ = σ EW : The resulting likelihood distribution L(SpT) EW (one for each feature) peaks at the most likely spectral type(s) according to the EW for that feature. To avoid very sharp peaks a minimum σ of 0.2 Å is applied. To combine the information contained in each distribution and obtain one final likelihood distribution for each luminosity class we tried several approaches among which: (1) taking the logarithm of the L(SpT) EW before adding all distributions and (2) adding one to L(SpT) EW to avoid numerical zero's before multiplying all distributions. In both cases all EWs were weighted equally. Testing the results on the template spectra as described before, the lowest errors were obtained when applying method (2): A detailed explanation is provided in Appendix A. An example of a final distribution is shown in Fig. 10.
Finally, the method returns the value of the highest peak of the final distributions as the spectral type of the object and we assign an uncertainty of two spectral sub-types to the obtained value. The luminosity class is determined by the distribution with the highest peak.
For the late types, both EWs and ratios of EWs are taken into account. Steps two and three of the procedure are applied in the same way to the EW ratios. The resulting likelihood distributions are simply multiplied together with those from the EWs, so that one final distribution is obtained for each object: one that includes the information both from the EWs and the ratios. Ratios of which at least one of the EWs has a negative value are excluded in order to avoid negative ratios. Since no real new information is added, one could view the inclusion of ratios as a way of giving extra weight to certain lines or features, specifically to those that do not have negative EWs. We list the ratios used for the classification in Table B.3.

Spectral classification by eye
As an additional test on the method all science spectra were classified by eye. Each spectrum was individually overplotted with all available templates of the early or late type library. For each star the best matching type or (in the majority of cases) a range of types was selected, within the limitations of the quality of the KMOS spectrum and the completeness of the template spectra. Some of our stars were not classified in this way due to poor data quality, which was a result either of the ripple patterns caused by the data reduction procedure (see Sect. 3.4) or of poor S/N due to extinction, or both. The S/N in the H band is about 20 on average for the regions G333 and M 8; for NGC 6357 this is close to 30. In the K band a S/N of around 30 was reached in all regions, being somewhat lower for G333 and somewhat higher for NGC 6357. The spectra that were not classified by eye due to poor S/N typically had values less than ten in the H band. The different coloured lines are for different luminosity classes: red for dwarfs (IV,V), green for giants (III) and blue supergiants (I,II). The method returns the spectral type corresponding to the highest peak, in this case F5 I-II. We assign an uncertainty of two spectral sub types to this value for all the classified stars.
For the early types, hydrogen (e.g. H i 2.16 µm) and helium (e.g. He i 2.11 µm and He ii 2.19 µm) lines were the only distinguishing features. For the late types both hydrogen and atomic metallic lines (the most prominent being Mg i, Si i, Ca i, Al i, and Fe i) were used, as well as CO ro-vibrational bandheads for the lower temperature types (mostly detected in the K band).
For F-type stars the spectrum is dominated by the hydrogen lines but some metallic lines start to appear. The hydrogen lines significantly weaken at early G spectral types so that by G0 the spectra are dominated by metals. The ratio of Mg i/Brγ is a good temperature indicator for F-type stars. In order to distinguish the stars in luminosity class, the width of the hydrogen lines was considered: giants and supergiants have narrower lines than dwarf stars.
In the case of G-K stars the K band is best suited to do the spectral classification. The ratios between Brγ and Na i at 2.26 µm, and Brγ and Ca i around 2.20 µm can be used, as these metallic lines are more sensitive to the decrease in temperature than others (Gray & Corbally 2009). For later types, the CO bandheads both in the H and in the K-band were mostly used as temperature and luminosity class indicators.

Comparison of the automatic method with the classification by eye
Before we use the spectral classification to determine the age of the studied regions, we first compare the results of the two methods. Stars for which the two spectral types did not match within error bars we carefully reinspected and a final decision was made (see columns three and four of Table 4). For stars that could not be classified by eye due to poor data quality, we kept the results of the automated classification, unless we judged any  Table 4). The red triangles represent dwarfs (IV-V), and the green circles (super)giants (I-III) according to the automatic classification. Full symbols signify agreement in luminosity class with the corrected classification. The outliers in the bottom-right of the plot correspond to the early B-type stars that are misclassified as F-type by the automatic method.
result to be untrustworthy based on the spectral quality. We label our final decision as "corrected classification".
Figures 11-13 present the comparison of the corrected classification with the automatic method for each region. The comparison is shown in effective temperature rather than spectral type (see Sect. 8), to better reflect effects on the final results. There is an overall good agreement except for some B stars (that are misclassified as early F types by the automatic method) and some late M stars (that are misclassified as K types by the automatic method). These are outliers for different reasons: in case of the early types the miss-classification is due to a lack of late-B and A-type templates, requiring an interpolation from B3 to F0-F3 with only a few A0-A1 templates in between.
This interpolation helps to classify the late-A and early-F type stars correctly, but causes some B-types to be mistaken for F-types. In case of the late types it is the (relative) importance of the 13 CO feature that causes the confusion for M types: detecting this feature by eye assures one that the star must be M type, however, no special weight is assigned to this feature by the automatic method, which takes into account 23 features for the late types.
To quantify the accuracy of the automatic method further, we compare the number of stars it misclassifies with the number of stars misclassified by initial visual inspection (Sect. 7.2) taking the final corrected classification as reference. The results are shown in Table 2. We see that over the sample that we test, 74% is correctly classified by the automatic method and 73% by initial visual inspection.
We present the summary of the spectral classification for all the studied sources in Table 3. The tabulated spectral types can be  found in the CDS Table 4 and the classified spectra are presented at the CDS. Some examples of spectra are shown in Fig. 14.

Spectral classification in M 8
We observed four fields in M 8 which resulted in spectra for 81 stars. From the K-band spectral classification we confirm that 21 are OBA stars (16 OB and 5 A). We have 1 candidate YSO (see Sect. 6 for details about the mYSO classification) and 55 A155, page 13 of 24 A&A 633, A155 (2020)  late-type stars, identified by the presence of CO band-heads in absorption. We were not able to classify 4 stars due to severe ripples in their spectra or to insufficient signal to noise. The earliest type star in our sample is the source O10, which we classified as a O9-B1 type star. The full H and K-band spectra are displayed at the CDS. At the left side of the plots the name of the star is shown, at the right side we list the NIR-based spectral classification.
In the top section of Table 5 we present the spectral classification from this work and compare it with the literature classification (columns four and 5); our K-band classification is consistent with the previous classification by Wegner (2000) and Skiff (2009) for the overlapping sources. We also list the two candidate OB stars presented by Povich et al. (2017) and observed by us (second panel); we confirm that source 24 is effectively an OB star, while source 48 presents weak CO bandheads in absorption and no IR excess, and therefore it could be either a foreground low-mass PMS star or a background field giant. In the third section of Table 5 we list four new OB and four A stars classified by us that are not listed as massive stars in the literature. The last three columns of this table list the Kband extinction (A K ), effective temperature (T eff ), and luminosity (log L/L )); see Sect. 8. 7.5. Spectral classification in G333.6−0.2 7.5.1. Spectral classification We observed two fields in G333.6−0.2 and obtained spectra of 39 stars. From the K-band spectral classification we confirm that 8 are OBA stars (4 OB and 4 A), 1 candidate YSO and 27 late-type stars, identified by the presence of CO band-heads in absorption. We were not able to classify 3 spectra. The most massive star in our sample is 6829462 which is an O6 star. The candidate YSO (source 720) does not present CO band-head emission, Brγ seems to be in emission with a central nebular component, and the other lines of the Brackett series show a photospheric absorption line with a central emission component. The H and K-band spectra are available at the CDS. We present four new OB stars and six A stars in our observed sample. In Table 6 we list these stars together with their derived K-band extinction, effective temperature, and luminosity (see Sect. 8).
7.5.2. X-ray counterparts for sources in G333.6−0.2 Recently, Townsley et al. (2018) presented the second instalment of the Massive star-forming regions Omnibus X-ray Catalogue Notes. This table is available in its entirety at the CDS.
(MOXC2) where they list X-ray point sources in several star forming regions including G333.6−0.2. By cross-matching their catalogue with the coordinates of our sources we find that 22 out of our initial 39 sources have X-ray counterparts. Those sources are included with Chandra IDs in the second column of Table 4. Source 378 is a late-O, early-B type star, but it does not have an X-ray counterpart. We assume that this massive star is part of G333.6−0.2. Given that we are not certain of the cluster membership of the other (low-mass) stars with no X-ray counterparts, in the remainder of this paper we treat those sources, except 378, as non members of G333.6−0.2 and do not take them into account when estimating the age of the region (Sect. 8).

Spectral classification in NGC 6357
We observed six fields in NGC 6357 resulting in 123 spectra. From the IR spectral classification we confirm that 31 are OBA stars (22 OB and 9 A), we have 4 candidate YSOs and 81 late-type stars, identified by the presence of CO band-heads in absorption. We were not able to classify 7 stars from the obtained spectra. The H and K-band spectra are available at the CDS, in the left side of the plots, the name of the stars is shown, and in the right part we show the NIR-based spectral classification. In Table 7 we show the previously classified OB stars in NGC 6357 that overlap with our sample. The spectral classification obtained for these sources agrees with the previous studies. Out of the 26 new candidate OB stars presented by Povich et al. (2017) we observed 17 with KMOS. We confirm that most of these stars are OB stars with the exception of B3 which presents CO bandheads in absorption and is therefore classified as a K4-K5V type star, and B14 where the spectrum is more consistent with it being an F star. We were not able to classify B12 due to the very strong ripples affecting its spectrum nor B16 because the signal-to-noise is too low to recognise any spectral features. Finally, five newly classified OB stars and nine A stars are also listed in Table 7. As in Tables 5 and 6, the last three columns show the physical properties derived in Sect. 8.  7.7. Relation between photometry, astrometry, and spectral classification Here we discuss some individual objects whose position in the CMDs and CCDs  or their astrometry from Gaia DR2 data (Sect. 5) suggest that they have peculiar properties.
Source 46 in M 8 seems to be too bright for its spectral type, one possibility is that the X-ray an IR sources were incorrectly matched. If the sources are correctly matched this source could be a cool giant foreground star where the X-ray emission is originated by binary interaction. Sources 24 and 48 seem to be highly reddened in comparison to the rest of the population in the NIR CMD, but a strong excess is not apparent in the NIR nor MIR CMDs. Source 24 is also amongst the candidate non-members according to its Gaia DR2 astrometric properties, but the quality of the astrometry is insufficient to firmly exclude membership of M 8. From the photometric properties of sources F13 and F24 it does not appear obvious that these stars are not members of M 8, but we discard them as cluster members based on our results from the Gaia DR2 data. We consider these four sources not to belong to M 8 and exclude them from the extinction calculation and age determination.
In the NIR CCD of NGC 6357 source 33 is below the reddening line for R V = 3.1, and shows evidence for the presence of a circumstellar disk which indicates that it might be a massive YSO with a NIR excess (see Sect. 6). Sources 91 and 8 seem to be too bright for their spectral types, which is consistent with the astrometric solution asserting these stars to be non cluster members. We exclude these stars from the extinction and age determination.
In G333.6−0.2, the stars classified as M sources correspond to some of those which do not have X-ray counterparts and were selected only on the basis of their NIR colours. From the NIR diagrams we conclude that these stars are cool giants and are not part of the cluster. This shows the importance of the MYStIX project in order to identify cluster members. Stars 515756828027 and 515757115402 have X-ray counterparts but their position in the CMD and CCD seems inconsistent with their spectral type, this in agreement with the astrometric solution showing that they are not cluster members.
Our spectral classification agrees with the SED classification by Povich et al. (2013) in the MIR CCDs: the objects classified as OB type stars are located where the authors find the visible photospheres, and the sources that we identify as premain-sequence stars are likely YSOs. Objects 22, 88, and 112 in NGC 6357 are an exception because they are in the region dominated by YSOs, but do not show any gaseous disk signatures in the NIR diagram, nor in the H-and K-band spectra. This could be explained by a relatively large inner hole.
From the photometry, the spectral classification, and the X-ray counterparts (see also Sect. 4), it becomes clear that there are some stars that are not cluster members. These have been marked with grey symbols in the HRDs and have not been included in the extinction histograms nor taken into account for the age determination. Unfortunately, the quality of the astrometric fit for these stars is poor, and therefore we are not able to evaluate their membership with Gaia DR2 data. We do not discard the stars classified as giants because the fact that they have low surface gravities could mean that they are PMS swollen stars contracting towards the Zero age main-sequence (ZAMS).

Age of the giant H ii regions
After the spectroscopic classification, we can place our KMOS sources in the HRD and match them with MESA isochrones and evolutionary tracks obtained from the MIST project (Paxton et al. 2011(Paxton et al. , 2013(Paxton et al. , 2015Dotter 2016;Choi et al. 2016). The isochrones and tracks used have solar metallicity and initial rotational velocity v ini = 0.4v crit , where v crit is the critical velocity, reached when the gravitational acceleration is exactly compensated by the centrifugal force.
The effective temperature, intrinsic colours (V − K s) 0 , (J − H) 0 , and (H − K) 0 , and bolometric correction of the luminosity class V stars are taken from Pecaut & Mamajek (2013); for the luminosity class III stars we used the calibrations from Alonso et al. (1999) for spectral types F0-F9, and for G4-M5 stars we used the calibration of Cox (2000). With the effective temperature and bolometric correction in the V-band, BC V , we derive the absolute magnitude in the V-band. In order A155, page 15 of 24 A&A 633, A155 (2020)  References. to calculate the luminosity of the observed targets we used the absolute K-band magnitude assuming the Indebetouw et al. (2005) extinction law to derive A K , and calculated the absolute K-band magnitude by scaling it to the region's distance and correcting it by the extinction A K . We obtained the bolometric correction in the K-band, BC K , from BC V and the absolute V and K-band magnitudes (BC K = BC V + M V − M K ). The K-band extinction, temperature, and luminosity derived for the OB and A stars in M 8, G333.6−0.2, and NGC 6357 are listed in the last three columns of Tables 5-7

Extinction A K
To asses the sensitivity of our results to the choice of the extinction law, we also used the Nishiyama et al. (2009) (Fig. 15). Given this small difference, adopting the Indebetouw et al. (2005) extinction law seems secure. The extinction in M 8 peaks at around A K = 0.1 mag and has a tail extending to values of A K = 1.5 mag. The object with the highest extinction is F5; its position in the CMD is consistent with having such a high extinction. With an extinction A K = 2.5 mag the star F9 may be a non-member, thus we have excluded it from the histogram and the age classification. Although the trend is not very strong, the extinction seems to be higher in the western part of this region, close to the location of the massive multiple system Herschel 36.
In the case of G333.6−0.2 the A K values have an almost flat distribution ranging from 0.2 to 3.5; this is consistent with what is observed in the CMD and CCD. Most of the OB stars have an extinction ranging from 1 to 2 mag. The objects with the highest extinction are found at the centre of the cluster while the lowest extinction is measured for the four objects to the north-east of the cluster centre.
The mean value of A K for NGC 6357 is higher than for M 8, which is also evident from the CMDs. In this region we can distinguish two peaks, one around A K = 0.7 and a smaller one at about A K = 1.7. The highest extinction corresponds to F67 which is also one of the reddest objects in the CMD and CCDs. The stars in the north-west region in NGC 6357 are on average less extincted in the K-band than those in the south-east region, but other than that there is no obvious trend of extinction with location. To get a more complete picture of the studied regions, we add the previously known OB stars from the MPCM catalogue listed in Povich et al. (2017). The derived properties of these objects are listed in the last section of Tables 5 and 7 and shown as solid green and red dots corresponding to the giant and dwarf stars, respectively. We also plot 0.2 dex error circles to represent the scatter reported by Povich et al. (2017) around the ZAMS. The infrared spectral classification of the most massive stars (spectral types O3-O4) is normally degenerate (see e.g. Wu et al. 2014Wu et al. , 2016Bik et al. 2019) so it could affect the lower age limit. To avoid this, we determined the age based on the whole population (including the low-mass PMS-stars) instead of focusing only on the massive stars.

Age determination
In addition to the sources mentioned in Sect. 7.7 that were discarded as members of the regions, we marked the K-band excess sources (below the reddening line in the NIR CCDs) with empty ellipses filled with crosses. Their NIR excess emission could cause an overestimation of L bol and affect the age estimation, therefore we did not take them into account when assessing the age of the regions.
The main sequence stars in M 8 have masses between ∼5 and ∼70 M . The PMS stars will become ZAMS stars of ∼0.5−3 M . From comparing the observed population with MIST isochrones we conclude that the age of M 8 is between 1 and 3 Myr.
For G333.6−0.2 the main sequence population ranges in mass between ∼5 and ∼35 M . As in this region the membership of the stars is more uncertain, we had to focus on the massive stars only to estimate the age. For this reason, we only provide an upper limit. By comparing the few confirmed OB stars with X-ray counterparts with MIST isochrones we estimate an age < 3 Myr for this region.
The massive stars in NGC 6357 have main sequence masses between ∼10 and ∼100 M . From comparing the position of the stars with MESA isochrones, we conclude that the mean age of the PMS and main-sequence populations of NGC 6357 is probably somewhat younger than that of M 8. We determine the age of NGC 6357 to be 0.5−3 Myr.
We also tested the effect of the different extinction laws on the age determination. We conclude that the Nishiyama et al. (2009) (Povich et al. 2017). The luminosity class IV and V stars are shown in red, the stars with luminosity class III are shown in green. In grey we show the late-K and M-type stars. The empty ellipses with crosses show the objects located below the reddening line in the NIR CCD, which have been excluded from the age determination. The age of M 8 is estimated to be between 1 and 3 Myr.  Fig. 16 but for the classified stars in G333.6−0.2. We estimate an age < 3 Myr.

Discussion
In this section we discuss the relation between the results obtained from photometry and spectroscopy. We consider some of the uncertainties that could affect the age determination and the impact that would have on our results.

The effectiveness of the MYStIX selection
When selecting cluster members based on their NIR colours, typically half of the sources are discarded after spectral classification because they are background or foreground stars. For example, Wu et al. (2016) selected 44 candidate OB stars belonging to the cluster W49 to perform a spectroscopic follow up. They found that 22 of the observed spectra (50%) are dominated by CO in absorption, which led them to conclude that they are foreground or background stars. Gennaro et al. (2012) studied the CN15/16/17 molecular complex; they aimed at observing the massive stars in the complex and by classifying them spectroscopically, they constrained the distance and extinction towards the stars. The authors selected the spectroscopic sample based on the NIR colours (brightest stars in K and H − K s > 0.3). Their sample consists of 14 stars out of which five (35%) are field giants polluting the sample of massive stars. Similarly, Maaskant et al. (2011)  spectroscopically and after looking for highly reddened sources, IR excess in the IRAC bands, evidence of jets or outflows, and the presence of a radio counterpart, they identified 28 objects as cluster members (so 28% as foreground and background stars).
The importance of the MYStIX selection becomes very clear in this paper where two regions are included in the MPCM list (M 8 and NGC 6357) and one is not (G333.6−0.2). The MYStIX selection is especially important in the case of lowmass PMS stars. Given that they are more abundant than massive stars, when these stars are selected on the basis of their NIR colours only, the probability that these are foreground or background sources is high. Nevertheless, adding the X-ray information points towards these sources being PMS stars. In the case of our two MYStIX regions -M 8 and NGC 6357 -the fraction of stars whose membership is not confirmed based on their spectral type and astrometric properties is very low (<10%), while in G333.6−0.2 the fraction of stars with uncertain membership is around 30%.
In all HRDs a group of stars (grey symbols) is located at the right side of the 0.1 Myr isochrone near the birth line. These objects correspond to late-K and M-type stars and the most likely explanation for them to be at that location is that they are not members of the cluster, and so contaminate the MYStIX MPCM list.

Accuracy of the age determination
We constrained the age of each region in Sect. 8.2. Here we discuss some of the issues that might affect the age determination and compare our age estimates with those published in the literature. These issues are: (i) Binarity, which could affect the luminosity determination by making stars look brighter than they actually are. No information about the multiplicity properties of our sources is however available. (ii) The use of different evolutionary tracks, which could lead to different age estimates, e.g, Martins & Palacios (2013) show that the main-sequence age for different models could vary by ∼0.7 Myr. In our case, this uncertainty is likely smaller than that caused by the uncertainty on the spectral type. (iii) The adopted extinction law, which may impact the age determination. We tested the effect of two different extinction laws on the age determination. By using the Nishiyama et al. (2009) extinction law, we obtain lower A K values, and therefore slightly fainter objects. However, the implied difference in luminosity does not significantly affect the age determination. Prisinzano et al. (2005Prisinzano et al. ( , 2019 estimated the age of the PMS stars in M 8 between ∼0.8 and 2.5 Myr, in agreement with our estimate -including the massive stars -of 1−3 Myr. Massi et al. (2015) determined the age of NGC 6357 at 1−3 Myr based on the visual and NIR CMD and CCDs. By comparing the whole population of stars with MIST isochrones, we estimate an age of 0.5 to 3 Myr for this cluster. For G333.6−0.2 we could not find a previous age determination.
The age obtained for the studied regions by considering only the massive stars is consistent with the age obtained only from the low-mass PMS stars. If the late-K and M-type stars near the Hayashi track were cluster members, we would obtain a younger age (∼0.1−1 Myr younger) for the PMS stars compared to that of the massive stars only.

Ionisation rates
Binder & Povich (2018) did a multi-wavelength study of Galactic massive star-forming regions where they studied their energy budgets and star formation rates. They obtained the total infrared luminosity processed by dust from the radio free-free continuum and the global infrared SEDs (3.6−500 µm). In the cases of M 8 and NGC 6357 they used a similar sample of ionising stars as we do in this paper, but in the case of G333 they had to rely on an incomplete stellar population. They concluded that in the first two cases there are more than enough stars to explain both the observed ionisation and the bolometric luminosity. For the entire G333 complex (of which G333.6−0.2 is the brightest component), they calculated that around a dozen O5V stars are necessary to explain the total luminosity of the complex, and therefore concluded that the principal ionising stars in the complex remain undiscovered. Due to the ambiguity in the stellar population of G333 Binder & Povich (2018) assumed the presence of five O5V stars in G333.6−0.2 based on a photometric study by Fujiyoshi et al. (2005). We only classified three new OB stars in G333.6−0.2, all of which have spectral types later than O5V (hence produce less ionising photons). Therefore, we A155, page 20 of 24 M. C. Ramírez-Tannus et al.: The young stellar content of M 8, and NGC 6357 conclude that there are probably more OB stars to be discovered in this region.

Conclusions
We have presented a photometric and spectroscopic study of massive stars in three H ii regions and characterised their stellar populations in terms of age and mass ranges.
-M 8, G333.6−0.2 and NGC 6357 contain young stellar populations: we detect O and B stars, several mYSOs as well as a low-mass PMS stellar population. -We develop an automatic spectral classification method in the H-and K-band, that is accurate to within two spectral subtypes. We conclude that this method is at least as accurate as a classification by eye, while having the obvious merits of being fast and reproducible. database in order to further constrain the cluster membership of our sources based on their astrometric properties. For each of the clusters we found two sources that are not cluster members. F13 and F24 in M 8; 515756828027 and 515757115402 in G333.6−0.2; 91 and 8 in NGC 6357. -The sample selection for G333.6−0.2 was made based on the NIR-colours only. By classifying the stars spectroscopically and by using the information provided by Gaia (when available), we conclude that ∼30% of the stars observed in G333.6−0.2 are not cluster members, but luminous background stars. For the other regions we used the MYStIX selection ) in order to select cluster members; in these case the fraction of non-members is less than 10%. The latter demonstrates the importance and effectiveness of the MYStIX analysis when identifying cluster members. -Based on the presence of CO bandheads in emission and/or (double-peaked) emission lines in the spectra, we find one mYSO candidate in G333.6−0.2 (720), one in M 8 (32), and four in NGC 6357 (33, F57, 114, and F68). -We determined the K-band extinction of the observed stars by comparing their intrinsic NIR colours with the observed ones. In all regions A K varies from ∼0 to ∼3 mag. The mean extinction of NGC 6357 is higher than that of M 8. We do not find strong evidence for a correlation between the extinction value and the location of the stars in the clusters suggesting that the extinction varies on small spatial scales, that is, is patchy. -The observed stellar population of M 8 has an age of ∼1−3 Myr, and that of NGC 6357 is 0.5−3 Myr old. From the OB stars in G333.6−0.2 we estimate an age < 3 Myr. Given that this region did not profit from the MYStIX analysis, we only use the main sequence stars to determine the age. A more accurate determination is at present not possible given the uncertainty of the spectral classification. With present-day 8−10 m class ground-based telescopes we may resolve stellar populations in the Magellanic Clouds. With future extremely large telescopes, like the ELT and TMT, we are going to resolve young stellar populations in more distant galaxies. With the techniques used and introduced here we may charac-terise these regions in terms of content, mass, age, and multiplicity properties even in highly obscured regions. This will allow us to probe massive star formation in galaxies spanning a wide range of types, masses, and metallicities. A&A 633, A155 (2020) Notes. All wavelengths are in µm. Lines taken from Hanson et al. (2005).