EDP Sciences
Free Access
Issue
A&A
Volume 598, February 2017
Article Number A136
Number of page(s) 20
Section Stellar structure and evolution
DOI https://doi.org/10.1051/0004-6361/201628450
Published online 15 February 2017

© ESO, 2017

1. Introduction

The new generation of Galactic plane surveys carried out in the last decade, from near-infrared (NIR) to millimeter wavelengths, has started to revolutionize our knowledge of star formation in the Milky Way. In the past, we have been limited to inferring properties of the star formation process from observations of good template regions or a selected sample of individual objects, but these surveys now allow us to explore the whole variety of star-forming environments in the Galactic plane. They are unbiased in spatial coverage within a certain range of coordinates, although still subject to observational limitations such as sensitivity, angular resolution, and (for short wavelengths) interstellar extinction.

Using the mid-infrared (MIR) data from the Galactic Legacy Infrared Mid-Plane Survey Extraordinaire (GLIMPSE), Robitaille et al. (2008) compiled a sample of almost 19 000 intrinsically red sources in the inner Galactic plane, thought to be mostly high- and intermediate-mass young stellar objects (YSOs) and asymptotic giant branch (AGB) stars. This population of YSOs was modeled by Robitaille & Whitney (2010) to estimate the star formation rate (SFR) of the Milky Way, for the first time using directly individual YSO counts in conjunction with a population synthesis model. On the other hand, recent (sub)mm continuum surveys (Bolocam Galactic Plane Survey, Aguirre et al. 2011; ATLASGAL, Schuller et al. 2009) have revealed thousands of potential star-forming cold dust clumps throughout the Galactic plane. By matching the detected (sub)mm sources with samples of ongoing star formation indicators, namely (massive) YSOs identified in the GLIMPSE or MSX MIR surveys, 24 μm point sources from the MIPSGAL survey, H ii regions, and methanol masers, several studies have characterized the population of these cold clumps, proposing tentative evolutionary stages, identifying “starless” clump candidates, and investigating the physical properties that give rise to different modes of star formation in our Galaxy (e.g., Dunham et al. 2011; Tackenberg et al. 2012; Urquhart et al. 2014). More recently, the Herschel Hi-GAL survey (Molinari et al. 2010), covering wavelengths from 70 μm to 500 μm, which is the range where the cold dust emission peaks, provides a rich dataset to study the properties of both the embedded YSOs (through the protostellar radiation reprocessed by the surrounding dust) and their envelopes, as well as younger prestellar cores (e.g., Veneziani et al. 2013; Elia et al. 2013).

None of these previous studies, however, have considered the potential multiplicity of a single object at a given wavelength when observed at a shorter wavelength providing higher angular resolution. The standard approaches to deal with multi-wavelength data is a plain nearest source matching to associate sources across different wavelengths or, in the case of sub(mm) clumps, a simple classification of the source based on the type (or absence) of higher resolution objects that fall within the covered area in the sky, but without explicitly taking the multiplicity of these objects into account. In order to make use of the full high-resolution information of multi-wavelength data, we are working on the development of a hierarchical YSO catalog that will associate multiple sources from a higher resolution survey with the corresponding (typically fewer) sources in a lower resolution survey. As a long-term plan, our idea is to apply this formalism to the whole set of Galactic plane surveys, from the NIR to the millimeter.

In this paper, we present an exploratory study that constitutes the first step toward the construction of the hierarchical YSO catalog. Instead of constructing a catalog from scratch and using all the Galactic surveys available, we make here two simplifications: first, we only use data from the GLIMPSE survey and the NIR higher resolution United Kingdom Infrared Deep Sky Survey (UKIDSS); and second, we start from the GLIMPSE individual objects already selected by Robitaille et al. (2008) and check whether these objects split into multiple sources when seen in UKIDSS. In particular, for each GLIMPSE object we investigate whether there are multiple UKIDSS sources that might all contribute to the GLIMPSE flux, or whether there is only one dominant UKIDSS counterpart. This approach is more similar to the previous studies mentioned above in the sense that it is based on a sample of known objects selected from a given survey to investigate how they look at a higher resolution survey. To our knowledge, however, the present work is the first study that tries to associate single GLIMPSE objects with multiple UKIDSS sources, instead of just associating the nearest source, in a large portion of the Galactic plane. We have chosen the Robitaille et al. (2008) census as our starting point not only for simplicity and good coverage of the Galactic plane, but also because this sample is highly reliable. As a direct scientific application of this work, we can study the validity of the physical properties derived when assuming that the GLIMPSE YSOs in the Galactic plane are single objects. In particular, in this paper we investigate how clustering and unresolved binaries could affect the Robitaille et al. (2008) sample, and therefore the SFR estimate of Robitaille & Whitney (2010).

In Sect. 2, we briefly describe the GLIMPSE observations and the corresponding YSO catalog and the data from the UKIDSS Galactic Plane Survey we used in this work. Section 3 gives details about the point spread function (PSF) fitting photometry we performed on the UKIDSS data and the quality control of this photometry. In Sect. 4 we report the main results of this study, in particular the implementation of a method to automatically identify dominant UKIDSS sources in the spectral energy distribution (SED), and the statistics for the GLIMPSE YSO sample regarding these UKIDSS sources; Appendices A and B describe more specific aspects of our method, while in Appendices C and D we give, respectively, further technical details about our statistics and extended results for special cases. In Sect. 5, we discuss the nature of possibly multiple UKIDSS sources, the sensitivity of our method to flux changes, and the presence of variable sources; we also present simple simulated YSO populations models (based on the work by Robitaille & Whitney 2010) to assess the importance of clustering and unresolved binaries in the GLIMPSE YSO sample and to compare the simulations with the properties of the UKIDSS observations analyzed here; we also discuss the implications for SFR estimates. Finally, Sect. 6 summarizes the main conclusions of this paper.

2. Data sets

2.1. GLIMPSE YSO candidates

thumbnail Fig. 1

Positions of UKIDSS GPS DR8 frame sets covering the GLIMPSE I/II observed area.

Open with DEXTER

The GLIMPSE project (Benjamin et al. 2003; Churchwell et al. 2009) consists of a set of various MIR surveys of the Galactic plane carried out with the InfraRed Array Camera (IRAC, Fazio et al. 2004), at 3.6, 4.5, 5.6, and 8.0 μm and with an angular resolution (FWHM) of ~2′′, on board the Spitzer Space Telescope (Werner et al. 2004). Here we use the census of intrinsically red MIR sources from Robitaille et al. (2008, hereafter R08), which is specifically based on the GLIMPSE I and II surveys. These two surveys cover the (ℓ,b) ranges 5° < | | ≤ 65° and | b | ≤ 1°; 2° < | | ≤ 5° and | b | ≤ 1.5°; | | ≤ 2° and | b | ≤ 2°, comprising a total of 274 square degrees.

The GLIMPSE I and II point source catalogs were used by R08 to compile a highly reliable census of 18 949 sources selected by their MIR red color, namely [4.5] − [8.0] ≥ 1, and additional brightness criteria that were carefully chosen to minimize the effects of position-dependent sensitivity and saturation: 13.89 ≥ [4.5] ≥ 6.50 and 9.52 ≥ [8.0] ≥ 4.01. R08 further filtered their selection by imposing a set of quality criteria (their Eq. (4)) to reduce the contamination by bad photometry and spurious detections. The R08 catalog provides improved coordinates and photometry at 4.5 and 8.0 μm (each source was visually examined to ensure that the improved photometry could be trusted), and magnitudes at 24 μm obtained from photometry performed on images of the MIPSGAL survey (Carey et al. 2009), together with the original GLIMPSE photometry at 3.6 and 5.8 μm, and JHKs magnitudes from the 2MASS survey (Skrutskie et al. 2006, included in the GLIMPSE point source catalog).

The sources from the R08 sample consist mostly of YSOs (with an estimated percentage of 50–70%) and AGB stars (estimated percentage of 30–50%). The separation of the sources into YSOs and AGB stars was based on simple color and magnitude selection criteria (Eqs. (7)(9) by R08), and was only approximate. However, supported by the angular distribution of both samples in the Galactic plane (uniform for AGB stars, and more clustered for YSOs), R08 claimed that this separation should be sufficient to estimate the relative number of AGB stars and YSOs in their sample.

One aspect of the GLIMPSE point source catalog and the photometry performed by R08, which is relevant for the analysis presented here, is the processing of close and/or blended sources. Although the GLIMPSE pipeline is based on the PSF fitting photometry program DAOPHOT, which in principle is able to extract fluxes of overlapping sources (see Sect. 3), the fact that the IRAC PSFs are undersampled (the camera has 1.2′′ pixels) makes the splitting of the flux between close sources very inaccurate. Therefore, all the detections within 2′′ in an individual IRAC frame were “lumped” into one source before the in-band merge (merging sources in the same band from different frames) and the cross-band merge (B. Babler, priv. comm.; see also the GLIMPSE I v2.0 data release document1). Owing to the in-band merge and the migration of some positions during the cross-band merge, a few cases of sources closer than 2′′ are still present in the more complete GLIMPSE point source “archive” (these cases are filtered out in the “catalog”), but in general objects within 2′′ are not separated and are considered a unique source by the GLIMPSE photometry. R08 did remove some cases of blended sources from their sample, which produced unreliable measurements during their manual photometry; however, we noticed by visual inspection that most of those blended sources are separated by more than 2′′. In this paper, we thus assume that multiple objects within 2′′ are in general included and identified as only one source in the R08 sample.

Related to what discussed above, one of the quality criteria imposed by R08 on the GLIMPSE catalog to select their sample was a close source flag equal to zero, which means that each source has to be free of neighbors between 2′′ and 3′′ from its position. Although this criterion could have removed some YSOs that were in very dense environments, if we eliminate this restriction the number of selected sources would only increase by ~10% with respect to the number of sources in the R08 sample, so that the effect of this particular selection is not significant.

2.2. The UKIDSS Galactic Plane Survey

In this work, we use data from UKIDSS (Lawrence et al. 2007), in particular the UKIDSS Galactic Plane Survey (GPS; Lucas et al. 2008). This survey was conducted with the Wide Field Camera (WFCAM; Casali et al. 2007) mounted on the United Kingdom Infrared Telescope (UKIRT), on Mauna Kea, Hawaii, providing images in the J (1.25 μm), H (1.63 μm), and K (2.20 μm) filters, with an angular resolution (FWHM) of ~0.9′′. Here we use the UKIDSS GPS products from the Data Release 8 (DR8), which covers 975.5 square degrees (in all filters) of the northern and equatorial Galactic plane, including most of the area covered by GLIMPSE I and II surveys for ≥ −2. The latest data release during the time that we carried out the work presented here was DR8. The GPS data from DR10 were released on January, 2015, and will be used on future papers together with data from the VISTA Variables in the Vía Láctea survey (VVV; Minniti et al. 2010), which covers the southern Galactic plane.

thumbnail Fig. 2

Comparison between the point source catalog generated by the WFCAM pipeline (based on aperture photometry) and the sources detected and measured in this work by performing PSF fitting photometry, for the R08 object SSTGLMC G048.873100.5091. Left: UKIDSS GPS H-band image, overlaid with the positions of the point sources from the UKIDSS catalog. Middle: the same image overlaid with the positions of the objects detected by our PSF fitting photometry. Right: residual image of the PSF fitting photometry, i.e., after subtraction of scaled PSFs at the positions of the detected sources.

Open with DEXTER

The UKIDSS observations are reduced and calibrated at the Cambridge Astronomical Survey Unit (CASU)2, including the generation of a point source catalog for each individual co-added “Multiframe” (see below). These data are then organized in the WFCAM Science Archive into a SQL database (WSA; Hambly et al. 2008)3, where the catalog detections in the different filters are merged. A multiframe is the fundamental unit of the UKIDSS database, and consists of four ~13.65′ × 13.65′ spatially separated patches of the sky observed in one filter, inherited from the four array detectors of WFCAM. Each patch is called “MultiframeDetector” in the WSA nomenclature. In this paper, we directly use the final co-added images known as leavstack multiframes (see Appendix A.1 in Lucas et al. 2008, for more details), but for a given position we access them through the identification numbers assigned by the band-merging process in the WSA, which groups individual well-aligned frames observed in different filters into a frame set, i.e., a set of MultiframeDetectors (see Hambly et al. 2008). By doing this, we ensure the access to the best set of UKIDSS images for a given position, and in cases of bright diffuse background emission, to the images that were already processed with a nebulosity-filtering algorithm (originally designed to improve the extracted catalogs; Irwin 2010).

Figure 1 shows the positions and coverage of all UKIDSS GPS DR8 frame sets overlapping the GLIMPSE I/II area, after excluding a few frames with poor astrometry4. We downloaded all the multiframe images associated with these frame sets, and extracted a point source catalog per frame set by automatically querying the online UKIDSS catalog available at the WSA; these catalogs were only used for PSF stars selection, as explained in Sect. 3.1. We then searched for the best frame set available for each object from the R08 sample and stored the associated identification number (frameSetID). Again, this was carried out by automatically querying the WSA UKIDSS catalog at each position within a 12″ radius, and by selecting the frameSetID associated with the nearest of the sources satisfying some quality criteria (following Appendix A3 by Lucas et al. 2008), namely: select only primary detections for duplicates in overlapping frames, exclude detections classified as noise, and select only sources without serious quality issues (ppErrBits<256 in all filters).

In total, 8325 objects from the R08 sample, which represent 86% of all R08 objects with ≥ −2, have available UKIDSS GPS DR8 images in the three JHK filters.

3. UKIDSS photometry

The UKIDSS point source catalog produced by the WFCAM pipeline is based on aperture photometry alone. Lucas et al. (2008) claim that overlapping apertures are properly treated in the pipeline, and that no significant differences in precision are found in test results comparing this type of aperture photometry with PSF fitting photometry. However, we do expect the latter to give an improved accuracy and completeness in the presence of blended stars or in very crowded regions, as has been already shown in previous studies using UKIDSS data of star clusters (e.g., Stead & Hoare 2011; Longmore et al. 2011). Since we are interested in the possible multiplicity of YSOs seen as individual sources in GLIMPSE, we performed PSF fitting photometry on the UKIDSS images to detect and properly separate the emission of multiple overlapping sources at the position of each R08 object. As an example, this higher completeness can be clearly noted in Fig. 2, which compares the detections found in the UKIDSS point source catalog with those in our PSF fitting photometry for a small field centered on one R08 source.

We used the standard PSF fitting routines of the DAOPHOT package (including the programs ALLSTAR and ALLFRAME; Stetson 1987, 1994), which were run in a completely automatic way through a custom Python wrapper. Here, we briefly describe some points that are particularly related to the UKIDSS photometry; more details regarding the general functioning of DAOPHOT are provided in the original papers or the DAOPHOT user’s manual5. Before running these routines, the UKIDSS images were processed with the dribbling method, i.e., convolved with a simple matrix in order to redistribute the original 0.4′′ pixel flux into the new 0.2′′ pixel grid after the interleaving process6.

3.1. Construction of the PSFs

We constructed a spatially varying PSF for each individual detector of every multiframe in our data set, using a selection of bright stars from the corresponding point source catalog. The selection was based on the quality criteria applied to generate the most reliable, low completeness sample in Appendix A3 of Lucas et al. (2008), together with the stricter condition ppErrBits≤ 16 in all filters (only the deblended error flag is allowed). We generated slightly different lists of PSF stars per filter by imposing individual magnitude cuts J> 13.25, H> 12.75 and K> 12, which are the conservative saturation limits recommended by Lucas et al. (2008). The PSF stars were also required to be free of brighter or saturated neighbors within a radius equal to fitting radius + psf radius (parameters in DAOPHOT; we use values of 4 and 20 pixels, respectively) to emulate the selection of PSF stars by the routine pick of DAOPHOT.

From this restricted very high-quality sample, we selected the 100 brightest stars (in each frame and each filter) to construct the PSF, which was allowed to vary quadratically across the field. In a very few cases (19 out of 2125 frame sets used, representing fields with severe source crowding close to the Galactic center), there are less than 100 stars satisfying all the quality criteria in the K filter; however, only two frame sets have less than 50 selected stars in K (namely, 29 and 36 stars), for which we construct a linearly varying PSF. The PSF was derived iteratively following a standard sequence of steps: construct a preliminary PSF, perform a PSF fitting photometry for all neighbor sources of the PSF stars, subtract the neighbors from the image using properly scaled PSFs, and derive an improved PSF from this clean image. Our script ran for five iterations, and the complexity of the PSF was gradually increased from a purely analytic constant function to an analytic function plus quadratically varying empirical corrections.

We measured the size (FWHM) of the PSF evaluated at nine uniformly distributed positions in every individual detector. The sizes are very similar for all the UKIDSS bands with a mean of ~0.9″ and a dispersion of ~0.1″. For 96% of all the measurements in our data set, the sizes are within the range [0.7″, 1.1″].

3.2. Point spread function fitting photometry

We then proceeded to generate a list of point sources with measured JHK magnitudes for every R08 object by carrying out PSF photometry in a small 30″ × 30″ field centered on the object position. The UKIDSS images in the different bands were accessed through the previously stored frameSetID. Because we needed the full extent of the frame to properly apply the quadratic variations of the PSF, pixels outside the selected 30″ × 30″ field were simply masked, rather than cut out from the image. For every filter, a preliminary catalog was produced by running three iterations of a sequence of the DAOPHOT routines find, photometry, and ALLSTAR. In the first iteration, we used find to search the image for star-like objects exceeding the local background by at least 4σ, where σ is the noise of the image (see below); rough aperture magnitudes for all the detected objects were estimated with photometry; and we then calculated more precise positions and magnitudes through the simultaneous multi-PSF fitting performed by ALLSTAR. In the subsequent iterations, we ran find in the residual image produced by ALLSTAR (this time with a threshold of 5σ to account for the increased noise after the subtraction) to identify previously undetected stars that were blended with brighter companions; we estimated aperture magnitudes of the newly detected objects and appended the new star list to the output list of the previous ALLSTAR run; and we repeated the PSF fitting photometry of ALLSTAR with this extended list as input.

This typical usage of the DAOPHOT routines was slightly modified in our pipeline by the incorporation of two custom adjustments. First, the total noise originally used in DAOPHOT, estimated as Poisson noise plus readout noise, could be significantly underestimated in very crowded regions, where the source confusion noise is important; we verified that this was indeed the case in fields close to the Galactic center. Thus, we empirically derived the noise σ and mode (sky level) of the image via a two-step algorithm: iterative σ clipping for preliminary estimates, and iterative calculation of the mode through a kernel density estimator, and of the standard deviation by reflecting the pixel values below the mode. We then conservatively used the greatest value of σ among the theoretical and empirical estimates.

The second modification in the find-ALLSTAR cycle was filtering spurious sources after every ALLSTAR run, which are unavoidably detected in some cases as a result of very localized extended emission (which, because of its local nature, was not completely corrected by the nebulosity filter of the WFCAM pipeline) and/or imperfect PSF subtraction from the previous iteration. There are two quality parameters given as part of the ALLSTAR output, sharp (hereafter, represented by r0) and χ, which represent a first-order estimate of the intrinsic radius of the object (in pixels), and a dimensionless measure of the goodness of the PSF fitting, respectively. More details on this can be found in the DAOPHOT user’s manual and in Stetson et al. (2003). After doing visual checks in some critical fields, we found that a good filtering is achieved, without compromising the detection of blended objects in the new iterations, if we apply thresholds on r0 and on the flux error ΔF relative to the flux F. We used the criteria | r0 | ≤ 1.5 and ΔF/F ≤ 50% for all detections and ΔF/F ≤ 15% for the new detections found in the current iteration. We did not impose any limit in the other quality index χ because some sources can have a high χ because of the initially bad residual produced by the non-detection of nearby blended sources that are detected in subsequent iterations.

After deriving an individual list of point sources for each UKIDSS filter, a preliminary band-merged list was obtained by cross matching the detections with daomaster. At the same time, this routine provides geometrical transformations relating the coordinate systems of the different frames, although in this particular data set the corresponding frames are reasonably well aligned and therefore these transformations only represent slight corrections. We calculated initial coordinate transformations as input for daomaster using the World Coordinate System (WCS) information from the image headers. A final band-merged list of detected UKIDSS point sources for every R08 object was produced by running the ALLFRAME program, which simultaneously performs multi-PSF fitting photometry on the three JHK images. We converted the data count fluxes to calibrated magnitudes using Eq. (C1) of Hodgkin et al. (2009) and the calibration parameters from the image headers.

3.3. Quality filtering and flags

Although we are only interested in the UKIDSS sources in the close vicinity of every R08 object, the PSF fitting photometry was performed in larger fields (30″ × 30″) to properly cross-match a significant number of detections in the different bands and to derive more precise coordinate transformations between the frames. We can use the final coordinates of the sources measured by ALLFRAME in the different filters and the WCS information in the headers to test the relative precision of the UKIDSS astrometry. We found that the positions in the different UKIDSS bands had relative differences larger than 1 pixel (0.2′′) for only 116 R08 objects, which were discarded for further analysis.

Hereafter, we consider all the detected UKIDSS sources within a radius of 2′′ from each R08 object, so that we only take sources that might contribute to the total flux detected in GLIMPSE into account (see Sect. 2.1). First, we checked for the presence of saturation in the images possibly affecting these sources. Saturated pixels were defined as having >min { 35 000,SATURATE } counts, where SATURATE is the average saturation level provided in the image header, and 35 000 is a conservative reference value given by Lucas et al. (2008). Objects from the R08 sample that have saturated pixels in at least one of the three UKIDSS JHK images within the 2′′-radius circle were removed from the final sample. On the other hand, cases of saturation occurring outside that circle but within one PSF radius from the perimeter were simply flagged as objects with peripheral saturation, in which the UKIDSS photometry of the potentially associated sources could have been affected by the subtraction of poorly fitted PSF wings of nearby saturated stars. We discarded these objects from the sample, except for very specific parts of the analysis in which we explicitly mention their inclusion. We found a total of 2218 R08 objects saturated in UKIDSS within 2′′, 2012 of which (91%) consistently have 2MASS magnitudes brighter than at least one of the conservative UKIDSS saturation limits recommended by Lucas et al. (2008) of J< 13.25, H< 12.75, or Ks< 12. Most of the remaining 9% of the objects have no 2MASS magnitudes available, so either they were undetected in 2MASS (because of variability or the much lower sensitivity of 2MASS with respect to UKIDSS, especially toward the Galactic center), or they have unreliable 2MASS fluxes which were then rejected by R08. The useful sample is then reduced to 5991 targets, which are divided into 636 objects with peripheral saturation and 5355 objects not affected by saturation at all.

thumbnail Fig. 3

DAOPHOT quality indices r0 and χ as a function of the magnitude for all sources detected within 2′′ of non-saturated R08 objects, in each UKIDSS band. The different lines show the thresholds defining the bad-quality flag in a particular filter: | r0 | = 2 in the top panels, and the curve χlim(m) in the bottom panels (derived empirically as described in the text). A source was considered as having bad-quality photometry in a given band if | r0 | > 2 or χ > χlim(m).

Open with DEXTER

For the 14 291 detected UKIDSS sources within 2′′ of these 5991 R08 objects, we computed individual 3σ detection limits. Output fluxes below these limits can exist in ALLFRAME owing to the simultaneous PSF fitting in all bands, but we used instead the computed detection limits (i.e., lower limit magnitudes). For a given source, we converted a 3σ peak in counts to a calibrated magnitude by using the PSF at that position and the same calibration process applied to the source. The UKIDSS sensitivity varies considerably in our data set because of the highly irregular confusion level throughout the inner Galactic plane (see also Sect.3.1 of Lucas et al. 2008), reaching in some uncrowded regions 3σ detection limits of ~22, 21, and 20 in the J, H, and K bands, respectively. However, conservative upper limit sensitivities (i.e., lower limit 3σ magnitudes) can be set at J ≃ 19, H ≃ 18, and K ≃ 17, for most of the sources in our sample.

We also defined a bad-quality flag for this whole sample of UKIDSS sources, using the DAOPHOT quality indices r0 and χ introduced in Sect. 3.2, which are also part of the ALLFRAME output for the different bands. This is an independent step to the quality filtering already carried out after every ALLSTAR run, which was intended to generate a reliable star list as input for ALLFRAME; the final photometry and quality indices were then recalculated by ALLFRAME. In Fig. 3, we plot these quality parameters as a function of the magnitude for all sources with magnitudes brighter than the individual 3σ detection limits previously computed, corresponding to a total of 10 148, 13 061, and 13 835 sources in the J, H, and K bands, respectively. The values of χ do not converge to unity for faint magnitudes as expected, but closer to ~3 / 8, representing approximately the noise reduction after the dribbling process (see Sect. 3). This occurs because χ is scaled with respect to the theoretical noise estimated by DAOPHOT, which is equivalent (for regions without source confusion) to the noise of the images with the original 0.4′′ pixel flux.

A source was considered as having bad-quality photometry in a particular filter if | r0 | > 2 or χ>χlim(m), where χlim(m) is an empirically derived threshold that depends on the magnitude m, following Stetson et al. (2003). We note that all the remaining sources satisfy ΔF/F ≤ 50%, so that it was not necessary to explicitly impose this condition in this particular sample. We slightly increased the tolerance in | r0 | with respect to the ALLSTAR quality filtering because this time we were only defining flags for sources already validated as non-spurious. To derive the threshold χlim(m), we did not use an analytical expression adjusted by eye on the m-χ plot as in Stetson et al. (2003), given that our plot does not exhibit a trivial dependency, especially in the K band. We instead computed χlim(m) as the limit below which we retain the best 90% of the sources in every filter. Specifically, we implemented the following steps: construction of a surface density m-χ plot using a kernel density estimator; for a grid of 100 values of m, calculation of the value χ0.9 at which the cumulative distribution function is 0.9; and finally fit of a 4-degree polynomial to the resulting (m,χ0.9) pairs. The bad-quality thresholds in r0 and χ are also shown in Fig. 3 for each band.

Applying the above criteria, we flagged 1296, 1537, and 1666 UKIDSS sources in the J, H, and K bands, respectively. These typically correspond to different sources in each filter, and only 321 sources were flagged in all three bands.

4. Results

4.1. SED exploration

thumbnail Fig. 4

Illustration of the usefulness of the SEDs comparison to identify dominant UKIDSS sources. This example shows the R08 object SSTGLMC G048.961000.3963 and the three detected UKIDSS sources within a radius of 2′′, which is indicated by the dashed-line circle. Left: UKIDSS JHK three-color image overlaid with the positions of the UKIDSS sources as open circles (green for the nearest in angular separation, and magenta for the others). Middle: GLIMPSE 3.6, 4.5, and 8.0 μm three-color image of the same field. Right: SED of the R08 object at 3.6, 4.5, 5.8, 8.0, and 24 μm wavelengths (blue points), plotted together with the SEDs of the three UKIDSS sources in the J, H, and K bands (green points for the nearest source and red points for the others).

Open with DEXTER

Detecting the UKIDSS sources within 2′′ from each of the R08 objects is not enough to address the possible multiplicity of these objects, since some of the identified sources could be unrelated foreground or background stars along the same line of sight. Ideally, one would somehow have to determine whether these UKIDSS sources are YSO candidates by themselves. Because YSOs are harder to distinguish from field stars via NIR colors alone (see, e.g., Sect. 4.1 of Lucas et al. 2008), an optimal method would be the execution of PSF fitting photometry on the GLIMPSE images using the positions of the UKIDSS sources as input. In this way, we would obtain the contribution of these sources to the flux at the different GLIMPSE wavelengths, so that we would have access to their MIR colors (or colors combining NIR and MIR bands). This is beyond the scope of this particular paper, but analogous approaches will be addressed in the context of the hierarchical YSO catalog we are developing in the future (see Sect. 1).

In this work, we focus on a simpler – but also useful – approach, consisting in the identification of only such UKIDSS sources that have a significant or dominant contribution to the observed flux in the GLIMPSE bands. We noticed that comparing the NIR SEDs of the UKIDSS sources with the MIR SED of the corresponding R08 object was remarkably helpful to find the dominant sources. As an example, Fig. 4 shows UKIDSS and GLIMPSE three-color images of one particular R08 object, together with its MIR SED (3.6–24 μm) and the NIR SEDs (JHK) of the three detected UKIDSS sources. By examining the UKIDSS image and the SEDs comparison plot, it is clear that the nearest (in projected angular distance) UKIDSS source matches the MIR SED nicely and is probably the main contributor to the GLIMPSE fluxes, whereas the other two are a much fainter reddened source that does not match the MIR SED well, and an evident foreground star with a flat NIR slope. In Sect. 4.2, we describe the methodology we implemented to automatically recognize these UKIDSS sources having a good match with the MIR SED and therefore being the dominant counterparts of the objects from the R08 sample.

4.2. Quantifying the SED match

By inspecting several SED comparison plots similar to the plot shown in the right panel of Fig. 4, we realized that the UKIDSS dominant counterparts are typically characterized by a smooth transition between their NIR SED and the MIR SED of the corresponding R08 objects, which does not occur at all for much fainter reddened sources or foreground/background stars. To evaluate whether or not we can trace a smooth curve crossing the NIR and MIR points of the combined SED, particularly through the NIR-MIR interface, we computed the cubic spline representation of the SED constructed for each UKIDSS source (and the associated R08 object). We then quantified how similar this spline was to a simple quadratic function fitted over the four middle points of the SED defining the NIR-MIR transition (typically the H, K, 3.6 μm, and 4.5 μm filters) by computing the mean ratio of both curves. Given the high dynamic range of the SEDs, all calculations were carried out in (log 10(λ),log 10(Fν)) space, where λ is the wavelength and Fν is the flux density; we adopted the zero-point fluxes given by Hewett et al. (2006) for the UKIDSS band, and by R08 for the Spitzer bands. The mean spline to quadratic function ratio R was then defined as (1)where splinei and quadi are the spline representation and quadratic function, respectively, evaluated on a grid of 1000 values covering the wavelength range defined by the four middle points, all in (log 10(λ),log 10(Fν)) space. By using the absolute difference, we make sure that a perfect agreement translates into R ⟩ = 1, and that any deviation from one curve with respect to the other in any direction would produce R ⟩ > 1.

Figure 5 illustrates this method for the same R08 object previously shown in Fig. 4. Now we plot the combined NIR-MIR SED for each detected UKIDSS source individually together with its spline representation and the quadratic fit described above; we also indicate the value of R for each source. It is clear from this example that the agreement or disagreement of both curves represents a good indicator of the match or mismatch between the NIR and MIR parts of the SED, and that this is well quantified by the mean ratio R defined in Eq. (1).

thumbnail Fig. 5

Example of the method implemented in this work to evaluate the smoothness of the NIR-MIR transition of the SED constructed for every UKIDSS source and the corresponding R08 object. For each one of the three UKIDSS sources detected for the R08 object SSTGLMC G048.961000.3963 (the same shown in Fig. 4), we plot the combined NIR-MIR SED overlaid with its cubic spline representation (yellow line) and the quadratic function fitted over the 4 middle points (H, K, 3.6 μm, and 4.5 μm filters; gray line). The value for the mean ratio of both curves as defined in Eq. (1) is indicated at the lower left corner of each panel. Colors for the SED points are as in the right panel of Fig. 4. In this particular example, the J upper limits have been simply ignored by the method (see the text for details on how different cases like this are treated).

Open with DEXTER

When the flux at a given NIR wavelength is flagged as a bad-quality measurement or represents an upper limit, but the other two fluxes unambiguously define the SED (mis)match, the missing flux can be simply ignored by the method, as is the case for the J upper limits in the example shown in Fig. 5. Detailed decision rules for all the possible combinations of flagged fluxes and upper limits are described in Appendix A. In summary, we did not run the SED matching method for UKIDSS sources satisfying one or more of the following conditions: flagged K-band flux, number of flagged fluxes >1, or the total number of flagged fluxes and upper limits = 3. However, in some cases the associated R08 object was still included in the statistics presented in Sects. 4.3 and 4.4, especially if the rejected UKIDSS source had a sufficiently low K-band flux. Out of the 14 291 detected UKIDSS sources within 2′′ of non-saturated R08 objects (recall Sect. 3.3), 12 257 were considered for the SED matching procedure (the remaining 2034 were rejected), of which 10 696 are unambiguous sources (category G as defined in Appendix A), i.e., they are detected in all the UKIDSS bands or present an upper limit or bad-quality flux that does not compromise the unambiguity of the SED (mis)match.

From all the R08 objects associated with this list of unambiguous UKIDSS sources, we randomly selected a control sample of 75 objects, which uniformly cover the GLIMPSE observed area for ≥ −2, and are associated with 157 UKIDSS sources. We inspected the combined SEDs of all the objects in this control sample (similar plots as in Fig. 5), and visually decided whether or not the NIR and MIR parts of the SED are well matched. We found that a very useful parameter combination in which the sources well matched by eye were clearly separated from the poorly matched sources was the 2D-space defined by R and the angular separation between the UKIDSS source and its corresponding R08 object, hereafter denoted by Δθ. Out of the 73 UKIDSS sources that were visually evaluated as having a good match with the MIR SED, 70 are concentrated in a confined area defined by the conditions (2)which represent, for the rest of this paper, our quantitative criteria to distinguish the good SED matches from the bad ones. To estimate the possible contamination of using these criteria, we selected an independent random sample of 220 uniformly distributed R08 objects, associated with a total of 471 UKIDSS sources that define what we call the validation sample. This sample was examined through the same visual inspection process we carried out for the control sample to compare the visual criterion with the quantitative criteria defined by Eq. (2). The left panel of Fig. 6 shows the location of all unambiguous UKIDSS sources in the R versus Δθ plot, while the right panel shows the same diagram for the sources from the validation sample, which are color-coded according to their evaluation after the visual inspection. We found that 199 sources from the validation sample fell within the area delineated by Eq. (2), of which only 6 were visually considered bad matches, implying a misclassification ratio of 3%. On the other hand, 11 sources visually considered good matches are outside that area, but notably they are all in the R ⟩ ≤ 1.3 region, where a total of 47 sources from the validation sample are found (with Δθ> 0.57″). This translates into no contamination for bad matches defined by R ⟩ > 1.3, and 23% contamination for bad matches defined by Δθ> 0.57″ but having R ⟩ ≤ 1.3. These contamination ratios are used in Sects. 4.3 and 4.4 to properly estimate the uncertainties on the statistics presented there.

thumbnail Fig. 6

Mean spline to quadratic function ratio R (as defined in Eq. (1)) vs. the angular distance Δθ from the associated R08 object, for all unambiguous UKIDSS sources detected (left panel), and for the sources from the validation sample (right panel), which are color-coded according to their visual evaluation of good (pale sky blue) or bad (red) match with the MIR SED. The limits on R and Δθ defining the quantitative criteria of Eq. (2) are indicated as dashed lines on both panels.

Open with DEXTER

There is another very specific cause of contamination for sources satisfying Eq. (2) that was not revealed by the validation sample, and was treated independently. This consists of sources for which the derived spline and the fitted quadratic function are in good agreement, but the combined SED does not follow the expected concave shape of a reddened source in the NIR, and therefore the quadratic function is convex. With the exception of very peculiar cases, these sources were considered to have an uncertain match with the MIR SED. In Appendix B, we give more details on how we processed these sources.

4.3. Statistics of SED match

Using the SED matching criteria defined in Eq. (2), we can now classify the R08 objects in a way that helps us to address the potential multiplicity of these objects when seen in UKIDSS, in particular the question of whether there is only one dominant contributor to the GLIMPSE fluxes (Sect. 4.4). According to the number of detected UKIDSS sources within 2′′, how many of these sources match the MIR SED, and whether or not one of them dominates the flux in the K band (in cases when the SED (mis)match is not enough), each R08 object can be in one of the following categories:

  • • U0:

    No UKIDSS source detected within 2′′.

  • • U1_S1:

    Only one UKIDSS source detected, which matches the MIR SED.

  • • U1_S0:

    Only one UKIDSS source detected, which does not match the MIR SED.

  • • UM_S1:

    Multiple (more than one) UKIDSS sources detected, but only one matching the MIR SED.

  • • UM_SM:

    Multiple UKIDSS sources detected, more than one matching the MIR SED, and with K-band fluxes within a factor 10.

  • • UM_SM_K10:

    Multiple UKIDSS sources detected, more than one matching the MIR SED, but among these sources there is one source that is brighter at K than the others by a factor >10.

  • • UM_S0:

    Multiple UKIDSS sources detected, none matching the MIR SED, and all having K-band fluxes within a factor 10.

  • • UM_S0_K10:

    Multiple UKIDSS sources detected, none matching the MIR SED, but one source is the brightest at K by a factor >10.

Instead of directly counting the number of R08 objects that fall in each category, we run 105 Monte Carlo (MC) simulations to include the presence of contaminants in the SED matching criteria of Eq. (2), allowing us to estimate the uncertainties on this classification. In Appendix C.1, we explain the details of this procedure.

Table 1

Statistics for the classification of R08 objects.

In Table 1, we list the mean number of R08 objects that fall in each category and the corresponding standard deviation (Columns N and σ(N), respectively) for the whole set of MC simulations. Here, we used the sample of 5355 R08 objects not affected by any kind of saturation. It is already clear from the table that the categories indicating R08 objects with only one dominant UKIDSS counterpart (U1_S1 and UM_S1) largely outnumber the remaining classes. This will be further investigated in Sect. 4.4. The last row of Table 1 counts such objects for which the UKIDSS sources do not have a signal-to-noise S/N high enough to produce an unequivocal classification. We used a threshold of S/N = 30 in the K band, so that any undetected UKIDSS source at K, i.e., below 3σ = 3N, would still be fainter by a factor >10 than the relevant UKIDSS source for each category. The relevant UKIDSS source for which we impose the threshold in S/N depends on the specific category. For U1_S1 and U1_S0, there is an unique UKIDSS source to evaluate; for UM_S1, we consider the source that matches the SED; and for UM_S0_K10 and UM_S0, we simply evaluate the brightest K-band source. The UM_SM objects were not considered because the eventual presence of undetected UKIDSS objects within a factor 10 at K would not affect their classification. The UM_SM_K10 objects are treated independently in Appendix D.

We also derived the proportion of objects, within each category of Table 1, that were classified as YSO or AGB star candidates by R08 (columns %YSOs and %AGB, respectively). The uncertainty on each proportion (column σ(%YSOs) or σ(%AGB)) was estimated by adding in quadrature the standard deviation of the proportion computed for all the MC simulations and the typical error derived from subsampling the whole set of objects in each category (using the hypergeometric distribution; see Appendix C.2 for details). Out of the 5355 R08 objects not affected by any kind of saturation in UKIDSS, 1530 and 3825 are classified as AGB star and YSO candidates, respectively. Although the AGB stars-YSOs distinction made by R08 is not perfect for individual objects, it provides a reasonable separations in a statistical sense (see Sect. 2.1). In principle, the misclassification from the AGB stars-YSOs separation could also be modeled within the MC simulations as for our SED matching criteria, but unfortunately there is no robust determination of the contamination from this separation which could be applied to the whole inner Galactic plane. The proportion of YSO and AGB star candidates in some categories of Table 1 indeed show significant (and expected) differences (at the 2σ level) with respect to the whole sample of R08 objects considered (first row). Because AGB stars are generally brighter than YSOs, there is a relatively lower proportion of AGB star candidates within the U0 category and within the set of objects with low S/N. Similarly, given that AGB stars are more isolated than YSOs, the proportion of AGB star candidates is higher for UM_S1 objects, and lower for UM_SM objects (see Sect. 4.4).

4.4. Proportion of GLIMPSE YSO candidates with one dominant UKIDSS counterpart

In this section, we use the classification of R08 objects based on the SED match or mismatch with the detected UKIDSS sources (see Sect. 4.3) to estimate the proportion of YSOs that can be described by only one dominant UKIDSS counterpart. For simplicity, we refer to those objects as UKIDSS-single, though there might be other non-dominant UKIDSS sources physically associated with the same object.

Since we are particularly interested in the sample of YSOs and AGB stars are relatively more isolated and thus most of them are expected to be UKIDSS-single, here we adopt the AGB stars-YSOs separation made by R08, as in Sect. 4.3. For each MC simulation previously performed, we counted in each sample (candidate YSOs or AGB stars) the objects that fall in the SED matching categories that are consistent with being UKIDSS-single objects versus the objects with classification consistent with not being UKIDSS-single objects. Then, we can compute the mean proportion of UKIDSS-single objects (and the standard deviation) of the whole set of MC simulations. Here, we only used the good-quality categories, in which the UKIDSS-single nature is reliably defined: U1_S1 and UM_S1 objects for UKIDSS-single, and U1_S0, UM_SM, UM_SM_K10, UM_S0, and UM_S0_K10 for non-UKIDSS-single objects, comprising a total of ~1260 AGB star candidates and ~2650 YSO candidates (exact numbers vary for each MC simulation).

The percentage of UKIDSS-single objects turned out to be 92.1 ± 1.2% for candidate AGB stars and 87.0 ± 1.6% for candidate YSOs. The higher proportion of UKIDSS-single objects for AGB stars with respect to YSOs, as stated above, is expected. However, the absolute percentage of UKIDSS-single YSOs is, at first sight, surprisingly high because one would expect a larger number of GLIMPSE YSOs with potentially multiple dominant UKIDSS counterparts, given the typically clustered nature of their environment. Possible explanations of this result are discussed in Sect. 5.4. If we include in the statistics R08 objects affected by peripheral saturation and some special cases that could be considered UKIDSS-single, the computed proportions are identical within the uncertainties (see Appendix D).

We compared the angular separation Δθ between the UKIDSS sources and the respective R08 objects that were classified as UKIDSS-single but enclosed more than one detected source in total (i.e., all UM_S1 objects). We found that almost all the dominant UKIDSS sources (99% of the cases) were also the nearest in angular projection to the corresponding R08 objects, which is consistent with the fact that these sources are the unique main contributors to the GLIMPSE fluxes. This means that, at least for the R08 sample, and given that the majority of their objects have only one dominant UKIDSS counterpart, a simple nearest-source matching between UKIDSS and GLIMPSE would be a statistically good approximation. However, this kind of matching would fail for the few objects that show possibly genuine multiplicity in UKIDSS (see Sect. 5.1), or for different samples of Spitzer-selected YSO candidates that include fainter objects and would probably have a lower proportion of UKIDSS-single objects (e.g., for star-forming regions in the solar neighborhood).

5. Discussion

5.1. Nature of GLIMPSE objects not dominated by one UKIDSS source

Even though most of the R08 objects were found to be dominated by only one UKIDSS counterpart, which matches the MIR SED and thus represents what we called UKIDSS-single objects (see Sect. 4.4), it is still interesting to investigate the nature of the less frequent remaining cases. We then performed a quick visual inspection of the combined SEDs and UKIDSS images of the objects that were not labeled as UKIDSS-single, using a direct classification of the R08 objects into the categories defined in Sect. 4.3 (i.e., without MC simulations).

We found that U0 objects seem to be truly undetected in the UKIDSS images, except for a few cases in which there is a UKIDSS source that is too extended to be detected by the PSF fitting photometry (see left panel of Fig. 7 for an example). For most of the U1_S0 objects, the detected UKIDSS source is an unrelated star for which the SED matching criteria are correctly not satisfied (angular separation larger than the threshold, Δθ> 0.57″, and convex quadratic fits; see Appendix B). However, a few objects in this category are variable (see Sect. 5.3) or have a good SED match (R ⟩ ≤ 1.3, and concave quadratic fit) but their UKIDSS position is slightly shifted from the GLIMPSE position (angular separation criterion too strict in these cases).

thumbnail Fig. 7

UKIDSS JHK three-color images for three examples of R08 objects not dominated by only one UKIDSS source. The green circles show the positions of the detected UKIDSS sources within a radius of 2′′, which is indicated by the dashed-line circle. Left: SSTGLMC G016.7954+00.1216, example of object for which the UKIDSS counterpart is too extended to be detected by the PSF fitting photometry. Middle: SSTGLMC G049.1319+00.9327, example of object with two UKIDSS sources that match the MIR SED. Right: SSTGLMC G007.226600.2728, example of an apparent small cluster of UKIDSS sources.

Open with DEXTER

This previous scenario can also occur for some specific UKIDSS source(s) associated with UM_S0 and UM_S0_K10 objects, especially when there seem to be genuine multiple counterparts with a noticeable angular separation that can even produce small variations in the position of the R08 object in the different GLIMPSE bands (and the unique listed GLIMPSE position is separated from all the UKIDSS counterparts by more than 0.57″). Another apparent situation for these categories is when the true UKIDSS counterpart of the R08 object is below the detection limit, and the detected UKIDSS sources are unrelated objects on the line of sight. However, much of the ambiguity of all these special cases was already taken into account in our statistics of Sects. 4.3 and 4.4 by the MC simulations of contamination on the SED matching criteria.

According to our visual inspection, most of the cases of apparently true multiple UKIDSS counterparts are found for R08 objects classified in the UM_SM category. Most of the UM_SM objects have two UKIDSS sources with K-band fluxes within a factor 10 and that match the MIR SED (Fig. 7, middle), but there are also a few cases showing three UKIDSS counterparts or even small clusters of YSOs (Fig. 7, right). The factor 10 imposed here is just a nominal threshold used to separate the case of multiple UKIDSS counterparts (UM_SM) from the situation of multiple UKIDSS sources matching the SED but one dominating the flux (UM_SM_K10), which is very uncommon. Within the UM_SM category, the UKIDSS counterparts typically have more similar K-band fluxes, where 87% of the objects have multiple UKIDSS sources with fluxes within a factor 5.

We also found a few apparent small clusters for some UM_S0 objects, but the total number of clusters found (i.e., including those within the UM_SM category) is not more than ~15; we do not give an exact number here because of the subjectivity of the visual inspection. This is consistent with the calculations presented in Sect. 5.4, in which we found that it is hard to find multiple UKIDSS sources or clusters within the GLIMPSE resolution and with K-band fluxes within a factor ~5 in the R08 catalog; it is probably also difficult to find clusters in any sample of bright single GLIMPSE YSO candidates. As an independent example, Alexander & Kobulnicky (2012) found only 6 UKIDSS clusters at the position of 391 bright (<5 mag) GLIMPSE objects selected by a red color criterion (Ks− [3.6] > 2). They found additional 12 UKIDSS clusters at the positions of massive YSO candidates from the literature, which did not have a 2MASS/GLIMPSE catalog entry because of blending or saturation. Their clusters could be partially resolved at 3.6 or 4.5 μm and have angular radii in the range 5′′–11′′ and are therefore more extended than the UKIDSS clusters we found toward R08 objects. However, these kinds of GLIMPSE objects, which are close to saturation, highly blended in the GLIMPSE images but missing or identified as a single entry in the catalogs, are not present in the R08 catalog.

5.2. Sensitivity of the SED match to flux changes

Here, we study the sensitivity of the SED match to changes in the NIR fluxes to have an idea of the behavior of our method for variable sources (Sect. 5.3) and of the typical flux ratios between a dominant UKIDSS counterpart and fainter associated red sources for a given R08 object (useful for Sect. 5.4). We iterated over all UKIDSS sources matching the MIR SED of non-saturated R08 objects classified as U1_S1, UM_S1, UM_SM, and UM_SM_K10, and gradually scaled down the UKIDSS fluxes (by an uniform ratio in all bands) to the point where the SED was not matched anymore, i.e., where R ⟩ > 1.3. We then repeated the same exercise, but now scaling up the UKIDSS fluxes.

The top panel of Fig. 8 shows the resulting distribution of the ratio (in logarithmic scale) between the scaled fluxes and the original fluxes at the point of the transition from SED match to mismatch. The logarithm of the ratio have a mean and standard deviation of −0.94 ± 0.28 for scaled-down fluxes and 0.81 ± 0.25 for scaled-up fluxes. These values translate into an average high to low flux ratio7 of 8.6 and 1σ limits of [4.6,16.3] for scaled-down fluxes and an average ratio of 6.5 and limits of [3.7,11.6] for scaled-up fluxes. This result indicates that the SED matching criteria are not very sensitive to uniform flux variations and that fainter multiple UKIDSS counterparts could be detected (i.e., as matching the SED) if they are within a factor 8–9 of the main counterpart and have a similar NIR SED shape; however, this factor could probably be closer to ~5 for real (not scaled) sources (see Sects. 5.1 and 5.4).

Nevertheless, this is not necessarily true for less reddened sources or background/foreground stars detected in UKIDSS. To test the performance of the SED matching method in those cases, we repeated the flux-scaling experiment described above, but this time we only scaled down the K-band flux, while the J- and H-band fluxes were always kept constant. In the bottom panel of Fig. 8 we show the distribution of the K-band ratios of scaling needed for the sources to mismatch the SED. We did not scale up the fluxes in this case, since that would only increase the HK color, which is not typical of unrelated field stars. Because the NIR SED shape of each UKIDSS source changes in this test, the resulting ratios are now much lower with an average high to low flux ratio of 2.4. The SED matching method is then reasonably good to reject the UKIDSS sources that are unlikely to be physically associated with a R08 object.

thumbnail Fig. 8

Histograms of ratios (in logarithmic scale) between the scaled fluxes and the original fluxes at the point of the transition from SED match to mismatch. The top panel shows the results for fluxes that were scaled by the same amount in the three JHK bands; these ratios were computed independently for scaled-up and scaled-down fluxes, and are presented here as two separated histograms. The bottom panel shows the distribution of ratios for scale-down K-band fluxes; in this case, the J- and H-band fluxes were kept constant.

Open with DEXTER

5.3. Variable sources

Given the difference in the epochs between the GLIMPSE and UKIDSS observations, one would expect a significantly higher proportion of intrinsically variable R08 objects in cases where the UKIDSS counterpart(s) do not match the MIR SED, corresponding to categories U1_S0, UM_S0, and UM_S0_K10 in the classification of Sect. 4.3. However, according to the results obtained in Sect. 5.2, the SED matching method is probably not very sensitive to variability. Indeed, by checking the subsample of objects with available GLIMPSE II second epoch data that were found to be variable by R08 by at least 0.3 mag at 4.5 μm or 8.0 μm, we did not find significant differences in the categories mentioned above. In our sample of 5355 non-saturated R08 objects, 1043 have GLIMPSE II second epoch observations from which 151 (14.5%) are variable. We found that only the category UM_SM had a discrepant (at the 2σ level) proportion of variable objects, with respect to the whole sample, of only 3.9%. Although this might seem consistent, the fact that we still detected variable objects within the other categories indicating good SED match (U1_S1, UM_S1, and UM_SM_K10), and with a comparable proportion to the whole sample, implies that the variability is not really discriminated by our SED matching method.

We can also compare the UKIDSS photometry of R08 objects that are dominated by one source with the corresponding 2MASS magnitudes provided in the original catalog to find additional variable objects and check their classification. A total of 2083 non-saturated objects from our sample are dominated by one UKIDSS source (brightest K-band flux by a factor >10) and have 2MASS Ks magnitudes available, which were compared with the UKIDSS K magnitudes. We did not convert the 2MASS Ks magnitudes to UKIDSS K magnitudes using the transformation given by Hodgkin et al. (2009), as we do later for the synthetic YSOs experiments, because for that we would need the 2MASS JKs color, which would considerably reduce the comparison sample (as fewer objects have 2MASS J magnitudes available). However, the conversion is almost the identity and depends weakly on the JKs color, which we found to be at most ~5. The error of directly using the 2MASS Ks magnitude is then 0.05 mag.

From this subsample, we found that 516 objects had 2MASS Ks-UKIDSS K absolute differences of more than 0.3 mag, representing 24.8%. As expected, the proportion of K-variable objects is significantly higher for the candidate AGB stars (30.8 ± 1.2%) than for the YSO candidates (21.0 ± 0.7%) of this subsample. However, we did not obtain significant differences in the proportion of K-variable objects in the relevant categories of the SED matching classification when compared with the overall proportion; this is consistent with the results from Sect. 5.2 and for GLIMPSE-variable objects. The only possible trend was found when we examined the direct classification (without MC simulations) of this subsample, in which there are five K-variable objects (out of 8) within the U1_S0 category. Although the statistical significance of this trend should be confirmed by a larger sample, these variable objects might represent a small subset for which their SED shape makes the SED matching method sensitive to relatively smaller changes in the NIR fluxes; this would be a very special situation because there are many more K-variable objects in the categories that indicate good SED match.

5.4. Expected clustering within the GLIMPSE resolution

In order to interpret the intuitively surprising result of Sect. 4.4 regarding the high percentage of GLIMPSE YSO candidates that are dominated by only one UKIDSS counterpart, we investigated how synthetic clustered YSOs would be observed within the GLIMPSE resolution. To accomplish this, we used the population synthesis model of Galactic YSOs developed by Robitaille & Whitney (2010). The specific model used in the present work consists of a total of 2.68 million synthetic YSOs in the Galaxy, of which 9055 would be detected in GLIMPSE and satisfies the conditions required for the R08 catalog and the YSO selection criteria adopted by R08 to separate them from AGB stars. These numbers differ slightly from the comparison model used by Robitaille & Whitney (2010, total of 2.73 million objects, and 11 919, since we again run a random realization of the model for a given SFR, and also because we now include the criteria of R08 based on 24 μm to separate the YSOs from AGB stars. We converted the intrinsic 2MASS JHKs magnitudes (i.e., before interstellar extinction is applied) to UKIDSS JHK magnitudes for all the YSOs of the model using the transformation given in Eqs. (6)–(8) of Hodgkin et al. (2009).

From the 9055 GLIMPSE-detected synthetic YSOs, we selected a subsample of 3391 objects that would be covered by UKIDSS GPS DR8 images (see Sect. 2.2 and Fig. 1) and would not be saturated in UKIDSS (magnitude cuts J> 13.25, H> 12.75 and K> 12, following Lucas et al. 2008). To simulate the effect of clustering (which had not been considered by the original model by Robitaille & Whitney 2010), instead of rearranging the full population of synthetic YSOs into groups and clusters, we adopted the simpler approach of placing the already selected subsample of 3391 UKIDSS-detected YSOs in clustered environments and examining how their observational properties change. For each one of these UKIDSS-detected synthetic YSOs, we randomly assigned a certain number of neighbors from the full list of 2.68 million synthetic YSOs of the model, with the only condition that they were not brighter in the K band after scaling these neighbors to the same distance and interstellar extinction.

The number of neighbors for each YSO was obtained by first drawing a value for the surface number density from a lognormal distribution, and then, using the distance, we converted this value to the number of objects within 2′′ (to compare with the properties of the UKIDSS detections in our observed sample). This assumption was motivated by the work of Bressert et al. (2010), who found that the combined YSO surface density distribution of several Spitzer-observed star-forming regions within 500 pc from the Sun is well described by a lognormal function. However, the Bressert et al. (2010) distribution is probably not representative of the whole range of star-forming environments in the Galaxy, in particular of the more distant and massive regions that probably harbor many of the YSO candidates from the R08 sample. In fact, the Orion Nebula Cluster was excluded from the analysis by Bressert et al. (2010) owing to the likely incompleteness of the Spitzer YSO sample there. In addition, the recent results by Kuhn et al. (2015) using YSO samples selected by combining X-ray and infrared observations toward 17 massive star-forming regions have shown that an important part of the YSO population in these regions resides in much denser environments than those studied by Bressert et al. (2010).

Unfortunately, there is no estimate of the average distribution of YSO surface densities in the Galaxy, which would be more appropiate for the R08 sample. Here, we simply adopt a shifted and broadened lognormal distribution with respect to that proposed by Bressert et al. (2010): mean surface density μlog 10Σ = 2 (where Σ is in units of pc-2) and dispersion σlog 10Σ = 1, as compared with the original μlog 10Σ = 1.34 and σlog 10Σ = 0.85 of Bressert et al. (2010). Figure 9 compares these two lognormal distributions, together with the combined surface density distribution of the 17 massive star-forming regions by Kuhn et al. (2015); we obtained this combined distribution by digitizing their Fig. 7 and summing all the individual histograms. The assumed distribution properly covers the higher densities found by Kuhn et al. (2015), but does not abruptly drop for the lower densities covered by Bressert et al. (2010). Still, we believe that our adopted distribution represents an upper limit for the average clustering of YSOs in the Galaxy, which is probably somewhere in between our distribution and that by Bressert et al. (2010). Therefore, for comparison we also run the clustering experiments using the Bressert et al. (2010) surface density distribution as input.

thumbnail Fig. 9

Probability distributions of YSO surface densities in units of pc-2 and logarithmic scale. The solid curve represents our assumed distribution for the clustering experiments (lognormal with mean μlog 10Σ = 2 and dispersion σlog 10Σ = 1), whereas the dashed curve indicates the distribution by Bressert et al. (2010) for several Spitzer-observed star-forming regions within 500 pc from the Sun (lognormal with mean μlog 10Σ = 1.34 and dispersion σlog 10Σ = 0.85). The normalized histogram is the combined surface density distribution of the 17 massive star-forming regions studied by Kuhn et al. (2015).

Open with DEXTER

We repeated the random selection of neighbors for the sample of UKIDSS-detected synthetic YSOs in a series of 1000 MC simulations. In each simulation, we randomly assigned K-band detection limits (taken from the observed sample of R08 objects) and counted the number of synthetic YSOs that: (1) had a signal-to-noise S/N> 30 in the K band; (2) satisfied the S/N threshold and none of their corresponding neighbors (if any) satisfied the SED matching criteria when applied to the NIR fluxes of the neighbors and the MIR fluxes of the main object (as expected, the SED matching criteria are always met when applied to the same synthetic object); or (3) satisfied the S/N threshold and are brighter at K than all their corresponding neighbors by at least a factor fK, which was varied in the range [4,20]. We found that 2535 ± 13 UKIDSS-detected synthetic YSOs had a signal-to-noise S/N> 30 at K, of which 91.0 ± 0.6% satisfy the condition (2), and 92% to 83% satisfy the condition (3) for the different minimum K-flux ratios used. These percentages are even higher for the clustering experiments using the Bressert et al. (2010) surface density distribution: ~98% for condition (2), and 98% to 95% for condition (3).

Given the simplicity of the clustering simulations, the percentages computed for our assumed surface density distribution are in good agreement with the 87.0 ± 1.6% of observed YSO candidates that are dominated by only one UKIDSS counterpart referred to as UKIDSS-single objects (see Sect. 4.4). Although we cannot discard small systematic errors derived from the assumption of our clustering experiments, the slightly lower proportion of observed UKIDSS-single objects with respect to the proportion of synthetic YSOs satisfying condition (2) might be produced by a real underestimation on the observed number of UKIDSS-single objects, which can still be present outside the categories U1_S1 and UM_S1 of the classification (see Sect. 5.1). This hypothesis is supported by the fact that the assumed surface density distribution in our experiments probably represents an upper limit for the average YSO clustering in the Galaxy (see above).

Regarding the typical flux ratios, we found that the proportion of synthetic UKIDSS-single YSOs by SED match (condition 2) agrees with the proportion of objects satisfying condition (3) for a minimum K-flux ratio of fK = 5, which would mean that fainter associated UKIDSS sources could be detected by the SED matching method down to a factor 1 / 5 of the main counterpart. This is in agreement with the typical flux ratios found for multiple UKIDSS counterparts in the real sample (category UM_SM, see Sect. 5.1), but lower (although within the 1σ limits) than the average flux ratio of fK = 8.6 derived from the flux-scaling experiments of Sect. 5.2; this is probably because in those experiments we have analyzed sources that originally matched the SED and it is therefore harder to change their SED matching criteria by just scaling down their fluxes, as compared with independent fainter sources with their own SED shape.

Overall, we have found that the high observed percentage of GLIMPSE YSO candidates that are dominated by only one UKIDSS counterpart are well reproduced by the clustering experiments presented here using a synthetic population of Galactic YSOs. Therefore, we suggest that the main explanation for this high percentage is that within the mass range covered by the R08 YSO sample (~3 to 20 M, see Fig. 1 by Robitaille & Whitney 2010), clustering with objects of comparable mass is unlikely at the GLIMPSE resolution. However, this does not mean that clustering with lower mass YSOs does not occur, but it is just undetected by our observational method. Indeed, if we do not impose any limit on the K-band flux ratio, about 60% of the UKIDSS-detected synthetic YSOs in our experiments have at least one neighbor within the GLIMPSE resolution.

5.5. Expected clustering within the UKIDSS resolution

By performing analogous calculations to those presented in Sect. 5.4, we estimated the properties of the YSO sample when observed by a hypothetical survey with an even higher angular resolution with respect to UKIDSS. We run another set of 1000 MC experiments of random assignment of neighbors for the 3391 UKIDSS-detected YSOs (a subsample from the 9055 synthetic GLIMPSE-detected YSOs), but this time we counted the number of objects within 0.9′′, the UKIDSS resolution. We found that an important proportion (~63%) of objects are genuinely single sources (no other sources within the UKIDSS resolution), while 97% to 92% have only one dominant source within the UKIDSS resolution with minimum K-flux ratios fK in the range [4,20]. This indicates that, at least for the R08 sample of YSO candidates, a higher resolution than that of UKIDSS is not really needed to statistically assess their clustering properties.

We can also use the full population of 2.68 million synthetic YSOs of Robitaille & Whitney (2010) to generate an observed sample of YSOs that would be directly selected with UKIDSS data (without depending on GLIMPSE), and study their properties if observed by a higher resolution survey. We applied the UKIDSS magnitude cuts 13.25 <J< 19, 12.75 <H< 18, and 12 <K< 17, corresponding to the saturation limits already adopted before, and rough conservative detection limits that were estimated from the UKIDSS observations of the R08 objects used in this paper (see Sect. 3.3). A 99% flux radius <0.9″ was additionally required, following Robitaille & Whitney (2010). A total of 70 478 synthetic YSOs satisfy these criteria within the GLIMPSE observed area; we used the full coverage to emulate the future inclusion of the VVV survey (see Sect. 2.2). The experiments of random assignment of neighbors for this sample resulted in about 89% of YSOs being genuinely single within the UKIDSS resolution, whereas 93% to 90% have only one dominant source within 0.9′′for the same range of minimum K-flux ratios as before, fK = 4 to fK = 20. Again, we might not need a better resolution than that of UKIDSS to statistically investigate the clustering properties of a hypothetical UKIDSS-selected (and VVV-selected) sample of YSOs. The objects of this sample are typically located at closer distances than the synthetic GLIMPSE-selected YSOs, as expected from the higher extinction at shorter wavelengths, so that most of the objects do not have any neighbor within 0.9′′. However, since UKIDSS is more sensitive than GLIMPSE, UKIDSS-selected objects are more numerous and cover lower masses, and therefore the minimum flux ratio criterion in this case does not significantly increase the proportion of objects dominated by one source.

5.6. Unresolved binaries

Apart from clustering, there is another characteristic of star-forming regions that can also affect results derived when assuming that GLIMPSE YSO candidates are single objects: the ubiquitous presence of binary (and in lower proportion, multiple) systems (e.g., Duchêne & Kraus 2013; Reipurth et al. 2014), which are typically unresolved at the GLIMPSE resolution. To simulate this effect, we rearranged the 2.68 million synthetic YSOs of the model by Robitaille & Whitney (2010) in a certain proportion of binaries, and we then compared the number of objects (or unresolved pairs of objects) that would be selected by R08 with the 9055 selected synthetic YSOs from the original single-stars model.

There is currently a controversy about whether the binary fraction and period distribution in the Galactic field result from the dynamical processing (in dense regions or clusters) of binaries that form from a universal period distribution and a binary fraction of unity (e.g., Kroupa 1995; Marks & Kroupa 2011; Marks et al. 2015), or whether the binary properties in the field are indicative of the primordial binary population (Parker & Meyer 2014). Therefore, we paired the synthetic YSOs in binary systems using two sets of binary properties as follows:

  • The universal setup: we assumed that all YSO are in binaries withperiods that follow a universal distribution, except for massivestars. Following Oh et al. (2015),we used the period distributions given by theirEqs. (2) and (3) for primary masses m1 < 5 M and m1 ≥ 5 M,respectively.

  • The field setup: we adopted primary-mass dependent binary fractions and period distributions representative of the Galactic field, summarized in Table 1 of Parker & Meyer (2014); we extended the G-dwarf range to primary masses between 0.45 M and 1.5 M, and the A-dwarf range up to masses of 5 M. For primary masses m1 ≥ 5 M, we used the period distribution and the binary fraction of 0.69 derived by Sana et al. (2012).

In both setups, the secondary objects were selected such that the mass ratio q = m2/m1 followed a roughly flat distribution in the range q ∈ [0.1,1], as observed in the Galactic field (Duchêne & Kraus 2013; Reggiani & Meyer 2013). For simplicity, the 99% flux radii, r1 and r2, of the two synthetic YSOs were added geometrically to estimate the radius of the system, r = (r1 + r2 + a) / 2, in the case in which the smallest radius plus the binary separation a is larger than the largest radius; otherwise we adopted the largest radius. Similarly, the fluxes of the two objects at every wavelength were simply added to obtain the corresponding flux of the binary system. Even though this approach is probably enough to just have an idea of the effect of unresolved binaries in the observed population of GLIMPSE YSO candidates, a proper treatment of simulated binaries would involve the use of radiative transfer models of binary YSOs instead of single YSOs for the synthetic objects, which is beyond the scope of this paper. In each setup, we counted the number of synthetic YSO (single or binary) systems that would be detected in GLIMPSE and included in the R08 sample of YSO candidates (already separated from AGB stars). For binaries, we also imposed that the observed separation of the objects was outside the range [2″,3″] to account for the close source criterion imposed by R08 (see Sect. 2.1).

We found that ~6750 YSO binaries from the universal setup and ~7680 YSO systems from the field setup would be selected by R08 as YSO candidates. Exact numbers slightly vary depending on each random realization. This time we did not run MC repetitions because the pairing of a significant proportion (the totality for the universal setup) of the whole population of synthetic YSOs into binaries was much more computationally expensive than the experiments of previous sections. In other words, if we consider the presence of binaries, the number of GLIMPSE-selected synthetic YSOs is reduced by a factor ~0.75–0.85 with respect to the number of objects selected from a binary-free model with the same initial number of individual YSOs. The reduction factor might seem less extreme than what one could expect (e.g., ~0.5 for the case of all YSOs paired in twin binaries) simply because of the pairing. Even for a flat distribution of mass ratios, primary YSOs are dominated by the more massive objects of the population, so that the number of missing massive synthetic YSOs that would have been selected by R08 as single objects but are now the companions of more massive YSOs is moderate. This can be illustrated by comparing the mass functions of the original sample and of the primary YSOs from the universal setup. The ratio of the number of primary YSOs from the universal setup with masses m1 > 3 M to the number of original single YSOs with masses >3 M (roughly all GLIMPSE-selected YSOs are above this limit) is ~0.68, which is significantly higher than 0.5, even though the total primary YSO sample is half the original.

We also used these samples of synthetic GLIMPSE-selected YSO systems from the two setups of binary properties to estimate the number of binaries that would be resolved by UKIDSS and identified by our SED matching method. The proportion of resolved or detected binaries was computed with respect to the total of GLIMPSE-selected systems (i.e., single or binary for the field setup) that were not saturated in UKIDSS and had a signal-to-noise S/N > 30. We assigned a K-band detection limit to each system, drawn from the distribution of UKIDSS sensitivities for the observed sample of R08 objects (analogous to the experiments presented in Sect. 5.4). We found a proportion of ~2% to 3.5% of binaries with observed separations larger than 0.375 × 0.9″ = 0.34″ for the field and universal setup, respectively (0.375 × FWHM is the limit above which the DAOPHOT routines are allowed to deblend two close sources). If we additionally required that the two objects in each binary had K-band fluxes within a factor of fK = 5 to be detected by the SED matching method (see Sect. 5.4), and separations smaller than 0.57′′ (one of the SED matching criteria), this proportion decreases to ~0.3–0.6%. Given that we have a good-quality observed sample of about 2600 GLIMPSE YSO candidates (from Sect. 4.4), this means that we are able to detect ~8–16 binary YSOs in UKIDSS through the SED matching method. We have indeed identified many (~50) double systems in the UM_SM category (see Sect. 5.1); however, with the present data we cannot distinguish the systems that represent genuine binaries from those that just happen to contain two YSOs within the GLIMPSE resolution due to clustering.

5.7. Implications for SFR estimates

By adjusting the total number of YSOs in their population synthesis model such that the numbers of synthetic and observed GLIMPSE-selected objects match, Robitaille & Whitney (2010) estimated a SFR of the Milky Way in the range 0.68–1.45 M yr-1; this range of values accounts for the uncertainty on the completeness and YSO selection (to separate them from AGB stars) in the observed sample. Since they assumed that the GLIMPSE-selected YSOs are individual objects, a significant presence of unresolved clustering or binaries within the GLIMPSE resolution could in principle modify this estimate.

However, we have seen that even though clustering can be common within the GLIMPSE resolution, the YSO candidates of the R08 catalog typically have intermediate masses (3 M) and, therefore, dominate the observed flux with respect to their neighbors if all the sources follow a canonical mass function. This solves the question of why most of the YSO candidates analyzed in this work are dominated by only one UKIDSS counterpart. In the clustering experiments of Sect. 5.4, we found that ~87% of the synthetic YSOs are brighter at K than all their corresponding neighbors (if any) by factor >10, which means that if clustering were included in the Robitaille & Whitney (2010) model we would in any case need almost the totality of the mass of the main object to explain the observed fluxes. Consequently, the impact of clustering on the SFR estimate by Robitaille & Whitney (2010) is expected to be at most a few percent and, therefore, considering the uncertainties, no significant corrections are needed from this effect.

Regarding binaries, we found in Sect. 5.6 that, with respect to the original model, the number of GLIMPSE-selected YSOs is reduced by a factor ~0.750.85 in a population synthesis model taking unresolved binary YSOs into account. Since the model is just a random realization of the Galactic YSO population, the total number of individual YSOs scales linearly with the number of detected objects, so that the initial number of synthetic YSOs (and hence, the SFR) of the binaries model has to be increased by a factor ~1.18–1.33 to reproduce the observed number of GLIMPSE YSO candidates. Therefore, the correction from the presence of unresolved binaries to the SFR estimate of Robitaille & Whitney (2010) might not be negligible and could in part make it closer to other estimates in the literature that are systematically higher (Chomiuk & Povich 2011).

6. Conclusions

We have analyzed near-infrared UKIDSS observations of a sample of 8325 objects taken from the catalog by Robitaille et al. (2008) of intrinsically red sources in the Galactic plane, which were selected using the mid-infrared GLIMPSE survey. Since UKIDSS has a better angular resolution than that of GLIMPSE by a factor >2, our primary aim was to investigate whether there are multiple UKIDSS sources that might all contribute to the GLIMPSE flux or if there is only one dominant UKIDSS counterpart. We did not use the published UKIDSS point source catalog, which is based on aperture photometry alone, and we instead performed PSF fitting photometry at the position of every GLIMPSE red source to detect and properly separate the emission of multiple overlapping sources. The main results and conclusions presented in this paper are summarized as follows:

  • 1.

    The dominant UKIDSS sources are typically characterized by a smooth transition between their NIR SED and the MIR SED of the corresponding R08 objects. We implemented a technique to automatically recognize these UKIDSS sources, which basically consisted in a comparison between two different interpolation methods at the NIR-MIR transition of the SED. This technique is very generic and could be perfectly applied for matching SEDs across gaps at other wavelengths.

  • 2.

    Most of the analyzed objects from the R08 sample present only one dominant UKIDSS counterpart, which matches the MIR SED (what we call UKIDSS-single objects). In particular, the percentage of UKIDSS-single objects is 92.1 ± 1.2% for candidate AGB stars, and 87.0 ± 1.6% for candidate YSOs, using the YSO-AGB star approximate separation of R08. While this was expected for AGB stars, it was not intuitive for YSOs given the typically clustered nature of their environment.

  • 3.

    Practically the totality of the dominant UKIDSS sources were also the nearest in angular projection to the corresponding R08 objects; therefore, a simple nearest-source matching between UKIDSS and GLIMPSE would be a statistically good approximation for the overall R08 sample, although it would be wrong for the few specific objects showing multiple dominant sources in UKIDSS or for different samples of Spitzer-selected YSOs with lower proportions of UKIDSS-single objects.

  • 4.

    For the few R08 objects that are not dominated by one UKIDSS source, we found by visual inspection some cases of apparently true multiple UKIDSS counterparts, typically two sources, but in exceptional cases three sources or even small clusters. However, given that the SED matching method was designed to identify the dominant UKIDSS sources, these multiple sources have K-band fluxes within a factor ~5, while fainter counterparts are, in general, not identified by our technique.

  • 5.

    We found that the SED matching method is not very sensitive to flux changes of up to a factor ~7–9, and therefore the dominant UKIDSS sources can still be identified in the SED if they are variable (considering the difference in the epochs between the GLIMPSE and UKIDSS observations).

  • 6.

    We performed simple clustering experiments using the population synthesis model by Robitaille & Whitney (2010), in which we randomly assign neighbors within the GLIMPSE resolution to the “detected” synthetic YSOs. We were able to reproduce the high percentage of GLIMPSE YSOs that are dominated by only one UKIDSS counterpart with a K-flux that is brighter than that of their neighbors by a factor of at least ~5. We argued that within the mass range covered by R08 YSO candidates (~3–20 M), clustering with objects with comparable mass is unlikely at the GLIMPSE resolution.

  • 7.

    We also carried out similar experiments to study the effect of unresolved binaries in the GLIMPSE YSO sample, but this time we rearranged the full initial set of synthetic YSOs of Robitaille & Whitney (2010) in a certain proportion of binaries and investigated how the number of “detected” objects change with respect to the original single-stars model. We found that this number is reduced by a factor ~0.75–0.85.

  • 8.

    We conclude, according to these results, that no significant corrections are needed to the SFR estimated by Robitaille & Whitney (2010) from the effect of YSO clustering within the GLIMPSE resolution. However, the correction derived from the presence of unresolved YSO binaries might not be negligible, and would increase the SFR estimate by a factor ~1.2–1.3.

The SED matching method implemented in this paper turned out to be very useful to characterize the UKIDSS observations of the GLIMPSE YSO candidates, especially for the detection of the dominant counterparts. Nevertheless, as shown by the clustering experiments of synthetic YSOs, a significant proportion of the GLIMPSE YSO candidates might contain at least two physically associated UKIDSS sources within the GLIMPSE resolution, even though only one dominates the flux. The challenge for the near future is to design procedures to identify these fainter multiple sources to really take advantage of the full high-resolution information of UKIDSS and progress toward the construction of the hierarchical YSO catalog in the long term.


4

For details on this issue, check http://surveys.roe.ac.uk/wsa/gpsAstrometryDR8.html

7

For convenience, we inverted the ratio for scaled-down fluxes. Also, since our SED matching procedure was always applied in logarithmic space (see Sect. 4.2), these more intuitive values are just given for reference, and the average ratio quoted here is obtained by simply inverting the mean of the logarithm computed before, so that it is not strictly the mean of the ratio.

Acknowledgments

We thank the referee for a thorough report that helped us to improve the clarity of the paper. This work was carried out in the Max Planck Research Group Star formation throughout the Milky Way Galaxy at the Max Planck Institute for Astronomy (MPIA). We have used observations from the Spitzer Space Telescope, which is operated by the Jet Propulsion Laboratory, California Institute of Technology under a contract with NASA; and from the 8th Data Release of the UKIDSS Galactic Plane Survey (Lucas et al. 2008). We thank Peter B. Stetson for providing us the source code of the DAOPHOT package. This research made use of Astropy, a community-developed core Python package for Astronomy (Astropy Collaboration et al. 2013); matplotlib, a Python library for publication quality graphics (Hunter 2007); and APLpy, an open-source plotting package for Python hosted at http://aplpy.github.com.

References

Appendix A: Decision rules for SED matching

Table A.1

Categories of JHK combinations of good-quality (X), bad-quality (×), and upper limit () fluxes.

Here, we describe in detail how we treated the three UKIDSS JHK bands for SED matching in the presence of bad-quality fluxes or/and upper limits, as defined in Sect. 3.3. If we consider that each UKIDSS band can be either a normal measurement (good-quality flux), a bad-quality flux, or an upper limit, there are then 27 possible cases of JHK combinations for each source, which were grouped in 7 different categories. A specific category is a set of combinations sharing some relevant property and for which we applied the same decision rule for SED matching. The JHK combinations and the defined categories are listed in Table A.1. With the exception of very specific situations that are described below, most of the combinations outside the G category represent cases in which the match or mismatch with the MIR SED is uncertain. These UKIDSS sources are referred in this paper as ambiguous sources, and are divided into faint ambiguous (FA) and bright ambiguous (BA) sources depending on their K-band flux, , and the flux of the brightest UKIDSS source associated with the same R08 object and matching the MIR SED. In general (unless explicitly mentioned below when describing each category), we used the criteria, (A.1)By definition, R08 objects classified as cases UM_S0_K10 and UM_S0 in Sect. 4.3 do not have any source matching the MIR SED, so if ambiguous sources are present, they are automatically labeled as BA (by setting ). Similarly, an ambiguous UKIDSS source that is the only detected source for a given R08 object is always classified as BA, except the special situations stated below. In this way, all R08 objects with FA sources are still usable for the SED matching statistics.

A R08 object with a BA source is removed from the sample for further analysis, unless the object corresponds to the category UM_SM of SED matching and the BA source is not brighter than , in which condition the UM_SM classification is not affected. If the category UM_SM_K10 is not considered as a case of one dominant UKIDSS counterpart (as in the statistics of Sect. 4.4), UM_SM_K10 objects with BA sources (as before, not brighter than ) are also allowed, owing to the following argument: if the BA source had, in reality, a good SED match, the category would only change to UM_SM, and the total number of R08 objects that do not have only one dominant UKIDSS counterpart would be the same.

Below, we explain every category of JHK combinations and the corresponding decision rule we applied. As before, we represent the K-band flux of the brightest UKIDSS source matching the MIR SED as , while the flux of the considered (potentially ambiguous) UKIDSS source is just denoted by FK.

  • NS: these are very few sources that were initially detected by the source finding algorithm of DAOPHOT, but they turned out to be below the more rigorous 3σ detection limits defined in Sect. 3.3, in all the UKIDSS bands. They can therefore be considered spurious sources.

  • BK: given that the K band is the closest in wavelength to the GLIMPSE filters and is therefore crucial to define the NIR-MIR transition of the SED, in this work we adopted the conservative approach of rejecting all UKIDSS sources with bad-quality K flux for the SED analysis. We applied for the associated R08 object the rule, In this particular case, even R08 objects with SED matching types UM_SM (and UM_SM_K10) were not considered because bright bad-quality K fluxes could also affect the photometry of nearby sources.

  • KU-B: in this case, there is no good-quality flux in any band and therefore the SED (mis)match is completely uncertain; however, since the K band is just upper limit and not a bad-quality measurement, the R08 object follows the standard decision rule of Eq. (A.1),

  • KU-G: these sources have also a K-band upper limit, but the presence of at least one good-quality flux in the other bands allows us to run the SED matching procedure using the upper limits (except the case of detection at H only, for which we ignored the J upper limit) and ignoring the bad-quality fluxes. With this limited information, if the source matches the SED, this would still be uncertain because the actual K-band flux could be lower; however, if the source does not match the SED, this is very likely the case since the lower actual fluxes at longer wavelengths (K band, and H band when it is also an upper limit) would produce an even clearer SED mismatch. Then, we applied the rule

  • KO-2: sources with good-quality flux only in the K band could still be important counterparts of the respective R08 objects, and therefore were not always considered ambiguous. In this particular case, if the source had a flux FK that was the brightest by a factor >10 among the detected sources, it was evaluated by a secondary SED matching criterion: if Δθ ≤ 0.57″ (as in Eq. (2)) and the linear extrapolation at the K band from the two shortest wavelength Spitzer fluxes of the corresponding R08 object, in (log 10(λ),log 10(Fν)) space, is within a factor 2 of FK, the source is assumed to match the MIR SED; we refer this case as a linear match with the SED. If the source is not the brightest, it is treated as a standard ambiguous source. In summary,

  • KO-U: these sources have also good-quality flux in the K band only, but in this case there is at least one upper limit in the other bands, so that we can run the primary SED matching method using the K-band flux and the upper limit(s), and ignoring any bad-quality flux, if present. Given the expected shape of the SED, if the source matches the MIR SED, we think that it would likely match the SED with the actual (lower) flux(es) at H and/or J as well; if there is a SED mismatch, however, it remains uncertain, unless the source is the brightest at K by a factor >10 and linearly matches the MIR SED, in which case the mismatch with the primary method is probably only due to the use of upper limit(s) in the other bands instead of the actual fluxes. In other words, sources that do not match the SED with the primary method were treated as KO-2 sources:

  • G: this category groups all sources that have always a good-quality K-band flux, and at least one more good-quality flux in the H and/or J band. Consequently, the SED (mis)match is unambiguously defined and the primary SED matching algorithm can be applied by simply ignoring the bad-quality flux or upper limit, if present.

Appendix B: Convex quadratic fits

thumbnail Fig. B.1

Examples of UKIDSS sources producing a convex quadratic fit, but still satisfying the SED matching criteria of Eq. (2). Left: UKIDSS source with a flat NIR SED (and therefore most likely not being a counterpart of the corresponding R08 object) and high convexity. Right: reddened UKIDSS source that is probably the dominant counterpart of the R08 object; the slight convexity of its quadratic fit is probably produced by uncertainties or variability on the measured fluxes. The values of R and the quadratic coefficient a for each source are indicated at the lower right corner of each panel.

Open with DEXTER

In this work, we have used the agreement, quantified as the parameter R ⟩ , between the spline representation of the combined NIR-MIR SED and the quadratic function fitted over the four middle points of the SED defining the NIR-MIR transition, as one of the main indicators to distinguish the dominant UKIDSS counterparts of the R08 objects. Nevertheless, in a few cases, some UKIDSS sources with a flat or blue shape in the NIR SED, and thus that are most likely not a counterpart of a R08 object, succeed in producing a quadratic fit that is consistent with the derived spline curve and could fall within the good SED match criteria of Eq. (2). In such a case (see left panel of Fig. B.1 for an example), the overall SED does not follow the expected concave shape of an YSO or AGB star, and the fitted quadratic curve is convex, i.e., the coefficient a of the function f(x) = ax2 + bx + c is greater than zero.

thumbnail Fig. B.2

Mean spline to quadratic function ratio R vs. the coefficient a of the quadratic function for all UKIDSS sources satisfying the SED matching criteria of Eq. (2) and with a > 0. The sources are color-coded according to their visual evaluation of good (pale sky blue) or bad (red) match with the MIR SED. The dashed lines indicate the limits on R and a defining the conditions of Eq. (B.1) to distinguish the unambiguous sources from the ambiguous sources.

Open with DEXTER

We could have simply reassigned all sources satisfying the conditions of Eq. (2) and with a > 0 to the set of sources not having a good SED match. However, there are some UKIDSS sources matching the MIR SED that are reddened and still produce a slightly convex quadratic fit (as the example in the right panel of Fig. B.1), which is probably due to uncertainties or variability on the measured fluxes. Since the number of sources satisfying the SED matching criteria and with convex quadratic fits is relatively low (246 sources out of a total of 4958 UKIDSS sources matching the MIR SED), the SEDs of these source were visually inspected and evaluated as good or bad matches, which is similar to the procedure for the validation sample (Sect. 4.2). In Fig. B.2, we plot the parameter R against the quadratic coefficient a for this sample of 246 sources, color-coded according to their evaluation after the visual inspection. We found that sources within the region defined by the conditions (B.1)can reliably be kept as having good SED matches, whereas sources outside that area can have either good or bad SED matches. During the visual inspection process, we also noticed that sources outside the region defined by Eq. (B.1) were harder to evaluate. We then decided to consider all sources with a > 0 and satisfying the conditions of Eq. (2) but not those of Eq. (B.1) as ambiguous sources, which follow the standard decision rule of Eq. (A.1).

We found that the convexity of the quadratic fit was useful to identify what kind of sources can be misclassified as bad SED matches by the SED matching criteria, and in this way we were able to improve the random reassignment of sources in the MC simulations (see Sect. C.1), carried out to estimate the uncertainties on the statistics. For simplicity, we call the sources with a > 0 and not satisfying Eq. (B.1) extremely convex sources, regardless of whether they satisfy the SED matching criteria or not. All 11 sources from the validation sample with Δθ > 0.57″ and R ⟩ ≤ 1.3 that were visually evaluated as good SED matches have a < 0 (10 sources), or have a > 0 and satisfy the conditions of Eq. (B.1) (1 source). Therefore, the proportion of extremely convex sources within the whole sample of misclassified sources with Δθ > 0.57″ and R ⟩ ≤ 1.3 is low, and probably comparable to the proportion fc of those sources within the set satisfying Eq. (2). We thus set the proportion of extremely convex sources within the reassigned sources with Δθ > 0.57″ and R ⟩ ≤ 1.3 in each MC simulation to a value close to fc, following an analogous method (based on the binomial distribution) to the one used to set the number of reassigned sources (see Sect. C.1).

Appendix C: Details of SED matching statistics

Appendix C.1: Monte Carlo simulations of contamination

As described in Sect. 4.2, the SED matching criteria defined by Eq. (2) are affected by some contamination, which introduces uncertainties on the classification of R08 objects presented in Sect. 4.3. We estimated these uncertainties by running 105 MC simulations that included the presence of contaminants.

The totality of UKIDSS sources analyzed by the SED matching method were initially divided into good or bad SED matches using the conditions of Eq. (2). Then, in each MC simulation, we randomly reassigned a certain proportion of sources satisfying Eq. (2) to the set of sources with a bad SED match; in the same way, a proportion of sources with Δθ > 0.57″ and R ⟩ ≤ 1.3 were reassigned to the set of sources with a good SED match. We assumed no contamination for bad SED matches defined by R ⟩ > 1.3, as we found for the validation sample in Sect. 4.2.

For simplicity, we estimated the probability that a source is a contaminant by assuming a Bernoulli trial. Hence, the specific number of reassigned sources for each case was drawn from a binomial distribution B(n,p), where the parameters n and p are the total number of UKIDSS sources that initially fall in each set and the corresponding contamination ratio, respectively. For the validation sample, we generically refer to the total number of sources in each set and the number of misclassified sources as n0 and k0, respectively, so that n0 = 199 and k0 = 6 for the set of initially good SED matches defined by the conditions of Eq. (2), and n0 = 47 and k0 = 11 for the set of initially bad SED matches defined by Δθ > 0.57″ and R ⟩ ≤ 1.3. Given that the validation sample is just a randomly selected subsample, the corresponding number of misclassified sources does not determine the true underlying contamination ratio of the whole set, but defines a probability distribution from which we can draw the contamination ratio p used in each MC simulation. If we assume that the sample space of the validation sample in each set consists of subsets of exactly n0 sources, it can be shown, using the Bayes’ theorem, that the resulting distribution function for p given k0 is (C.1)which is equivalent to the beta distribution Beta(α,β) with parameters α = k0 + 1 and β = n0k0 + 1.

Table C.1

Statistics for special cases.

In summary, each MC simulation reassigned a certain number k of randomly chosen sources from each set, where k had been drawn from the binomial distribution B(n,p), and p had been previously drawn from the distribution of Eq. (C.1). The R08 objects were then classified into the categories defined in Sect. 4.3, using the resulting separation of the UKIDSS sources into good or bad SED matches in each MC simulation. For consistency, we excluded from the reassignment sources from the validation sample that were visually considered correctly classified and those with a convex quadratic fit that satisfy the criteria of reliable sources (see Appendix B), whereas sources from the validation sample found to be misclassified were always reassigned; however, given the relatively low number of sources in these samples, this probably has, if any, a minor effect on the statistics. In addition, most of the reassigned sources from the set of initially bad SED matches were forced to have concave quadratic fits to resemble the corresponding misclassified sources from the validation sample (see Appendix B for details).

Appendix C.2: Sampling error on the proportion of YSOs and AGB stars

In Sect. 4.3, we also divided the sample into YSOs and AGB stars using the criteria by R08 to identify whether one of both populations is more important within a certain category with respect to the others in the classification defined in Table 1. There are two sources of uncertainty on the proportion of YSOs or AGB stars within each category. The first source of uncertainty is the dispersion derived from the MC simulations, produced by the variation of both the total number of objects in a category and its corresponding number of YSOs or AGB stars. The second error arises from the fact that each category is a subsample of the whole set, and therefore the proportions of YSOs and AGB stars can statistically be slightly different than the original ones, even if the category is unbiased regarding the YSO-AGB star separation (i.e., equivalent to a random subsample). The total sample consists of M = 5355R08 objects, which are divided into JYSOs = 3825 YSOs and JAGB = 1530 AGB stars; then, any random subsample of m objects contains jYSOs YSOs and jAGB AGB stars with expected values of ji = mJi/M, where i = { YSOs,AGB }, and a certain dispersion. This is a direct application of the hypergeometric probability distribution for the random variable ji. The variance of the ratio ji/m is then given by (C.2)The total uncertainty on the proportion of YSOs or AGB stars for a given category was computed adding in quadrature the dispersion from the MC simulations and σ(ji/m) from Eq. (C.2), where we used the mean number of objects m = ⟨ N of the category (see Table 1).

Appendix D: Special and ambiguous cases of SED match: Extended categories

We can identify some special lower quality cases of SED match or mismatch that are different from the main classification defined in Sect. 4.3, which might still be useful for the statistics of UKIDSS-single objects presented in Sect. 4.4. In Table C.1, we distinguish three situations, based on the decision rules detailed in Appendix A, that were previously grouped in the single category ambiguous/special cases of Table 1. The first row of Table C.1 counts all objects that have a bright ambiguous UKIDSS source (BA source), and were thus excluded from the SED matching statistics. Objects denoted here as KO10 represent cases of KO-2 or KO-U sources that are the brightest at K by a factor >10 and were evaluated by the secondary SED matching criteria explained in Appendix A, referred as linear match. R08 objects with KO10 sources with linear match can be therefore considered as UKIDSS-single. Columns of Table C.1 have the same meaning as those of Table 1. The relatively more clustered environment of YSOs could explain the higher percentage of candidate YSOs in objects affected by bad photometry (objects with a BA source).

We constructed an extended sample to compute a new estimate of the proportion of UKIDSS-single objects with slightly improved statistics. In addition to the separation into UKIDSS-single and non-UKIDSS single objects explained in Sect. 4.4, here we included the classes UM_SM_K10 and KO10 with linear match as possible UKIDSS-single objects, and correspondingly the category KO10 with no linear match as possible non-UKIDSS-single. In this case, we should exclude the UM_SM_K10 objects that eventually had a low S/N or had a BA source; however, in our sample this did not occur. All these categories are very rare, but we also considered here the R08 objects affected by peripheral saturation (see Sect. 3.3), since the PSF wings of nearby saturated stars should not severely affect the photometry of UKIDSS sources with S/N > 30 at K, which was required for our SED matching statistics. This extended sample comprises a total of ~1440 AGB star candidates and ~2950 YSO candidates, representing an increase of over 10% in the sample size. The proportion of UKIDSS-single objects are 92.0 ± 1.2% for candidate AGB stars, and 87.2 ± 1.5% for candidate YSOs, which are identical within the uncertainties to the percentages estimated for the good-quality sample of Sect. 4.4.

All Tables

Table 1

Statistics for the classification of R08 objects.

Table A.1

Categories of JHK combinations of good-quality (X), bad-quality (×), and upper limit () fluxes.

Table C.1

Statistics for special cases.

All Figures

thumbnail Fig. 1

Positions of UKIDSS GPS DR8 frame sets covering the GLIMPSE I/II observed area.

Open with DEXTER
In the text
thumbnail Fig. 2

Comparison between the point source catalog generated by the WFCAM pipeline (based on aperture photometry) and the sources detected and measured in this work by performing PSF fitting photometry, for the R08 object SSTGLMC G048.873100.5091. Left: UKIDSS GPS H-band image, overlaid with the positions of the point sources from the UKIDSS catalog. Middle: the same image overlaid with the positions of the objects detected by our PSF fitting photometry. Right: residual image of the PSF fitting photometry, i.e., after subtraction of scaled PSFs at the positions of the detected sources.

Open with DEXTER
In the text
thumbnail Fig. 3

DAOPHOT quality indices r0 and χ as a function of the magnitude for all sources detected within 2′′ of non-saturated R08 objects, in each UKIDSS band. The different lines show the thresholds defining the bad-quality flag in a particular filter: | r0 | = 2 in the top panels, and the curve χlim(m) in the bottom panels (derived empirically as described in the text). A source was considered as having bad-quality photometry in a given band if | r0 | > 2 or χ > χlim(m).

Open with DEXTER
In the text
thumbnail Fig. 4

Illustration of the usefulness of the SEDs comparison to identify dominant UKIDSS sources. This example shows the R08 object SSTGLMC G048.961000.3963 and the three detected UKIDSS sources within a radius of 2′′, which is indicated by the dashed-line circle. Left: UKIDSS JHK three-color image overlaid with the positions of the UKIDSS sources as open circles (green for the nearest in angular separation, and magenta for the others). Middle: GLIMPSE 3.6, 4.5, and 8.0 μm three-color image of the same field. Right: SED of the R08 object at 3.6, 4.5, 5.8, 8.0, and 24 μm wavelengths (blue points), plotted together with the SEDs of the three UKIDSS sources in the J, H, and K bands (green points for the nearest source and red points for the others).

Open with DEXTER
In the text
thumbnail Fig. 5

Example of the method implemented in this work to evaluate the smoothness of the NIR-MIR transition of the SED constructed for every UKIDSS source and the corresponding R08 object. For each one of the three UKIDSS sources detected for the R08 object SSTGLMC G048.961000.3963 (the same shown in Fig. 4), we plot the combined NIR-MIR SED overlaid with its cubic spline representation (yellow line) and the quadratic function fitted over the 4 middle points (H, K, 3.6 μm, and 4.5 μm filters; gray line). The value for the mean ratio of both curves as defined in Eq. (1) is indicated at the lower left corner of each panel. Colors for the SED points are as in the right panel of Fig. 4. In this particular example, the J upper limits have been simply ignored by the method (see the text for details on how different cases like this are treated).

Open with DEXTER
In the text
thumbnail Fig. 6

Mean spline to quadratic function ratio R (as defined in Eq. (1)) vs. the angular distance Δθ from the associated R08 object, for all unambiguous UKIDSS sources detected (left panel), and for the sources from the validation sample (right panel), which are color-coded according to their visual evaluation of good (pale sky blue) or bad (red) match with the MIR SED. The limits on R and Δθ defining the quantitative criteria of Eq. (2) are indicated as dashed lines on both panels.

Open with DEXTER
In the text
thumbnail Fig. 7

UKIDSS JHK three-color images for three examples of R08 objects not dominated by only one UKIDSS source. The green circles show the positions of the detected UKIDSS sources within a radius of 2′′, which is indicated by the dashed-line circle. Left: SSTGLMC G016.7954+00.1216, example of object for which the UKIDSS counterpart is too extended to be detected by the PSF fitting photometry. Middle: SSTGLMC G049.1319+00.9327, example of object with two UKIDSS sources that match the MIR SED. Right: SSTGLMC G007.226600.2728, example of an apparent small cluster of UKIDSS sources.

Open with DEXTER
In the text
thumbnail Fig. 8

Histograms of ratios (in logarithmic scale) between the scaled fluxes and the original fluxes at the point of the transition from SED match to mismatch. The top panel shows the results for fluxes that were scaled by the same amount in the three JHK bands; these ratios were computed independently for scaled-up and scaled-down fluxes, and are presented here as two separated histograms. The bottom panel shows the distribution of ratios for scale-down K-band fluxes; in this case, the J- and H-band fluxes were kept constant.

Open with DEXTER
In the text
thumbnail Fig. 9

Probability distributions of YSO surface densities in units of pc-2 and logarithmic scale. The solid curve represents our assumed distribution for the clustering experiments (lognormal with mean μlog 10Σ = 2 and dispersion σlog 10Σ = 1), whereas the dashed curve indicates the distribution by Bressert et al. (2010) for several Spitzer-observed star-forming regions within 500 pc from the Sun (lognormal with mean μlog 10Σ = 1.34 and dispersion σlog 10Σ = 0.85). The normalized histogram is the combined surface density distribution of the 17 massive star-forming regions studied by Kuhn et al. (2015).

Open with DEXTER
In the text
thumbnail Fig. B.1

Examples of UKIDSS sources producing a convex quadratic fit, but still satisfying the SED matching criteria of Eq. (2). Left: UKIDSS source with a flat NIR SED (and therefore most likely not being a counterpart of the corresponding R08 object) and high convexity. Right: reddened UKIDSS source that is probably the dominant counterpart of the R08 object; the slight convexity of its quadratic fit is probably produced by uncertainties or variability on the measured fluxes. The values of R and the quadratic coefficient a for each source are indicated at the lower right corner of each panel.

Open with DEXTER
In the text
thumbnail Fig. B.2

Mean spline to quadratic function ratio R vs. the coefficient a of the quadratic function for all UKIDSS sources satisfying the SED matching criteria of Eq. (2) and with a > 0. The sources are color-coded according to their visual evaluation of good (pale sky blue) or bad (red) match with the MIR SED. The dashed lines indicate the limits on R and a defining the conditions of Eq. (B.1) to distinguish the unambiguous sources from the ambiguous sources.

Open with DEXTER
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.