Mining the UKIDSS Galactic Plane Survey: star formation and embedded clusters⋆
O. Solin1,2, E. Ukkonen1 and L. Haikala3,2
1 University of Helsinki, Department of Computer Science, PO Box 68, 00014 University of Helsinki, Finland
2 University of Helsinki, Department of Physics, Division of Geophysics and Astronomy, PO Box 64, University of Helsinki, 00014 Helsinki, Finland
3 Finnish Centre for Astronomy with ESO, University of Turku, Väisäläntie 20, 21500 Piikkiö, Finland
Received: 28 November 2011
Accepted: 13 March 2012
Context. Data mining techniques must be developed and applied to analyse the large public data bases containing hundreds to thousands of millions entries.
Aims. We develop methods for locating previously unknown stellar clusters from the UKIDSS Galactic Plane Survey (GPS) catalogue data.
Methods. The cluster candidates are computationally searched from pre-filtered catalogue data using a method that fits a mixture model of Gaussian densities and background noise using the expectation maximization algorithm. The catalogue data contains a significant number of false sources clustered around bright stars. A large fraction of these artefacts were automatically filtered out before or during the cluster search. The UKIDSS data reduction pipeline tends to classify marginally resolved stellar pairs and objects seen against variable surface brightness as extended objects (or “galaxies” in the archive parlance). 10% or 66 × 106 of the sources in the UKIDSS GPS catalogue brighter than 17m in the K band are classified as “galaxies”. Young embedded clusters create variable NIR surface brightness because the gas/dust clouds in which they were formed scatters the light from the cluster members. Such clusters appear therefore as clusters of “galaxies” in the catalogue and can be found using only a subset of the catalogue data. The detected “galaxy clusters” were finally screened visually to eliminate the remaining false detections due to data artefacts. Besides the embedded clusters the search also located locations of non clustered embedded star formation.
Results. The search covered an area of 1302 deg2 and 137 previously unknown cluster candidates and 30 previously unknown sites of star formation were found.
Key words: open clusters and associations: general / methods: statistical / catalogs / surveys / infrared: stars
Appendices A–C are available in electronic form at http://www.anda.org
© ESO, 2012