A robust morphological classification of high-redshift galaxies using support vector machines on seeing limited images*
I. Method description
LESIA-Paris Observatory, 5 place Jules Janssen, 92195 Meudon, France e-mail: firstname.lastname@example.org
2 IAA-C/ Camino Bajo de Huétor, 50, 18008 Granada, Spain
3 LAM, Marseille Observatory, Traverse du Siphon, Les trois Lucs, BP 8, 13376 Marseille Cedex 12, France
4 Laboratoire d'Astrophysique de Toulouse-Tarbes, CNRS-UMR 5572 and Université Paul Sabatier Toulouse III, 14 avenue Belin, 31400 Toulouse, France
Accepted: 7 November 2007
Context. Morphology is the most accessible tracer of the physical structure of galaxies, but its interpretation in the framework of galaxy evolution still remains a problem. Its dependence on wavelength renders the comparison between local and high redshift populations difficult. Furthermore, the quality of the measured morphology being strongly dependent on the image resolution, the comparison between different surveys is also a problem.
Aims.We present a new non-parametric method to quantify morphologies of galaxies based on a particular family of learning machines called support vector machines. The method, which can be seen as a generalization of the classical C/A classification but with an unlimited number of dimensions and non-linear boundaries between decision regions, is fully automated and thus particularly well adapted to large cosmological surveys. The source code is available for download at http://www.lesia.obspm.fr/~huertas/galsvm.html
Methods.To test the method, we use a seeing limited near-infrared (Ks band, 2,16 μm) sample observed with WIRCam at CFHT at a median redshift of z ~ 0.8. The machine is trained with a simulated sample built from a local visually classified sample from the SDSS, chosen in the high-redshift sample's rest-frame (i band, 0.77 μm) and artificially redshifted to match the observing conditions. We use a 12-dimensional volume, including 5 morphological parameters, and other characteristics of galaxies such as luminosity and redshift. A fraction of the simulated sample is used to test the machine and assess its accuracy.
Results.We show that a qualitative separation in two main morphological types (late type and early type) can be obtained with an error lower than 20% up to the completeness limit of the sample (KAB ~ 22), which is more than 2 times better that what would be obtained with a classical C/A classification on the same sample and indeed comparable to space data. The method is optimized to solve a specific problem, offering an objective and automated estimate of errors that enables a straightforward comparison with other surveys. Selecting the training sample in the high-redshift sample rest-frame makes the results free from wavelength dependent effects and hence its interpretation in terms of evolution easier.
Key words: galaxies: fundamental parameters / galaxies: high-redshift / methods: data analysis
© ESO, 2008