Up: Object classification in astronomical

1 Introduction

Sky surveys are designed to provide statistical samples of astronomical objects, aiming for spatial overview, completeness and homogeneous datasets. Mostly they serve as a database for rather general conclusions about abundant objects, but another attractive role is allowing to search for rare and unusual objects. For both purposes, it is very useful to predict rather precisely the appearance of the different known types of objects. The object types can then be discriminated successfully, and allow to extract the information content from the survey. Also, unusual objects can be found as inconsistent with all known sorts of objects, but they might as well hide among the bulk of normal objects mimicking their appearance.

In this picture, we of course want a survey to perform as reliable and as accurate as possible in measuring object characteristics like class, redshift or physical parameters. Since surveys aim typically for large samples upon which future detailed work is based, their results are often not extremely reliable and accurate for a given single object. But for a statistical analysis of large samples, we can usually do without perfect accuracy in the measurement of features and we can also accept occasional misclassifications.

In astronomical surveys pointing off the galactic plane, obvious classes to start out with could basically be stars, galaxies, quasars and strange objects. These can be further differentiated into subclasses, based on physical characteristics derived from their morphology or spectral energy distribution (SED). Therefore, morphology and color or prominent spectral features are the typical observational criteria applied to survey data for classifying the objects contained.

Presently, surveys concentrate mostly on either imaging or spectroscopy. While spectroscopic surveys deliver a potentially high spectral resolution, they have expensive requirements for telescope time. Imaging multi-color surveys can expose a number of filters consecutively, and deliver morphological information and crude spectral information for all objects contained in the field of view.

Since the subject of this paper is the spectral information in multi-color surveys, we want to mention morphological information only briefly: The morphology is only of limited use for classifying objects into stars, galaxies and quasars: Objects observed as clearly extended are certainly not single stars, but the smaller ones could either be galaxies, low-luminosity quasars, or chance projections of more than one object. Objects consistent with point-sources can be stars, compact galaxies or quasars. Also, the morphological differentiation depends on the seeing conditions and typically reaches not to the survey limits set by the photometry.

The power of spectral classification in a multi-color survey depends both on the filter set used and the depth of the imaging, where the optimum choices are determined by the goal of the survey. If a survey aims at identifying only one type of object with characteristic colors, a tailored filter set can be designed. E.g., when looking exclusively for U-band dropouts (Steidel et al. 1995), the UGR filter set is certainly a very good choice. The performance of such a dropout survey depends mostly on the depth reached in the U-band, so the photon flux detection limit in U is the key figure. Also, number count studies are limited by the completeness limit in the filter of concern. Quasar search is very often done with color excess rules (Hazard 1990), where the limit is given by the flux errors combined from two or three filters. E.g., the evolution of quasars between redshift 0 and 2.2 was established using the UV excess method (Schmidt & Green 1983; Boyle et al. 1988). At higher redshift quasars display rather star-like broad-band colors, motivating more advanced approaches like the selection of outliers in an n-dimensional color space (Warren et al. 1991).

If we now intend to focus different survey programs on a common patch of sky to maximise synergy effects from the various efforts, then we might as well combine the individual surveys into one that identifies every object, and avoid double work. Then we have to ask for a filter set which enables identifying virtually every object above some magnitude limit unambigously. In this case, the key number for the performance is the magnitude limit for a successful classification as needed for various science applications. If the classification takes all available color data into account, like template fitting procedures do, then the flux limit of a single filter is not the only relevant number, since the performance will depend to a large extent on the filter choice. This applies also for the estimation of multi-color redshifts, an idea dating back to Baum (1962), who used nine-band photoelectric data to estimate the redshifts of galaxy clusters.

Most multi-color surveys conducted to date obtained spectral information via broad-band photometry. They have been used e.g. to search for quasars or for high-redshift galaxies. However, they always needed follow-up spectroscopy to clarify the true nature of the candidates and to measure their redshift. The SLOAN Digital Sky Survey (York et al. 2000) is now the most ambitious project to provide a broad-band color database, on which the astronomical community might perform a large number of "virtual surveys''.

So far, only very few survey projects make extensive use of medium-band and narrow-band photometry, e.g. the Calar Alto Deep Imaging Survey (Meisenheimer et al. 1998). Surveys like CADIS with typically 10 to 20 filters are sampling the visual spectrum with a resolution comparable to that of low resolution imaging spectroscopy. CADIS fostered the development of a scheme for spectral classification, that distinguishes stars, galaxies, quasars and strange objects. Simultaneously, it assigns multi-color redshifts to extragalactic objects.

Using 162 spectroscopic identifications Wolf et al. (2001, henceforth Paper II) have shown, that it is reliable for virtually all objects above the 10- $\sigma$ limits of the CADIS survey. Also, the photometric redshifts are accurate enough ( $\sigma_z \approx 0.03$ for galaxies and $\sigma_z \approx 0.1$ for quasars around the 10- $\sigma$ limit), so that follow-up spectroscopy is not needed for a number of analyses, e.g. the derivation of galaxy luminosity functions (Fried et al. 2000).

After this algorithm was developed for CADIS, it is now used for classification in additional projects. It provides multi-color redshifts in lensing studies of the cluster Abell 1689 (Dye et al. 2000), aiming at determining the cluster mass after identifying cluster members and weakly lensed background objects. It is also employed for an ongoing widefield survey to search for high-redshift quasars, to provide multi-color redshifts for galaxy-galaxy lensing studies, to search for high-redshift galaxy clusters and to perform a census of L* galaxies at $z \approx 1$ (Wolf et al. 2000).

The purpose of this paper is to present our classification scheme and discuss the optimization of its use for optimum survey strategies. The statistical algorithm for the scheme is presented in Sect. 2 and our choice for the template libraries is detailed in Sect. 3. In Sect. 4 we report on simulations of a few competitive filter sets and their expected classification performance. We include an analytic discussion on the comparison of filter sets and conclude that medium-band surveys are altogether more powerful, even when being limited by available telescope time. Section 5 outlines a few real datasets using this classification and draws conclusions about the expected performance. Paper II demonstrates real CADIS data based on which we gained experience during the development of the scheme, and show, that the conclusions from the simulations compare well to the real dataset.

Up: Object classification in astronomical