A&A 384, 81-98 (2002)
E. Hatziminaoglou 1 - M. A. T. Groenewegen 1 - L. da Costa 1 - S. Arnouts 1 - C. Benoist 2 - R. Madejsky 1,3 - R. P. Mignani 1 - L. F. Olsen 4 - C. Rité 1,5 - M. Schirmer 6 - R. Slijkhuis 1 - B. Vandame 1
1 - European Southern Observatory, Karl-Schwarzschild-Str. 2, 85748 Garching b. München, Germany
2 - Observatoire de la Côte d'Azur, BP 229, 06304 Nice Cedex 4, France
3 - Universidade Estadual de Feira de Santana, Campus Universitário, Feira de Santana, BA, Brazil
4 - Astronomical Observatory, Juliane Maries Vej 30, 2100 Copenhagen, Denmark
5 - Observatório Nacional, Rua Gen. José Cristino 77, Rio de Janerio R.J., Brazil
6 - Max-Planck-Institut für Astrophysik, Karl-Schwarzschild-Str. 1, 85748 Garching b. München, Germany
Received 16 August 2001 / Accepted 20 December 2001
This paper describes the methodology currently being implemented in the EIS pipeline for analysing optical/infrared multi-colour data. The aim is to identify different classes of objects as well as possible undesirable features associated with the construction of colour catalogues. The classification method used is based on the -fitting of template spectra to the observed SEDs, as measured through broad-band filters. Its main advantage is the simultaneous use of all colours, properly weighted by the photometric errors. In addition, it provides basic information on the properties of the classified objects (e.g. redshift, effective temperature). These characteristics make the -technique ideal for handling large multi-band datasets. The results are compared to the more traditional colour-colour selection and, whenever possible, to model predictions. In order to identify objects with odd colours, either associated with rare populations or to possible problems in the catalogue, outliers are searched for in the multi-dimensional colour space using a nearest-neighbour criterion. Outliers with large -values are individually inspected to further investigate their nature. The tools developed are used for a preliminary analysis of the multi-colour point source catalogue constructed from the optical/infrared imaging data obtained for the Chandra Deep Field South (CDF-S). These data are publicly available, representing the first installment of the ongoing EIS Deep Public Survey.
Key words: surveys - quasars - stars: white dwarves - stars: low mass, brown dwarves
The ESO Imaging Survey (EIS) is an ongoing project to carry out public imaging surveys in support of VLT. Its primary goal is to provide multi-wavelength data sets from which samples comprising different types of extragalactic and galactic objects can be extracted for follow-up spectroscopic observations. So far the surveys conducted have used different instrument/telescope setups to carry out moderately deep observations of large areas, deep optical/infrared observations of high-galactic latitude fields, contiguous areas of the SMC and LMC and selected stellar fields including open clusters, and globular clusters (for more details see da Costa 2001). Altogether over 50 square degrees of the southern sky have already been surveyed, albeit using different filter combinations and reaching different magnitudes (see the EIS web page at http://www.hq.eso.org/science/eis/).
The ultimate success of these surveys will, to a large extent, depend on the ability of reliably identifying different classes of objects and extracting well-defined samples for spectroscopic follow-up observations. While colour selection is nothing new and several methods have been devised and applied in the past, the demands of modern, wide-area surveys involving large numbers of objects and passbands is relatively new and must be properly addressed. Therefore, to fully achieve the scientific goals of EIS a detailed understanding of the distribution of objects in colour space is required. This is not only necessary for the selection of spectroscopic targets but also as a verification of the colour catalogues being routinely produced by the survey pipeline.
An ideal way of tackling this problem is to combine intrinsic (e.g. spectral properties) and statistical information (i.e. spatial distribution, luminosity and mass functions, evolution) regarding different classes of known objects. The nature of objects, as characterised by its spectral properties, can be assessed by comparing the measurements obtained from the multi-colour photometry with those estimated using template spectra describing different types of objects. To take into account the statistical properties of a given population requires detailed simulations of the stellar population of our Galaxy as well as the extragalactic populations. These simulations must also satisfy observational constraints, such as sky position, completeness, photometric errors, and morphological classification. In principle, combining these two independent methods should lead to a further improvement of the classification of objects extracted from colour catalogues. As a first step towards this goal, this paper discusses the classification of the objects based exclusively on their spectral properties as derived from multi-colour observations. As a practical illustration, this analysis is applied to the multi-colour point source catalogues extracted from the recently completed optical/infrared data of the Chandra Deep Field South (CDF-S) by the EIS Deep Public Survey (DPS; Vandame et al. 2001; Arnouts et al. 2001a).
The UBVRI optical data covers an area of 0.25 square degrees. These data are complemented by near-infrared observations over 0.1 square degrees located at the centre of the area covered by the optical data. While the angular coverage is relatively small, this is the first complete data set of this survey which at the end will cover a total area of 3 square degrees corresponding to 12 times the data presented here. Therefore the results presented in this paper provide a first assessment of the likely outcome of this survey once completed. Using the CDF-S data as a benchmark is particularly interesting considering the large number of imaging and spectroscopic observations planned for this region, in addition to the already publicly available deep X-ray observations of Chandra (Rosati et al. 2001). These observations will provide an unprecedented multi-wavelength data set that should certainly help refine the classification algorithms being developed.
In Sect. 2 the data as well as the method employed in the construction of the point source catalogue are briefly reviewed. Section 3 presents the methods used to classify objects based on their colour properties and to search for objects located in poorly populated regions of the colour space. These methods are applied to the CDF-S optical and optical/infrared data and the results are discussed and compared to other methods in Sect. 4. In this section tables listing different types of objects are also presented. In Sect. 5 an assessment of the results is carried out by visually inspecting image cutouts and examining the photometric measurements of individual objects to evaluate the performance of catalogue production and target selection procedures. General conclusions and a discussion of the main results is presented in Sect. 6. Finally, in Sect. 7 a brief summary is presented.
The first step for a target selection from multi-colour data is the construction of multi-colour catalogues. Such catalogues can be produced either by association of objects listed in the single passband catalogues or by building a image (Szalay et al. 1999) combining the available images in the different passbands. The reference image is used to detect objects, while their photometric measurements are carried out in the individual single passband images. For the work presented here the first technique has been applied. The image will be used in future work where a detailed discussion of the pros and cons of the two methods will be presented.
The input to the catalogue association are the single passband catalogues (Vandame et al. 2001; Arnouts et al. 2001a) cut at a S/N=2, slightly lower than the catalogues publicly released which were cut at a S/N=3. Furthermore, in the present work only objects inside a trimmed region and with appropriate SExtractor flags are considered. Finally, the U-band catalogue used here was extracted from an image produced stacking all the available U/38 and U350/60images, in order to reach a fainter limiting magnitude. A more detailed discussion of this catalogue and of its photometric calibration will be present in Arnouts et al. (2001b).
After the association, only objects that are in the area common to all catalogues and outside all the masks placed around saturated objects and bright stars, are kept. This ensures that all objects could have been detected in all passbands. An object is included in the final colour catalogue if it is detected with a in at least one passband. This was done as a compromise between completeness, to avoid pruning possible interesting objects before looking at the data, and the number of spurious detections. If an object is not detected in a particular passband, its magnitude is set to the corresponding 3 limit. Note that the 3 limiting magnitudes (throughout this work all magnitudes are in the Vega system) of the colour catalogues are: U=25.7, B=27.4, V=26.0, R=25.9, I=24.7, J=23.3, and .
In the single passband catalogue, point and extended sources are separated using the SExtractor CLASS_STAR flag, up to the morphological classification limit. The point/extended source classification in a colour catalogue is not trivial and the following scheme is adopted. As the point/extended source classification works best for bright objects, the passband utilised for the classification is the one where the object is brightest with respect to the classification limit in that filter (if any). If the object is classified as a point source in this filter, it is considered a point source in the colour catalogue. This procedure is valid as long as the seeing on the images obtained in different passbands is comparable, which is the case here.
The final point-source catalogues considered comprise 1539 and 623 objects in five and seven passbands, respectively. Based on empirical number counts (Wolf et al. 2001a) the completeness of the point-source catalogue is estimated to be % at R=24.5. The present analysis only considers sub-samples consisting of objects detected in at least three bands and those consisting of objects detected only in the red-most passband available over the two regions covered in five and seven filters.
Over the years several techniques have been employed to classify objects based on their colour properties, the most common being the selection of objects within regions defined in two-dimensional projections of colour space. These regions are defined based on colour tracks computed from template spectra. While this is a reasonable approach for data sets encompassing a small number of passbands, it is cumbersome to handle photometric errors and as the number of bands increases the problem rapidly becomes untrackable. Furthermore, use of a colour-colour diagram inevitably leads to severe contamination, requires the use of several projections in order to properly constraint a particular population and is not easily implemented for automatic classification for large datasets.
A more suitable approach to handle colour data from large imaging
surveys is the -technique (e.g. Wolf et al. 2001b, 2001c)
originally developed for galaxy
photometric redshift estimation (e.g. Bolzonella et al. 2000 and
references therein) and quasar search (Hatziminaoglou et al. 2000,
hereafter HMP00; Richards et al. 2001 and references therein).
This method is an optimal way of simultaneously
handling samples consisting of more than three passbands, reducing the
dimensionality of the problem. It consists of
fitting the observed spectral energy distribution (SED) of each object
to a series of template spectra of different classes of objects,
The SED of each object is compared to all available templates separately, and the object is assigned a given type depending on the value of the . Classifications are considered robust (rank 1) if the is within 95% confidence level; good (rank 2) if in the interval 95-99% and poor (rank 3) if outside the 99% confidence level. Poorly classified objects can indicate an incomplete spectral library, errors in the (theoretical) spectra, inadequacies in the error model adopted and/or undesirable features in the colour catalogue. The reliability of the results, of course, critically depends on the completeness and the quality of the available spectral library. The classification can always be refined by continuously adding measured or improved theoretical spectra. It is important to point out that even if the number of filters is limited to three this approach, in principle equivalent to the traditional colour-colour selection, provides a more convenient way of handling the errors and uses additional constraints based on flux upper-limits.
The spectral library currently in use consists of series of: model quasar, white dwarf and brown dwarf spectra; three empirical cool white dwarf spectra; a set of stellar templates; and a set of model galaxy spectra. The template quasar spectra are assumed to have three different power-law continua (spectral indexes of 0, 0.5 and 1 in the optical). The emission lines (Ly, Ly, CIII, CIV, MgII, SiIV, H, H, and H) are assumed to have Gaussian profiles and typical relative intensities and equivalent widths (e.g. Peterson 1997). The Ly forest has been modeled according to Madau (1995). Template spectra were created for redshifts from z=0 to z=6, in steps of dz=0.1. A set of theoretical white dwarf spectra (provided by D. Köster; for the description of the models see Finley et al. 1997; Homeier et al. 1998), with effective temperatures and effective gravities ranging from 6000 K to 100000 K and -9, respectively, has been incorporated. Furthermore, three empirical spectra of very cool white dwarves ( K) have been included, two covering the optical wavelength range (Ibata et al. 2000), and one covering both the optical and near-infrared wavelength range (Oppenheimer et al. 2001). Also included is a series of theoretical low mass stars and brown dwarf spectra (Chabrier et al. 2000) covering temperatures from 500 K to 2800 K corresponding roughly to masses in the range 0.03 to 0.1 , for an effective gravity of . The effective temperature range corresponds approximately to spectral types later than M6V (Kirkpatrick et al. 1991). We should point out that the transition between main sequence M-dwarves and low mass stars is arbitrary, since there is an overlap of templates of objects with temperatures around 2800 K. Stellar spectra are taken from the stellar library of Pickles (1998), which contains 131 spectra for a broad range of stellar types (main sequence, giant and sub-giant stars). Finally, the set of galaxy spectra used is the Coleman et al. (1980) model, with no intrinsic evolution. This last set of templates, however, and since this paper only deals with point sources, is only used in cases of poor classifications, as it will be explained later on. The conversion from spectra to magnitudes are carried out using the same response functions as the observations (Vandame et al. 2001; Arnouts et al. 2001a).
It is worth pointing out that a classification scheme based on the -technique may lead to degeneracies, which are type-dependent (e.g. quasars - galaxies - stars). Multiple minima may occur in the parameter space (e.g. redshift - type) and such degeneracies can only be solved by including additional information in the classification procedure. This will be addressed in a forthcoming paper (Hatziminaoglou et al. 2002). In the present work only objects with real detections in at least three passbands are treated. One could also use upper limits as an a posteriori, in order to impose additional constraints and exclude highly improbable solutions. An alternative way is to construct realistic mock catalogues and to combine the colour-space distribution of the different populations with the individual properties (i.e. SED) of the objects (e.g. Hatziminaoglou et al. 2002).
The technique presented above assumes that all objects belong to classes for which the spectral properties are known. In order not to overlook the presence of unknown populations and to identify possible problems in the process of building the colour sample, it is of great interest to be able to pick up colour-space "outliers'', even though these may simply belong to rare populations, with known spectral properties.
For the systematic identification of outliers the nearest neighbour
criterion is adopted using two dissimilarity measures (e.g. Warren et al. 1991). The first measure is the Euclidean distance in colour space
From the original sample of 1539 objects, morphologically classified as point sources (see Sect. 2) within the 0.25 square degrees covered in five optical passbands, 1494, detected in at least three filters, were considered. In total there are 204 objects classified as quasars in the magnitude range . This number is in excellent agreement with that predicted by quasar models (202 +131-78, e.g. Hatziminaoglou et al. 2002) at the limiting magnitude of B=25.0, which roughly corresponds to the completeness limit of the colour catalogue considered. It is also in good agreement with estimates based on observed number counts (e.g. Hartwick & Schade 1990; Glazebrook et al. 1995).
|Figure 1: (U-B)/(B-V) (panel a)) and (B-R)/(R-I) (panel b)) two-colour diagrams showing the quasar candidates selected on the basis of the -technique. Open circles, plus signs and crosses denote candidates classified at 95%, 95-99% and outside 99% confidence level, respectively.|
|Open with DEXTER|
To help evaluate the results of the -method it is important to compare them with those obtained from a simple colour-colour selection. This is illustrated in Fig. 1 which shows the two colour-colour diagrams normally used to identify low to intermediate ( (U-B)/(B-V)) and high-redshift quasars ( (B-R)/(R-I)), respectively. In the plot the identified quasars are displayed with different symbols to indicate their associated confidence level as follows: robust (open circles); good (plus signs); and poor (crosses). The same notation applies to all colour-colour plots presented hereafter, unless noted otherwise.
From the careful inspection of these colour-colour diagrams several points can be made regarding the performance of the -analysis. Note, for instance, that nearly all robust candidates are located in the region predicted by the models for low- to intermediate-redshift ( ) quasars, i.e. and , with a few cases extending into the region where higher redshift quasars are expected to lie. One important short-coming of the standard colour selection is its inability to discriminate between objects with similar colours, leading inevitably to contamination problems. In the particular case of quasars, the main contaminants are white dwarves and early spectral type main-sequence stars. The results of -analysis suggest that this technique is capable of distinguishing among these different classes, as can been seen from quasar candidates lying very close to and sometimes overlapping objects of other types. Poor candidates are, in general, located at the outskirts of the region delineated by the robust candidates, except for a few cases where the two populations overlap. Similar conclusions can be drawn from the (B-R)/(R-I) diagram shown in panel b. In particular, given the B filter used, high-redshift quasars ( ) lie very close to the main sequence, making a colour selection problematic and susceptible to contamination by main-sequence stars. It is worth mentioning the object with (see panel a) located in a region unlikely to be occupied by quasars. However, on other colour diagrams this object lies close to quasar tracks. A more detailed discussion about its nature will be presented in Sect. 5.
In order to evaluate the results obtained applying the
-method, they can be compared with those based on the more
traditional UVX and BRX selection. Adopting criteria similar to those
used by Hall et al. (1996) one finds 298 UVX and 49 BRX independent
quasar candidates. Out of these, 166 are in common with those
classified using the -method. Considering the
-classification as reference one estimates a contamination
of 40% in a colour-colour quasar selection, demonstrating the
potentiality of the technique for minimising the number of
contaminants. Moreover, since it uses the colour information in a
combined way, it should also lead to a higher completeness than those
based on distances from the stellar locus (e.g. Gaidos et al. 1993; Newberg & Yanny 1997). For example, colour based
selections will tend to miss quasars with redshifts in the range
This is a serious drawback because currently it
is believed that the space density of optically-selected quasars
starts decreasing in this redshift range. The possible benefits of the
-classification over a simple colour selection remains to be
evaluated, when spectroscopic data become available.
|Figure 2: Optical-infrared colour-colour diagram used in the selection of KX candidates, showing the quasar candidates selected on the basis of the -technique. The symbols correspond to those defined in Fig. 1.|
|Open with DEXTER|
The CDF-S field has also been observed in the near-infrared passbands over an area of 0.1 square degrees. For this seven passband sub-sample, the number of morphologically classified point sources is 623, out of which 605 are detected in at least three filters. In total there are 92 objects classified as quasar candidates, out of which 62 are in common with those found using only the optical data. The infrared information yields 30 new candidates. Five cases originally classified as quasars are now assigned different classifications: three as white dwarves, one as an G8I and one as a K3I stars.
For the case of optical/infrared all -selected quasar candidates are shown on the diagram, introduced by Croom et al. (2001), presented in Fig. 2. This diagram is suitable for identifying quasar candidates, due to the following reasons. First, it could partially solve the degeneracy between low to intermediate quasars and white dwarves, occuring when the UVX criterion is applied. Quasars should exhibit a K-excess due to the broad bump created by dust, present in their spectra at the wavelengths around 1 m (in the rest frame). White dwarves do not have such near-infrared spectral features and should be separated from quasars when the infrared information is added. Second, as pointed out by Croom et al. (2001), KX-selection could identify reddened quasars possibly missed by the UVX selection. Choosing similar regions of colour space as these authors, one finds 65 quasar candidates, 52 of which in common with those UVX selected. Of the KX-selected candidates 34 belong to the quasar candidate list defined using the -technique. From Fig. 2 one sees that in order to ensure completeness of -selected robust quasar candidates, one is forced to consider all objects with V-J<3.0, which in turn leads to a contamination comparable (50%) to the one introduced by the UVX selection. Another point worth mentioning is the fact that in Fig. 2 one finds red objects not predicted by models of the spectral properties of point-sources. As discussed below, this is probably due to the contamination of the sample by unresolved galaxies (see Sect. 5). This population is seen in all optical/infrared colour-diagrams presented below.
Another important feature of the the -method is that it can be used not only to classify the objects but also, in the case of extragalactic candidates, to estimate their redshifts. The photometric redshift distribution estimated for the quasar candidate sample selected from the five optical passbands by the -technique is shown in Fig. 3, where the three rankings are plotted separately. The distribution covers a broad range of estimated redshifts extending all the way to . The present sample includes 16 candidates with estimated redshifts , among which 10 are robust classifications. Due to the degeneracy in the assignment of quasar redshifts based solely on broad-band optical filters (HMP00) objects with have a considerable dispersion in their estimated photometric redshifts. This accounts for the excess seen at and the dearth at . At there is also the increased probability of misclassifying Compact Emission Line Galaxies (CELGs) as quasar candidates, as shown by HMP00 using the DMS spectroscopic sample. The dearth of objects with redshifts in the interval is due to the colours of AGN at these redshifts, which are much like the colours of the main sequence stars, and can be very easily mis-classified as such. Note that in the present sample there are seven U-dropouts and a B-dropout robust candidates with photometric redshifts .
|Figure 3: Photometric redshift distribution of quasar candidates selected by the minimum -method for the UBVRI data set. The solid histogram denotes the distribution for candidates selected when no constraint is applied on the values of the . The dotted histogram corresponds to objects selected at 99%; and the dashed histogram correspond to those selected at 95%, as described in the text. Finally, the thick solid histogram is that predicted according the model adopted for the evolution of the quasar LF.|
|Open with DEXTER|
The redshift distribution of the 92 quasar candidates selected from their optical/infrared photometry is presented in Fig. 4. A comparison between Figs. 4 and 3 shows that by including the infrared data one can significantly improve the redshift estimates. As can be seen, the excess of low-redshift quasars is considerably smaller and the distribution resembles more closely the model prediction in the redshift interval . For the objects in common, Fig. 5 shows the comparison of photometric redshifts based on five and seven passband data. The final redshift distribution for all quasar candidates identified in the present work, including those with poor classification, is shown in Fig. 6, using the optical and, whenever possible, the optical/infrared data.
|Figure 4: Photometric redshift distribution of quasar candidates -selected in the area covered by the optical and infrared data. The figure shows the model predicted redshift distribution (thick solid line) and the one measured for objects with no selection (solid line), at 99% (dotted line) and at 95% (dashed line).|
|Open with DEXTER|
|Figure 5: Comparison of the photometric redshift distribution of the 63 common quasar candidates using the five and seven passbands.|
|Open with DEXTER|
Using the above redshift distribution one finds a surface density of 55 quasar candidates with redshifts in the range per square degree, within the area covered by the optical observations. By contrast, using the optical/infrared data one finds a surface density of 100 per square degree, thereby improving the completeness of the sample.
|Figure 6: Final photometric redshift distribution for all 234 quasar candidates. The thick solid line shows the model predictions.|
|Open with DEXTER|
Table 1 lists the first 40 entries of the final quasar
candidate sample comprising 234 objects,
The -technique primarily used for classification of quasar, galaxies and stars has been extended to consider stellar sub-classes such as white dwarves, low mass stars and brown dwarves. Even though confirmation of these classifications will depend on spectroscopic data this method is applied here as a first attempt to define a robust procedure to select different sub-classes of galactic objects.
In order to search for white dwarf candidates, 66 theoretical spectra were provided from D. Köster, as well as three observed spectra of very cool white dwarves ( K) from Ibata (2000; F351-50, F821-07) and Oppenheimer (2001; WD0346). The template spectra were again compared to the broad-band photometry. A total of 97 objects were classified as white dwarf candidates in the magnitude range , out of which 86 are robust classifications. The number of 97 candidates is a factor of 1.5 higher that the 65 white dwarves with , brighter than B=25, expected to be found within the area of 0.25 square degrees, as predicted by current estimates of the white dwarf luminosity function (Girardi et al. 2001). From the total number of candidates, 89 have estimated temperatures in the range from 6000 K to 16000 K according to the theoretical spectra. The remaining nine were better matched by the observed very cool white dwarf spectra, with seven being robust classifications. The distribution of all candidates in the (U-B)/(B-V) plane is shown in Fig. 7. Two of the cool white dwarves are located at and in the interval 1.0<(U-B)<1.5, superposing the main-sequence locus. The location of these candidates are by and large consistent with the white dwarf cooling curve kindly provided by P. Bergeron. It is interesting to point out that one of the tracks, representing the cooling curve of hydrogen white dwarves curves towards the main-sequence. Note that five of the cool candidates are U-dropouts and do not appear in this colour diagram.
Comparison of Fig. 7 with Fig. 1 shows that for the characteristics of the present survey a simple UVX selection would lead to a large contamination of this sample by quasar candidates, since these populations have a significant overlap in this diagram. In fact, choosing typical values for the colours to delineate the region occupied by white dwarves in this diagram, the fraction of white dwarf candidates would correspond to about 30% of the total number of objects within this region. This is in contrast with the high success rate (72% spectroscopically confirmed) obtained by Christlieb et al. (2001) in their analysis of the Hamburg/ESO Survey (HES). These results illustrate the strong dependence of the efficiency of colour selection with the characteristics of the survey. A bright survey like HES would yield a low quasar and a high white dwarf surface density while exactly the opposite is true for the deep observations considered here.
|Figure 7: (U-B)/(B-V) colour-colour diagram showing the -selected white dwarf candidates within the area covered by the optical data.|
|Open with DEXTER|
|Figure 8: (V-J)/(J-K) colour-colour diagram showing the -selected white dwarf candidates within the area covered by the optical/infrared data.|
|Open with DEXTER|
The results obtained from the -analysis of the optical/infrared data are as follow: a total of 21 candidates are selected with 18 being robust detections. Figure 8 shows the distribution of the candidates in the same colour-colour diagram as Fig. 2. The locus of the white dwarves in this diagram is shifted redwards relative to that computed by P. Bergeron. The candidates have estimated effective temperatures in the range 6000 to 14000 K. The overlap between the sub-samples extracted from the five and seven passbands comprises 18 objects. The remaining three objects were originally classified as quasar candidates. Another 17 objects selected as white dwarf candidates based on the optical only, are now classified as quasar candidates. These results show how difficult it is to distinguish between quasars and white dwarves, and how useful the infrared data can be for that purpose.
It is worth pointing out that none of the cool white dwarf candidates identified using the optical colours are confirmed when the infrared colours are included in the analysis. This may be due to inadequacies in the near-infrared part of the model spectra, which could also explain the shift of the locus of white dwarf candidates mentioned above. This point will be further investigated when more infrared spectra become available.
Table 2 lists the first 40 entries of the white dwarf
The low mass and brown dwarf spectral library was provided by Chabrier and Baraffe and consists of 105 theoretical spectra. They correspond to three sets of models which attempt to account for differences in the formation and settling of dust in the atmospheres (Chabrier et al. 2000). In this paper 53 of these models are used, corresponding to objects with masses 0.1 ( K), to select low-mass stars and/or brown dwarf candidates. These template spectra were compared to our broad-band SED and over the 0.25 square degree area covered in five passbands a total of 18 candidates were identified with (all fainter than ), with 13 being robust detections. All of the candidates are matched to spectra with effective temperatures between 1700 K and 2800 K, corresponding to masses roughly between 0.05 and 0.1 , close to the hydrogen-burning limit. Their position on a (V-R)/(R-I) diagram is shown in Fig. 9. The objects with (R-I) > 2.5 seen on this figure that have not been selected as low mass or brown dwarf candidates were identified as M6V stars. Note that this class marks the transition between main sequence stars and low mass stars. Comparing the results of the -analysis with those obtained selecting objects redder than (R-I)>2.3, roughly corresponding to the (V-I)>3.5criterion adopted by Zaggia et al. (1999), one finds a significant contamination (75%) by other types of objects.
Applying the -test on the optical/infrared data one finds a total of 35 candidates out of which 14 are robust classifications. All five low-mass star candidates that have both optical and near-infrared data are confirmed when the J and information is included in the analysis. Furthermore, 11 stars originally classified as M5V and M6V stars using the UBVRI catalogues, are classified as low-mass stars when the infrared data are used. All candidates lie in the range and have estimated effective temperatures in the range between 1700 K and 2800 K. The candidates identified are shown in Fig. 10. The figure shows two main concentrations of low mass star candidates. One at and , with 19 candidates, roughly corresponding to the transition between main sequence and very low mass stars. The other covers the region defined by and (13 objects), consistent with the location of the L-dwarves (L3) as reported in the literature (e.g. Reid et al. 2001; Leggett et al. 2001; Schweitzer et al. 2001). In this optical/infrared colour diagram, as mentioned in the previous sections, one sees again a population of objects with colours which are not predicted by any model describing the spectral properties of point-sources. As discussed below, most of these objects are associated with unresolved galaxies which contaminate the point-source catalogue.
|Figure 9: Optical colour-colour diagram showing the brown dwarf candidates selected from the -technique.|
|Open with DEXTER|
|Figure 10: Optical/infrared colour-colour diagram showing the brown dwarf candidates selected using the -technique.|
|Open with DEXTER|
Based on the results of the -selection one finds a surface density of very low mass stars of about 72 candidates per square degree using optical colours. When the near-infrared data is added, this value increases by a factor of 3, yielding a surface density of 350 per square degree. These estimates for the surface density are a factor of 3 higher than the expected value of 116 low-mass stars per square degree with K and brighter than B=25, predicted by models (e.g. Girardi et al. 2001).
The final list of individual low-mass star candidates in the CDF-S
field is given in Table 3.
Figure 11 shows the colour-magnitude diagram for all point-sources within the area of 0.25 square degrees (left panel) and the diagram for the central area of the CDF-S covered by infrared data (right panel). The symbols are described in the figure caption. The extreme colour lower limits of the R- and J-dropouts shown in the figure make them likely to be brown dwarves (but see Sect. 5).
|Figure 11: Colour - magnitude diagrams for the very red objects: I versus R-I (left panel) and versus (right panel). The filled circles show objects detected in two bands. Arrows indicate lower limits in the colour of objects detected only in the red-most passband.|
|Open with DEXTER|
As discussed in Sect. 3.2 there are several reasons why one would like to search for outliers. From a pure technical point of view, objects with odd colours have to be identified and visually inspected as they may reveal problems in the construction of the colour catalogue, contamination by close neighbours, cosmic rays or other image artifacts. Alternatively, they may represent potentially interesting rare cases, either of known objects such as quasars at very high-redshifts or previously unknown populations. Therefore, classifying objects as outliers is an important step towards verifying the integrity of the colour catalogue and avoiding overlooking new discoveries.
|Figure 12: Illustration of the selection of outliers applied on the five passband sub-sample for m=2. For definitions of and see Sect. 3.|
|Open with DEXTER|
As described in Sect. 3.2, outliers are identified in colour space based on their distances and from their nearest neighbour. An isolation criterion is then chosen, depending on the position of the objects on the versus diagram, as schematically shown in Fig. 12. This criterion divides the versus space in two regions, a densely populated one towards low values of the parameters and a much less dense region, where outliers lie. The parameters describing the separation line are chosen by fine-tuning them to include the most obvious cases of isolated objects in different colour-colour projections, separately for m=2 and m=3. Note, however, that objects isolated in one of the colour-colour projections are not necessarily isolated in all of them.
The results from the outlier analysis are summarised in Table 5, which lists the sub-samples, the number of objects in them and the number of outliers for m=2 and m=3, respectively. Note that in general the number of outliers increases with m. From the table one finds that the fraction of outliers is small being typically 10% of the whole sample. For both m=2and m=3, about 60% of the outliers are indeed poorly classified by the -method, while 30% are robust candidates. These results indicate, as expected, that the outliers consist of a mix population including known rare objects, objects possibly not well described by the available spectral library, or undesirable features on the images or the derived catalogues. The total number of outliers is 64 (80) for the sub-samples analysed using m=2 (m=3), out of which about 60% deserve a closer investigation as presented in the next section.
Figure 13 illustrates the location of outliers identified by applying the methodology described above to the five (seven) passband sub-sample. The figure shows two projections of the colour space, one for each of the sub-samples considered. The different symbols represent outliers selected using different values of m. Since nearly all of the outliers identified using m=2 are also identified when m=3 is used, in these plots only the additional objects identified with m=3 are represented by a different symbol. While a single projection is not sufficient to determine whether an object is truly an outlier in the multi-dimensional colour space, most objects far from the main concentration of points are successfully identified. In particular, in the left panel one finds the object with very particular colours, mentioned in Sect. 4.1, originally classified as a quasar. This case as well as others will be discussed in the next section.
Generally speaking, among the 30% of outliers which are also robust classifications about half are identified as quasar candidates and half are identified as galactic object candidates from the -technique. In both cases, the candidates found to be outliers are associated with sparse populations with nearly all quasars having and most of the stars being early spectral types (O-A) which are rare, especially at high-galactic latitudes. These results apply equally well to all the sub-samples and isolation criteria adopted.
Note that the selection of the mth neighbour as well as the separation line must be empirically determined. Even thought there is a correspondence between the outlier selection and the values of the , the exact relation is not easy to establish.
|Figure 13: (U-B)/(B-V) (left panel) and (right panel) showing objects classified as outliers by the criteria given in Sect. 3 adopting m=2 (circles) and m=3(rectangles). Open (filled) symbols denote objects with good (poor) classifications.|
|Open with DEXTER|
In previous sections the results of the classification were tentatively assessed by comparing these classifications with those that would be obtained from selecting regions in two-colour diagrams. In this section this evaluation is complemented by an investigation of the nature of the outliers and extreme red objects detected in two or only one band, in several cases resorting to a visual inspection of the images and a careful examination of the photometric measurements obtained for these objects. The motivation is to find possible effects that may impact the fitting and lead to a poor classification. The primary goal is to investigate whether these cases are of interest or are, instead, associated with particularities of the images and/or derived source catalogues for which, perhaps, additional refinements in the selection criterion can be devised. The measured SED of the poorly classified objects are also compared to a set of galaxy models to evaluate the possible contamination of the point-source catalogue by unresolved galaxies. The information stemming from the discussion below is incorporated as notes to the Tables 1-4. In these notes "o'' denotes outliers, "d'' objects that should be discarded from further consideration, and "g'' objects which are better matched by galaxy spectra.
Out of a total of 81 outliers, representing less than 5% of the total sample of point-sources, image cutouts were produced for all 57 objects having poor classification based on the -analysis. From the visual inspection of their images the following conclusions can be drawn. About 44% of the cases (25 objects) are found to be significant SExtractor detections, point-like sources and isolated. This set of objects consists of a mixed population which includes predominantly and high-redshift quasars ( 2.1 <z < 3.94) candidates. Among them there are two interesting cases, both poorly classified as quasars, one of which has already been mentioned in Sect. 4.1 (Fig. 1). Close examination of their photometry shows that both objects have comparable magnitudes in all passbands except in B in which they are more than one magnitude fainter. Inspection of the original images show no reason to suspect problems in the magnitude estimates. The only possible explanation for the odd colours is that these objects may be variable. This seems to be a reasonable explanation considering that images taken in all passbands except B were obtained a few days apart, while the B-images were taken nine months earlier. This result points out the importance of including temporal information, whenever possible, in the type of analysis presented here.
About 25% of poorly classified outliers are located near other objects leading to problems of contamination, de-blending and erroneous associations and should be discarded. The remaining 30% of the cases have been identified with features of the source extraction algorithm and with the production of colour catalogues, which are now being addressed.
In summary, about 25% of the outliers are robust classifications. Another 25% are isolated objects with no apparent problems in their photometric measurements, indicating that their poor classification may be due to inadequacies in the spectra library. These objects are thus prime targets for spectroscopic observations. These cases as well as outliers found to be potentially problematic are indicated in the tables of different classification types presented in previous sections. Taken the above numbers at face value one can conclude that only for about 2% of the objects in the original point-source colour catalogue should be discarded because of problems, most of which unavoidable in nature being caused by the presence of nearby objects affecting their photometric measurements.
In addition to the outliers one should also carefully examine extreme cases such as those presented in Sect. 4.3. From the total of 29 objects listed in Table 4, 17 showed no problems when visually inspected. Out of the remaining 12, seven are close (less than 4 arcsec) to brighter objects or located on the outer parts of a galaxy, and were not properly de-blended. Included are two K-only objects listed in Table 4. Another related case is one object found close to one of the masks automatically placed around a very bright star. As before, the photometric measurements are not reliable. Two additional problems were recognized. One is the break-up of a galaxy image by the source extraction algorithm (1 case) and the detection of residual cosmic rays in regions poorly sampled by the dithered images (2 cases, both I-only detections). Finally, one -only object, while visible in the I-band image, was not detected.
While the bright catalogue considered here should, in principle, contain only point sources, objects having poor classification (see Sect. 3) according to the -method have also been compared to galaxy templates. Overall there are 216 and 221 objects with poor classification in the five and seven passband catalogues, respectively. From their comparison with galaxy spectra one finds that 25-35% have an improved fit, with 51 and 71 objects being classified as galaxies, respectively. The objects classified as galaxies have mix morphological classifications and cover a broad redshift range. Based on these numbers one estimates that a lower-limit for the contamination of the five and seven passband point-source catalogues by unresolved galaxies is of the order of 5-10%.
The methodology described in this paper has been developed to analyse in an objective and automatic way colour catalogues being routinely produced by the EIS pipeline in order to: assign objects to different classes of astronomical sources; allow for new discoveries; and understand the limitations of the data and the procedures adopted in the derivation of source catalogues and their colours from the association of data taken in different passbands. The ultimate goal is to define procedures to efficiently extract from imaging survey data well-defined samples, with minimal contamination, for spectroscopic follow-ups.
As a first step in this direction the method of fitting template spectra to the measured broad-band photometry, currently being used for estimating galaxy/quasar photometric redshifts, has been employed, extending it to include different types of galactic objects. The method is intended to replace standard classification schemes based on the analysis of one or more two-colour diagrams which becomes unmanageable for large sets of multi-band data. As currently implemented the classification scheme only considers the spectral properties of the objects, neglecting other important information as the apparent magnitude of the objects and the expected density of objects of different types (see below). The results obtained from the automatic classification are not only consistent with those that would have been obtained from traditional methods based on two-colour diagrams but also consistent with model predictions, while minimising contamination by objects of other types. A point worth noting is that a significant number of poor classifications stem from the fact that the passbands used in the present analysis not always provide independent information and the statistical analysis leads to artificially low significances of the resulting classifications. In order to deal with this problem other techniques which take into account the proper dimensionality of the colour space for a specific class of objects (e.g. PCA) should also be considered. Finally, it is worth emphasising that the classifications are as good as the available spectral library. The library currently being used has been assembled from publicly available models and data and a number of classes are under-represented. Improvements in the classification method will depend on the continuous upgrade of the available spectral library. Currently, the library is being upgrade to include infrared spectra of white dwarves and low-mass stars kindly provided by S. Leggett. Adding spectra for different type of objects from other ongoing spectroscopic surveys such as the SDSS will also be of great value in improving the current library.
In order to detect potential problems and not to overlook possible new discoveries the -method has been complemented by a procedure of identifying outliers using as dissimilarity measures Euclidean distances in the multi-dimensional colour space and adopting a nearest-neighbour isolation criterion. Despite its simplicity the criterion adopted identified rare population of objects, objects with odd colours which could be traced either to real physical effects such as variability or to problems with their measured colour, demonstrating its usefulness in greatly reducing the number of cases, about 5% of the entire sample, that require a more detailed inspection. This number could be reduced even more by a further screening of the sample. As alluded to in the previous section, information about variability, if available, is of great use as it is a criterion based on angular separation and magnitude differences between a source and its nearest neighbour or to the mask automatically placed around very bright stars. This together with SExtractor de-blending flag and distance to masks placed around very bright stars should be used to discard objects which are likely to have the photometry affected by light contamination, the most common problem identified.
Overall, the present analysis suggests that derived catalogues are mostly free of problems. Visual inspection of several odd colour or extremely red objects has revealed that the most frequent problems are associated to the limitations of the de-blending algorithm; contamination by close neighbours; and, in some cases, residual cosmic rays located in poorly sampled regions of the image mosaic, with insufficient number of stacked images for a proper sigma-clipping.
In a follow-up paper the classification based on the spectral properties, as presented here, will be complemented with other statistics which further characterize the different populations of extragalactic/galactic objects using mock catalogues created from Monte Carlo simulations. This is particular important in the analysis of point-sources for which the stellar population makes an important contribution which varies according to the position of the sky observed. To account for this as well as reddening effects and the different filter sets used by the various surveys a population synthesis model has been combined with galactic structure models to simulate different observations.
This paper describes the methodology developed to analyse multi-wavelength data from ongoing public surveys to objectively and automatically classify extracted objects based on their colours. The method is expected to yield samples with better completeness and less contamination, than previous analysis based on two-colour diagrams. Moreover, the analysis can be carried out automatically and thus better cope with the rapidly increasing data volume from imaging surveys and more efficiently producing improved samples (higher yield) to feed large-aperture telescopes. The method has been applied to a catalogue of point-sources extracted from the optical/infrared images taken of the CDF-S field by the EIS project. The CDF-S field is of particular interest in this context considering the number of spectroscopic and photometric programs either ongoing or planned for the near future. These data will be invaluable to directly assess the results of the present analysis which may lead to refinements in the classification method adopted. The paper also provides a rough estimate of the by-products that should be expected at the end of the ongoing DPS.
The data used consists of UBVRI images taken with a wide-field imager covering an area of 0.25 square degree, complemented by a mosaic of infrared images covering 0.1 square degrees. Combining these data one finds a total of 234 quasar candidates with estimated photometric redshifts up to , among which 16 have . In addition, 51 low-mass star/brown dwarf candidates and 100 white dwarf candidates were identified. Tables listing the classified objects are presented together with the properties inferred by the classification method. If the classifications presented here are confirmed, samples comprising over 200 high-redshift quasars (z>3.5), and over 1000 white dwarves, 100 cooler than 4000 K, will become available at the end of the survey, expected to cover 3 square degrees.
It is worth emphasising the contribution of the near-infrared data. It increases the accuracy of the determination of the photometric redshifts as well as the number of quasar candidates in the redshift interval . Infrared photometry is also essential for tracking very low-mass stars and brown dwarves. If only optical data are considered, one should expect the DPS to provide 200 low-mass stars, including only 20 L-dwarves over the 3 square degree area. Since the surface density of L-dwarves increases to 120 per square degree when infrared data is considered, the number of L-dwarves would be much larger if the infrared observations were extended. As it currently stands the infrared observations cover an area of 0.2 square degrees within which one expects 25 L-dwarves.
The results presented here will be used in a forthcoming paper to produce a pruned stellar catalogue (Groenewegen et al. 2001). The same techniques will also be applied to analyse the faint source catalogue of the CDF-S which should also yield interesting results and more insight regarding the source catalogues.
Finally, one should keep in mind that, unless follow-up spectroscopic studies of the current samples are carried out, the numbers and target list will remain tentative.
We would like to thank R. Ibata and B. Oppenheimer for providing us with observed, very cool white dwarf spectra. We thank G. Chabrier and I. Baraffe for providing their most recent model spectra of brown dwarves, as well as D. Köster for giving us his most recent white dwarf models. We also thank P. Bergeron for computing the expected colour tracks of white dwarves in our filter system. Finally, we thank D. Swayne from the AT&T Labs, for providing and helping us install the multi-dimension visualisation tool, Xgobi.