Up: The ROSAT-ESO flux limited sample

5 Correlation of the X-ray source list with the COSMOS data base

5.1 The COSMOS data base

Since the X-ray properties which are described above do not allow by themselves an identification of the X-ray sources associated to clusters, we have to include information from an optical data base in the further identification process. For this we are using the most comprehensive optical data base covering the southern sky and the area of the REFLEX survey: the COSMOS scans of the UK-Schmidt survey plates (MacGillivray & Stobie 1984). There are also the complementary APM scans of the same photographic survey material, but the galaxy classification in the APM survey concentrates on the southern part of the sky south of the galactic plane (Maddox et al. 1990) which covers only about 2/3 of the REFLEX region.

The UK-Schmidt survey has been performed using IIIa-J photographic plates at the 1.2m UK-Schmidt-telescope. The plates were scanned within a sky area of about $5.35\deg \times 5.35\deg$ per plate with the fast COSMOS scanning machine and subsequently analysed yielding 32 parameters for the source characterization per object. These parameters describe the object position, intensity, shape, and classify the type of object. Object images are recognized down to about $b_{\rm J} \sim 22$ mag. This allows a subsequent star/galaxy separation which has been estimated to be about 95% complete with about 5% contamination to $b_{\rm J} \sim 19.5$ mag and about 90% complete with about 10% contamination to $b_{\rm J} \sim 20.5$ mag (Heydon-Dumbleton et al. 1989; Yentis et al. 1992; MacGillivray et al. 1994, and Mac Gillivray priv. communication). The galaxy magnitudes were intercalibrated between the different plates using the substantial plate overlaps and absolutely calibrated by CCD sequences (Heydon-Dumbleton et al. 1989; MacGillivray et al. 1994).

5.2 Correlation of the X-ray sources with the COSMOS galaxy distribution

In search of cluster candidates as counterparts to the RASS sources we correlate the X-ray source positions with the galaxy catalogue of the COSMOS data base. The basis of this correlation are counts of galaxies in circles around the X-ray source positions to search for galaxy density enhancements.

Here we should make some remarks about the strategy behind the choice of the present cluster search algorithm. As mentioned before it is difficult to devise a good algorithm to select the most massive clusters of galaxies from optical sky survey images. We use a comparatively simple algorithm (aperture counts as compared to e.g. matched filter techniques). This simple technique seems well adapted to our needs and the depth of the COSMOS data set: (i) the technique is used to only flag the candidates and there is no need to determine a cluster richness, since we use the X-ray emission for a quantitative measure of the clusters; (ii) while matched filter techniques may introduce a bias, since a priori assumptions are made about the shape of an idealized, azimuthally symmetric cluster, we are interested in introducing as little bias and as few presumptions as possible; (iii) the actual numbers in the galaxy counts are limited and therefore the shape matching is not precise and is affected by low number statistical noise. Therefore our technique is not seen as a perfect and objective cluster characterization algorithm. The cluster selection should primarily depend on the X-ray criteria. We have chosen a very low cut for the optical selection which results in a substantially larger candidate sample compared to the expected number of clusters, with an estimated contamination of as much as 30-40%. But it assures on the other hand that we have a highly complete candidate sample. This overabundance of candidates is thus a necessary condition to obtain an essentially X-ray selected sample for our survey.

The galaxy counts are performed for 5 different radial aperture sizes: 1.5, 3, 5, 7.5, and 10 arcmin radius with no magnitude limit for the galaxies selected. Since an aperture size of about 0.5h₅₀^-1 Mpc in physical scale corresponding to about two core radii of a rich cluster provides a good sampling of the high signal-to-noise part of the galaxy overdensity in a cluster, the chosen set of apertures gives a good redshift coverage in the range from about z = 0.02 to 0.3 as shown by the values given in Table 4. With this choice and the depth limit of the COSMOS data set we are aiming at a high completeness in the cluster search out to a redshift of about z = 0.3. For this goal the chosen flux limit and the depth of the COSMOS data base are quite well matched as the richest and most massive clusters are still detected in both data sets out to this redshift.

The galaxy counts around the given X-ray source positions are compared with the number count distributions for 1000 random positions for each photographic plate. With this comparison we are also accounting for plate to plate variations in depth as explained below. The number count histograms for the random positions have been generated at the Naval Research Laboratory in preparation of a COSMOS galaxy cluster catalogue, the SGP pilot study (Yentis et al. 1992; Cruddace et al. 2000), and for this ESO key program. The results of the random counts yield a differential probability density distribution, $\phi (N_{\rm gal})$ , of finding a number of $N_{\rm gal}$ galaxies at random positions. An example for the distribution $\phi (N_{\rm gal})$ for an average of 5 randomly selected plates is shown in Fig. 9 for all five aperture sizes. (Note that $\phi (N_{\rm gal})$ is defined here as a normalized probability density distribution function while in Fig. 9 we show histograms of the form $\phi (N_{\rm gal}) \times N_{\rm count}$ ). The distribution functions resemble Poisson distributions (The possible theoretical description of the functions is not further pursued here since we are only interested in the purely empirical application to the following statistical analysis). In Fig. 10 the random count histogram for aperture 2 (3 arcmin radius) is compared to the counting results for the 4206 X-ray source positions. We note the large number of sources with significant galaxy overdensities in the X-ray source sample compared to the random counts, and expect to find the X-ray clusters in this high count tail of the distribution.

$\begin{figure} \par\includegraphics[width=8cm,clip]{aa10210f8.ps} % \end{figure}$

Figure 9: Example of the distributions of galaxy number counts, $\phi (N_{\rm gal}) \times N_{\rm count}$ , for the five apertures with radii of 1.5, 3, 5, 7.5, and 10 arcmin. The histograms are constructed from counts at 1000 random positions per photographic plate and the results for each aperture size as shown here are obtained from an average of five plates. The second, third, forth, and fifth histograms have been multiplied by factors of 2, 4, 5, and 6, respectively, for easier comparison

These results for $\phi (N_{\rm gal})$ are then used in the form of cumulative probability distribution functions

$\begin{displaymath}% P( < N_{\rm gal}) = \int_0^{N_{\rm gal}} \phi (N'_{\rm gal}) {\rm d}N'_{\rm gal} \end{displaymath}$

(1)

to assign the probability value $P( < N_{\rm gal})$ to each counting result. For the counts around X-ray sources we expect a significant galaxy density enhancement for those sources which have cluster counterparts. Therefore the counting results for the X-ray source positions should yield a distribution function $\phi (N_{\rm gal})_{\rm X}$ which has a more pronounced tail at high values of $N_{\rm gal}$ (Fig. 10). Instead of characterizing the enhancement of the counts at high galaxy numbers in the tail of $\phi (N_{\rm gal})_{\rm X}$ we use another data representation as follows.

Going back to the random sample, taking each of the values of $P( < N_{\rm gal})$ assigned to each counting result, and plotting the distribution function $\phi (P( < N_{\rm gal})) \equiv \phi (P(N))$ we will find that this function is a constant. This follows simply from the chain rule of differentiation in the following way

$\begin{displaymath}% \phi (P(N)) {\rm d}P = \phi (N_{\rm gal})~ \left\vert {{\rm d}N_{\rm gal} \over {\rm d}P} \right\vert ~ {\rm d}P \end{displaymath}$

(2)

$\begin{displaymath}% =\!\phi (N_{\rm gal})\! \left(\!{{\rm d}\over {\rm d}N_{\rm... ...m d}N'_{\rm gal}) \!\right)^{-1} \!{\rm d}P \!=\!{\rm const}. \end{displaymath}$

(3)

Thus for random counts we should expect to see a constant function (with noise if the counts are derived in an experiment independent from the random count experiment used to define $P( < N_{\rm gal})$ ). In the case of counts around X-ray sources involving clusters of galaxies the function $\phi (P_{\rm X}(N))$ is no longer a constant but should show an enhancement for large values of $P_{\rm X}(N)$ . Figure 11 shows the resulting distribution function for the galaxy counts, $\phi (P_{\rm X}(N)) \times N_{\rm sources}$ , in the 3 arcmin ring aperture for the sample of 4206 X-ray sources. The extended tail in the distribution at high values of $N_{\rm gal}$ in Fig. 10 now translates into a very pronounced peak at large values of P in Fig. 11.

For the further evaluation of this type of diagrams we make the following simplifying assumptions: i) the distribution function is composed of two types of counting results, results obtained for cluster X-ray sources and results obtained for other sources, and ii) the non-cluster X-ray sources are not correlated to the galaxy distribution in the COSMOS data base and thus constitute effectively a set of random counts. This latter assumption is of course not strictly true for all the non-cluster X-ray sources. While it may be justified to treat stars and other galactic sources as well as distant quasars as independent of the nearby galaxy distribution, there is also a population of extragalactic sources like low redshift AGN and starburst-galaxies that we know are correlated to the large-scale structure in the galaxy distribution. However, the practical assumption that this correlation is weak in comparison to the galaxy density enhancements in clusters of galaxies is generally well justified.

$\begin{figure} \par\includegraphics[width=8cm,clip]{aa10210f9.ps}\end{figure}$

Figure 10: Example of the distributions of galaxy number counts in a circular aperture with 3 arcmin radius for an average of five UK Schmidt plates and 1000 random positions per plate (thin line). This distribution is compared to the results of the galaxy number counts for the 4206 X-ray sources of the sample for the same aperture radius. The histogram for the random position counts has been normalized to the histogram of the X-ray source counts so that the peaks have the same hight

With this assumption we expect to find a distribution function $\phi (P_{\rm X}(N))$ composed of a constant function and a peak at high P-values. Subtracting the constant function leaves us with the cluster sources. This is schematically illustrated in Fig. 12. For the selection of the cluster candidates we can now either select the sources which feature a high value of $N_{\rm gal}$ or a high value of $P_{\rm X}(N)$ . We choose to use $P_{\rm X}(N)$ for the sample selection (as justified further below) in such a way that most of the cluster peak is included in the extracted sample (that is choosing $P_{\rm X}(N)$ such that the fraction C in Fig. 12 of cluster lost from the sample is small or negligible).

$\begin{figure} \par\includegraphics[width=7.8cm,clip]{aa10210f10.ps} % \end{figure}$

Figure 11: Histogram of the galaxy excess probabilities, $P_{\rm X}$ , obtained from galaxy counts in circular apertures of radius 3 arcmin for the sample of 4206 sources of our count rate limited sample of southern RASS II sources (excluding the multiple detections of extended sources). There is a clear excess of counts at high values of the galaxy density (high values of $P_{\rm X}$ ), which is primarily due to the effect of galaxy cluster counterparts to the X-ray sources

The clear distinction between the flat distribution for probability values between 0 and about 0.7 and the clear and prominent "cluster peak'' as found in Fig. 11 indicates that we can quantify this result further. As illustrated in the sketch of Fig. 12, the cluster contribution is responsible for the dark shaded areas labeled A and C. Extracting a sample highly enriched in clusters by choosing a particular high value, $P^{\star }_{\rm X}$ , leaves us with a formal completeness of the sample expressed by

$\begin{displaymath}% F_{\rm comp} = {A \over A + C}\cdot \end{displaymath}$

(4)

The formal contamination of this sample by non-cluster sources is likewise given by the expression

$\begin{displaymath}% F_{\rm cont} = {B \over A + B}\cdot \end{displaymath}$

(5)

We choose the parameter $P_{\rm X}(N)$ for the selection of the cluster sample for the following reason. The distribution $\phi (N_{\rm gal})$ is computed for each plate. Since there are plate to plate variations in the average galaxy density, using just $N_{\rm gal}$ would introduce a bias in the sample extraction. The use of the parameter $P_{\rm X}(N)$ takes these variations into account. Possible variations in the background density of the galaxies within each plate are not accounted for in this approach but in general these variations are very small and due to our strategy of oversampling (minimizing fraction C in Fig. 11) this has little effect on the sample. The analysis was carried out for all five circular apertures. The strategy for the selection of the cut value, $P^{\star }_{\rm X}$ , was to roughly obtain a sample with 90% completeness for the single ring statistics and a contamination not much larger than about 20% to 30%. Comparing the results for different apertures, one notes that the peak is best defined for the counts with the two smallest apertures. With increasing aperture size the peaks get broader and broader, leading to a more and more unfavorable value for completeness versus contamination result. Therefore we have relaxed the completeness criterion for the three largest apertures to values lower than 90% not to increase the sample contamination dramatically. The resulting values for $P^{\star }_{\rm X}$ , $F_{\rm comp}$ , $F_{\rm cont}$ , and the resulting sample sizes are given in Table 4 for each aperture counting result. We also indicate in the table the sample size, $N_{\rm sample}$ , defined as A + B, and the expected number of clusters, $N_{\rm cl.est.}$ , given by A + C (in Fig. 12). Note that the sample size is larger than half of the starting sample (4410 objects) and that the results indicate the presence of roughly 1800 galaxy clusters, a number much larger than expected.

$\begin{figure} \par\includegraphics[width=8cm,clip]{aa10210f11.ps} % \end{figure}$

Figure 12: Sketch of the typical result of the distribution function $P_{\rm X}(N_{\rm gal})$ for an X-ray source sample containing galaxy cluster counterparts. The parameter $P^{\star }_{\rm X}$ indicates the minimal allowed value of $P_{\rm X}(N_{\rm gal})$ in the sample selection. A + Cgives the "true number of clusters'' and A + B the size of the extracted sample

**Table 3:** Statistics of the results of the galaxy counts around the 4410 X-ray sources (above the first count rate cut). Columns 2 to 5 give the lower probability limit for the sample selection, the sample completeness, the contamination, the sample size and the estimated total number of clusters (see text for more details). Column 6 gives the physical scale of the aperture radius at a redshift z = 0.08, close to the median redshift of the REFLEX sample. Column 7 gives the redshift at which the aperture radius corresponds to a physical size of 0.5 h₅₀^-1 Mpc. The combined sample is defined by all candidates flagged in at least one of the aperture count searches
$\begin{table} \par$ \begin{array}{llrlrrrr} \hline {\rm aperture~ radius}&... ...90\% & - & 2640 & {\sim}{\rm 1750}\\ \hline \end{array} $\space \end{table}$

Up: The ROSAT-ESO flux limited sample