The final SDSS-IV/SPIDERS X-ray point source spectroscopic catalogue

Aims. We look to provide a detailed description of the SPectroscopic IDentification of ERosita Sources (SPIDERS) survey, an SDSSIV programme aimed at obtaining spectroscopic classification and redshift measurements for complete samples of sufficiently bright X-ray sources. Methods. We describe the SPIDERS X-ray Point Source Spectroscopic Catalogue, considering its store of 11 092 observed spectra drawn from a parent sample of 14 759 ROSAT and XMM sources over an area of 5129 deg2 covered in SDSS-IV by the eBOSS survey. Results. This programme represents the largest systematic spectroscopic observation of an X-ray selected sample. A total of 10 970 (98.9%) of the observed objects are classified and 10 849 (97.8%) have secure redshifts. The majority of the spectra (10 070 objects) are active galactic nuclei (AGN), 522 are cluster galaxies, and 294 are stars. Conclusions. The observed AGN redshift distribution is in good agreement with simulations based on empirical models for AGN activation and duty cycle. Forming composite spectra of type 1 AGN as a function of the mass and accretion rate of their black holes reveals systematic differences in the H-beta emission line profiles. This study paves the way for systematic spectroscopic observations of sources that are potentially to be discovered in the upcoming eROSITA survey over a large section of the sky.


Introduction
Since the advent of powerful focusing X-ray telescopes, it has become clear that the high-energy emission provides an insightful view of the extra-galactic sky. Accreting super-massive black holes dominate the number of detected X-ray sources down to the limiting fluxes detectable in the deepest pencil beam surveys today; clusters of galaxies, on the other hand, also shine brightly in X-rays due to the presence of hot plasma reaching temperatures of millions of degrees in their potential wells. X-ray surveys can, therefore, be used to provide some of the most stringent constraints on the cosmological evolution of super massive black holes (see e.g. Hickox et al. 2017) and of the large-scale structure itself (see e.g. Weinberg et al. 2013). However, optical spectroscopy is almost always needed in order to unambiguously Catalogues are also available at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsarc. u-strasbg.fr/viz-bin/cat/J/A+A/636/A97 classify X-ray sources as well as measure their distances accurately.
Over the last decade, spectroscopic observations in the optical of X-ray selected active galactic nuclei (AGN) have increased in number by about two orders of magnitude, from hundreds to tens of thousands, when combining deep and mediumdeep surveys with wide area surveys (Murray et al. 2005;Salvato et al. 2009Salvato et al. , 2011Brusa et al. 2010;Fotopoulou et al. 2012;Kochanek et al. 2012;Hsu et al. 2014;Nandra et al. 2015;Marchesi et al. 2016;Menzel et al. 2016;Xue et al. 2016;Ananna et al. 2017;Georgakakis et al. 2017;Luo et al. 2017;Hasinger et al. 2018;LaMassa et al. 2019). A subset 1 of existing samples of X-ray selected AGN with spectroscopic redshift is detailed in Table 1.
Compared to previous samples, SPIDERS covers a different parameter space in terms of area and depth and it is also the largest X-ray point source spectroscopic catalogue to date. The spectroscopic data are made public in the 16th release of data from the SDSS (DR16, Ahumada et al. 2019) 2 , together with two "value added catalogues", which are also part of DR16, for ROSAT and XMM-Slew sources, respectively. Table 2 gives the links to the catalogues and a description of each column.
The SDSS-IV single fibre optical spectroscopic programme is shared between the extended Baryon Oscillation Spectroscopic Survey (eBOSS, main programme), the SPectroscopic IDentfication of ERosita Sources survey (SPIDERS, subprogramme), and the Time-Domain Spectroscopic Survey (TDSS, sub-programme), which share the focal plane during observations. The complete SPIDERS survey programme provides a homogeneous optical spectroscopic observations of X-ray sources both point-like and extended, paving the way towards systematic spectroscopic observations of eROSITA detections over a large portion of the sky (Merloni et al. 2012Predehl et al. 2016;Kollmeier et al. 2017). The programme 2 sdss.org started well before the beginning of SRG/eROSITA operations upon completing the observation of the currently existing wide area X-ray surveys. In particular, SPIDERS targeted sources from the ROSAT All-Sky Survey, XMM Slew sources, and XMM-XCLASS catalogues (Voges et al. 1999(Voges et al. , 2000Saxton et al. 2008;Clerc et al. 2012) within the SDSS-IV footprint (Dawson et al. 2016;Blanton et al. 2017).
Clusters of galaxies were selected by cross-correlating faint ROSAT and XCLASS extended sources with red-galaxy excess found in SDSS imaging in the range 0.1 < z < 0.6 (Clerc et al. 2016;Finoguenov et al. 2019). These are the most massive and largest clusters in the X-ray sky, representing a well-defined sample that can be used as a first stepping stone for cluster cosmology experiments via a measurement of the growth of structure (Ider Chitham et al., in prep.). Two companion papers (Clerc et al., in prep.; Kirkpatrick et al., in prep.) describe the observation of clusters in SPIDERS.
Active galactic nuclei were selected by cross-correlating ROSAT and XMM Slew catalogues with optical and near infrared data (Dwelly et al. 2017;Salvato et al. 2018). In this paper, we describe the results of the observation of point-like sources. More specifically, we detail the case of the active galactic nuclei detected by ROSAT.
The structure of the paper is as follows. We explain the data and the procedure used to construct the catalogue in Sect. 2. A97, page 2 of 22 We describe the redshifts measured in Sect. 3. We discuss the specific case of stars in Sect. 4. Finally, we show flavour spectral stacks of type 1 AGN in Sect. 5. Throughout the paper, we assume the flat Λ cold dark matter (ΛCDM) cosmology from Planck Collaboration XVI (2014). Magnitudes are given in the AB system (Oke & Gunn 1983).

Data
The original SPIDERS targeting, as documented in Dwelly et al. (2017), was based on earlier versions of the X-ray catalogues than the ones that were used to build the SPIDERS-DR16 catalogues, as the X-ray-optical cataloguing methods have evolved and improved since the time of target selection.
Here we first (in Sect. 2.1) summarise the original target selection for the SPIDERS-AGN samples (based on 1RXS and XMMSL1 catalogues) and the observational completeness of these samples by the end of the SDSS-IV/eBOSS survey. Then in Sect. 2.2, we describe in detail the steps that were carried out to build the catalogues released here based on updated X-ray catalogues (2RXS, XMMSL2). These sections are very technical in nature. Dwelly et al. (2017) documents how the target selection was carried out on the ROSAT (1RXS) and XMM Slew v1.6 (XMMSL1) catalogues (Voges et al. 1999(Voges et al. , 2000Saxton et al. 2008). The area considered for target selection was the subset of the SDSS DR13 photometry footprint (Fukugita et al. 1996;Albareti et al. 2017) that was considered suitable for extragalactic survey work by the BOSS team 3 . It consists of ∼10 800 deg 2 of extra-galactic sky and contains 32 408 1RXS + 4325 XMMSL1 X-ray sources. For 28 515 (1RXS) and 3142 (XMMSL1) of these X-ray sources, a counterpart was found in the AllWISE catalogue, together with an SDSS-DR13 photometric counterpart (AllWISE, Wright et al. 2010;Cutri et al. 2013). 11 643 (1RXS) and 1411 (XMMSL1) of these optical counterparts had previously been spectroscopically observed in earlier phases of the SDSS project. Out of the 16 872 (1RXS) + 1731 (XMMSL1) potential targets remaining, 9028 (1RXS) + 873 (XMMSL1) passed suitability filters and were put forward for spectroscopic observation within the main SDSS-IV/eBOSS programme. For more details on the procedure to select the targets, please refer to Dwelly et al. (2017), particularly their Figs. 8 and 13. The target catalogues are available here 4 .

Target selection summary
The sky area observed by the combination of the SDSS-IV/eBOSS main spectroscopic programme, plus the SDSS-III/SEQUELS pilot area, covers approximately half of the wider 10 800 deg 2 BOSS imaging footprint considered for the SPIDERS-AGN target selection (Dawson et al. 2016). For the purposes of this paper, we define the following "SPIDERS-DR16" footprint. First we consider the sky area covered by the union of 1006 SDSS-IV/eBOSS and SDSS-III/SEQUELS plates (each plate covers a 1.49 deg radius circle). In order to maximise the contiguity of the footprint, we included 15 plates that do not meet the nominal eBOSS minimum signal-to-noise ratio (S/N). We then reject any sky areas that lie outside the BOSS imaging footprint or those that are overlapped by any plates that were planned but not observed by the conclusion of SDSS-IV/eBOSS (217.8 deg 2 is rejected). The total remaining unique sky area in the SPIDERS-DR16 footprint is 5128.9 deg 2 . Figure 1 illustrates the SPIDERS DR16 footprints. Within the SPIDERS-DR16 area, there are 4713 (1RXS) + 457 (XMMSL1) potential targets available. We note that during the SDSS-IV observations, the focal plane was shared between three programmes: eBOSS, TDSS, and SPIDERS (Dawson et al. 2016;Blanton et al. 2017) and so there was competition for fibre resources. A total of 4406 (1RXS, 93%) + 430 (XMMSL1, 94%) of the targets were eventually observed during the SDSS-III/SEQUELS and SDSS-IV/eBOSS campaigns.

The SPIDERS 2RXS sample
The DR16 SPIDERS 2RXS catalogue is constructed as follows. We consider the updated ROSAT point-source catalogue (2RXS Boller et al. 2016) and its counterparts found via the nway software ). This parent catalogue does not correspond exactly to the parent catalogue used (1RXS) at the moment of targeting by Dwelly et al. (2017). At the bright end, higher detection likelihood, the catalogues are the same. At the faint end, marking the lower detection likelihood, there are differences. For a quantitative comparison between 1RXS and 2RXS, please refer to Boller et al. (2016) The 2RXS catalogue contains 132 254 sources over the entire sky, of which 21 288 lie in the SPIDERS-DR16 footprint.
We filter the complete source list with the eBOSS footprint mask (and with a galactic latitude cut |g lat | > 15 • ). We match AllWISE positions (columns names in the SPIDERS catalogue: ALLW_RA, ALLW_Dec) to SDSS-DR13-photo optical catalogues choosing the brightest counterpart (in modelMag_r) lying within 3 arcsec radius (larger than the 1.5 arcsec radius used for targeting). In the catalogue, we select only the most likely counterpart detected in SDSS photometry as follows: After this, only one catalogue entry per X-ray source remains; we note, however, that in some rare cases, this is the incorrect counterpart (for example, if the uncertainty on the X-ray position is underestimated, we may miss the true counterpart if it is located beyond the search radius). We discuss these few cases later in the article. In the SPIDERS-DR16 footprint, we obtain 19 821 (10 039) X-ray sources with existence likelihoods greater than 6.5 (10) 5 . We refer to these as "All" the sources of interest (labelled "A" in figures and tables, Eq. (1)).
Among "A", 13 986 (6853) are in the magnitude range to be observed by the SDSS-IV programme. We refer to these as candidate "targets" for spectroscopic observation with SDSS ("T ", Eq. (2)).
Then the SDSS spectroscopic information is added based on the optical position (using a 1.5 arcsec matching radius between the optical source position (SDSS_RA, SDSS_DEC) and fibre position on the sky (PLUG_RA, PLUG_DEC). Among "T ", 10 590 (6145) were spectroscopically observed during one of the SDSS editions (for these, in the catalogue, the "DR16_MEMBER" flag is set to True). We refer to these as "observed" ("O", Eq. (3)); Among "O", 10 474 (6096) were identified or classified. We refer to these as "identified sources" ("I", Eq. (4)). where Among "I", we measured 10 366 (6007) reliable redshifts, confirmed by visual inspection. We refer to these as "good redshifts" ("Z", Eq. (12)). The difference between I and Z consists of a set of 108 (89) featureless high signal-to-noise BLAZAR Notes. "exiML" refers to the existence likelihood threshold applied in the X-ray. "Any" refers all the sources in the catalogue. For a single X-ray sources, a set of counterpart may be listed (not unique). "A" refers to all sources matched to their potential best optical counterpart. Each X-ray source is listed only once (Eq. (1)). "T " refers to sources that are candidate targets for optical spectroscopic observation (Eq. (2)). "O" refers to observed sources (Eq. (3)). "I" refers to identified sources (Eq. (4)). "Blazar no Z" refers to sources identified as BLAZAR for which we could not measure the redshift (Eq. (11)). "Z" refers to sources with good redshift measurements (Eq. (12)). The last column gives the targets that are uniquely present in the XMMSL2 i.e. not in the 2RXS catalogue.
spectra, whose redshift could not be determined (classification "blazars_noZ" below, Eq. (11)); The existence likelihood, denoted exiML, is the detection likelihood that was measured by Boller et al. (2016) for the 2RXS sample (RXS_ExiML). Table 3 gives the number of object in each category A, T , O, I, blazar_noZ, Z for the two existence likelihood thresholds (6.5 and 10). The redshifts are described in detail in the following section.
We investigate the distribution of the A, T, O, I, Z samples (with exiML > 6.5) as a function of the X-ray flux (RXS_SRC_FLUX) and optical i-band 2 arcsec fibre magnitude (SDSS_FIBER2MAG_i), see Fig. 2. The X-ray flux is de-reddened from the Milky-way assuming a power law emission, which is correct for AGN but not for stars or clusters. The distribution of soft band X-ray flux for each sample is shown on the top panel. Most of the sources have a flux −13 < log 10 (F X [erg cm −2 s −1 ]) < −11.5. Few are brighter. The number of targets diminishes (w.r.t. all sources) as a function of flux, see curve labelled "T " (in orange). It is due to the bright fibre magnitude and model magnitude cuts i.e. the bright X-ray sources are also bright in the optical. The bottom panel of the figure clearly shows the impact of the optical cuts. The panels showing the ratio between the observed sample and the targets as a function of X-ray flux or fibre magnitude demonstrate that the observed sample is biased with respect to the targets. Indeed the faintest and brightest objects are under-represented. For the high existence likelihood sample (exiML > 10), the effect is lesser but is still present (third panel of Fig. 2). Although we have observed 6145/6853 = 89% of the exiML > 10 targets, there remain small biases as a function of fibre magnitude and X-ray flux at the bright end.
A97, page 4 of 22  Table 3: A, T , O, I, Z. The histogram of X-ray flux shows how bright the targets are (top panel). Second and third panel: fraction of observed targets, identified objects and good redshifts with respect to the targets sample. The second panel is for exiML > 6.5 and the third panel for exiML > 10. They show that the exiML > 10 Z sample is close to being a random sub sample of the targeted sample with a completeness slightly below 90%. The histogram of the i-band fibre 2 mag (fourth panel) shows the impact of the optical selection made on the counterparts found, which removes the bright objects. Similarly to the second panel, we show in the fifth panel the ratios O/T , I/T , Z/T as a function of fibre magnitude. This shows that identifying sources and determining their redshift is more difficult at the faint end. Over the complete BOSS extra galactic area (10 800 deg 2 = 2.1 times the SPIDERS-DR16 area), the total number of targets (26 685) is about twice that present in the SPIDERS-DR16 area (13 986), see Table 4. Here, the fraction of observed targets is 63.1% over 10 800 deg 2 instead of 75.4% on SPIDERS-DR16, so the completeness is lower. Furthermore the observed targets were chosen following different targeting schemes (previous SDSS editions), so the observed sample will be further away from being a random sampling of the complete set of targets. It thus complicated the statistical analysis, for example extracting an unbiased redshift distribution becomes tedious. This is the main reason we excluded this additional area from the catalogue and the analysis presented here. Using the ZWARNING=0 criterion from the SDSS pipeline (indeed inspections are not available for the complete area), we obtain an estimation of the total number of good redshifts, 16 128 (95.7% of the observed), but cannot guarantee that all of them indeed are, due to the lack of visual inspections. To reach the 97.8% of good redshifts (as in the SPIDERS-DR16 footprint) further inspection of the spectra is required. It would also enable the proper flagging of blazars, which redshifts are difficult to fit.

The SPIDERS-AGN XMMSL2 sample
The DR16 SPIDERS XMMSL2 catalogue is constructed in a similar fashion to the 2RXS. The existence likelihood, denoted exiML, is the maximum of the detection likelihood in any of the three bands the point source were detected in Saxton et al. (2008) (483) good redshifts, see Table 3.

Summary of observations
By combining the observations of the 2RXS and the XMMSL2 samples, we accumulated 10 849 good redshifts out of 14 759 targets over the SPIDERS-DR16 area. The fraction of observed targets is about O/T ∼ 73.5% and could increase to 90-95% with another dedicated programme. The fraction of identified targets among the observed is high: Z/O = 97.8%. Given that the 2RXS catalogue covers the full sky, one could extend the match to spectroscopic observation to larger areas, but the completeness would then be much lower (O/T ∼ 30%) and the observed redshift may constitute a biased sample with respect to the complete sample. In the next decade, the combination of eROSITA with SDSS-V, 4MOST and DESI should enable the construction of a large full-sky X-ray AGN catalogue, see the discussion in Sect. 6.
To increase confidence in the automatically obtained redshifts, we visually inspect the SPIDERS spectra. The visual inspection procedure and the reconciliation of results between inspectors is detailed in Dwelly et al. (2017). After inspection, we report the successful measure of redshifts for 97.8% of the observed targets. Please refer to Menzel et al. (2016) for a specific and detailed discussion on the accuracy of spectroscopic redshifts for X-ray selected AGNs. Overall, the number of redshift failure being quite small, we cannot study these population statistically in depth. Nevertheless, we see a hint that the magnitude (or fiber magnitude) distribution of undetermined redshifts is skewed towards the fainter magnitudes. Indeed, it should be more difficult to obtain redshift for fainter objects relative to brighter objects.

Classifications
In addition to the redshift confidence flag (CONF_BEST), the visual inspection enable to classify in AGN types (CLASS_BEST). However, because the SPIDERS catalogues has been assembled by combining various generations of SDSS observations and visual inspections, the final classification is heterogeneous. For simplicity, we can group the observed objects with reliable redshift (CONF_BEST==3) into the following broadly defined families: AGNs (type 1 and 2), stars, AGN in clusters and galaxies in clusters. These additional classifications flags are made available here 7 . 1. Stars are identified with the CLASS_BEST=="STAR". 2. Blazar: CLASS_BEST="BLLAC" or "BLAZAR". 3. Type 1 AGN (or Broad-line AGN, or un-obscured AGN of optical type 1), comprising CLASS_BEST=="BALQSO", "QSO_BAL", "QSO", "BLAGN". 4. Type 2 AGN (or narrow-line AGN or narrow-line AGN candidates or obscured AGN of optical type 2) comprising CLASS_BEST=="NLAGN", "GALAXY" 5. Considers the possibility of ROSAT mistakenly identifying a source as point-like instead of extended, due to poor PSF. In the latter case (5), some or all of the X-ray flux may be due to a cluster of galaxies. In order to take that eventuality into account, Galaxies or QSO are counted as possible cluster members if their redshifts are within 0.01 and their position within 1 arcmin of a redmapper cluster (Rykoff et al. 2014) or a SPIDERS cluster (Clerc et al. 2016;Finoguenov et al. 2019). These cannot be counted within the 2RXS or XMMSL2 X-ray flux limited AGN sample. Indeed some of the flux associated may come from the host cluster. Among the good redshift class ("Z"), after visual inspection, we list (in parentheses, separated by a plus sign) the occurrences in the 2RXS+XMMSL2 catalogue in each family: type 1 AGN (8216+941), Type 2 AGN candidates (1331+119), possible clusters members (503+62) and stars (278+27), see Table 5. We note that among the Cluster member candidates, "GALAXY" refers 7 http://www.mpe.mpg.de/XraySurveys/SPIDERS/ Notes. "exiML" refers to the existence likelihood threshold applied in the X-ray catalogue. "Z" refers to sources with good redshift measurements.
here to the spectra without any obvious signature of an AGN. The top panel of Fig. 3 shows that these sources are usually either associated to a low X-ray source detection likelihood (and in this case the source would just be a galaxy in the field), or among the brightest members of a galaxy cluster (large extension in X-ray images, e.g., bottom panel of Fig. 3). Most of the sources classified as "GALAXY/Cluster" have a low p_any value in NWAY , indicating that the reliability of the association is also low. We note that each population samples the fiber magnitude, model magnitude, and redshift histograms in a different fashion (see Fig. 4). The stars sample the brighter end of the magnitude distribution. The AGN exclusively populate the fainter end. Indeed, at faint broad band magnitudes, redshift can only be determined thanks to strong emission lines; and the galaxies in clusters sample intermediate magnitudes.

AGN
Among the AGN, the majority (8216/9622 ∼ 85%) show a spectrum with broad features (emission line widths in excess of 200 km s −1 , Bolton et al. 2012). We name these type 1 AGN. 1331 are classified as type 2 AGN. The remaining few are either BLAZAR or broad absorption line QSO.
The type 2 AGN category is constituted by heavily obscured AGN (or candidates). Among the 1331, 602 (729) have a high (low) existence likelihood in the X-ray (i.e. above and below 10). For the population of high existence likelihood, the spurious fraction expected is of order of 7%, that is, about 40 among 602. The spurious fraction should be higher (about 50%) among the 729 with low existence likelihood. More accurate X-ray observations, deeper optical data, and a detailed emission line analysis are needed to disentangle these cases. We leave such analysis for future studies, and note that machine learning algorithm using spectral features may be a key in this process (e.g. Zhang et al. 2019).
Following Sect. 5 of Coffey et al. (2019), we compute the 2RXS (XMMSL2) X-ray luminosities in the bands 0.1-2.4 (0.2-12) keV. The 2RXS (XMMSL2) flux is modelled with an absorbed power law, mod pha*powerlaw, with a slope of Γ = 2.4 (1.7) with the n H set to that of the Milky Way. Figure 5 shows the X-ray luminosity vs. redshift for the 2RXS (XMMSL2) samples. It compares them to a set of the deep pencil beam surveys referenced in Table 1 (red points) and the upcoming eROSITA sample (purple) taken from the mock catalogue of Comparat et al. (2019). The three data sets are very complementary in sampling the redshift luminosity plane. The A97, page 6 of 22 Fig. 3. Top: distribution of all sources in the X-ray detection likelihood vs. X-ray extension (in arcsec) plane. All counterparts are shown in grey (label: CTPS to ROSAT/2RXS sources). Sources classified as "GALAXY" are coloured according to their p_any parameter (see Sect. 3 of Salvato et al. 2018, for a definition of p_any). The majority of the galaxies are either associated to a very low significance Xray detection and thus just galaxies in the field (bottom part) or to extended sources (upper right part), indicating that they could be passive galaxies members of clusters or local (low redshift) extended galaxies. Bottom: central object in the figure is the counterpart associated to a 2RXS source, with a low p_any but high extension in the X-ray images. In fact, the 2RXS source in this case was extended and the associated galaxy is the central galaxy of a cluster at redshift 0.145. These type of sources populate the top/right quadrant in the top panel of the figure.
SPIDERS-DR16 sample will participate to a more quantitative estimate of the evolution with redshift of the bright end of the X-ray AGN luminosity function (Miyaji et al. 2000(Miyaji et al. , 2015Aird et al. 2015;Georgakakis et al. 2017). Indeed, this sample has a comparable number of sources to all pencil beam surveys together.

Stars
A complete section on stars is presented in Sect. 4.

Redshift distribution
We find that the redshift distribution observed has the shape expected for an X-ray flux-limited sample with a broad optical magnitude range cut. In Fig. 6 we show the redshift distribution observed per square degrees in the 2RXS and XMMSL2 catalogues for each classification: AGN and cluster. For XMMSL2, which has the brightest flux limit (log10 around −12) the number density per unit sky area increases and reaches its peak in the bin 0.1 < z < 0.2. For 2RXS, which has a fainter flux limit (log10 around −12.5) the peak in number density occurs in the bin 0.2 < z < 0.3. It compares favourably with predictions from an adaptation of the mock catalogue of Comparat et al. (2019). To adapt the mock sample, we re-sample the X-ray fluxes and optical magnitudes to match the depths of the 2RXS catalogue and of the SDSS optical photometric survey. There is a discrepancy at low redshift: a deficit of AGNs in the observed sample compared to the mock. It is due to the bright magnitude and fiber magnitude cuts applied to the targeted sample; see Fig. 2. Indeed these cuts remove a part of the low redshift AGNs, but they are difficult to mock properly.
We complemented the SPIDERS-DR16 catalogue with a variety of multi-wavelength information: X-ray ( Fig. 7 the W1 magnitude vs. the X-ray flux of the sources, adopting the same line that was suggested to be able to separate AGN and compact objects from stars. Figure 8 shows the SDSS g−r vs. r−i colours for all our SPI-DERS sources. The vast majority of AGN cluster around a blue locus (g−r < 0.5 and r −i < 0.5) Sources classified as BLAZAR lie in the same blue locus. Some AGN are redder (obscured) and thus extend to the to right corner of the plot. The sequence of stars also appears clearly. Galaxies in clusters are mostly red and QSO in clusters are mostly blue. A consistent picture emerges also from the analysis of the with the WISE colour-colour diagrams (W1 − W2 vs. W2 − W3) shown in Fig. 9    . W1 magnitude vs. X-ray flux for 2RXS sources without (grey) and with (colored) spectroscopy, as labeled. The dashed line, taken from Salvato et al. (2018), define the loci of AGN and compact objects (above) and stars (below). Note that here AGN contains both type 1 and type 2 (and candidates) objects.

X-ray stars
Visual screening of all spectra obtained in the SPIDERS programme and of those obtained during earlier phases of the SDSS programme and associated with 2RXS and XMMSL2 sources led to a separation of stellar objects from the large body of extra-galactic objects. The 2RXS and XMMSL2 catalogues list 290 and 37 stellar objects with attribute CLASS_BEST=="STAR", but 278 and 27 only, when the criteria described in Sect. 2 are applied. The number 27 is further reduced to 16, when duplications with the 2RXS catalogue are removed.
Obtaining a spectrum of an object classified as "STAR" does not entail that the counterpart of the X-ray source has been identified; for this, a second X-ray identification screening step (XID) is needed. While the initial screening was undertaken by several individuals and a compromise had to be found in case of deviating results (classification, redshift), the XID screening step was performed by just one of the authors (AS) with the potential risk of introducing some biases or errors, but the potential advantage of a more homogeneous way of classifying stars. Screening for XID was done with the help of a few extra data products. These were: (a) an optical finding chart based on a PanSTARRS (Flewelling et al. 2016) g-band image (location of the X-ray centroid, the X-ray uncertainty and the target indicated), an X-ray to optical colour-colour diagram (log( f X / f opt ) vs. g − r), and a long-term light curve obtained from the Catalina Real-Time Transient Survey (CRTS, Drake et al. 2009). For almost all targets, the "EXPLORE" feature of the SDSS-sciserver was used to search for possible other counterparts and to search for entries in the SIMBAD or NED databases.
Based on the available information, a first decision was made if the object could be confirmed as a star. This first screening step was performed on the more general CLASS_BEST=="STAR" sample and led to a revision of a number classifications that are documented in Table A.1. We corrected the incorrectly labelled source CLASS_BEST=="STAR" in Table A.2. Then a second decision about the reliability of the target being the counterpart of the X-ray source was made. An XID-flag was assigned to each spectrum indicating this kind of reliability, ranging from XID=1 to XID=3. XID=1 means that the object is regarded being the A97, page 9 of 22 optical counterpart with high confidence. XID=2 means that the object could be the counterpart or at least could contribute to the observed X-ray emission. This often means that some typical ingredient or hallmark is missing or that the object seems to be blended or shows other morphological complexities. An XID=3 object is regarded likely not being the counterpart of the X-ray source. Table A.1 contains the results and XID values for the objects classified as stars.
All stellar targets were sub-classified into three main classes: coronal emitters (including flare stars), white dwarfs (WD), and compact white-dwarf binaries, either in a detached or a semidetached configuration. The latter are the cataclysmic binaries, were a white dwarf accretes matter from a main-sequence star via Roche-lobe overflow. The break-down of stars flagged XID=1 into those three main sub-classes for the 2RXS and the XMMSL2 samples is given in Table 6. In the star-related Tables, we use the following acronyms to classify the sources: -CV: cataclysmic variable with unknown sub-category -CV/AM: cataclysmic variable of AM Herculis type -CV/DN: cataclysmic variable of dwarf nova type -WDMS: detached white dwarf/main sequence binary -LARP: low accretion rate polar -DB+M: a binary consisting of a white dwarf of spectral type DB and a companion star.

2RXS
The distribution of stellar spectra over the three XID bins (1/2/3) is (102/77/99). Among the 102 XID=1 sources from the 2RXS list, we find 67 single stars (coronal emitters and hot or sufficiently close white dwarfs) and 33 binaries with a compact object, most of them (29) being cataclysmic variables (CVs). Sample spectra of those typical X-ray emitters are displayed in Fig. 10. Interestingly, 75 of the 102 high-confidence (XID=1) counterparts have an NWAY p_any < 0.5, illustrating the fact that the Bayesian prior used in the X-ray to IR/optical association seems to disfavour true stellar X-ray emitters. For a stellar survey, a different prior is needed.
We list the reasons for an XID=2 classification over an XID=1: (1) the object appeared optically too faint for the given X-ray flux, (2) an M-star did not show any obvious sign of activity like Hα in emission of flares/flickering of the light curves, (3) large X-ray positional errors could cast doubt on the uniqueness of the identification, in particular if the object does not show strong signs of activity which, together with an atypical optical faintness casts doubt on the reliability of the X-ray to optical association, (4) apparent binaries were found, so that the X-ray-WISE-SDSS association chain led to ambiguities (an unresolved double WISE counterpart to the X-ray source was associated with the wrong SDSS object), (5) the contribution of the WISEblended source could not be quantified.
An example of such an of XID=2 classification is J002317.1+191028 (7590-56944-674), which is an M-star Fig. 10. Sample spectra of XID=1 objects, a hot white dwarf, a flare star, a non-magnetic cataclysmic variable (dwarf nova), and a strongly magnetic cataclysmic variable (a polar or AM Herculis star). The PLATE-MJD-FIBERID combination and the type of X-ray emitter are indicated in the panels. All spectra were obtained in the current SDSS programme.
showing Hα in emission and displays a variable light curve, hence qualifies as X-ray emitter, although being found with an uncomfortably large f X / f opt . We found that a QSO, SDSS J002319.72+190958.2, at redshift z = 1.504 with a similar distance to the X-ray position and could contribute to the X-ray flux or even dominate. This object was thus put in the XID=2 bin because both objects could contribute to the X-ray emission.
XID=3 sources were classified as such mainly for two reasons: (a) the targeted object was too faint with high confidence for being compatible with a stellar coronal emitter, meaning that it had a too high an X-ray flux or a too faint an optical brightness to be compatible with the maximum L X /L bol which was assumed to be ≤−3 (b) another much more typical X-ray emitter was found (often even closer) to the X-ray position (e.g. an A0 star was targeted (3454-55003-211), one of the least X-ray active stars, but a white dwarf SDSS J155108.25+454313.2 was found to lie closer to the X-ray position). Indeed most of the discarded objects had QSOs, CVs or WDs as more likely counterparts. These more likely counterparts already had spectra taken by previous editions of SDSS, so in the SPIDERS programme, they were targeted as possible secondary sources to investigate their hierarchy.
A further two X-ray sources were associated with Mstars (spectra with PLATE-MJD-FIBERID 693-52254-0599 and 1046-52460-0078) but had unusually large X-ray positional uncertainties. Inspecting the area around the M-stars revealed many galaxies with concordant redshifts, obvious clusters of galaxies with the BCG rather close to the targeted star. While the spectrum taken was clearly that of a star, the X-ray source was likely not point-like. While these two objects were most pronounced and for that reason discussed here separately, there are possibly more of this kind in the larger sample. As stated above, A97, page 10 of 22 Fig. 11. X-ray/optical colour-colour diagram highlighting the XID=1 objects on the background of all identified objects of the 2RXS sample. Hot white dwarfs, coronal emitters and close binaries are shown with blue, red and green symbols, respectively.
we give in Table A.2 the re-classification of these objects as a correction of the officially published catalogue.
The distribution of the XID=1 objects in an X-ray/optical colour-colour diagram is shown in Fig. 11. The quantity plotted along the ordinate was computed as log(RXS_SRC_FLUX) + 0.4 × SDSS_MODELMAG_i + 5.61425. The optical colour g − r was built from the SDSS MODELMAG columns. The many objects in grey in the background are all identified objects in the catalogue (10 404). The white dwarfs stick out as extreme blue objects with a high X-ray to optical flux ratio. Many of the single stars are likely coronal emitters in late-type stars and to be found as red objects with g − r 1.5.
The compact binaries appear on top of the abundant AGN with a median g − r 0.2 and a median log( f X / f opt ) 0.9 but with a large dispersion in both quantities. Among the compact white dwarf binaries that are not CVs we find three objects that were previously classified as WDMS objects (detached white dwarf main sequence objects; Heller et al. 2009;Rebassa-Mansergas et al. 2012) and one magnetic precataclysmic binary (a so-called LARP -low accretion rate polar, Schwope et al. 2002). The origin of their X-ray emission needs to be addressed separately, as well as the extreme X-ray emission of a few of the apparently normal stars around g − r ∼ 0.7, log( f X / f opt ) ∼ 0.5. Such a discussion, together with a more thorough presentation of the stellar content of the survey, is foreseen in a subsequent paper.

XMMSL2
For the SPIDERS-XMMSL2 stellar sources, the emerging picture is slightly different. We find 19/2/6 objects in the XID=1/2/3 bins, a much higher fraction of XID=1 candidates as in 2RXS. Among the 37 objects with CLASS_BEST=="STAR" we reclassify two as Blazar (still XID=1, although not being a star, 4385-55752-614, 8172-57423-839), and one further, following the arguments given above, as likely cluster of galaxies, which thus becomes an XID=3 object. In this case, XMMSL2 J113224.0+555745 (8170-57131-926), the BCG of the cluster lies even closer to the X-ray position than the M-star whose spectrum was taken. Other objects classified as XID=3 were F, K or M stars which appeared way too faint given the measured X-ray flux.
Among the XID=1 sources, we find 16 CVs and only three late-type coronal emitters (M5, M6). Interestingly, the majority (11 out of 19) XID=1 sources of the SPIDERS-XMMSL2 have a likelihood of any association p_any > 0.5. It confirms that having a reliable X-ray positional error is key to obtain accurate counterparts. To resolve ambiguities mentioned in this section, it would appear advisable to additionally visualise X-ray contours on the optical (or infrared) finding charts, instead of just using coordinates.

AGN spectral properties
A detailed discussion of the optical spectral properties of the SPIDERS sample is beyond the scope of this paper. We refer the reader to Coffey et al. (2019), Wolf et al. (2020) for an exploration of the detailed properties of SPIDERS type 1 AGN with sufficient signal-to-noise ratio in individual spectra. Wolf et al. (2020) investigated the markers of optical diversity of Type 1 AGN by deriving the principal components of optical and X-ray features for a sample of sources identified in SDSS-IV/SPIDERS and compiled by Coffey et al. (2019). Making use of the large redshift and luminosity ranges probed by the SPIDERS sample, they could confirm that the broad Hβ line shape significantly evolves along the main sequence of broad line AGN (for a review see Marziani et al. 2018). Wolf et al. (2020) report that the scaling of the FeII and the continuum emission strengths strongly depends on the sign of the asymmetry of Hβ. The effect is discussed in the light of Broad Line Region outflows.
Instead, we present here a description of the general features of the sample. A benefit from having a large number of spectra is in stacking similar objects to increase the signal-to-noise ratio per pixel and possibly unveil new features in the spectra (e.g. Zhu et al. 2015). In the following, we stack SPIDERS-DR16 spectra to create templates for generic usage, for example, exposure time calculation for spectroscopy, redshift fitting re-simulation, etc. The stacks are made available here 8 .
On average, the signal-to-noise ratio per pixel grows with the number of spectra stacked together as follows: log 10 (S /N per pixel) = 0.45(1 + log 10 (N spec per pixel)). (13) The median signal-to-noise ratio per pixel in the observed spectra is 10 0.45 = 2.81. By stacking 3000 (1000) spectra one reaches a signal-to-noise ratio on the order of 100 (60).

Spectral stacking method
First, we translated each observed spectrum to its rest-frame λ RF = λ/(1 + z). Then we interpolated each spectrum and its uncertainties on a fixed wavelength grid in log 10 wavelength between 800 Å and 11 000 Å with a ∆ log 10 (λ) = 0.0001 using spectres (Carnall 2017). Finally, we took the median value of all fluxes in each pixels to obtain a stacked spectrum on this wavelength grid. We estimated the uncertainty on the median flux with a jackknife procedure. We note that to each spectrum, a normalisation (or a weight, e.g. a luminosity function completeness weight) can be applied, but this feature was not used here. This stacking procedure was previously applied in Zhu et al. (2015), Raichoor et al. (2017), Huang et al. (2019), Zhang et al. (2019) to stack spectra from star-forming galaxy. It is also used to stack the spectra of passives galaxies observed in the SPIDERS-CLUSTERS programme. These are presented in Clerc et al. (in prep.). The accuracy on the redshift of AGN being lower than that of star-forming galaxies (with narrow lines), some information spanning the width of a few pixels is washed out in the stacks; the broad features remain. We chose a redshift bin with width 0.2 (or 0.5) and slide the redshift window by 0.1 to obtain a consistent evolution between the stacks. If more than 100 spectra were available in a bin, then we computed the stack.

Type 1 AGN
We selected type 1 AGN spectra in the 2RXS sample. There are enough spectra for the stacks to cover up to redshift 2.5. Figure 12 shows the stacks obtained on a rest-frame wavelength axis in a f λ convention. The stacks obtained are consistent with the findings of Vanden Berk et al. (2001).
We zoom in on the second and the last spectra to show the variety of features detected in Figs. A.1 and A.2. We compare it to the SDSS DR5 spectral templates of the QSO (DR5 29) and of the luminous QSO (DR5 32) (Adelman-McCarthy et al. 2007). Emission line features are more marked (higher equivalent widths) in the SPIDERS templates.

Type 2 AGN
In SPIDERS-DR16, the sample of type 2 AGN is large enough and spectroscopic data is homogeneous, so that we can create stacks up to redshift 0.7. We were previously lacking such stacks due to a smaller number of spectra or less homogeneous observations (exposure time, different instruments), which made the stacking procedure tedious. Figure 13 shows the stacks obtained. There, Hα seems to be somewhat broad meaning that the type 2 classification is not perfect. Figure 14 shows the stacks of sources that are in the vicinity of optically detected clusters. The stacks show that we can separate (on average) the two populations of active galaxies in clusters and passive galaxies in clusters. The bottom panel is accompanied by the stack of passive galaxies in clusters from the SPIDERS-CLUSTER observations with their evolution as a function of cluster-centric radius, see Clerc et al. (in prep.) for A97, page 12 of 22 full details. The stack of galaxies in clusters "contaminating" the AGN sample looks exactly like stacks of passive galaxies found to be cluster members.

Black hole mass and Eddington ratio
The FWHM of Hβ frequently serves as a virial broadening estimator and is used to estimate black hole masses (e.g. Trakhtenbrot & Netzer 2012;Mejia-Restrepo et al. 2016). The flux ratio r FeII = F(FeII)/F(Hβ) is known to correlate with the Eddington ratio (Grupe et al. 1999;Marziani et al. 2001;Du et al. 2016). These two parameters were initially among the main correlates of the original Eigenvector 1 (EV1), that is, the vector through optical and X-ray parameter space, which spans the most total variance (Boroson & Green 1992). The plane FWHM Hβ and r FeII span is known as the EV1 plane. The distribution of Type 1 AGN in this plane has been identified as main sequence of broad line AGN (e.g. Marziani et al. 2018, and references therein) and has proven of great use in the characterisation of the optical diversity of these sources. The stacking method described in this work can be applied in this context by using the binning of the Eigenvector 1 plane proposed by Sulentic et al. (2002). Sulentic et al. (2002) as well as Zamfir et al. (2010) have computed median composite spectra to investigate the evolution of the broad Hβ line shape along the EV1 sequence. The large number of sources available from the SPIDERS programme can be used similarly to uncover the dominating trends in the Balmer line diagnostics with increasing black hole mass and increasing Eddington ratio. In order to demonstrate the high S/N achieved with our stacks, we made use of the DR16 update of the SDSS-IV/SPIDERS Type 1 AGN catalogue compiled by Coffey et al. (2019). FWHM Hβ and r FeII are listed as derived parameters in the catalogue from Coffey et al. (2019) and we identified sources in the following bins: -A1: 0 km s −1 < FWHM Hβ < 4000 km s −1 and 0 < r FeII < 0.5 -B1: 4000 km s −1 < FWHM Hβ < 8000 km s −1 and 0 < r FeII < 0.5 -B1+: 80 000 km s −1 < FWHM Hβ < 12 000 km s −1 and 0 < r FeII < 0.5 The spectra of these sources were stacked following the method described in Sect. 5.1. Figure 15 zooms on the Hβ line in these stacks. To guide the eye, we overplot the location of emission lines (Vanden Berk et al. 2001). For increasing FWHM of Hβ one can clearly see the gradual appearance of a distinct very broad, slightly redshifted component in the stacked Hβ, confirming the results by Sulentic et al. (2002) and Zamfir et al. (2010). Finer bins in the EV1 plane or further key optical parameter planes will allow us to probe the physics and geometry of the Broad Line Region in future work.

Conclusions and outlook
In this work, we present the contents of the optical spectroscopic catalogue associated to X-ray point-like sources in the SPIDERS survey, published as part of the SDSS DR16. The systematic, highly complete follow-up programme assembled within four generations of SDSS delivers the largest spectroscopic redshift sample of an X-ray survey to date and represents a test-bed for a large programme of identification for large X-ray surveys in the future, especially with regard to the upcoming eROSITA all-sky survey.
The combination of wide-area X-ray surveys with optical spectroscopy enables a large number of unique scientific applications. As a further example, we a possible application for cosmology discuss below, following the works of Risaliti & Lusso (2015) and Lusso & Risaliti (2017).
6.1. Future AGN spectroscopic surveys following X-ray selected AGNs The SRG eROSITA full-sky scans will provide large number of targets for spectroscopic observation (Merloni et al. 2012;Predehl et al. 2016). SDSS-IV SPIDERS has demonstrated A97, page 13 of 22 Fig. 14. Spectral stacks as a function of redshift for objects classified as galaxy in cluster. Top panel: stack of active galaxies in clusters and bottom panel: stack of passive galaxies in clusters. The bottom panel is accompanied by the stack of passive galaxies in clusters from the SPIDERS-CLUSTER observations with their evolution as a function of cluster-centric radius, see Clerc et al. (in prep.) for full details. Vertical displacement between spectra are added for clarity.
its ability to observed AGN with high completeness and to unambiguously classify the X-ray sources. For eROSITA, eRASS8 with a flux limit around −14, the peak of the number density should be around z ∼ 1 (Merloni et al. 2012;Comparat et al. 2019) (compared to 0.1-0.2 in 2RXS, XMMSL2). It justifies the need of larger spectroscopic infrastructure to be complete. The next X-ray observation programme lined up is a transition programme linking SDSS-4 and SDSS-5. This programme is named eFEDS and will consist in 12 eROSITA dedicated plates covering 60 deg 2 within the footprint of the eROSITA Per-formance Verification programme. The data will be released as part of the next SDSS Data Release.
Later in September 2020, following the completion of the first full-sky scan, SDSS-V Kollmeier et al. (2017), with its telescopes located in both hemisphere, will optimally observe the bright half of the sources. A couple of years later, using a deeper four-year full-sky scan, 4MOST  will observe the fainter half of the eROSITA sources.
In the longer term, the Athena (Nandra et al. 2013) observatory will be well matched to the capabilities of the A97, page 14 of 22 Recently, Risaliti & Lusso (2015) and Lusso & Risaliti (2017) proposed a method to construct quasar standard candles. It relies on the fact that exist an intrinsic non-linear relation between the UV emission from the accretion disk and the Xray emission from the surrounding corona of the AGN. This relation between X-ray luminosity and UV luminosity has been observed (Lusso et al. 2010). Our current understanding of the disc-coronae and its non-linear scaling between UV and X-ray luminosity is not yet sufficient to prove this method in details. In the literature, there is skepticism about the physical disccoronae model from Lusso & Risaliti (2017) to account for this relation. For example, Kubota & Done (2018) and Panda et al. (2019) propose a model that is in agreement with the LX-LUV relation from Lusso. On the contrary, after exploring the physics of the disc and the coronae within a radiatively efficient AGN model, Arcodia et al. (2019) could not find a satisfactory explanation for the tight relation observed. Some authors find the α OX to be correlated with the Eddigton ratio (Lusso et al. 2010), some authors do not (e.g. Vasudevan & Fabian 2009); and others find a correlation with black hole mass. So this point is yet to be entirely proven. SDSS-IV SPIDERS has demonstrated our ability to observed AGN with high completeness and to unambiguously classify the X-ray sources, as required by this method. Additionally, Coffey et al. (2019) showed our ability to determine accurately the relevant spectral features for such an analysis. A cosmological analysis of the SPIDERS 2RXS sample is limited by depth the X-ray data, as the X-ray properties of the AGN are not determined well enough and impede the best selection of type 1 AGN standard candles. In the near future, eROSITA will provide the necessary high quality X-ray data, and we estimate below the possibility of a cosmological constraints via this method using the eROSITA mock catalogue produced by Comparat et al. (2019). For all type 1 mock AGN, we simulate the quasar UV-X-ray relationship and derive distance modulus estimates following the Risaliti & Lusso (2015) method. The resulting quasar Hubble diagram is then fit using a standard ΛCDM cosmological model to place constraints on Ω M and Ω Λ . We use only sources for which 4MOST will obtain optical spectra at a signal-to-noise greater than or equal to 10. Among these sources, we assume that ∼10% of these sources will have reliable measurements of both the UV and X-ray flux densities (conservative assumption). We find a best-fit cosmology compatible with the input cosmology of the simulations. The uncertainty obtained is 5% on Ω M and 10% Ω Λ . For comparison, Risaliti & Lusso (2015) with current samples constrained Ω M and Ω Λ to the ∼40% and 27% level while the Union 2.1 supernovae sample from Suzuki et al. (2012) constrained them to the 14 and 11% level.
Given the large number of type 1 AGN to be detected by eROSITA, which will then be observed with optical spectroscopy, the combination of eROSITA + SDSS-5 and 4MOST should be able to unveil if the method is correct. If the method is proven right, it would produce competitive and independent constraints on cosmological parameters. More accurate forecasts where one simulates jointly the photometry and the spectroscopy, based on the stacks presented here, to populate the Hubble diagram are foreseen in upcoming studies (PhD Thesis of Coffey, in prep.).