Open Access
Issue
A&A
Volume 686, June 2024
Article Number A170
Number of page(s) 54
Section Catalogs and data
DOI https://doi.org/10.1051/0004-6361/202346730
Published online 12 June 2024

© The Authors 2024

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1 Introduction

Modern astronomical sky surveys are intended to be useful for a wide array of science objectives, extending both beyond the remit of their primary science goals and the lifetime of the survey’s operation. Such extended science use is typically referred to as the ‘legacy’ value of a survey. Given the ever-increasing monetary and temporal costs required to undertake large programmes at state-of-the-art observatories, the allocation of large programmes is increasingly contingent on the inclusion of planning for legacy science.

In order to take advantage of the full legacy value of any modern astronomical dataset, however, said dataset ought to follow the ‘FAIR’ principle: being Findable, Accessible, Interoperable (i.e. able to be easily combined with other data), and Reusable. Conformity with these principles is generally achieved by the storage and release of the data in an easily interfaceable system, with relevant accompanying documentation, such as in the European Southern Observatory (ESO) archive.

While all aspects of the FAIR principle are important, the ‘reusable’ principle is particularly relevant here. This manuscript details the fifth data release of the Kilo Degree Survey (KiDS; de Jong et al. 2013), with a particular focus on ensuring that the legacy value of the KiDS dataset is preserved. This includes the survey’s on-sky overlap with the VISTA Kilo-degree Infrared Galaxy Survey (VIKING; Edge et al. 2013) and coverage of wide and deep spectroscopic survey fields.

KiDS is most notable as a so-called stage-III imaging survey for cosmology (using nomenclature defined in the Dark Energy Task Force white paper; Albrecht et al. 2006). There are three principal stage-III imaging surveys for cosmology: KiDS, the Dark Energy Survey (DES; Sevilla-Noarbe et al. 2021), and the Hyper-Suprime Camera (HSC) survey (Aihara et al. 2018). These surveys predominantly differ in their complementary combinations of imaging depth, survey area, and wavelength coverage. This is illustrated in Fig. 1: the HSC survey is the deepest of the stage-III surveys, DES has the largest sky coverage, and KiDS spans the greatest range in wavelength.

The stage-III surveys are directly comparable to earlier stage-II surveys, such as the Canada-France-Hawaii Telescope Lensing Survey (CFHTLenS; Heymans et al. 2012) and the Deep Lens Survey (DLS; Jee et al. 2016), which are notable due to their limited area but impressive photometric depth. Finally, the next generation of cosmological imaging surveys, stage-IV, stand out due to their combination of large areas and exceptional depths: Euclid (Laureijs et al. 2011; Euclid Collaboration 2022), the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST; Ivezić et al. 2019; The LSST Dark Energy Science Collaboration 2018), the Chinese Space Station Optical Survey (CSS-OS; Gong et al. 2019), and the Nancy G. Roman Telescope High Latitude Survey (Roman; Eifler et al. 2021). As with the stage-III surveys, each of these stage-IV surveys fills a particular niche combination of depth, resolution, wavelength coverage, and area on the sky. Figure 1 also includes the ground-based partner survey to Euclid, the Ultraviolet Near-Infrared Optical Northern Survey (UNIONS; Guinot et al. 2022), which has the capability to become a stage-III-like weak lensing survey in its own right.

As the final data production manuscript for the KiDS and VIKING surveys, this paper is designed as a one-stop reference for researchers and students who wish to use this dataset for their science. The manuscript therefore includes a full description of the production processes for all data products included in the fifth data release (DR5), including documentation of the input data, the analysis methods and settings, and the final data products.

The manuscript is arranged approximately in the order that the production itself takes place, which we describe in Sect. 2. Section 3 presents details of the optical reduction and photometric processing. Section 4 presents details of the near-infrared (NIR) data reduction. Section 5 details the compilation of spectroscopic data in the KiDS DR5 fields. Section 6 presents details of the NIR photometric processing, the combination into the full multi-wavelength dataset, and the computation of photometric redshifts. Section 7 describes the shape measurements for weak gravitational lensing that are contained in the dataset, which will be used for cosmic shear analyses with the final KiDS DR5 weak-lensing sample, named the KiDS-Legacy sample. Finally, Sect. 8 outlines the contents of the full data release, detailing catalogues and imaging, important practical information for access and catalogue use, and summary statistics for the full dataset.

thumbnail Fig. 1

Summary statistics for stage-II (Heymans et al. 2012; Jee et al. 2016), stage-III (this work; Sevilla-Noarbe et al. 2021; Hikage et al. 2018; Guinot et al. 2022), and stage-IV (Euclid Collaboration 2022; The LSST Dark Energy Science Collaboration 2018; Gong et al. 2019; Eifler et al. 2021) cosmological imaging surveys, comparing the survey area and effective number density of sources for weak lensing analyses (square markers). The red curves show the total effective number of sources, which principally determines the cosmological constraining power. The colour bars indicate each survey’s wavelength coverage, which principally determines the photometric redshift quality of a survey. The size of the circular point within each survey’s square marker shows the typical seeing in the lensing band, which principally determines the accuracy of shape measurements. The statistics presented for future surveys are forecasts, as indicated in pink. The final DES-Y6 data release will increase in depth, relative to the DES-Y3 statistics shown, raising the final effective number density of sources.

2 Production workflow and updates

To assist with the understanding of the production process (and with the navigation of this paper) we start by providing an overview of the production workflow, and direct readers to the relevant sections of the paper that describe the workflow stages. Details of surveys, technical concepts, algorithms, and other associated jargon are provided in the relevant sections, as are the appropriate references to the literature.

The fifth data release of KiDS (DR5) includes a collection of new observations, previously released data that have been re-reduced, and previously released data that are unchanged. As such, this release supersedes the previous releases of KiDS; the fourth data release (DR4) is not simply a subset of DR5. Of particular note in this release is the inclusion of an additional 30% of survey area, a complete second-pass of i-band observations, and 25 square degrees of imaging data over deep spectroscopic-survey fields (which are useful for photometric redshift calibration, and so are given the label ‘KiDZ’). Each of these additions has particular importance for scientific analyses with KiDS, as discussed in the following sections.

Figure 2 shows the updated data-production flowchart for DR5. The flowchart represents a slight development of a similar chart for DR4 presented in Fig. 1 of Kuijken et al. (2019), with added items for the processing of the KiDZ data, the NIR-reduction pipeline (KV PIPE1), the unified multi-band photometric processing pipeline (PHOTOPIPE2), and the final mosaic and compilation catalogue construction. Raw and input data products to the data-production pipeline are shown in yellow (accessible via e.g. the ESO archive), calibrated and released imaging data products are provided in green (accessible via the ESo archive and KiDS collaboration web pages), released per-tile catalogue data products are shown in pink (accessible via the ESO archive), and released mosaic data products are shown in blue (accessible via the KiDS collaboration web pages). The flowchart is annotated with the sections of the manuscript containing the description (and relevant changes with respect to previous work) related to each stage. We also outline the sections here for ease of reference.

Our production workflow starts, naturally, with raw observations made with the Very Large Telescope (VLT) Survey Telescope (VST; Sect. 3.1). Observations are then transferred to two separate reduction pipelines: the full set of photometric bands is sent to the ASTROWISE pipeline, and the r-band data are also sent to the THELI-Lens (for KiDS observations) or THELI-Phot (for KiDZ) pipelines. The THELI-Lens (Sect. 3.2) and THELI-Phot (Sect. 3.3) pipelines are responsible for producing the imaging (and associated products) that are used for source detection in the KiDS and KiDZ fields, respectively, while the THELI-Lens pipeline also produces calibrated individual detector images that are used by our fiducial shape measurement code lensfit (Sect. 7).

Within ASTROWISE (Sect. 3.5) we performed four main tasks: the primary KiDS and KiDZ source detection (Sect. 3.4), production of co-added images, weight-maps, and masks in each of the optical filters (Sect. 3.5), extraction of individual per-filter source catalogues (for stand-alone ‘single-band catalogues’, Sect. 3.5.5), and measurements of forced photometry using the Gaussian Aperture and PSF (GAAP) code on the optical images (Sect. 3.6). For the primary source extraction THELI-Lens and THELI-Phot r-band co-adds are ingested into ASTROWISE, and extraction is made on these images using SOURCE EXTRACTOR (Sect. 3.4). These catalogues form the basis of essentially all subsequent catalogues used in KiDS and KiDZ. For some use cases, however, it may be desirable to have individual source extractions in each of the optical filters (for example, for the detection of r-band drop-out sources). As such, the single band catalogues in each of the optical bands are also an important output of the reduction process. After the definition of the source lists, however, ASTROWISE falls back to the use of its own optical imaging for photometric measurements.

After the measurement of optical forced aperture photometry, the optical GAAP catalogues are passed to our unified multi-band photometry and photo-z measurement pipeline PHOTOPIPE (Sect. 6.1). This pipeline combines the optical data with the NIR images from the VIKING survey, starting with the ‘paw-prints’ provided by the cambridge Astronomical Survey Unit (CASU). For all sources in KiDS and KiDZ, PHOTOPIPE performs the forced photometry of the NIR bands (Sect. 6.2), construction of NIR mosaics, combination of optical and NIR photometric catalogues, estimation of photometric redshifts (Sect. 6.3), and construction of combined multi-band masks (Sect. 6.4). These catalogues are the primary data product that is released to ESO (Sect. 8). For weak-lensing applications, however, we required robust shape measurements: for tiles in the main KiDS footprint, these are performed with the lensfit algorithm (Sect. 7).

Combined ‘mosaic catalogues’ are then prepared for both the KiDS and KiDZ datasets. These are the primary science catalogues that are used by the majority of the cosmology team in KiDS. In the KiDS (i.e. main survey) areas, these catalogues are constructed by combining all per-pointing catalogues after source selection based on masks, shape measurements, photometry, and more (Sect. 7.2). In the KiDZ fields, a final ‘calibration catalogue’ is constructed by combining all catalogues from the KiDZ fields and matching them to a master spectroscopic compilation (Sect. 5). These mosaic and calibration catalogues are released to the community via the KiDS database3 after the conclusion of the primary cosmological analyses.

Finally, for convenience of reference, we also provide here a summary of the data release in terms of imaging quality and other relevant statistics (Table 1).

thumbnail Fig. 2

Chart showing the data processing path that is taken for the KiDS and KiDZ data in DR5, from optical and NIR imaging (top) through to final mosaic catalogues (bottom). Each step outlined in this graph is discussed in the correspondingly annotated section. Yellow boxes show raw data products and data products input into our reduction pipelines. Green boxes contain imaging data products released as part of this data release. Pink boxes contain per-tile catalogue-level data products released as part of this data release. Blue boxes contain mosaic catalogue-level data products released as part of this data release.

Table 1

Summary of imaging data released in KiDS-DR5.

3 Optical dataset and reduction

In this section we detail the optical portion of the KiDS DR5 dataset, its reduction, and its differences with the equivalent data released in DR4. Of particular note: new data have been observed with the VST (Sect. 3.1), THELI-Lens data have been fully re-reduced (Sect. 3.2), new KiDZ data have been reduced with the THELI-Phot pipeline (Sect. 3.3), source detection has been recomputed for the full dataset (Sect. 3.4), ASTROWISE data have been partially re-reduced (Sect. 3.5), and multi-band GAAP photometry has been re-extracted and the post-processing updated (Sect. 3.6). As a result, despite DR5 representing largely the union of existing (from Kuijken et al. 2019) and new ASTROWISE imaging, it is unlikely that source-lists, photometry, and/or higher-level data products will be identical (when comparing DR4 to DR5) for any individual source, population, or tile.

3.1 VST observations

The VST (Capaccioli et al. 2005) is a 2.6 m modified Ritchey-Chrétien telescope on an alt-az mount, located at the ESO’s Cerro Paranal observatory in Chile. The VST hosts a single instrument at its Cassegrain focus: the 300-megapixel Omega-CAM CCD mosaic imager (Kuijken 2011). The 32 thinned CCDs that make up the ‘science array’ of the camera consist of 4102 × 2048-pixel e2v 44 − 82 devices, which uniformly sample the focal plane at a scale of 0″.214 per 15µm pixel and have very narrow chip-gaps (25″ and 85″). The telescope and camera were co-designed for optimal image quality, and as a result they are capable of producing imaging with a point-spread function (PSF) that is quite round and shows little variation across the focal plane: seeing ellipticities (defined as e = 1 − b/a, where a and b are the major and minor axes of the PSF, respectively) across all KiDS VST observations are consistently less than 〈e〉 = 0.08 per exposure.

Observations with the VST are split into observing blocks (OBs), which each target a single tile with dithered observations using a particular filter, and which were queued in order to optimise the use of the VST (as it simultaneously carried out a suite of surveys). Observations for KiDS and KiDZ made with the VST were performed under the programmes listed in Table 2. Each tile was observed in four filters (u𝑔ri), with band-passes very similar to the u𝑔ri filters from the Sloan Digital Sky Survey (SDSS; York et al. 2000)4, and with various requirements regarding observational conditions. Table 3 summarises the observing constraints, number of OBs, number of dithers per OB, and total exposure time for each filter. r-band observations were made under only the best atmospheric conditions (seeing full-width at half-maximum of less than 0″.8) during dark-time (maximum lunar illuminations of 40%). The 𝑔- and u-band observations similarly required dark-time, but with less stringent seeing requirements: allowing maximal seeing of FWHM ≤ 0″.9 and FWHM ≤ 1″.1, respectively. The i-band observations were required to be performed under the same seeing conditions as the u-band (FWHM ≤ 1″.1), but, unlike the other bands, they could be obtained in grey or bright-time (i.e. with no restriction on maximum lunar illumination).

The observing strategy for KiDS and KiDZ can best be described as ‘depth first, area second’: each OB immediately achieves the full depth in a given filter on a tile, and that tile is not revisited with that filter unless the observations subsequently fail quality control. The one exception to this rule is the second i-band pass, which is revisited by design (see Sect. 3.1.1). There is no requirement that the observations with a different filter on a given tile be taken within a certain time period, although a bias towards observing partially completed tiles was built into the observing scheduler.

In November 2020, towards the end of the survey data taking and after the VST was recommissioned following an extended shutdown during the COVID-19 pandemic, one of the 32 CCDs in the OmegaCAM mosaic was found to have failed. This resulted in a rectangular gap near the centre of the mosaic for all observations taken after this date. This defect affects mostly the second i-band pass of the KiDZ fields, which were almost exclusively observed after this date (Sect. 3.1.2).

Table 2

Run numbers of all VST observations taken in the KiDS and KiDZ fields.

Table 3

KiDS+KiDZ observing strategy: observing condition constraints and exposure times.

thumbnail Fig. 3

Distribution of KiDS DR5 pointings on-sky. The figure shows the distribution of DR4 pointings (dark cyan) and new DR5 pointings (yellow). Pointings that were originally included in the 1500 deg2 KiDS footprint, but which were subsequently de-scoped due to the limited area observed by VIKING, are shown in dark purple. The combination of the green and yellow data therefore show the 1347 deg2 of KiDS DR5 observations.

3.1.1 KiDS data acquisition

The KiDS survey footprint consists of 1347 tiles, of 1 deg × 1 deg each, corresponding to the field of view of the OmegaCAM CCD mosaic camera. It covers two regions of the sky, roughly coincident with the main survey area of the 2dF Galaxy Redshift Survey (Colless et al. 2001): a 70 deg × 9 deg stripe (in RA× Dec) across the South Galactic Pole, and a 85 deg × 8 deg stripe across the celestial equator in the North Galactic Cap, with extensions to include the Galaxy and Mass Assembly (GAMA) 9hr field (12 deg × 5 deg, Driver et al. 2011) and the Cosmic Evolutions Survey field (COSMOS; 1 deg × 1 deg, Scoville et al. 2007). The layout of the KiDS survey fields is shown in Fig. 3.

The original plan for the KiDS survey, as endorsed by ESO’s Public Surveys Panel in 2005, was to cover 1500 square degrees in u𝑔ri-bands with the VST (de Jong et al. 2013). This proposed area is shown in Fig. 3 as the full span of coloured tiles. Subsequently, in 2006, ESO approved the VIKING survey (Edge et al. 2013), which would add five NIR filters to the same proposed survey area. The observations for KiDS and VIKING started in 2011 and 2009, respectively, soon after their corresponding telescopes were commissioned.

The initial VIKING time allocation resulted in about 90% of the planned area of sky being observed, at which point ESO decided to truncate that survey in favour of new projects (see Sect. 4.1.1). Given the slightly reduced area that VIKING would cover, it was decided also to truncate KiDS to the same total area, which resulted in the final footprint of 1347 tiles. Observations that were removed after this area truncation are shown in purple in Fig. 3. Some of the observing time on the VST originally allocated to the now-deprecated KiDS and VIKING tiles was reallocated to observations of deep spectroscopic fields, forming the basis of the ‘KiDZ’ dataset (Sect. 3.1.2).

A second component of the original KiDS survey plan was to re-observe the entire footprint at the end of the survey in the 𝑔-band, for proper motion and variability studies. As the survey progressed, however, it became clear that bright-time was in considerably less demand than dark-time on the VST, with the bright-time i-band observations advancing much faster than those using the dark-time u, 𝑔, r filters. This can be seen in Fig. 4, which shows the progression of observations in each of the optical filters over the lifetime of the KiDS survey. Given the rapid progression of the i-band data, it was decided to change the filter for the repeat pass to i, rather than 𝑔. The second pass i-band observations are visible in the figure under the i2 label. Similarly the i-band is unique in Table 3 as being the only filter with more than one OB per tile.

The choice to switch the filters used for the second pass comes with certain scientific benefits. First, the additional depth in the i-band improves the quality of photometric redshifts (particularly beyond z = 0.9, where the 4000 Å break redshifts into the i-band). Second, the repeat observations somewhat counteract the variability in the seeing and sky brightness that is inherent to the i-band observations, due to their less stringent constraints on observing conditions. Finally, all the second-pass observations were taken after significant improvements to the telescope baffling in 2015, much reducing scattered light that particularly affected bright-time observations. Due to the rapidity of the i-band progress during the first half of the survey, the scattered light effect is particularly prevalent in the i-band. Therefore, having the second pass be in this band is of particular benefit.

Given the primary focus of the repeated i-band observations on variability and transient science, it is worth noting the somewhat correlated nature of the cadence between i-band observations on-sky. As early KiDS data targeted the GAMA fields preferentially, the largest baselines in the i-band observation cadence are in those fields. This can be seen in Fig. 5, which shows the difference between i-band observation dates over the survey, with the rectangular GAMA patches (dashed outlines) clearly visible.

A second noteworthy feature of Fig. 4 is the particularly slow progress of observations taken in the r-band during the first 18 months of data-taking. This feature was traced to the queue scheduler, which was overly conservative for dark-time observations with strict seeing requirements. Changes to the scheduler in early 2013 addressed this problem and led to a speeding up of the dark-time KiDS observations.

thumbnail Fig. 4

Progression of optical observations over the full KiDS survey, for pointings within the final 1347 deg2 footprint. The total number of DR5 pointings is shown as the dashed grey line. The figure demonstrates the slow initial progress made by the survey, whereby only ~5% of planned r-band observations were completed in the initial 18 months of data-taking, prompting changes in the queuing of (in particular) the dark-time observations. Conversely, at the end of the survey, the complete second-pass of i-band observations (designated i2) was completed in roughly the same time span.

3.1.2 KiDZ acquisition

As mentioned in Sects. 3.1 and 3.1.1, DR5 includes observations made by the VST over deep spectroscopic survey fields that significantly enhance KiDS science. Eight fields were identified for these observations, as having both rich existing (or ongoing) spectroscopic campaigns, and being visible from Paranal: Chandra Deep Field South (CDFS), the COSMOS field, the DEEP2 Galaxy Redshift Survey (DEEP2) 02 and 23 h fields, the GAMA G15-Deep field, the Visible Multi-object Spectrograph (VIMOS) Public Extragalactic Redshift Survey (VIPERS) W1 and W4 fields, and the VIMOS VLT Deep Survey (VVDS) 14hr field. Specific details of the spectroscopic samples available in these fields are provided in Sect. 5. Of these fields, two already resided completely within the KiDS DR5 footprint (CDFS, G15-Deep) and so required no additional observations. KiDZ observations with the VST were taken under the same strategy as was used in the main survey (Sect. 3.1 and Table 3), including the second i-band pass, and VIKING-like observations on these fields were taken with VISTA. The VST KiDZ observations were taken between November 2015 and September 2021, and can be identified in the ESO archive using the run numbers listed in Table 2. The full set of targeted fields are shown in Fig. 6, with VST OBs visible by the black outlines.

3.1.3 Optical data quality metrics

The two primary quality metrics that are relevant for our KiDS+KiDZ observations, made with the VST, are the size of the PSF and the brightness of the background. Combined, these two properties determine the effective depth of the imaging, and the selection of source galaxies. Figure 7 shows the distribution of PSF size (full-width at half-maximum) and limiting magnitude (5σ in a 2″ diameter circular aperture) for all images taken as part of the KiDS+KiDZ observing campaigns. Limiting magnitudes in each of the bands were calculated from randomly scattered apertures across the masked tiles using LAMBDAR (Wright et al. 2016). The details of the magnitude limit calculation is given in Appendix A.

The figures show both the distribution functions and cumulative distribution functions for each parameter. The median seeing in each optical filter is {1″.01 ± 0″.17, 0″.88 ± 0″.15, 0″.70 ± 0″.12, 0″.81 ± 0″.18, 0″.81 ± 0″.18} for the {u, 𝑔, r, i1, i2}-bands, respectively, and the corresponding median limiting magnitudes are {24.17 + 0.10, 24.96 ± 0.11, 24.77 ± 0.13, 23.41 ± 0.26, 23.49 ± 0.28}5 In both cases, the uncertainties describe the tile-to-tile scatter in these metrics, computed with the normalised median absolute deviation from the median (NMAD). The distribution of magnitude limits in the four bands can be seen to take a reasonably narrow range of values; the scatter of the limiting magnitudes in the u𝑔r-bands is consistently less than 0.l5 mag. The i-band, however, does show more variability in both seeing and limiting magnitude than is evident in the other bands, largely due to these observations being taken in poorer conditions (Sect. 3.1). However, these metrics keep the independent i-band passes separate, meaning that the effective limiting magnitude of the individual tiles in the i-band would in-fact be approximately 0.4 magnitudes deeper and have a 30% reduction in scatter, bringing it closer in line with the other bands (from σ ≈ 0.3 mag to σ ≈ 0.2 mag). We note that this approximation is largely fair, as the individual distributions of PSF full-width at half-maximum (FWHM) and limiting magnitudes are extremely similar for the i1 and i2 passes.

Finally, systematic variation of observational conditions on-sky is an important nuisance in many scientific studies. As such, we include visualisations of the on-sky distribution of both magnitude limit and PSF FWHM, for each of the optical filters, in Appendix B.

thumbnail Fig. 5

Temporal separation between the two i-band passes, as a function of position on-sky. The figure shows all KiDS DR5 pointings, coloured by the number of years separating the two i-band passes. The overall distribution of these values is shown as a KDE within the colour bar, computed with a 0.1-year rectangular bandwidth. The largest temporal separations occur typically in the GAMA fields, where KiDS observations were focussed during early survey operations. The shortest temporal separation between any two passes is at RA = 350.3 deg, Dec = −35.1 deg, which was observed on 21 August 2017 and 7 October 2017 for the two i-band passes, respectively, for a separation of 50 nights.

thumbnail Fig. 6

Footprints of the KiDZ pointings. The figure shows the regions where there exist optical and/or NIR data in dark and light grey, respectively. The rectangles show the limits of all observations made with the VST (black) and dedicated observations made with VISTA (green). Dark grey regions that are not covered by our dedicated VISTA observations are either already contained within the VIKING footprint (CDFS, G15-Deep) or have VIKING-like observations constructed from existing deep observations (VIPERS, COSMOS; see Sect. 4.1.3).

3.2 theli-Lens r-band reduction

As with previous KiDS analyses, the primary imaging used for lensing science (specifically shape measurement) in DR5 has been produced with the THELI-Lens pipeline (version 1.3.0A). Since DR4 (which used THELI-Lens version 1.0.0A), THELI-Lens has undergone modifications to both the astrometric and photometric calibration routines. In this section we document, in particular, the specific changes to THELI-Lens (with respect to the DR4 version) that are relevant to DR5 science.

thumbnail Fig. 7

Primary observational properties of pointings in KiDS+KiDZ for observations in the four optical bands. Observations in the i-band are split into the two epochs (labelled i1 and i2). Each row shows the distribution of average PSF sizes (as reported by ASTROWISE; left) and limiting magnitude (5σ in a 2″ diameter circular aperture; right) determined with a KDE using the annotated kernel. The corresponding cumulative distribution functions for each panel are shown as grey dashed lines.

Astrometric calibration with Gaia

Until version 1.3.0A, THELI-Lens utilised two separate astrometric calibration samples for the northern and southern patches of KiDS: in the north SDSS (York et al. 2000) was used as an absolute astrometric reference, whereas the southern fields were absolutely calibrated to 2MASS (Skrutskie et al. 2006). In DR5, KiDS utilises a single astrometric calibration to Gaia DR2 (Gaia Collaboration 2018) stars, with positions translated to epoch J2000 by Vizier6. We note that the choice to calibrate KiDS to a J2000 epoch can lead to some confusion when performing a direct match between KiDS stars and Gaia catalogues that do not contain the RAJ2000/DEJ2000 columns. For an explanation of this issue, we direct the interested reader to Appendix C.

Figure 8 shows the accuracy of the KiDS astrometric calibration to Gaia (here using Gaia DR3 stars, rather than the DR2 stars used to define the astrometric solutions), as a function of RA and Dec. Average absolute residuals are typically less than 0 . 05, demonstrating that the imaging has been successfully tied to Gaia. The small residuals that remain are systematically induced by the choice to calibrate KiDS using Gaia stars at epoch J2000 positions: as KiDS observations were made after the year 2000, stellar proper motions introduce noise to the astrometric calibration. This is seen most clearly in the on-sky distribution of astrometric residuals, shown in Fig. 9: parts of KiDS that were observed earliest (i.e. the GAMA fields) clearly stand-out as having systematically smaller residuals than other (later) observations.

Finally, we note that the change to the Gaia astrometric calibration introduces another possible source of bias: differing astrometric solutions between our THELI-Lens and ASTRO-WISE reductions. As ASTROWISE astrometry is computed with 2MASS, it is possible that the locations of sources extracted from THELI-Lens images may have systematically different positions on-sky (as determined by our Gaia astrometry) to the same sources in ASTROWISE (as determined by the 2MASS astrometry). To verify the consistency of the astrometry between our THELI-Lens and ASTROWISE reductions, we matched sources extracted from the reduced ASTROWISE r-band images (see Sect. 3.5.5) to those in our master THELI-Lens source catalogue (Sect. 3.4). Figure 10 shows the astrometric agreement between stars in these two catalogues, thereby validating the use of THELI-Lens defined positions for photometric extractions on ASTROWISE images. The median residual between stars in the two catalogues is 0″.014 in the RA axis and −0″.017 in the Dec axis. The NMAD scatter in the RA and Dec directions is 0″.097 and 0″.090, respectively.

thumbnail Fig. 8

Astrometric calibration of KiDS DR5, with respect to Gaia DR3 stars at epoch J2000. Values here show the median astrometric residual per pointing for stars in the magnitude range 16.5 ≤ G ≤ 19. The colour bar shows the location of each field in the Dec direction, after subtracting a constant equal to the mean declination of all fields in the relevant hemisphere. Absolute residuals are typically less than the 0″.05 level, and are therefore negligible. Residual offsets below this level are possibly attributable to barycentric motion between the J2000 and J2015 epochs (see Appendix C).

thumbnail Fig. 9

On-sky variation in the KiDS DR5 astrometric calibration with respect to Gaia stars at the J2000 epoch. Residuals can be seen to fall around the earliest observations, due to the reduced proper motion differences between our observations and the assumed epoch.

3.3 Theli-Phot r-band reduction

As mentioned in Sect. 3.1.2, VST observations in the r-band were made in the KiDZ fields between November 2015 and September 2018. These observations were reduced with a slightly different version of the THELI pipeline, as the observations were not planned to be used for lensing (rather they would only be used for photometry in the context of redshift distribution calibration). This different version, which we designate ‘THELI-Phot’, is notable for its different astrometric calibration sample (2MASS) and the lack of individual exposures (which are used for shape measurement).

In practical terms, the THELI-Phot pipeline is based on THELI-Lens version 1.2.0A, with changes made to the data products that are preserved at the end of each pointing’s reduction. As mentioned previously, the primary distinction between these V1.2.0A and V1.3.0A is the use of Gaia as an astrometric reference (in V1.3.0A), as opposed to the use of SDSS (north) and 2MASS (south) in V1.2.0A. This means that the astrometric solution for KiDZ and KiDS are nominally different, at the level of existing systematic differences between Gaia and SDSS or 2MASS. Figure 11 shows the residual difference between the astrometric solutions of KiDS stars (i.e. calibrated with Gaia) with respect to SDSS stars, demonstrating that there is no significant systematic bias in the two astrometric systems. As such, we conclude that there is unlikely to be any significant systematic effect imprinted on the KiDZ data through the use of a different astrometric basis in the THELI-Phot pipeline.

thumbnail Fig. 10

Astrometric agreement between sources extracted from THELI-Lens and ASTROWISE images. The sample of all sources has a median offset of 0″.014 and −0″.017 in the RA and Dec directions, respectively. The NMAD scatter in the RA and Dec directions is 0″.097 and 0″.090, respectively.

thumbnail Fig. 11

Overall astrometric calibration of KiDS DR5, calibrated to Gaia, with respect to SDSS in the KiDS-N field. The sample of all sources has a median offset of 0″.031 and −0″.006 in the RA and Dec directions, respectively. The NMAD scatter in the RA and Dec directions is 0″.094 and 0″.084, respectively.

3.4 Source detection

Source detection in KiDS+KiDZ is performed with SOURCE EXTRACTOR (Bertin & Arnouts 1996), within the ASTROWISE environment. The sources are extracted from the THELI-Lens and THELI-Phot r-band images, and these source lists are subsequently used for all primary science in KiDS. Source extraction is performed in a relatively ‘hot’ mode (i.e. significant fragmentation of sources), as the main targets of interest for KiDS lensing science are small, faint sources at intermediate-to-high redshift. This means, however, that there is likely to be fragmentation of the largest sources within the footprint (see e.g. the demonstration of the hot-mode shredding of galaxies in Figs. 5 and 6 of Andrews et al. 2017). The parameters used to perform the KiDS source extraction are provided in Table 4.

Table 4

Source extraction parameters used in the creation of the KiDS DR5 source lists within ASTROWISE.

3.5 AstroWISE reduction

The ASTROWISE7 environment is a distributed database and data processing system, designed for calibrating and processing wide-field imaging data (McFarland et al. 2013). KiDS has utilised ASTROWISE for the production of reduced images, and forced optical photometry using GAAP (Kuijken 2008), in the u𝑔ri-bands since the beginning of the survey (de Jong et al. 2015).

For DR5 there have been a few minor modifications to the ASTROWISE imaging reduction pipeline compared to that described in Kuijken et al. (2019), which are important to document here.

3.5.1 Changes: Co-add production

There have been two changes to the co-add production for KiDS DR5: updates to the cross-talk coefficients (exclusively adding in new values), and a change to the polynomial order for defining the astrometric solution of the u-band co-adds.

Table 5 presents the cross-talk coefficients for the ASTRO-WISE co-add production. These coefficients have been updated to include values spanning the final observation window for KiDS and KiDZ, including the two lengthy shutdown periods of the VST as a result of the COVID-19 pandemic. As a result, the final set of cross-talk coefficients span a considerably longer period than the previous sets. We have not explored whether this difference introduces any systematic effect in the efficiency of the cleaning of cross-talk in the post-shutdown images obtained by the VST. However, as our source detection is performed on the THELI images (which were all observed prior to this period), we believe that this is unlikely to have a noticeable impact on our analyses.

The second change to the co-add production is related to the astrometry in the u-band. For all co-adds in DR4, the polynomial order for the distortion was set to three. In DR5, we found that the paucity of stars (per chip) detected in the u-band can occasionally lead to poorly constrained polynomial fits, leading to unphysical distortion solutions for some chips. Figure 12 shows the distribution of maximal astrometric residuals between the corners of all detectors in each pointing, assuming that all chips can be simply translated onto a common RA and Dec cen− troid. The figure shows that there is a clear systematic difference between the u-band and the other bands, whereby a significant number of pointings have chip corners that differ by more than 1 ; up to 24 in the worst case. Investigation of this effect led to the determination that failures in the u-band distortion parameters could be resolved by restricting the distortion polynomials to linear order. We opted to fit all u-band exposures with more than 1″ chip-corner-residuals with a linear order polynomial fit instead. The 1″ threshold was chosen because this is the median size of the PSF in the u-band, and (as the threshold is determined using the maximal distortion per pointing) this ensures any remaining systematic effects are below the seeing level for typical u-band images. The u-band co-adds that have been reprocessed in this way are listed in Appendix D.

Table 5

Applied cross−talk coefficients.

3.5.2 Changes: Zero-point calibration

One particular complexity that arises in DR5 is the treatment of the zero-point calibration in the presence of the two i-band passes. In previous KiDS data reductions, the calibration process has utilised a stellar locus regression (SLR) between the four optical filters. We implemented the same fundamental procedure, outlined in Sect. 3.1.3 of Kuijken et al. (2019) with further details in de Jong et al. (2017). Briefly, this process of SLR cali− brationwithin ASTROWISEproceeds as follows. First, so-called ‘principal colours’ are constructed from the measured GAAP fluxes for bright, unsaturated stars in each tile (Ivezić et al. 2004). Second, these colours are shifted so that straight sections of the colour-colour diagrams align with a set of fiducial templates. Finally, an overall zero-point correction is applied to all bands, in order to match the de-reddened (rG, 𝑔 − i) diagram to a template constructed from the SDSS survey. For this purpose, G band photometry is taken from Gaia DR2.

Given the additional i-band observation, we are required to run the SLR calibration process multiple times (i.e. running with the individual i-band passes separately). As a result, we have four SLR calibration estimates per tile: two for the GAAP minimum apertures of 0″.7 and 1″.0 (see Sect. 3.6), each computed with either the i1 or i2 imaging as the reference i-band. Performing the SLR in this way allows us to estimate the systematic uncertainty introduced in the calibration of the u𝑔r-bands by variations in the quality of the i-band data, and to flag potentially bad calibrations.

The difference in the zero-point calibration offsets as estimated with the i1 and i2 fluxes, and with the two GAAP minimum aperture sizes, are presented in Fig. 13. Looking firstly at the accuracy of the 𝑔- and r-band calibrations, we can see that the zero-points are consistently reproduced to ±0.02 magnitudes for all but a few tiles in the 𝑔-band. Similarly, looking at the calibration measured between the different GAAP settings, we see that, for all but a handful of outliers, the 𝑔- and r-band zero points are very consistently estimated. However, we of course cannot overlook the outliers in the 𝑔r-band comparisons, nor the significantly larger dispersion between zero-points estimated in the u-band.

In the figure we can see that the u-band zero-point calibration is somewhat unreliable, as has been previously documented (see e.g. Kuijken et al. 2019). In DR4, this behaviour prompted a recalibration effort that utilised Gaia and SDSS photometry to improve the u-band zero-point accuracies. For DR5, we adopted a slightly different approach to zero-point recalibration than was utilised in DR4 (and presented in Kuijken et al. 2019), who based zero-point corrections solely on the Gaia DR2 ‘white-light’ G-band. Our zero-point corrections are instead now based on a combination of the Gaia white-light G-band, blue GBp-band, and red GRP-band magnitudes, available in Gaia Early Data Release 3 (eDR3; Gaia Collaboration 2021). Use of this information in our zero-point calibration requires knowledge of the colour transformations between the Gaia and SDSS photometric systems, for which we use relationships derived by Riello et al. (2021). However, the transformations in Riello et al. (2021) were only computed for the 𝑔ri-bands; as such, we introduce a predictor of the SDSS u-band from available Gaia magnitudes, 𝒰, in a manner similar to Kuijken et al. (2019): (1)

where the first term is the Gaia internal colour transformation, (2)

and the second term is the u-band extinction, as a function of Galactic latitude b, (3)

The predictor 𝒰 is calculated for sources that are bright (G < 19) and occupy the heart of the stellar locus 0.7 < GBPGRP < 1.1, in an effort to suppress non-linear terms in the preceding two equations. The dependence on Galactic latitude b was added as discussed in Kuijken et al. (2019), with the implicit assumption that the Galactic latitude correction derived in KiDS-North (i.e. where there is overlap with SDSS) can be extrapolated to KiDS-S. The coefficients contained in Eqs. (2) and (3) are derived with two separate fits between u-band PSF magnitudes from the SDSS DR16 (Ahumada et al. 2020) and Gaia eDR3, selected within 710 KiDS-North tiles that overlap with the SDSS. The u-band offset for each tile in both KiDS-North and KiDS-South were then computed as the median of the differences ∆u = 𝒰 − uKiDS , measured for Gaia stars. A comparison between the quality of the zero-point calibration in the u-band in DR5 and DR4 is presented in Fig. 14. The figure demonstrates that there is considerably less variability (systematic and random) in the DR5 u-band zero-points compared to DR4. The scatter in the u-band zero-point reduces from ~0.035 in DR4 to ~0.018 in DR5: we notice that a similar scatter (0.016) was obtained by Liang & von der Linden (2023), where the calibration of u-band data in KiDS and other photometric surveys was improved using observed colours of blue Galactic halo stars.

The distribution of zero-point calibration offset difference between the two i-band passes (and the two minimum aperture sizes) is presented in Fig. 13. Somewhat by construction, the various calibrations now all agree with one-another, having been pegged to the external Gaia data. This is most apparent in the u-band, where we have applied this recalibration to every pointing. However, equally relevant is the choice to apply this recalibration process to the handful of tiles that show disagreement in the 𝑔r-bands. A similar recalibration approach can also be applied to the gri-bands, where we verified that robust results were obtained using the colour transformations available in Riello et al. (2021). However, given the accuracy that we observe already in the gr-bands, we decided to follow this option only for the few tiles where the SLR+Gaia approach described above failed to produce reliable offsets.

After the definition of final zero point corrections, individual corrections are applied to the catalogues. As with DR4, the corrections computed with the 0″.7 minimum aperture are applied to fluxes computed with the same aperture; the same is done for the 1″.0 minimum aperture corrections and sources. In DR5, though, we must combine the corrections computed using the i1 and i2 bands in the other (non-i) bands. In this case, we opted to take the straight arithmetic mean of the corrections estimated with the two i-band images for the u𝑔r-bands.

Figure 15 shows the residual median offsets between magnitudes for stars in KiDS and SDSS, after the full zero-point calibration and recalibration process. Systematic residuals are below 0.05 magnitudes in the u-band for the majority of the tiles (<0.07 mag for the two most outlying points). In the 𝑔ri-bands, residuals are smaller: consistently less than 0.03 magnitudes in the r- and i-bands, and typically less than 0.03 in the 𝑔-band. For our subsequent analyses, we implemented systematic error floors of σu = 0.05, and σX = 0.03 ∀ X ∊ {𝑔, r, i1, i2}, which are designed to encapsulate any residual systematic variation in the photometry (such as can be seen as a function of RA).

thumbnail Fig. 12

Maximal residual between chip corners within a single pointing, assuming all chips can be shifted to a common RA and Dec centroid. Cases where the astrometric distortion parameters are poorly constrained (and so result in unphysical chip distortions) manifest as large residuals. The distribution in the u-band when using third−order distortion polynomials is clearly systematically larger than the other bands, with some chips having significant biases (greater than 10″), due to the lower number of stars creating instability in the polynomial fits. To minimise the effect of this bias, we reprocessed all u-band tiles with maximal residuals greater than 1 using a linear polynomial distortion order.

thumbnail Fig. 13

Zero-point corrections estimated in KiDS+KiDZ using SLR. Left: difference between SLR offsets estimated using i1 photometry and i2 photometry, prior to recalibration of the u-band zero-points. Centre: differences between SLR offsets estimated using 0″.7 minimum-radius aperture fluxes and 1″.0 minimum-radius aperture fluxes (see Sect. 3.6). Right: distribution of final SLR offsets used in DR5 (i.e. corrections to the nightly zero points derived from photometric standards), including Gaia recalibration.

thumbnail Fig. 14

Comparison between the u-band zero-point calibration in KiDS DR4 (green) and KiDS DR5 (purple). The updated DR5 calibration procedure produces zero-points that show considerably less systematic and random variation, when compared to fluxes from SDSS.

thumbnail Fig. 15

Distribution of the median offsets between stars in KiDS DR5 tiles (in the magnitude range 16.5 < r < 19) and their counterparts from SDSS imaging, for tiles with 10% or more unmasked data. The offsets here were calculated after SLR corrections and Gaia recalibration, and therefore represent the final quality of photometry in the survey. Horizontal dashed lines demonstrate the systematic zero-point uncertainty that is included per-band in our scientific analyses (such as in the computation of photo-ɀ), which are designed to encapsulate any residual systematic variation in the photometry (such as can be seen as a function of RA, and whose origins are unclear).

3.5.3 Changes: r-band PULECENELLA masks

The ASTROWISE pipeline includes the automatic masking of image artefacts around bright, saturated stars. This masking is performed with the PULECENELLA software, which separates the artefacts into different components (de Jong et al. 2015): saturated pixels in the cores of the stars, spikes caused by diffraction of the mirror supports, spikes caused by the readout of saturated pixels, and up to three families of wide annular ‘ghost’ reflection halos with spatially dependent offsets around bright stars.

The properties of these components (the size of saturation cores, the size and offset of reflection halos, the orientation of diffraction spikes) all depend on the brightness and position of the stars in the focal plane, in a way that is stable in time for each photometric band. It is therefore possible to configure the modelling of each of these components by PULECENELLA at the beginning of the survey (de Jong et al. 2015). In DR4, however, it was noted that some of these parameters were perhaps too conservative (particularly in the r-band), producing masks that were in better agreement with the THELI-Lens ‘conservative’ masks rather than the fiducial masks. As such, in DR5 we updated the settings of the PULECENELLA software, related particularly to the size of reflection halos and orientation of diffraction spikes, to produce masks that are in better agreement with the fiducial masks generated by THELI-Lens. These changes are shown in Table 6, and their effect on a sample KiDS tile is displayed in Fig. 16.

3.5.4 Changes: ugri1 i2-band manual masking

As a final step of the optical reduction process, compressed 1000 × 1000 pixel images of all 1347 × 5 ASTROWISE co-adds were visually inspected for artefacts, and manually masked by means of a polygon region file. The main contaminating features discovered during this manual masking process were scattered light from bright sources outside the field of view (primarily in the pre-mid-2015 data, when the telescope was still poorly baffled). These artefacts can appear as fairly sharp-edged bright arcs across the focal plane (due to stars or planets), or as diffuse patches (due to the Moon). Other artefacts discovered during this masking included satellite flares, airplanes, and higher-order reflections from very bright stars.

Figure 17 shows an example of a heavily masked field in KiDS DR5. The field contains bright reflections from the out-of-field star Fomalhaut (αPsA, V = 1.16), which resides ~1.5 deg NNW of the centre of the focal plane. The field has been manually masked to remove the majority of bright contaminants. Faint residual fluctuations are only visible after heavily smoothing (with a 5" Gaussian filter) and scaling the image to emphasise the background variations.

Table 6

Updated parameters for the r-band PULECENELLA masking in KiDS DR5.

3.5.5 Single-band catalogues

As with previous KiDS data releases, we provide source catalogues extracted from individual ASTROWISE co-adds in each band. These individual extractions are of particular interest for transient and variability studies, where one expects source lists to vary intrinsically between bands and/or observations. Of particular additional interest in DR5 is the distribution of single-band extractions in the two i-band passes, as sources with significant flux differences in the two i-band passes may indicate the presence of stationary transient features, such as extra-galactic supernova.

These source extractions were performed using the same source detection parameters as used in the main r-band extraction on the THELI-Lens images, except they were applied to the ASTROWISE reduced co-adds. As a result, there are differences in the numbers of detected sources (and their observed properties), even when comparing the single-band r-band catalogues to the primary one. Also, it is important to note that cross-matching between single-band catalogues is unlikely to produce multi-band photometry that is accurate (at the same level as the dedicated multi-band photometry, Sect. 3.6), because there is no guaranteed consistency of apertures and/or deblend solutions for the independent extractions. As such, multi-band photometry obtained from position-matching single-band catalogues should be treated with caution.

Finally, as with previous releases, single-band catalogues are provided to the public largely ‘as-is’; they have not gone through the same rigorous quality control and testing as the lensing catalogues. More details of the contents of these catalogues are provided in Sect. 8.

thumbnail Fig. 16

Comparison between the ASTROWISE PULECENELLA masks for a selected KiDS pointing. The images show the masks on a consistent linear colour scale, so pixels masked with the same bit(s) are shown with the same colour in each image. The DR5 implementation of these masks was refined to use slightly different parameters (see Table 6), which more closely reproduce the masking behaviour of THELI-Lens in the ASTROWISE r-band. The primary effect is a reduction (in DR5) of the size of stellar masks (cyan) and the frequency of masking of large reflection halos (green).

thumbnail Fig. 17

Examples of the new ASTROWISE manual masking implemented for KiDS DR5. The figure shows a heavily contaminated field, caused by scattered light from Formalhaut (αPsA), a visible-magnitude star (V = 1.16) that is ~1.5 deg from the centre of the focal plane. The left column shows the tile before masking, while the right column shows the tile after application of the manual and PULECENELLA masks. The upper row shows the tile at native resolution, while the bottom row shows the tile after smoothing with a 5″ Gaussian filter.

3.6 Optical multi-band GAAP photometry

One of the most important products released by the KiDS consortium in each release is the multi-band photometry measured using the GAAP code (Kuijken 2011). For a range of images (spanning various photometric bandpasses and PSFs), GAAP computes a (typically non-total) intrinsically consistent flux (i.e. probing the same pre-convolution extent in each image). This is performed as follows. Given an intrinsic source flux distribution, f (x, y), an estimate of the flux can be made with a weighted aperture, using a Gaussian kernel. In GAAP, the Gaussian kernel is defined using an estimate of the source’s major and minor axis lengths (A and B, respectively), as measured in the r-band, and the source’s orientation angle θ. The flux estimated by GAAP is therefore (4)

where x′ and y′ are the x and y coordinates shifted to the centre of the source and rotated into the galaxy frame using the position angle θ. In the presence of an arbitrary Gaussian PSF with standard deviation σPSF, this intrinsic flux can be shown to be related directly to a Gaussian-weighted flux measured on the observed image, using a modified Gaussian kernel with (5) (6)

This therefore provides a simple way to calculate fluxes across multiple images that probe the same intrinsic scales of a galaxy, provided that the image PSF is Gaussian (and assuming that sources are unblended).

To ensure that image PSFs are Gaussian, we performed a step of Gaussianisation within ASTROWISE. This process involves accurately measuring the native PSF of each image, and using a shapelets expansion (Refregier 2003; Kuijken 2006) to compute the (spatially varying) kernel required to convert the native PSF to a Gaussian (with minimal information loss). The native images are then convolved with the Gaussianisation kernel, producing images with a Gaussian PSF (at the cost of an increased correlation of the noise profile). The image PSF size is essentially unchanged in this process (as quantified via the FWHM), and the correlated noise caused by the convolution is propagated to the estimate of the flux uncertainty.

As is clear from Eq. (5), there is a physical limitation to the GAAP formalism when σPSF ≥ [A, B]. In this limit, the effective aperture sizes (A′, B′) become imaginary in one or both axes, and flux measurement is not possible. It is therefore sensible to define a minimum intrinsic aperture size for every source, that is larger than the expected PSF size in all bands. In KiDS DR5, as in DR4, we measured all sources with two intrinsic aperture sizes, which are a combination of the source intrinsic size (as the RMS of the flux distribution along the major and minor axes, in arcseconds, estimated from the THELI-Lens r-band imaging by SOURCE EXTRACTOR), a minimum aperture size rmin, and a maximum aperture size of 2″ (to limit blending effects). As such, the intrinsic aperture sizes for all sources in DR5 are defined as (7) (8)

The two distinct sets of apertures come from the use of two distinct minimum aperture radii: rmin ∊ [0″.7, 1″.0]. While valid GAAP fluxes across all bands can be obtained with the lower rmin value for most sources, data with poorer seeing require the larger aperture (this point is further discussed in Sect. 6.2).

4 NIR observations and reduction

One of the unique features of KiDS is the complementary five-band NIR survey VIKING. The utility of this overlap lies particularly in estimation of photometric redshifts and redshift distribution calibration, resulting in higher-quality measurements across a larger redshift baseline, as well as in classification studies generally (see e.g. Nakoneczny et al. 2021). With the extension of KiDS to include the KiDZ observations in DR5, we also required new NIR observations over these fields. In this section we detail the existing and recently acquired NIR observations in the KiDS+KiDZ fields, their reduction, and their quality.

4.1 VISTA paw-print observations

The Visible and Infrared Survey Telescope for Astronomy (VISTA) is a 4m ESO Telescope, is also located at ESO’s cerro Paranal observatory (albeit on a separate peak, roughly 1500 m from the VST), and is serviced only by the Visible and Infrared Camera (VIRCAM). VIRCAM consists of 16 individual HgCdTe detectors, each with a 0.2 × 0.2 square degree angular size, but which jointly span a 1 × 1.2 deg2 field of view. A single exposure of the sky therefore contains (considerable) gaps between the detectors, in what is referred to as the ‘paw-print’ pattern, and standard VIRCAM observations combine six dithers designed to fill in these gaps (see Dalton et al. 2006). Additionally, observations of a single paw-print consist of a number of small jitters, taken in quick succession, which allow the paw-prints to have reliably estimated backgrounds and to sample over detector defects. These exposures are stacked into a single ‘stacked paw-print’, and six of these are combined to form a single contiguous ~1.5 deg2 ‘tile’. The task of reducing the raw images, particularly producing stacked paw-prints, was carried out by the Cambridge Astronomy Survey Unit (CASU; González-Fernández et al. 2018; Lewis et al. 2010).

VISTA observations used in DR5 span both the KiDS survey area (with observations from VIKING, Sect. 4.1.1), and the KiDZ fields (with a combination of new, dedicated VIKINGlike observations, and reconstructed VIKING-like observations made from pre-existing deep VISTA observations). The relevant observing programmes are listed in Table 7.

Table 7

Run numbers of all VISTA observations taken in the KiDS and KiDZ fields.

4.1.1 VIKING observations within KiDS

As mentioned previously, the VIKING imaging survey conducted on VISTA was designed in combination with KiDS, to produce well-matched optical and NIR data. As such, the surveys share an almost identical footprint on-sky: Fig. 18 shows the coverage of VIKING observations within the northern and southern patches of KiDS. We note the excellent overlap between the two surveys, whereby only relatively small areas at the extreme edges of the northern and southern patches are without complete (ZYJHKs) information.

Initial VIKING observations were taken between 13 November 2009 and 24 August 2016, defining the final footprint. Subsequently, some observations were repeated (primarily of data with instrumental problems), with the final observation dating from 16 February 2018. The data rate of observations is shown in Fig. 19. All VIKING observations can be found in the ESO archive under programme ID 179.A-2004, and a summary of the observational requirements is provided in Table 8. A detailed description of the survey design and observing strategy is given in Edge et al. (2013) and Venemans et al. (2015).

4.1.2 KiDZ VISTA observing programme

In order to produce NIR observations in the KiDZ fields that are consistent with those existing in the KiDS fields, we undertook a campaign to obtain VIKING-like data over the KiDZ fields from 5 December 2016 to 2 October 2018 (see Table 7), with the same observing constraints and settings as the VIKING survey itself (i.e. Table 8). Figure 6 shows the distribution of these dedicated VISTA observations in each of the KiDZ fields, as green boxes. The areas of the KiDZ fields that are not covered by these dedicated observations, but which nonetheless contain both optical and NIR data (shown by the dark grey), consist of VIKINGlike data that was reconstructed from pre-existing deep VISTA observations in these fields (Sect. 4.1.3).

thumbnail Fig. 18

Coverage of VISTA VIKING data in the KiDS fields, demonstrating that nearly the entire KiDS DR5 footprint is covered by complete ZYJHKs-band VIKING observations.

thumbnail Fig. 19

Progression of the NIR observations by the VIKING survey. Note that the J-band line has been multiplied by a factor of 0.5, to account for the fact that it is observed with twice the frequency of the other bands. The observing strategy, combining observations of ZYJ-bands and JHKs-bands, can be seen in the correlated increase in observed paws in the various filters.

Table 8

Requirements and settings for VIKING observations with VIRCAM on VISTA.

Table 9

Requirements and settings for deep VIRCAM with ultraVISTA and VIDEO on VISTA.

4.1.3 Constructed VIKING-like data in KiDZ deep fields

For a number of the KiDZ fields, there were pre-existing extremely deep VIRCAM observations obtained by the Ultra-VISTA (McCracken et al. 2012) and VIDEO (Jarvis et al. 2013) surveys. Rather than re-observing these fields, we constructed VIKING-quality data by selecting observations such that the total exposure time per pixel was at least as deep as in VIKING, and which contain paw-prints of similar seeing. The observation settings for ultraVISTA and VIDEO (and how many of such observations were selected to reach at least VIKING depth, 〈Ndither〉) are given in Table 9. We note that there are no Z-band observations within UltraVISTA; here new Z-band imaging was obtained with VISTA using VIKING observation parameters (included in Table 7). Given the very different observing strategies, it proved impossible to match the depth of the VIKING data exactly, and we therefore further degraded the photometry in these bands to the expected depth of VIKING (a process described in Hildebrandt et al. 2017 as ‘magnitude adaption’). This step is clearly particularly relevant in UltraVISTA, where typical depths are equivalent to the deepest parts of VIKING (i.e. where dither overlaps create Ndither = 6).

4.2 KiDS-VIKING pipeline reduction

The NIR data in KiDS+KiDZ were reduced with the same reduction pipeline used in previous KiDS data releases: KVPIPE (Wright et al. 2018), based on the processing pipeline developed in Driver et al. (2016) for GAMA. We opted to start from the reduced CASU paw-prints, because of the complex observing strategy and dither pattern of VISTA observations: within one tile there is a wide range of total exposure times (between 1× and 6× the individual exposure times, texp) and observational properties (up to 96 different PSFs). This variability motivated us to utilise individual stacked chips, extracted from the CASU paw-prints, as the basis for our reduction and forced photometry, rather than co-added tiles.

The processing pipeline first corrects the images for atmospheric extinction (τ) given the observation airmass (seo χ), removes the exposure time (t, in seconds) from the image units, and converts the images from various Vega zero-points (Zv) to a standard AB zero-point of 30 (using Vega to AB corrections, XAB, from González-Fernández et al. 2018). These various corrections and transformations are performed using a single multiplicative recalibration factor F per VISTA detector that is applied to all pixels in the detector image I: (9)

The factor ℱ is constructed as (10)

The processing pipeline also performs a re-orientation of the individual paw-print detector stacks using SWARP (Bertin 2010) and a background subtraction (again using SWARP). The SWARP background subtraction is computed using a 256 × 256 pixel mesh, and a 3 × 3 filter for the bicubic spline. This additional background subtraction was considered optimal by Driver et al. (2016), who demonstrated that these settings allowed for maximal removal of backgrounds with minimal impact on the photometry of extended sources.

The distribution of the recalibration factor ℱ per band is shown in Fig. 20. Following Driver et al. (2016); Wright et al. (2018) we applied a blanket quality-control selection of ℱ ≤ 30.0 to remove detectors with strong persistence or other artefacts.

4.3 NIR quality metrics

We quantify the quality of the NIR reduction using the same metrics as for the optical portion of the survey: PSF FWHM and depth (as determined in blank apertures scattered randomly over the detectors, Appendix A). Figure 21 presents the summary of PSF sizes and limiting magnitudes in each of the five NIR bands.

The PSF sizes in the NIR bands are all exceptionally consistent. This is due (at least in part) to the rapid variability of the NIR sky: PSFs vary on a second-to-second timescale, as do backgrounds. This necessitates that the ‘typical’ PSF of any one observation approaches the mean of the seeing conditions (i.e. following the central limit theorem). However, the distributions also all show considerable tails to high seeing; again, a demonstration of the difficulties in observing the NIR sky from the ground. This tail of poor seeing observations has some important consequences for our analysis and sample selection further down the line in our processing pipeline (Sects. 6.2 and 6.4).

Limiting magnitudes in the NIR filters were determined on individual detectors. However, as there are possibly many individual detectors per source, we need a reasonable method for combining the estimated magnitude limits into a final representative distribution of depths (i.e. that reflects the approximate depth that we achieve per source, in the correct proportion). We achieved this by measuring the background variance of the individual chips at a number of locations within the chip. Each of these background estimates is tagged with the central RA and Dec used for the estimate, and estimates (from a single band) that are within ∆{RA, Dec} ≤ 0.02 deg of each other are pooled into a single estimate, assuming that fluxes are combined using a simple average of their individual measurements (which is not strictly the case; see Sect. 6.2), and that the noise in overlapping chips is uncorrelated. This means that the variance of the mean stack of chips at each location on sky is simply (11)

where N is the number of detectors overlapping with each other at this point on-sky, and is the variance of the ith detector. From this calculation, we found that the median limiting magnitudes in the ZYJHKs -bands are {23.49, 22.74, 22.55, 21.96, 21.77}. The distribution of these combined limiting magnitude estimates on-sky is given in Appendix E.

thumbnail Fig. 20

Distribution of recalibration factors ℱ , derived by KVPIPE using the parameters provided by CASU, for VIKING and VIKINGlike observations in the KiDS+KiDZ fields. Following Driver et al. (2016) and Wright et al. (2019), we applied a rejection of detectors with recalibration factors ℱ ≥ 30. There are additional indirect selections, however, that occur for fields with large PSF sizes (see Sect. 6.4).

thumbnail Fig. 21

Primary observational properties of pointings in KiDS+KiDZ for observations in the five NIR bands. Each row shows the distribution of PSF sizes (as measured on each VISTA chip; left) and limiting magnitude (as determined by the magnitude of a 5σ source in a 2″ circular aperture; right) determined with a KDE using the annotated kernel. The corresponding cumulative distribution functions for each panel are shown as grey lines.

5 KiDZ spectroscopic compilation

As discussed in Sects. 3.1.2 and 4.1.2, we undertook a targeted campaign to observe eight spectroscopic calibration fields, which make up the ‘KiDZ’ fields. The associated spectroscopic redshift data used in DR5 is a compilation of public and proprietary measurements from a range of surveys, which we refer to here simply as ‘the full compilation’. The list of spectroscopic surveys that we draw from to create the full compilation is provided in Table 10.

The compilation includes two main components. The first are redshifts from wide-angle spectroscopic surveys that mainly cover lower redshifts and brighter magnitudes, and which overlap at least partly with the KiDS main survey area. Surveys that contribute to this portion of the dataset include SDSS (York et al. 2000), GAMA (Driver et al. 2022), 2dFLens (Blake et al. 2016), and WiggleZ (Parkinson et al. 2012). The second component comes from several deep (often pencil-beam) spectroscopic surveys that intersect the KiDZ fields, including VVDS (Ilbert et al. 2006), VIPERS (Scodeggio et al. 2018), DEEP2 (Newman et al. 2013), C3R2 (Euclid Collaboration 2021), and zCOSMOS (Lilly et al. 2009).

A number of these datasets have sources in common (due, in part, to some of the samples being compilations themselves: e.g. G10_COSMOS, zCOSMOS, and GOODS). Furthermore, some datasets include duplicate and/or multiple redshift estimates, with some appearing more than twice in their respective datasets. To remove duplicated spectroscopic redshift estimates, we adopted a strategy designed to retain only the highest-quality redshift estimates per source. Before performing this duplicate removal, however, we first homogenised the redshift quality flags among the different surveys (which are defined using various inconsistent criteria). This procedure resulted in a single redshift quality flag, based on the GAMA NQ nomenclature, that indicates the confidence in a redshift estimate’s accuracy as (12)

We note that in what follows we only use objects with NQ ≥ 3.

This spectroscopic compilation is nominally the same as used previously in van den Busch et al. (2022), although the overlap with photometric data has increased considerably due to our dedicated KiDZ imaging. van den Busch et al. (2022) describe the homogenisation process used in the construction of the compilation, and the results of the merger in the spectroscopic fields available during DR4. This process is unchanged here, but is applied to the larger KiDZ sample that is now available. Here we summarise the selection and quality flag assignment per input survey.

  • Arizona CDFS Environment Survey (ACES; Cooper et al. 2012): galaxies were selected with Z_QUALITY ≥ 3 and zErr/ɀ < 0.01. We assigned NQ = Z_QUALITY;

  • Deep Extragalactic Visible Legacy Survey (DEVILS; Davies et al. 2018): galaxies were selected with spectroscopic red-shifts (i.e. zBestType = spec), and with flags starFlag = 0, mask = 0, and artefactFlag = 0. We assigned NQ = 4 if zBestSource = DEVILS and NQ = 3 otherwise;

  • Complete Calibration of the Colour-Redshift Relation (C3R2) survey: galaxies were selected from the combination of four C3R2 public datasets: DR1 (Masters et al. 2017), DR2 (Masters et al. 2019), DR3 (Euclid Collaboration 2021), and KMOS (Euclid Collaboration 2020). We required QFLAG ≥ 3 for galaxies in this sample, and assigned NQ = QFLAG;

  • Deep Imaging Multi-Object Spectrograph survey (DEIMOS; Hasinger et al. 2018): galaxies were selected with quality flag Q = 2. We assigned NQ = 4 for Qf ∊ [4, 14], and NQ = 3 otherwise;

  • DEEP2 (Newman et al. 2013): as in the previous KiDS papers (Hildebrandt et al. 2017, 2020, 2021) we selected galaxies from two equatorial fields (0226 & 2330), with Z_QUALITY ≥ 3 and zErr/ɀ < 0.01. We assigned NQ = Z_QUALITY;

  • Fiber Multi-Object Spectrograph COSMOS survey (FMOS-COSMOS; Silverman et al. 2015): galaxies were selected with quality flag q_z ≥ 2. We assigned NQ = 4 if q_z = 4 and NQ = 3 otherwise;

  • GAMA (Driver et al. 2022): galaxies from the 4th Data Release were selected with redshift quality NQ ≥ 3 and with ɀ > 0.002 to avoid stellar contamination. We propagated NQ from GAMA;

  • GAMA-G 15Deep (Kafle et al. 2018; Driver et al. 2022): galaxies were selected with input redshift quality Z_QUAL ≥ 3 and with redshifts ɀ > 0.001 to avoid stellar contamination. We assigned NQ = Z_QUAL;

  • G10-COSMOS (Davies et al. 2015): galaxies were selected with Z_BEST as the redshift value, and otherwise following the documentation for selecting galaxy redshifts: Z_BEST > 0.0001, Z_USE < 3, and STAR_GALAXY_CLASS = 0. All sources are assigned NQ = 3.5.

  • Great Observatories Origins Deep Survey (GOODS CDFS): galaxies were selected from the public ESO compilation of spectroscopy in the CDFS field8 (Popesso et al. 2009; Balestra et al. 2010). Following the recommendations in the dataset description9 we selected ‘secure’ redshifts (assigning NQ = 4 to them) and ‘likely’ redshifts (NQ = 3);

  • Hectospec COSMOS survey (hCOSMOS; Damjanov et al. 2018): all galaxies from the published dataset were selected, and assigned them redshift quality NQ = 4;

  • The Large Early Galaxy Astrophysics Census (LEGA-C; van der Wel et al. 2016): galaxies were selected with f_use = 1. We assigned NQ = 4 to all sources;

  • Australian Dark Energy Survey (OzDES; Lidman et al. 2020): galaxies were selected in two patches partly overlapping with the KiDZ around CDFS and the VVDS 2h field, with required quality qop ∊ {3, 4}. An additional selection of ɀ > 0.002 was made to exclude stellar contaminants. We assigned NQ = qop;

  • SDSS (Abolfathi et al. 2018): galaxies from the 14th Data Release were selected with zWarning = 0 and zErr > 0. We furthermore required that zErr < 0.001, zErr/ɀ < 0.01, and ɀ > 0.001. We assigned NQ = 4 to all such selected galaxies;

  • VANDELS survey (Garilli et al. 2021): galaxies were selected with (zflg mod 10) ∊ {2, 3, 4}. We assigned NQ = 4 if (zflg mod 10) ∊ {3, 4}, and NQ = 3 otherwise. The reassignment of the quality flags here was motivated by the reportedly high redshift confidence of objects with flag values of two and three.

  • VIPERS (Scodeggio et al. 2018): galaxies were selected with 2 ≤ zflg < 10 or 22 ≤ zflg < 30. We assigned NQ = 4 if 3 ≤ zflg < 5 or 23 ≤ zflg < 25, and NQ = 3 otherwise;

  • VIMOS Ultra Deep Survey (VUDS; Le Fèvre et al. 2015): galaxies were selected with flag zflags ending with {3, 4, 9} (reliability ≥80%) and assigned NQ = 4 if 3 ≤ zflags < 5 or 13 ≤ zflags < 25, and NQ = 3 otherwise;

  • VVDS (Le Fèvre et al. 2005, 2013): galaxies were selected from the combined WIDE, DEEP, and UDEEP sub-samples, with ZFLAGS ∊ {3, 4, 23, 24}. We assigned NQ = 4 to all sources.

  • zCOSMOS: galaxies were selected from a compilation of public (Trump et al. 2009; Comparat et al. 2015; Lilly et al. 2009) and proprietary10 spectra in the COSMOS field, kindly provided to us by Mara Salvato, updated as of 1 September 2017. That dataset includes some of the surveys already included in our compilation, but also provides redshifts from various other campaigns. We used the provided quality flag and selected galaxies with 3 ≤ Q_f ≤ 5, or 13 ≤ Q_f ≤ 15, or 23 ≤ Q_f ≤ 25, or Q_f ∊ {6, 10}. We rejected sources with low-confidence redshift estimate (e.g. from grism spectroscopy), and limit the galaxies to z_spec > 0.002 to avoid stellar contamination. We assigned redshift quality as NQ = min((Qf mod 10), 4);

When combining the above spectroscopic samples, we removed both internal (i.e. within the same input catalogue) and external (i.e. in different input catalogues) duplicates. In the latter case, we assigned the most reliable measurement per source based on a specific ‘hierarchy’. Namely, we joined the catalogues by crossmatching objects within 1″ radius and apply the following order of preference:

  • GAMA takes precedence over others, followed by SDSS; then:

  • COSMOS field: G10-COSMOS > DEIMOS > hCOSMOS > VVDS > Lega-C > FMOS > VUDS > C3R2 > DEVILS > zCOSMOS;

  • CDFS field: ACES > VANDELS > VVDS > VUDS > GOODS CDFS > DEVILS > OzDES;

  • VIPERS_W1 field: VIPERS > VVDS > C3R2 > DEVILS > OzDES.

This hierarchy was followed regardless of the relative quality flags between different surveys. For objects with multiple spectroscopic measurements within a particular survey, we either selected the redshift with the highest quality flag or, if various entries for the same source have the same quality flag and the reported redshifts differ by no more than 0.005, we used the average of the provided redshift estimates. If the reported red-shifts have the same quality flag but differ by more than 0.005, we excluded the source from the compilation.

The final spectroscopic sample consists of 635 099 spectroscopic redshift estimates taken from 22 samples. Table 10 lists the statistics for this full compilation, prior to matching with sources detected in our KiDZ imaging. The table presents the number of spectroscopic redshift estimates, redshift range, mean redshift, and redshift scatter (computed using the NMAD) for each sample. Figure 22 shows the distribution of spectroscopic redshift estimates from the full compilation (blue and red) in the context of the individual KiDZ fields on-sky. The spectroscopic data can be seen to extend beyond the spatial extent of the KiDZ optical and NIR data (grey scale), to maximise the cross-match between available spectroscopic redshift estimates and extracted KiDZ photometric sources.

Finally, the spectroscopic redshift estimates were cross-matched to the sources detected in the KiDZ THELI-Phot-reduced images using a simple sky match at 1″ radial tolerance11. The matched sources are shown in Fig. 22 as blue points. The final sample of spectroscopic redshift estimates after matching to the KiDZ photometric data contains 126 085 sources; the statistics for these redshift estimates are also provided in Table 10.

Figure 23 shows the cross-matched KiDZ spectroscopic red-shift estimates in the RA-ɀ plane, demonstrating the relative depth of the various fields, and large-scale structures contained within them. For comparison, the spectroscopic redshift estimates that were available for previous KiDS analyses are shown in orange.

Table 10

Statistics for the KiDZ spectroscopic sample.

6 Optical and NIR

At this stage in the reduction process we have calibrated optical imaging and masks in the u𝑔ri1 i2-bands from THELI and ASTROWISE, source catalogues extracted with SOURCE EXTRACTOR from THELI imaging, forced GAAP photometry in the u𝑔ri1i2-bands from ASTROWISE, and calibrated ZYJHKs-band detectors from KVPIPE. These data products are prepared for all pointings in KiDS and KiDZ. The remaining tasks are therefore the measurement of GAAP photometry for all sources in the ZYJHKs-bands (on individual VIKING chips), combination of per-chip flux estimates for individual sources (per band), correction of fluxes for Galactic extinction, estimation of photometric redshifts, and construction of final ten-band masks per tile. Each of these tasks is performed within our new post-processing pipeline PHOTOPIPE.

6.1 The PhotoPipe repository

The PHOTOPIPE repository was constructed from scripts used for KiDS DR4, and is publicly available on GitHub12. The pipeline serves one primary function: to ensure that reduction is performed consistently across a wide array of complex steps and processes, even under significant changes to underlying datasets and/or methodology choices.

With regard to the scripts used for DR4, significant changes implemented in PHOTOPIPE include updated selection of source apertures used for photometry (Sect. 6.2), new masking of pathological photometric failures (Sect. 6.4), and updated calibration for THELI-Lens ‘auto’ magnitudes.

thumbnail Fig. 22

Distribution of KiDZ spectroscopic redshift estimates on-sky. The figure shows the distribution of all available spectroscopic redshift estimates from the full spectroscopic compilation (red) and those that are matched to unmasked KiDZ sources (blue). The available footprint of the KiDZ imaging is shown in grey scale beneath the points, demonstrating where we have imaging but no available spectra (and vice versa).

thumbnail Fig. 23

Distribution of KiDZ spectroscopic redshift estimates in RA-ɀ space. Sources in orange indicate estimates that were available for previous KiDS analyses, and sources in cyan show new estimates added here.

6.2 NIR GAAP photometry

The VIKING NIR data are treated differently than the optical data due to the inherently different dither patterns of KiDS and VIKING. As discussed in Sect. 3.1, KiDS was designed with small dithers that ensure similar PSF properties are combined at each point in the focal plane, thereby improving the stability of the final co-added images and resulting in stacks that mostly have smoothly varying PSF. In contrast, as discussed in Sect. 4, VIKING tiles require dithering by a fair fraction of the field of view, resulting in large discontinuous variations in the PSF across the mosaicked tile. Rather than modelling these complex PSF patterns directly (required for the Gaussianisation process performed by GAAP), we opted to extract the NIR photometry from the individual chips of each exposure, and perform an optimal averaging of the flux measurements at the catalogue level; that is, we performed our mosaic construction in catalogue space.

GAaP yields a flux measurement for each object and each exposure on which it appears, typically resulting in two observations for each object in the ZYHKs-bands, and four observations in the J-band. These fluxes are then combined using an approximately13 optimal inverse variance weighting, based on the flux error reported by GAAP.

One complication in this process, already described in Kuijken et al. (2019), is the fact that GAAP will fail to produce a flux estimate in band X if the aperture size, which is set by the r-band seeing, is smaller than the X-band seeing. Furthermore, short of complete failure, the GAAP flux measurements also become extremely noisy when the aperture size is comparable to the PSF size residual, as flux information is being determined by few data pixels. Such circumstances are infrequent but nonetheless bothersome: inspection of the PSF distributions in Figs. 7 and 21 demonstrate that the NIR PSFs consistently exceed the median r-band seeing by more than 0″.5 for roughly 10% of the observations (closer to 15% in the Z-band). To remedy this behaviour the pipeline runs GAAP twice, implementing two different choices of minimum aperture sizes: 0″.7 and 1″.0. This forces the smallest sources to have larger apertures in the 1″.0 case, in an effort to suppress the bias and failure rate (the NIR PSF sizes are only more than 1 larger than the median r-band seeing in less than 1% of observations). Once all flux measurements are available in all bands, the choice of which aperture to use for flux measurement is taken on an object-by-object basis, using the following criterion: (13)

where R is the vector of flux error ratios for the two apertures in each band: (a negative R identifies cases where all 0.7″ apertures fail to yield a flux.) This approach maximises the number of objects with high-S/N GAAP flux measurements in all bands, and yields the best possible photo-ɀ from the combined KiDS+VIKING data.

To demonstrate the quality of the ten-band photometric catalogues and the cross-survey calibration, in Fig. 24 we present an optical and NIR colour-colour diagram for KiDS DR5 sources. The figure shows the u − 𝑔 and 𝑔 − i optical colours versus the JKs colour from VIKING. The figure shows the standard stellar locus and galaxy colour distributions, with the sources that match to known SDSS stars highlighted with red contours. We also overlay the colours of all Pickles main sequence stellar templates (Pickles 1998), as observed through the relevant optical and NIR filters. We note the excellent agreement between the Pickles templates and the observed stellar locus: the only potential discrepancy is a small offset (∆u ≤ 0 05) in the u-band, where (uncertain) modelling of metallicity in the stellar population can cause significant bias. Nonetheless, the u-band discrepancy is consistent with our systematic uncertainty of ∆usys = 0 05 (Sect. 3.5.2).

6.3 Photometric redshift estimation

Photometric redshifts were estimated in a very similar way as in KiDS DR4, with the primary difference being the incorporation of the second pass i-band information. We ran the Bayesian photo-ɀ (BPZ; Benítez 2000) code on the ten-band GAAP photometry using the Bayesian prior presented in Raichoor et al. (2014), a maximum redshift of ɀ = 7, and a redshift stepping of ∆ɀ = 0 01 . The two i-band measurements are treated independently, relying on BPZ to optimally use the information based on the two GAAP magnitude errors. Providing the two fluxes independently in this way is mathematically equivalent to providing BPZ with a single variance-weighted average of the two i-band flux measurements.

The resulting photo-ɀ point-estimates were compared to the available KiDZ spec-ɀ (Sect. 5), and summary statistics for the various fields and full compilation are presented in Fig. 25. The figure contains both summary statistics computed for the entire compilation of spec-ɀ present in all of the KiDZ fields and for statistics computed in the individual KiDZ fields. In all cases the samples are binned into quasi-equal-N bins, which contain at most 2000 sources per bin. In the event that a field has fewer than 4000 spectra, the sample is split into two equal-N bins. This has the effect of producing short straight lines for some fields in the figures.

We quantify the photo-ɀ quality via the statistics of ∆ɀ = (ɀB − ɀspec)/(1 + ɀspec). In particular, we report in Fig. 25 the median, NMAD scatter, and rate of outliers (objects with |∆ɀ| > 0.15) as a function of r-band magnitude, spectroscopic redshift, and photometric redshift. The well-known dependence of photo-ɀ quality on photometric S/N, here parametrised by the r-band magnitude, is clearly visible for r > 22. For r < 22 the photo-ɀ scatter and outlier rate are roughly constant, at and , respectively, indicating that other sources of error limit the precision of the photo-ɀ at bright magnitudes. Of further note is the increase in photo-ɀ bias seen a magnitudes brighter than r = 20; here the photometric pipeline, and photo-ɀ priors, used for the KiDS-DR5 sample are not optimal, having been chosen to optimise performance at fainter magnitudes (see e.g. Fig. 10 of Wright et al. 2019).

The dependence of photo-ɀ quality on spectroscopic and photometric redshift is more complicated, showing several features that we attribute to the non-trivial interplay of the specific filter set, depth in the different bands, the spectral energy distribution templates, and the real galaxy population. only at high redshift and/or photo-ɀ can the behaviour be easily attributed to S/N effects, visible by the consistent degradation of quality in all KiDZ fields.

We find photo-ɀ biases at the few percent level over all magnitudes, but which exacerbate at the extremes in photo-ɀ and redshift . These biases are typically accompanied by an increase in photo-ɀ scatter, shown in the second row of the figure. Beyond spectroscopic and photometric redshifts (i.e. at ɀB ≥ 1 and ɀspec ≥ 1), the scatter in the photo-ɀ can be seen to deteriorate for all available surveys due to the increased photometric noise of these distant objects. Of particular note, though, is the increased in scatter and outlier rate observed in the COSMOS sample at ɀB ≈ 0.6 and ɀspec ≈ 0.7. Indeed, in this region the COSMOS sample of sources display roughly twice the number of outliers as other samples . This may simply be indicative of different selection effects in the COSMOS field spectroscopic sample, compared to the other fields (as the COSMOS sample is made up of many individual spectroscopic samples with a range of targeting criteria). However, the increase in scatter is coincident with an overdensity in the large-scale-structure, visible as a spike in the COSMOS number counts in the bottom row (and visually in the redshift-axis diagrams shown in Fig. 23). This increase in scatter may therefore be caused by the overdensity, which causes sources at this redshift to over-sample a particular subset of colour redshift degeneracies (specifically those degenerate with red galaxy spectra at redshift ɀ ≈ 0.7).

Regarding the inclusion of the additional i-band information in the computation of the photo-ɀ, we find that the summary statistics presented here change only slightly when moving between nine and ten bands. We find, however, that this is primarily due to the strong selection effects present in the sample that we use to construct this data-side photo-ɀ statistics. As a demonstration of the expected performance improvements in the photo-ɀ for the wide-field photometric, we can utilise simulations. Using the simulated KiDS galaxy sample presented in van den Busch et al. (2020), we computed our BPZ photo-ɀ with and without an additional i-band photometric realisation.

These results are presented in Fig. 26. We find a consistent 5–10% reduction in photometric scatter and outlier rate at magnitudes fainter than r = 22, where the additional noise realisation provides useful information. More significantly, however, the improvement seen in the photo-ɀ as a function of true redshift: we see a consistent 10–20% reduction in scatter at ɀ > 0.7, coupled with 5–20% reduction in outlier rate in the same red-shift range. This is understandable given the propagation of the major spectral features as a function of redshift, mainly the 4000Å and Balmer breaks: these features enter the i-band at approximately ɀ = 0.7, and beyond this redshift the additional photometric information allows us to better localise the break (and its absence, for ɀ > 1.2).

We stress here that these biases are not reflective of the bias imprinted on typical cosmological measurements from weak lensing with KiDS. The template-based photo-ɀ shown here are typically only used to define subsamples of galaxies with red-shift distributions that are largely localised and distinct along the line of sight. The redshift distributions themselves are determined via empirical methods making direct use of the spec-ɀ compilation, which here is only employed for validation. Furthermore, the underlying magnitude, colour, and morphology distributions of the sources that Fig. 25 is based on are inherently different from typical weak lensing source samples, due to selection effects and weighting. As a result, the comparison of photo-ɀ and spec-ɀ shown in Fig. 25 does not directly translate into the photo-ɀ quality of galaxy samples used for cosmological measurements, but simply illustrates the raw performance on the spec-ɀ without accounting for these differences.

Finally, we note that there are ongoing efforts to produce additional photo-ɀ estimates within the KiDS collaboration, aimed at improving photo-ɀ performance via sample selection (see e.g. Vakili et al. 2023), using machine learning of fluxes (see e.g. Bilicki et al. 2021), and using machine learning directly on images (see e.g. Li et al. 2022). These efforts, however, will not produce photo-ɀ estimates that are included in the formal DR5 ESO release. Nonetheless, such additional photo-ɀ estimates will undoubtedly be of value to the community as they are designed to produce higher accuracy and precision for (in particular) samples brighter than r = 20.

thumbnail Fig. 24

Colour-colour diagrams for KiDS DR5 sources. The stellar locus can be identified by the distinct clouds of data that are coincident with both SDSS stars (black contours) and Pickles stellar templates (pink dots).

thumbnail Fig. 25

Quality metrics for photo-ɀ point-estimates produced by BPZ in PHOTOPIPE. Each row shows one quality metric, computed from the distribution of ∆ɀ = (ɀB − ɀspec) /(1 + ɀspec): the running median (‘bias’, µ), the running normalised median-absolute-deviation (‘scatter’, σ), and the fraction of sources with |∆ɀ| > 0.15 (‘outlier rate’, η0.15). The columns show these statistics computed as a function of r-band magnitude (left), photo-ɀ point-estimate (ɀB, centre), and spectroscopic redshift (ɀspec, right).

6.4 Ten-band mask construction

We adopt a bit-masking scheme for the FITS masks that encodes the availability and quality of data at a given sky position. The 16 bits corresponding to different image defects are described in Table 11. Compared to previous KiDS data releases, we reorder the bits to facilitate the masking of a fiducial sample for typical weak lensing applications by selecting objects with (14)

For compatibility with previous data releases and codes, the fiducial mask as a bitwise-logical statement is (15)

(corresponding to 0111 1111 1111 11102). The first four bits encode objects and defects that are present in the THELI r-band detection image. The zeroth bit (MASK = 1) corresponds to faint halos created by reflected starlight that typically do not significantly bias flux and shape measurements, and is therefore excluded in the fiducial masking scheme given in Eq. (14). The first bit (MASK = 2) corresponds to brighter halos from reflections and the direct bright starlight. The second bit (MASK = 4) encodes areas that are masked out manually due to defects that are not captured by any of the automatic algorithms, and also rejects area that is affected by resolved dwarf galaxies or globular clusters, which confuse our automatic star-galaxy separation and subsequent PSF measurement algorithms due to the abnormally high stellar density. The third bit (MASK = 8) encodes areas that are removed from consideration by the THELI masking of regions with abnormally low number densities (the ‘void mask’), sources automatically flagged as being asteroids, and areas of THELI imaging with zero weight (caused by, for example, chip gaps and detector saturation). The following ten bits (MASK = 16 to MASK = 8192) correspond to defects in the ten u𝑔ri1i2ZYJHKs - bands that the GAAP photometry is extracted from. The final, 14th bit that is used (MASK = 16 384) corresponds to area that is outside the limits of each pointing, chosen to knit together the overlapping VST tiles without source duplication. These are referred to as the ‘WCS’ cuts, as they are defined using constant limits of RA and Dec in the World Coordinate System (WCS) of the individual tiles.

One significant change in the masks from previous KiDS releases is the removal of redundancy between the r-band mask created for the THELI detection images (bits 0–3) and the u𝑔ri1i2-band masks created for the ASTROWISE images that are used for the optical GAAP photometry (bits 4–8). Previous releases ignored the faint stellar reflection halos from the ASTROWISE r-band masks (bit 5), instead relying on the ability of the THELI r-band mask to capture those effects in bits 1 and 0 (MASK = {1, 2}). This choice was made because the default ASTROWISE masking was slightly too conservative, correlating more with the THELI-Lens conservative mask (MASK = 1) than with the standard THELI mask (MASK = 2). This effect was largely resolved with the update of the PULECENELLA mask parameters (Sect. 3.5.3), but nonetheless persists at low levels. In DR5, we apply all ASTROWISE mask bits in our fiducial mask. If we were to remove the ASTROWISE r-band masks from the fiducial masking, as in DR4, we would free up 16.0 deg2 of data over the full survey area (~1.6%). The final correlations between the various bits of the MASK are provided in Appendix F.

Another addition to the masks for DR5 is the inclusion of a ‘strong-selection’ mask, which is designed to remove areas of the survey where there have been pathological photometric measurement failures within GAAP. These failures are related to the difference in the PSF sizes between the detection r-band images and the other bands, as discussed in Sect. 6.2. As discussed, this effect led initially to the requirement of measuring fluxes in apertures with two minimum radii (see Sect. 3.6), but in some cases even the use of the larger aperture is not sufficient to stop GAAP from being unable to measure a flux. As these failures are related to the size of the object aperture, when such failures occur, they preferentially affect the smallest objects first. This can imprint a redshift-dependent bias on the source distribution that is localised in particular areas on-sky. Such a selection bias is problematic for (primarily) photometric clustering studies, but also for cosmic shear (albeit at higher order) as it exacerbates variable depth (see e.g. Heydenreich et al. 2020).

We therefore opted to remove areas of the survey where such pathological failures on-sky are detected. We quantify this failure using a 2D kernel-density estimate (KDE) measurement of the source density on-sky before and after selection due to GAAP photometric failures. If the ratio of these two KDEs is less than 0.8 (i.e. more than 20% of the sources, per square arcminute, have been removed due to photometric failures), we flag these regions and remove them from the survey. This masking can be seen in Fig. 27, where we show the effect of this masking for one of the KiDZ tiles in VIPERS. The figure shows the distribution of all sources in the tile, without any preselection. The sources are coloured by whether or not they have a photometric failure in the Y-band: blue have successful photometric measurements, and red have failed measurements. In the background of the image we show the Y-band sum-image for this field, after the application of the WCS cuts (bit 14). This shows the regular tile pattern formed by the dithering of many paw-prints (Sect. 4). It is apparent from the photometric failures that there are two paw-prints that have particularly poor seeing, which causes sources covered only by these paws to have failed photometric measurements. These areas of the sky have been identified by the strong selection algorithm, and these areas have been masked (leading to the sum-image being zero, white, in these areas). We note that the sources have not all failed in these regions; however, the pathological nature of the selection means that we have nonetheless removed these areas from consideration. It should also be stressed that this strong-selection masking does not correlate with cosmic large-scale structure, as it is driven by observing conditions. As such, it is unlikely to introduce any bias in cosmological measurements from weak lensing.

thumbnail Fig. 26

Differences between photo-ɀ point-estimates produced by BPZ in PHOTOPIPE computed without the second pass i-band (‘9-band’) and with the second pass i-band (‘10-band’), using a full simulated KiDS wide-field sample described in van den Busch et al. (2020). Metrics here are not directly comparable to Fig. 25 due to this sample being simulated, having a different redshift baseline, and representing a full wide-field sample (unmatched to spectroscopy). Therefore, only relative differences between the nine- and ten-band cases are relevant. Metrics are computed as in Fig. 25, except metrics here are shown as percentage difference with respect to the nine-band case.

Table 11

KiDS+KiDZ MASK bits, their names, and additional information.

thumbnail Fig. 27

Demonstration of the strong-selection masking in the KiDS+KiDZ catalogues for KiDZ pointing KIDZ_333p0_1p9 in the Y-band. All sources detected by SOURCE EXTRACTOR are shown coloured by the number of missing photometric bands as determined by GAAP (0: blue, 1: red). The mask is determined by the fraction of sources per unit area on sky that are missing one or more bands, as determined using a ratio of KDEs constructed on a 1′ × 1′ grid with a 1′ Gaussian kernel. The strong-selection mask removes regions with missing-source fractions greater than 20%. The effect of this mask can be seen in the background, which shows the sum image of this field’s Y-band data after masking: areas that show the strong selection effect (i.e. red dots) have a zero value in the sum image (i.e. white), and are therefore masked.

6.5 Mosaic mask construction

Finally, we constructed mosaic masks for science use and calculation of the fiducial survey area for KiDS. Mosaic masks are constructed using SWARP combining all individual ten-band masks using a ‘minimum’ combination. Masks are constructed on a predefined WCS, using an Aitoff projection and a 6″ pixel resolution.

Using these mosaic masks, we are able to calculate the area of the KiDS DR5 data when selecting specific mask bits. Table 12 presents a compilation of the available survey areas when masking specific bits, and combinations of bits. We note a few particular examples. First, the area of the survey when considering all available tiles after removing overlaps (‘WCS’, bit 14) is 1331.0 deg2. This area is then further reduced when considering masking of artefacts, stars, stellar reflections, and missing chips in the THELI-Lens r-band imaging (‘WCS+THELI-Lens’, bits {1–4, 14}) to 1145.4 deg2. Considering only the area available to all of the u𝑔ri1i2-bands (‘WCS+THELl-Lens+ASTROWISE’, bits {1–8, 14}), the survey area reduces to 1074.8 deg2. After removal of area that does not have overlap with the VISTA ZYJHKs-bands, the survey area is further reduced to the ‘fiducial’ survey area of 1014.0 deg2(‘WCS+THELI+AW+NIR’, bits {1-14}).

Table 12

Mosaic areas computed from the 6″ mosaic masks.

7 Shape measurement and legacy sample construction

In this section we detail the construction of the lensing sample that will be used for the fiducial KiDS DR5 cosmological analyses. This ‘KiDS-Legacy’ sample is a subset of the full DR5 sample, determined (primarily) by the availability of reliable shape measurements. We first describe the new version of our fiducial shape measurement code lensfit (Miller et al. 2013, Sect. 7.1), followed by the selection of the KiDS-Legacy sample (Sect. 7.2).

7.1 DR5 lensfit

Fiducial shape measurement in KiDS DR5 was performed with lensfit υ321. Demonstration of the properties of this lensfit version are provided in Li et al. (2023), where the accuracy of the measured shapes were tested with simulations. Regarding the use of lensfit for selecting the DR5 lensing sample, there are two important changes with respect to DR4 (which was produced using lensfit version υ309): the values of the fitclass flags have been modified, and the calibration procedure for the lensing weight has been modified.

7.1.1 fitclass

The new definitions of the fitclass flags in lensfit υ321 are presented in Table 13. These new flags require the selection to be updated, and the choice of which flags to reject is also presented in the table. A total of 31.31% of the available MASK ≤ 1 sample is flagged by the various fitclass selections, the most severe of which is the ‘insufficient data’ flag, which is assigned to 20.69% of the MASK ≤ 1 sources. However, the majority of these sources are not modelled by lensfit because they are too bright (see Sect. 7.2). In practice, the fitclass selection removes 14.03% of sources that would otherwise exist in the sample; the majority of which are either stars (9.07%) or have multiple sources detected within a single segmentation region (and so are deemed to be blended, 3.85%).

7.1.2 Weight recalibration

Galaxy shapes are able to be expressed as ellipticities ε, quantified using moments of the light distribution or using direct model fits to galaxy images, and take the form of complex quantities combining the source axis ratio (q) and the major axis orientation angle (θ): (16)

The magnitude of any source’s ellipticity is therefore simply (17)

Ellipticities are measured by lensfit using a model-fitting approach that is described at length in Miller et al. (2013). Source ellipticities are reported at their posterior-informed maximum likelihood (see Sect. 3.5 of Miller et al. 2013), and the 2D measurement variance of the likelihood in ellipticity is provided as . Faint sources have inherently noisier ellipticity estimates than their brighter counterparts, and this is reflected in them having a broader likelihood surface and a larger variance vε. To translate this uncertainty in shape measurement to subsequent scientific analyses, each source i is therefore tagged with a shape measurement weight wi, which is related to the inverse of the 2D variance estimate. Individual weight estimates are computed as (18)

where |ε|max is the maximum allowed ellipticity (determined by the intrinsic thickness of edge-on disk galaxies), and vε,pop is the population variance in ellipticity of all galaxies in the sample.

Unfortunately, shapes measured by the lensfit algorithm are not free of systematic biases, and therefore require recalibration. Of particular concern is the systematic imprint of the PSF shape on the estimated properties of galaxies, as such systematic contamination of estimated ellipticities and weights would ultimately introduce bias into shear correlation functions (which are a primary probe of cosmology). As such, our recalibration procedure (described in Li et al. 2023, and with minor modifications explained below) is designed to remove residual imprints of the PSF on the data (although the resulting shape estimates nonetheless require a correction for multiplicative shear bias; see Sect. 7.1.4).

Although the KiDS PSF in the r-band is relatively round, any anisotropy can lead to inferred galaxy shapes that are correlated with the orientation of the PSF. This ‘PSF leakage’ affects both the ellipticities per source, and the associated weights. PSF leakage in measured source weights refers to the tendency of sources to be preferentially up-weighted when their on-sky orientation happens to be aligned with the PSF. PSF leakage in the measured source shapes refers to the tendency of detected sources to be preferentially aligned with the PSF, as they have higher effective surface brightnesses in this regime (and thus are less likely to have been missed during source detection).

In KiDS-Legacy, we corrected for PSF leakage in the weights of sources by measuring the linear relationship between the variance of the shape measurement per source vε and the scalar projection S of the PSF ellipticity εPSF in the direction of the galaxy ellipticity ε per source: (19)

The relationship between vε and Sε is highly dependent both on the resolution of the source and on the S/N of the lensfit model. Here resolution is defined as the ratio of the squared PSF radius (standard deviation) to the quadrature sum of the PSF radius and the circularised galaxy effective radius (where re is the effective radius) per source: (20)

We therefore opted to fit for a linear relationship between vε and Sε in bins of ℛ and S/N, which we constructed to each contain an equal number of sources. We fitted our linear regression (between all galaxies i that reside within a bin) as (21)

where 〈Sε〉 is the additive component of the model, and the sources are expected to be normally distributed about the regression given a standard deviation σS; that is, assuming homoskedasticity of the residuals in the fitting process. This regression provides an estimate of the PSF leakage into the shape measurement variance for sources in this bin of ℛ and S/N, parameterised by the multiplicative coefficient αS. Panel a of Fig. 28 shows the value of αS in our bins of ℛ and S/N, demonstrating the non-negligible systematic PSF leakage into the weights over this plane.

One development that we have made with respect to the method presented in Li et al. (2023) regards these linear fits. Inspection of the data within each bin demonstrates that the assumption of homoskedasticity is poor: measurement variance is significantly larger for sources with small intrinsic ellipticities, and therefore the residuals in the fit balloon in the middle of the fitting region. Additionally, the variance is a truncated variable, which can lead to further heteroskedasticity in the regression residuals. To combat the influence of these two effects, we implemented a conservative clipping of data with exceptionally high variance, and a correction for heteroskedasticity in the form of iterative regression with residual weighting. The result of these processes (compared to simple linear regression) is presented in panels b and c of Fig. 28.

Once the value of αS is computed per bin, the original measurement variances (for sources in that bin) are corrected by subtracting the inferred leakage amplitude per source i: (22)

The distribution of αS values measured after this correction (i.e. re-fitting Eq. (21) with vε,ivε,i) are shown in panel d of Fig. 28. The corrected lensfit measurement variances vε,i are then used to define a corrected shape measurement weight per source wi using a modified version of Eq. (18): (23)

Table 13

fitclass definitions and statistics output by lensfit υ321.

thumbnail Fig. 28

Sources in the resolution-S/N plane that is used for recalibration of lensfit weights and shape estimates. Top row: estimates of the PSF leakage into the shape measurement variance, estimated with simple linear regression (left) and with clipped, iterative, residual-weighted linear regression (centre). Right: difference between the leakage estimates. Bottom row: PSF leakage estimates after correction of shape variances and weights (left). The estimates of PSF leakage into the source shape distributions is shown before (centre) and after (right) correction of estimated ellipticities.

7.1.3 Ellipticity correction

The final step in our recalibration process is the correction of source ellipticities, designed to remove any residual PSF leakage into the distribution of measured shapes. This process is similarly performed in bins of resolution and S/N, but also in bins of photo-ɀ (which are used for cosmic shear tomography). As such, this process formally does not contribute to the sample statistics calculated here, as the choice of tomographic bins for KiDS-Legacy is decided during the cosmological analysis. Nonetheless, we document the process here for posterity. Furthermore, the ellipticity correction does not fully encapsulate biases inherent to the shape measurement process, as multiplicative shear biases remain and must be corrected for (see Sect. 7.1.4).

The first step in our shape recalibration process is to perform a weighted linear fit (using our recalibrated shape weights) between the distribution of source ellipticities ε = (ε1, ε2) and measured PSF ellipticities εPSF = (εPSF,1, εPSF,2), per source i, assuming the model (24)

where c = (c1, c2) are the additive components of the regression models for each ellipticity component, and σε = (σε,1, σε,2) are the standard deviations of the residuals in the fits to each ellipticity component, which again are assumed to be Gaussian. These fits provide us with an estimate of the residual PSF leakage in the individual ellipticity components, α = (α1, α2), and an associated fit uncertainty σα = (σα,1, σα,2).

The distribution of a prior to any correction, expressed as the arithmetic mean of the two components , is shown in panel e of Fig. 28. In most bins we observe modest biases: amplitudes are typically , and vary smoothly over the plane of resolution and S/N. However, in low S/N cases the biases drop sharply to negative values, resulting in significant within-bin variation. As a result, the direct per-bin correction approach introduced in Sect. 7.1.2 does not satisfactorily correct for all biases within each bin. Instead, we invoked a hybrid approach that first removes a continuous polynomial bias over the ℛ and S/N plane, followed by a direct bin-wise correction to account for remaining (incoherent) biases.

To correct for the overall trend, we first estimated the amplitude of PSF contamination by fitting a polynomial to the estimated distribution of α (per component) and resolution vs S/N: (25)

where the coefficients cx = (cx,1, cx,2) are fit for using weighted least squares, per component and with inverse variance weights . The result of this polynomial fit is a contiguously estimated per source that is used to remove the overall PSF leakage into the source ellipticities: (26)

In practice is an intermediate product because, as mentioned previously, the ellipticities after this correction still contain residual biases due to higher-order bin-wise fluctuations in the distribution of α. These residual fluctuations are removed on a per-bin basis, using a direct correction for the residual bias in the ellipticities per bin, (27)

resulting in the final ellipticity measurements used for science: (28)

The distribution of measured biases after both corrections is also shown in panel f of Fig. 28.

After this recalibration procedure, sources with non-physical ellipticities (i.e. |ϵi| > 1) are removed from the sample; however, this number is typically very small (Li et al. 2023, found four sources out of more than 26 million were removed by this selection in their mock analyses).

Finally, we include the on-sky distribution of shape measurement statistics in Appendix G. These diagnostics are useful for understanding the variability of source shapes, PSF sizes, and possible systematic effects present in the lensing catalogue.

Table 14

Statistics of the additional selections used for KiDS-Legacy sample definition.

7.1.4 Multiplicative bias

The ellipticity estimates from lensfit suffer from small percent-level multiplicative biases. These biases originate from different physical effects such as source detection, blending, noise, model imperfections, and so on, and strongly depend on galaxy properties such as size, S/N, shape of the light profile, and so on. Ultimately, an estimate of the total multiplicative bias requires the choice of a source sample and the careful simulation of the shape measurement process on that sample. Such simulations are presented in Li et al. (2023) for the KiDS-1000 source sample and the tomographic bins used in the KiDS-1000 cosmic shear analysis (Asgari et al. 2021). We expect similar multiplicative biases for the KiDS-DR5 source sample presented here and refer the reader to Li et al. (2023) for orientation. However, the actual estimation of these biases will differ slightly due to the differences between DR4 and DR5 and possible differences in the upcoming KiDS-Legacy analyses compared to KiDS-1000. Hence, we do not present actual numbers for the multiplicative bias here but defer this to the forthcoming science papers.

7.2 KiDS-Legacy sample selection

The final KiDS-Legacy sample is defined as the lensing sources that satisfy a number of quality criteria. These criteria are summarised here, in Table 14, and in Fig. 29. The table presents details of the selections in terms of the number of sources removed (and remaining) in the KiDS-Legacy sample after each successive selection. Similarly, Fig. 29 presents the fraction of sources that are retained in the sample as a function of r-band magnitude after each selection.

The initial sample was constructed by applying the fiducial mask selection (Sect. 6.4): (29)

Next, we removed all sources that do not have successful GAAP measurements in all ten photometric bands xu𝑔ri1i2ZYJHKs: (30)

This selection removes less than 0.1% of the total MASK ≤ 1 sample.

We then removed asteroids and other transients present in the detection imaging, using a combination of extreme colour selections (Kuijken et al. 2015): (31)

The asteroid mask removes 0.94% of sources from the MASK ≤ 1 sample, consisting of both bright objects and a portion of the faintest sources in the sample (due to high photometric noise).

As the KiDS-Legacy sample is primarily intended for lensing science, we next selected on lensing-related quantities. First, we removed sources for which lensfit does not produce a shape estimate. Such sources retain a placeholder-value of exactly zero for the PSF moments (among other parameters), and are therefore able to be removed by selecting only galaxies with (32)

This causes a significant reduction of the sample, removing over 20 million sources. Figure 29 demonstrates, however, that this reduction is caused by a blanket cut of sources in r-band magnitude that are brighter than rauto = 19 or fainter than rauto = 25. These limits are internally imposed by lensfit. Within this magnitude window, very few sources are removed by this selection.

The next lensing-related selection removes sources that are considered to be blended. This selection is performed using the lensfit contamination_radius parameter, (33)

and removes sources that are contaminated by extracted neighbours. This selection removed 26.3% of the MASK ≤ 1 sample; however, most of these sources are already removed by previous selections (particularly the ‘unmeasured’ selection). As such, this blending selection only removes a further 6.99% of the available sample.

As described in Sect. 7.1, the lensfit code provides a range of flags related to the quality of the fitted model, called fitclass. fitclass selection is particularly useful for the rejection of point-like sources from the sample, which might otherwise contaminate our lensing measurements. For the lensing sample, we applied the fiducial fitclass selection: (34)

where the meanings of the various flags are given in Table 13. This fitclass selection removes more than 31% of the MASK ≤ 1 sample from consideration; however, more than half of these sources are already removed by previous flags. The remaining 14% of sources that are removed are preferentially bright, as can be seen from Fig. 29. This is consistent with the majority of these sources being stellar contaminants; a conclusion that is supported by the fitclass selection producing r-band number counts that more consistent with a simple power-law.

Next we applied a further selection to remove unresolved binary stars from the sample. This binary star selection has also been updated from that used in DR4. Figure 30 shows the distribution of objects from the MASK ≤ 1 sample in r-band magnitude and (a parameter akin to) angular extent on-sky. The centre and right panels of the figure show the sample prior to binary rejection, split by ellipticity. The highly elliptical sample (|∊| > 0.8) is used for the identification of unresolved binaries, using a cut in the size-brightness diagram, quantified using the r-band scale-length measured by lensfit (Rs = autocal_scalelength_pixels), and the r-band magnitude measured by GAAP. In DR4, the binary selection was made using a simple linear cut in this plane (the dashed red line). For DR5, we revised the cut to now perform a slightly modified linear rejection, coupled with a brightness cut: (35)

and (36)

This selection removes only 0.14% of sources in total, and only 0.02% of sources that would otherwise be currently in the lensing sample. Nonetheless, visual inspection of the ri versus 𝑔 − r colour-colour-space demonstrates that these selected sources are well localised on the stellar locus.

Next we applied a blanket magnitude in the detection band: (37)

This selection causes a 2.64% reduction in the total source counts, and a 1.18% reduction in the available sample for lensing. The origin of this selection lies in the previous lensfit version, which did not model the shapes for sources brighter than rauto = 20. The fraction of sources lost to this selection is relatively small (1.18%), and therefore we opted to keep this selection to be consistent with previous releases.

Next we applied a resolution selection that is designed to improve weight and shape recalibration (Sect. 7.1.2): (38)

where ℛ is defined in Eq. (20). This selection is a significant one, and removes 43.58% of the total MASK ≤ 1 sample. of sources otherwise in the lensing sample, this selection removes 20.15% of the sources. However, these sources are (by definition) the least resolved sources in KiDS, which means that they also have the lowest shear-responses and thus the lowest lensing weight. The sources are also preferentially faint, as can be seen from Fig. 29. Therefore, while this selection seems pathological (causing a >20% reduction in source number), Li et al. (2023) found (for DR4) that the cut only reduced the effective number density of lensing sources (i.e. including lensing weights) by ~2%. In KiDS-Legacy the selection has a slightly larger effect, reducing the effective number density by 4.6% (see Table H.1).

Finally, we applied a selection on whether sources have a non-zero shape measurement weight: (39)

This selection is highly correlated with the resolution selection, but can only be applied after the resolution selection has been applied (because the resolution selection is invoked in the weight recalibration, Sect. 7.1.2). Similarly, because of the weight recalibration, the weight selection can only be applied to the subset of the data that also satisfy the resolution selection, and so it is not meaningful to quote the fraction of the MASK ≤ 1 dataset that satisfy the weight requirement. Furthermore, as can be seen in Fig. 29, these are primarily faint sources. of the currently available dataset, the weight selection removes the 13.85% of the sources. We note, though, that the removal of these sources has no influence on the effective number density of lensing sources, because they carry zero lensing weight (see Appendix H).

The final KiDS-Legacy sample therefore consists of 43 205 156 sources drawn from 1014.0 deg2 of sky. Using the effective number density formulation of Heymans et al. (2012), the sample has an effective number density for lensing of neff = 8.92 arcmin−2. Limiting this sample to the photo-ɀ range expected for the KiDS-Legacy cosmological analyses (0.1 < ɀB ≤ 2.0) reduces the sample size to effective number density only slightly, to neff = 8.82 arcmin−2. This is the fiducial effective number density for the KiDS-Legacy sample, and is what we show for KiDS in Fig. 1. Finally, for direct comparison with previous KiDS releases, the effective number density of sources in the photo-ɀ range 0.1 < ɀB ≤ 1.2 is neff = 8.00 arcmin−2. This represents a modest increase from the measured effective number density in KiDS DR4, which was neff = 7.66 arcmin−2 (Hildebrandt et al. 2021) in the same photo-ɀ window. We attribute this increase in effective number primarily to the application of the strong-selection mask (Sect. 6.4), which masks areas of the survey with systematically reduced galaxy number densities.

thumbnail Fig. 29

Selections applied to the KiDS DR5 MASK ≤ 1 sample, and how they modify the available number of sources as a function of r-band magnitude, relative to the total number counts. The number of sources that are removed by each selection is given in Table 14.

thumbnail Fig. 30

Selection of binary stars in construction of the KiDS-Legacy lensing sample. Left: distribution of all DR5 sources in size and apparent magnitude. Centre: distribution of all sources remaining prior to the rejection of binary stars, which have ellipticities || ≤ 0.8. The colour scale in the centre panel is the same as in the left panel. Right: as in the centre panel, but for sources with ellipticities || > 0.8. This is the space in which the binary rejection is performed. The binary rejection criteria used in previous KiDS releases is shown as the dashed red line. The new binary rejection criteria for sources in DR5 is shown as the solid red line (sources above the line are discarded).

8 Data release and catalogues

The public release of KiDS data has been made through the ESO archive. As with previous releases, the data products made available via the archive include optical imaging (include science, weight, flag, and sum images), single-band detection catalogues, and multi-band catalogues (including NIR information from VIKING). However, changes to the data that are available within the survey (specifically regarding the two i-band passes) require changes to the format of the data products that are stored at ESO. Columns that are contained within the multi-band catalogues are presented in Table I.1, as well as specific columns that have changed names and/or meanings (Table I.3).

9 Summary

In this paper we present the fifth and final data release of KiDS. The data release presents optical and NIR photometry over 1370 square degrees of sky: 1347 square degrees of wide-field imaging designed for use in weak lensing studies, and 27 square degrees covering deep spectroscopic survey fields (with 4 deg2 overlap). The release improves upon previous KiDS data releases in more ways than simply the 34% increase in survey area: multiepoch photometry in the i-band allows for improved photo-ɀ and temporal science; improved calibration samples from the larger spectroscopic survey overlap allow for a better quantification of survey performance and systematic effects; the improved calibration of photometry and astrometry produces higher-quality imaging and derived data products, improved masking reduces unnecessary loss of data (and the remaining data are of higher quality); and the updated shape measurement (and calibration) leads to reductions in systematic biases in downstream data products.

The data forming this release include images (science, weight, flag, sum, and mask frames), independent catalogues (single-band source extractions per tile), multi-band catalogues (forced photometry and photo-ɀ catalogues per tile), and mosaic catalogues (the KiDS-Legacy catalogue for weak-lensing science and the KiDZ catalogue for calibration efforts). These data are made publicly available via the ESO archive and the KiDS collaboration web pages.

The KiDS-Legacy sample consists of 42 974 574 sources drawn from 1014.0 deg2 of sky, with an effective number density of neff = 8.87 arcmin−2. After limiting the sample to the expected tomographic limits for KiDS-Legacy cosmological analyses (0.1 < ɀB ≤ 2.0), the effective number density decreases slightly to neff = 8.77 arcmin−2. The KiDZ sample consists of 126 085 sources extracted from KiDS-depth imaging, with spectroscopic and photometric redshift estimates. This dataset represents a significant increase in the calibration sample available for future cosmological analyses with KiDS.

This paper is intended as a one-stop reference for current and future members of the community wishing to utilise KiDS data for their science. It is the sincere hope of the authors that this release will cement the legacy value of this unique dataset for years to come.

Acknowledgements

We thank the anonymous referee for their comments, which have undoubtedly improved the quality of the manuscript. A.H.W., H.H., A.D., C.M., R.R., and J.L. v.d.B. are supported by an European Research Council Consolidator Grant (No. 770935). A.H.W. and H.H. acknowledge funding from the German Science Foundation DFG, via the Collaborative Research Center SFB1491 “Cosmic Interacting Matters - From Source to Signal”. A.H.W. is supported by the Deutsches Zentrum für Luft- und Raumfahrt (DLR), made possible by the Bundesministerium für Wirtschaft und Klimaschutz. H.H. is also supported by a Heisenberg grant of the Deutsche Forschungsgemeinschaft (Hi 1495/5-1). K.K. acknowledges support from the Royal Society and Imperial College. M.R. acknowledges financial support from the INAF mini-grant 2022 “GALCLOCK”. M.B., P.J., and G.K. are supported by the Polish National Science Center through grant no. 2020/38/E/ST9/00395. M.B. is also supported by the Polish National Science Center through grant no. 2018/30/E/ST9/00698, 2018/31/G/ST9/03388 and 2020/39/B/ST9/03494, and by the Polish Ministry of Science and Higher Education through grant DIR/WK/2018/12. C.H. acknowledges support from the European Research Council under grant number 647112, and the UK Science and Technology Facilities Council (STFC) under grant ST/V000594/1. C.H., B.S., Z.Y., and M.Y. acknowledge support from the Max Planck Society and the Alexander von Humboldt Foundation in the framework of the Max Planck-Humboldt Research Award endowed by the Federal Ministry of Education and Research. H.H. acknowledges support from Vici grant 639.043.512, financed by the Netherlands Organisation for Scientific Research (NWO). S.S.L. is supported by NOVA, the Netherlands Research School for Astronomy. L.M. acknowledges support from the UK STFC under grant ST/N000919/1. M.B. acknowledges the INAF PRIN-SKA 2017 program 1.05.01.88.04 and the funding from MIUR Premiale 2016: MITIC. P.B. acknowledges support from the German Academic Scholarship Foundation. G.C. acknowledges the support from the grant ASI n.2018-23-HH.0. J.T.A.d.J. is supported by the NWO through grant 621.016.402. B.G. acknowledges the support of the Royal Society through an Enhancement Award (RGF/EA/181006) and the Royal Society of Edinburgh for support through the Saltire Early Career Fellowship (ref. number 1914). C.G. thanks the support from INAF theory Grant 2022: Illuminating Dark Matter using Weak Lensing by Cluster Satellites. J.H.D. acknowledges support from an STFC Ernest Rutherford Fellowship (project reference ST/S004858/1). B.J. acknowledges support by STFC Consolidated Grant ST/V000780/1. L.L. has been supported by the Deutsche Forschungsgemeinschaft through the project SCHN 342/15-1 and DFG SCHN 342/13. L.M. acknowledges support from the grants PRIN-MIUR 2017 WSCC32 and ASI n.2018-23-HH.0. S.J.N. is supported by the Polish National Science Center through grant UMO-2018/31/N/ST9/03975. L.P. acknowledges support from the DLR grant 50QE2002. C.S. acknowledges support from the Agencia Nacional de Investigación y Desarrollo (ANID) through FONDECYT grant no. 11191125 and BASAL project FB210003. H.Y.S. acknowledges the support from CMS-CSST-2021-A01 and CMS-CSST-2021-B01, NSFC of China under grant 11973070, and Key Research Program of Frontier Sciences, CAS, Grant No. ZDBS-LY-7013. T.T. acknowledges funding from the Swiss National Science Foundation under the Ambizione project PZ00P2_193352. G.V.K. acknowledges financial support from the Netherlands Research School for Astronomy (NOVA) and Target. Target is supported by Samenwerkingsverband Noord Nederland, European fund for regional development, Dutch Ministry of economic affairs, Pieken in de Delta, Provinces of Groningen and Drenthe. J.Y. acknowledges the support of the National Science Foundation of China (12203084), the China Postdoctoral Science Foundation (2021T140451), and the Shanghai Post-doctoral Excellence Program (2021419). Y.H.Z. acknowledges support from the UK STFC. Based on data obtained from the ESO Science Archive Facility with DOI: https://doi.org/10.18727/archive/37, and https://doi.eso.org/10.18727/archive/59 and on data products produced by the KiDS consortium. The KiDS production team acknowledges support from: Deutsche Forschungsgemeinschaft, ERC, NOVA and NWO-M grants; Target; the University of Padova, and the University Federico II (Naples). This work was performed in part at Aspen Center for Physics, which is supported by National Science Foundation grant PHY-1607611. Author Contributions: All authors contributed to the development and writing of this paper. The authorship list is given in three groups: the lead authors (A.H.W., K.K., H.H., M.R., and M.B.), followed by two alphabetical groups. The first alphabetical group includes those who are key contributors to both the scientific analysis and the data products of this manuscript and release. The second group covers those who have either made a significant contribution to the preparation of data products or to the scientific analyses of KiDS since its inception.

Appendix A Magnitude limit estimates with LAMBDAR

Background estimates in KiDS+KiDZ were made using LAMBDAR (Wright et al. 2016). LAMBDAR provides a convenient tool for estimating the correlated and uncorrelated noise present in the various sets of imaging that are present in KiDS-DR5, as it can be ran on images of arbitrary pixel scale. This allows us to simply specify the on-sky locations that we wish to sample with our sky-estimates, and allow the code to report back the noise estimates at each location on-sky for these positions. For the grid of on-sky sampling positions we chose a simple rectangular grid that sampled each of the ~ 1 deg2 pointings with a spacing of 0.02 degrees in both RA and declination. We estimate the background level of imaging data using the two methods built into the LAMBDAR code: randoms estimation and sky estimation.

The randoms estimate of noise is the fiducial one that we use for quantifying our imaging depth in DR5, as it includes the influence of correlated noise. We summarise the computation of the randoms noise estimation process here, and direct the interested reader to Sect. 3.6 of Wright et al. (2016) for a complete description of the randoms measurement process.

The randoms noise estimate is made by measuring the flux contained within the chosen aperture (i.e. circular with a 2″ diameter), shifted to multiple random location within the local area of the image under analysis. An example of this is shown in Fig. A.1. After randomly shifting the location of the source aperture, the flux measured in the aperture at this new location is measured and recorded. After many realisations, the distribution of random fluxes therefore encodes the expected flux contained within an aperture of this specific size for a source randomly placed in this area of imaging. This measurement is performed on the image after application of the fiducial mask, as can be seen in Fig. A.1 where a diffraction spike has been masked at the bottom of the image. However, this estimate does not include source-masking: this means that the distribution of measured random fluxes also includes contamination from detectable sources. As such, the expectation and scatter of the random flux distribution is computed with median statistics, to suppress bias caused by outliers in the tails of the flux distribution (shown as stars in the right panel of Fig. A.1).

The second estimate of the background noise is made without considering the on-sky correlation present in the background. We leverage this sky estimation method as a robustness check for our fiducial noise estimation method, and to estimate the implied degree of on-sky correlation present in the KiDS imaging. We briefly describe the sky estimation process here, and direct the interested reader to Sect. 3.5 of Wright et al. (2016) for a complete description of the measurement process.

The sky-estimate routine in LAMBDAR assumes that the pixel-noise can be computed in annular radii around the input source, without consideration of the relative correlation between adjacent pixels. The sky estimate is made on the masked image (but, again, without source masking in our application). Figure A.2 demonstrates the estimation of the uncorrelated sky, for the same location shown in Fig. A.1. The sky estimate is made on a smaller scale than in the randoms case, and is similarly constructed to be robust to contamination: estimates of the mean/median/RMS sky are made in annuli around the requested location, and annuli with anomalously high/low values are discarded. In the case of Fig. A.2, all bins were acceptable. The resulting mean/median sky estimates, and the associated RMS scatter, are provided for this location. The values of the sky at the requested locations are therefore tabulated in this method without consideration of pixel-to-pixel noise correlation.

Previous data releases in KiDS have leveraged background estimates similar to the uncorrelated method used by LAMBDAR. This means that the background estimates (and therefore reported magnitude limits) in previous KiDS releases should be similar to those computed using LAMBDAR’s uncorrelated estimate. The magnitude limits estimated using this method for DR5 are {24.26 ± 0.10, 25.15 ± 0.12, 25.07 ± 0.14, 23.66 ± 0.25, 23.73 ± 0.3} for the {u, 𝑔, r, i1, i2}-bands, respectively, in very good agreement with the estimates presented in previous releases.

An example of the estimation of these two background levels for a single pointing of KiDS is shown in Fig. A.3. The figure shows the estimate of the correlated sky noise with randoms, the estimate of the uncorrelated noise, and ratio between these two noise estimates. This ratio is an estimate of the degree of correlation present in the imaging at that location on sky. The distribution of uncorrelated sky values can be seen to closely trace the dither pattern of an observation in KiDS: the lowest noise values occur where pixels were observed by all five exposures, and the highest noise values occur in the upper and lower extremes of the image where only one exposure is available per pixel. The correlated noise estimate shows much less of this structure, suggesting that the expected reduction in noise is being hampered by other effects. The cause of this is suggested by the third panel, which shows the noise correlation factor: (A.1)

The correlation factor can be seen to increase in regions of the image where there is considerable background contamination from, for example, residual reflections from bright stars.

thumbnail Fig. A.1

Correlated background estimated using the LAMBDAR randoms routine. Left: Local region of the image under analysis, with the 100 realisations of the random aperture shown as translucent black circles. The image has been masked using the fiducial mask, seen as the missing triangular region (a diffraction spike) at the bottom of the image. Right: Distribution of pixel values measured within the random apertures, for all apertures (black) and the individual apertures (colours). There are some apertures that are coincident with sources, resulting in large positive fluxes. These are limited in the computation of the random noise estimate through the use of median statistics. The distribution of aperture fluxes from the randoms is shown by the box-and-whisker plot, with the outlier fluxes shown as stars. This figure is a direct output of the LAMBDAR code.

thumbnail Fig. A.2

Uncorrelated background estimated using the LAMBDAR sky-estimate routine. Left: Annuli used to estimate the background estimates, shown against the input image (colour mapping for pixels in the left panel is given by the point colouring in the right panel). Right: Distribution of pixel values (points) as a function of radius from the random location chosen. The black lines show the median sky value per annulus (solid) and the uncertainty on the median (dashed). The final mean (median) sky estimate is shown as the solid (dashed) red line. This figure is a direct output of the LAMBDAR code.

thumbnail Fig. A.3

Demonstration of the background estimates produced by LAMBDAR for pointing KIDS_l85p0_m0p5 in the r-band. Each panel shows a grey-scale image of the pointing, smoothed with a 2″ Gaussian kernel and after masking. Overlaid on this image are various estimates of the image noise level, shown with rectangles covering the extent of the area used for the estimate. Left: Magnitude limits estimated with blanks. Centre: Magnitude limits estimated with the pixel RMS. Right: Correlation factor, estimated as the ratio between the aperture noise RMSs estimated with blanks and pixels. This factor clearly recovers the regions of the image with increased noise correlation caused by residual bright artefacts.

Appendix B VST on-sky quality metrics

For many scientific applications, an understanding of the systematic differences in imaging quality on-sky is relevant. For imaging in KiDS the primary parameters of relevance are the magnitude limit (estimated using the method described in Appendix A), and the PSF size in each band. As such, in Fig. B.1 we show the distribution of magnitude limits estimated in each band on-sky, and in Fig. B.2 we show the distribution of average PSF sizes (per pointing, as reported by ASTROWISE). Each figure shows the parameters per filter, with the two i-band passes shown separately.

thumbnail Fig. B.1

Distribution of limiting magnitudes in the optical bands for the KiDS fields. The background levels measured using 2″ circular apertures, as described in Appendix A. To construct the on-sky mosaic, estimates per-pointing are combined using the same WCS cuts as are applied to the data (Sect. 6.5).

thumbnail Fig. B.2

Distribution of PSF FWHM sizes (in arcsec) as reported by ASTROWISE for each pointing within KiDS.

Appendix C Astrometric residuals compared to Gaia

As mentioned in Sect. 3.2, the KiDS data are calibrated to Gaia stars at the J2000 epoch. In practice, this means that the positions of stars used to calibrate KiDS have been shifted (using their observed positions and estimated proper motions) back to where they would have been in the year 2000. Crucially, though, this does not just have a random effect per star: as the Solar System moves through the Milky Way, an additional coherent shift is imprinted on the stellar distribution. The purpose of calibrating to J2000 epoch stellar positions is to remove this coherent drift.

However, KiDS observations were not taken in the year 2000. Rather, the typical KiDS observation was taken around 2015–2016. As a result, the observed positions of stars in KiDS imaging are actually much more similar to the observed positions of Gaia stars at epoch J2016.

The fact that KiDS is calibrated to J2000 epoch stars, but that KiDS images were taken around epoch J2016 (similar to Gaia) leads to an apparent systematic bias when performing a direct sky match between the KiDS DR5 catalogues and Gaia stars at their epoch J2016 positions: there appears to be a systematic bias in the two hemispheres, at the level of |∆RA| ~ 0″.1, which has different sign in the northern and the southern hemispheres. This can be seen in Fig. C.1. Of particular note, though, is the low scatter in each of the systematically offset clouds. This is demonstrative that, modulo the difference in overall coordinates between epoch J2000 and epoch J2016 positions, the relative position of stars in KiDS and Gaia are very similar (because the observations were actually taken at similar epochs).

thumbnail Fig. C.1

Astrometric residuals between KiDS and Gaia when comparing to J2016 epoch stellar positions. The systematic difference is caused by the systematic shift in the positions of stars between 2000 and 2016, caused by the motion of the Solar System through the Milky Way.

thumbnail Fig. C.2

Astrometric residuals between KiDS and Gaia when comparing to J2000 epoch stellar positions. The systematic difference is no longer present; however, a considerable increase in scatter is visible. This is due to the KiDS stars being relatively positioned according to the J2016 epoch (determined by when the images were taken).

Figure C.2 shows the same as Fig. C.1, but now using J2000 epoch positions. When performing the comparison of per-star astrometry between the KiDS data and the J2000 epoch Gaia stars, one expects that the systematic offsets would disappear, which it does. However, we also see a considerable increase in scatter: this is due to the proper motions of stars introducing noise, as they shuffle positions between epoch J2000 and epoch J2016.

Appendix D List of reprocessed u-band co-adds

As discussed in Sect. 3.5.1, u-band data were flagged for reprocessing with a modified astrometric distortion polynomial order, to reduce pathological errors in co-add construction. The full list of reprocessed u-band co-adds is given in Table D.1.

Table D.1

Co-adds for which the u-band was reprocessed with linear polynomial order for the astrometric distortion definition.

Appendix E VISTA on-sky quality metrics

As with Appendix B, the distribution of magnitude limits and PSF sizes on-sky in the NIR bands is a relevant statistics for understanding systematic variations of photometric quality over the KiDS-DR5 survey. Figure E.1 shows the distribution of depths in each of the five NIR filters, computed using the method of Appendix A and combined as described in Sect. 4.3. The dither pattern of the VISTA observations is visible, particularly in cases where sequences of observations are particularly shallow (leading to stripes and rectangles in the image).

In Fig. E.2 we also show the distribution of VISTA PSF sizes on-sky, as reported by ESO for each paw-print. PSF sizes have been combined on-sky using the same procedure as for the magnitude limits. Again, the paw-print pattern of the VISTA telescope is visible. We note, however, that the distribution here is not necessarily representative of the information contained in the photometry, as the measurement of photometry by GAAP is PSF-size dependent. In cases where the PSF is particularly poor for an observation, that observation cannot be used in the final flux calculation (as GAAP failed to produce a flux measurement in that case).

thumbnail Fig. E.1

Distribution of effective NIR magnitude limits in the KiDS fields, computed using LAMBDAR with 2″ circular apertures.

thumbnail Fig. E.2

Distribution of mean PSF FWHM sizes (in arcsec) as reported by ESO for each paw-print observation of the KiDS fields. Binning is made is steps of {δRA, δDec} = {0.25/cos (Dec), 0.25} deg, as the spacing between adjacent chips in the paw-print is ~ 0.2 deg.

Appendix F Masking correlations

The areas masked by each of our various mask criteria are not unique. An area marked for flagging because of stellar reflections in the r-band, for example, are likely highly correlated with areas marked for similar reflections in the 𝑔-band, because the same stars cause the artefacts in both images. Furthermore, the KiDS observing strategy exacerbates this effect because each optical filter tiles the sky with essentially the same dither pattern. As such, it is worth noting the degree of correlation between areas masked (per pointing) by each of the bits in our mask.

thumbnail Fig. F.1

Correlations between masked area per pointing, for each of the individual mask bits. Correlations are computed between the area available per pointing after the application of the relevant mask bits, for the full KiDS-DR5 survey.

Figure F.1 shows the correlation matrix of masked area per pointing, for individual masking bits. Correlations are computed between the total area masked per pointing under each condition. As such, the correlations are indirect (i.e. they do not necessarily mean the same pixels are masked in each case), but nonetheless are indicative of the approximate survey area masked in each of the cases.

The figure shows that the footprint mask in each of the NIR bands are highly correlated, as VIKING naturally observed a consistent footprint in each filter. Slight deviations from unity in the correlations of the NIR footprint masks come from the additional masking that we perform in the NIR (due to the PSF size, for example; see Sect. 6.4). Otherwise, there is a surprising lack of correlation in the masks, indicating that information contained in each of the masks is somewhat independent. This is particularly interesting in the case of the r-band, where masks constructed within ASTROWISE and THELI are maximally correlated at the level of ~ 50%, despite being constructed from the same raw images. This motivates our fiducial choice of masks: individual masks from ASTROWISE and THELI flag somewhat different artefacts, and so the combination of these masks is the most appropriate (and conservative) choice.

Appendix G Shape measurement properties

The distribution of shape measurement properties on-sky is a useful diagnostic of the level of variability in the lensing sample, after recalibration, which can influence modelling and hint at possible systematic effects. Figure G.1 shows the on-sky distribution of PSF ellipticity in each of the five exposures (per pointing) that contribute to our shape measurement. Figure G.2 shows the distribution of measured shapes, and their uncertainty. Finally, Fig. G.3 shows the on-sky sum of the recalibrated shape-measurement weight.

thumbnail Fig. G.1

Distribution of PSF ellipticities (|∊PSF|) measured in each of the five r-band exposures used for shape measurement by lensfit. The panels show the relative homogeneity of the PSF measured across all exposures, and the amount of variability between exposures of the same pointing.

thumbnail Fig. G.2

Distribution of ∊1, ∊2, |∊|, σ∊,1, and σ∊,2 measured by lensfit after recalibration. All estimates are unweighted.

thumbnail Fig. G.3

Distribution of recalibrated shape-measurement weights on-sky.

Appendix H Sample selection with weighting

Here we provide the details of the sample selections used in KiDS-Legacy, as presented in Table 14, but computed using weighted values rather than direct source counts, and expressed in terms of the effective number density of sources η = A−1 (∑ w)2/(∑ w2), computed using the uncalibrated lensing weight w and the survey area in square-arcminutes A. This provides an indication of the most significant selections, when considering their influence on the weighted source sample that is used for computation of, for example, shear correlation functions. The weighted selections are shown in Table H.1.

The table shows that the most significant selections, in terms of lensing weight, are the blending, fitclass, magnitude, and resolution selections. These remove, respectively: 5%, 9%, 2%, and 5% of the available effective number density.

Finally, we reproduce the fractional selection of sources as a function of r-band magnitude shown in Fig. 29, now computed as a weighted fraction in Fig. H.1. The distribution is truncated to the limits that are analysed by lensfit (i.e. 19 ≤ r ≤ 25.5).

thumbnail Fig. H.1

As in Figure 29, but now computed using sources weighted by their uncalibrated shape measurement weight. The distribution is truncated to the limits of analysed by lensfit (19 ≤ r ≤ 25.5), as the weights are not defined outside this region.

Table H.1

Statistics of the KiDS-Legacy sample definition, specified in terms of the effective number density.

Appendix I ESO data products

The catalogues and images that are available within the ESO database have changed between DR4 and DR5, as the format of the data has changed. In particular, the addition of the second-pass i-band necessitates a modification to the catalogue (and metadata) formats. The description of the columns contained within the ESO multi-band catalogues is provided in Table I.1, the columns contained within the single-band catalogues are provided in Table I.2. Columns that have changed meaning between DR4 and DR5 are specifically highlighted in Table I.3. Details of the image headers that are released are provided in the Table I.4.

Table I.1

Columns provided in the ten-band catalogue.

Table I.2

Columns provided in the single-band source lists.

Table I.3

Changes to variable names and definitions in DR5 compared to DR4.

Table I.4

Main keywords in the ten-band catalogue headers

References

  1. Abolfathi, B., Aguado, D. S., Aguilar, G., et al. 2018, ApJS, 235, 42 [NASA ADS] [CrossRef] [Google Scholar]
  2. Ahumada, R., Allende Prieto, C., Almeida, A., et al. 2020, ApJS, 249, 3 [NASA ADS] [CrossRef] [Google Scholar]
  3. Aihara, H., Arimoto, N., Armstrong, R., et al. 2018, PASJ, 70, S4 [NASA ADS] [Google Scholar]
  4. Albrecht, A., Bernstein, G., Cahn, R., et al. 2006, [arXiv:astro-ph/0609591] [Google Scholar]
  5. Andrews, S. K., Driver, S. P., Davies, L. J. M., et al. 2017, MNRAS, 464, 1569 [NASA ADS] [CrossRef] [Google Scholar]
  6. Asgari, M., Lin, C.-A., Joachimi, B., et al. 2021, A&A, 645, A104 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  7. Balestra, I., Mainieri, V., Popesso, P., et al. 2010, A&A, 512, A12 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  8. Benítez, N. 2000, ApJ, 536, 571 [Google Scholar]
  9. Bertin, E. 2010, Astrophysics Source Code Library [record ascl:1010.068] [Google Scholar]
  10. Bertin, E., & Arnouts, S. 1996, A&AS, 117, 393 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  11. Bilicki, M., Dvornik, A., Hoekstra, H., et al. 2021, A&A, 653, A82 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  12. Blake, C., Amon, A., Childress, M., et al. 2016, MNRAS, 462, 4240 [NASA ADS] [CrossRef] [Google Scholar]
  13. Capaccioli, M., Mancini, D., & Sedmak, G. 2005, The Messenger, 120, 10 [NASA ADS] [Google Scholar]
  14. Colless, M., Dalton, G., Maddox, S., et al. 2001, MNRAS, 328, 1039 [Google Scholar]
  15. Comparat, J., Richard, J., Kneib, J.-P., et al. 2015, A&A, 575, A40 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  16. Cooper, M. C., Griffith, R. L., Newman, J. A., et al. 2012, MNRAS, 419, 3018 [NASA ADS] [CrossRef] [Google Scholar]
  17. Dalton, G. B., Caldwell, M., Ward, A. K., et al. 2006, SPIE Conf. Ser., 6269, 62690X [Google Scholar]
  18. Damjanov, I., Zahid, H. J., Geller, M. J., Fabricant, D. G., & Hwang, H. S. 2018, ApJS, 234, 21 [NASA ADS] [CrossRef] [Google Scholar]
  19. Davies, L. J. M., Driver, S. P., Robotham, A. S. G., et al. 2015, MNRAS, 447, 1014 [Google Scholar]
  20. Davies, L. J. M., Robotham, A. S. G., Driver, S. P., et al. 2018, MNRAS, 480, 768 [NASA ADS] [CrossRef] [Google Scholar]
  21. de Jong, J. T. A., Kuijken, K., Applegate, D., et al. 2013, The Messenger, 154, 44 [NASA ADS] [Google Scholar]
  22. de Jong, J. T. A., Verdoes Kleijn, G. A., Boxhoorn, D. R., et al. 2015, A&A, 582, A62 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  23. de Jong, J. T. A., Verdois Kleijn, G. A., Erben, T., et al. 2017, A&A, 604, A134 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  24. Driver, S. P., Hill, D. T., Kelvin, L. S., et al. 2011, MNRAS, 413, 971 [Google Scholar]
  25. Driver, S. P., Wright, A. H., Andrews, S. K., et al. 2016, MNRAS, 455, 3911 [NASA ADS] [CrossRef] [Google Scholar]
  26. Driver, S. P., Bellstedt, S., Robotham, A. S. G., et al. 2022, MNRAS, 513, 439 [NASA ADS] [CrossRef] [Google Scholar]
  27. Edge, A., Sutherland, W., Kuijken, K., et al. 2013, The Messenger, 154, 32 [NASA ADS] [Google Scholar]
  28. Eifler, T., Miyatake, H., Krause, E., et al. 2021, MNRAS, 507, 1746 [NASA ADS] [CrossRef] [Google Scholar]
  29. Euclid Collaboration (Guglielmo, V., et al.) 2020, A&A, 642, A192 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  30. Euclid Collaboration (Stanford, S. A., et al.) 2021, ApJS, 256, 9 [NASA ADS] [CrossRef] [Google Scholar]
  31. Euclid Collaboration (Scaramella, R., et al.) 2022, A&A, 662, A112 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  32. Gaia Collaboration (Brown, A. G. A., et al.) 2018, A&A, 616, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  33. Gaia Collaboration (Brown, A. G. A., et al.) 2021, A&A, 649, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  34. Garilli, B., McLure, R., Pentericci, L., et al. 2021, A&A, 647, A150 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  35. Gong, Y., Liu, X., Cao, Y., et al. 2019, ApJ, 883, 203 [NASA ADS] [CrossRef] [Google Scholar]
  36. González-Fernández, C., Hodgkin, S. T., Irwin, M. J., et al. 2018, MNRAS, 474, 5459 [Google Scholar]
  37. Guinot, A., Kilbinger, M., Farrens, S., et al. 2022, A&A, 666, A162 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  38. Hasinger, G., Capak, P., Salvato, M., et al. 2018, ApJ, 858, 77 [Google Scholar]
  39. Heydenreich, S., Schneider, P., Hildebrandt, H., et al. 2020, A&A, 634, A104 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  40. Heymans, C., Van Waerbeke, L., Miller, L., et al. 2012, MNRAS, 427, 146 [Google Scholar]
  41. Hikage, C., Oguri, M., Hamana, T., et al. 2019, PASJ, 71, 43 [Google Scholar]
  42. Hildebrandt, H., Viola, M., Heymans, C., et al. 2017, MNRAS, 465, 1454 [Google Scholar]
  43. Hildebrandt, H., Kohlinger, F., van den Busch, J. L., et al. 2020, A&A, 633, A69 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  44. Hildebrandt, H., van den Busch, J. L., Wright, A. H., et al. 2021, A&A, 647, A124 [EDP Sciences] [Google Scholar]
  45. Ilbert, O., Arnouts, S., McCracken, H. J., et al. 2006, A&A, 457, 841 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  46. Ivezić, Ž., Lupton, R. H., Schlegel, D., et al. 2004, Astron. Nachr., 325, 583 [Google Scholar]
  47. Ivezić, Ž., Kahn, S. M., Tyson, J. A., et al. 2019, ApJ, 873, 111 [Google Scholar]
  48. Jarvis, M. J., Bonfield, D. G., Bruce, V. A., et al. 2013, MNRAS, 428, 1281 [Google Scholar]
  49. Jee, M. J., Tyson, J. A., Hilbert, S., et al. 2016, ApJ, 824, 77 [Google Scholar]
  50. Kafle, P. R., Robotham, A. S. G., Driver, S. P., et al. 2018, MNRAS, 479, 3746 [Google Scholar]
  51. Kuijken, K. 2006, A&A, 456, 827 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  52. Kuijken, K. 2008, A&A, 482, 1053 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  53. Kuijken, K. 2011, The Messenger, 146, 8 [NASA ADS] [Google Scholar]
  54. Kuijken, K., Heymans, C., Hildebrandt, H., et al. 2015, MNRAS, 454, 3500 [Google Scholar]
  55. Kuijken, K., Heymans, C., Dvornik, A., et al. 2019, A&A, 625, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  56. Laureijs, R., Amiaux, J., Arduini, S., et al. 2011, arXiv e-prints [arXiv:1110.3193] [Google Scholar]
  57. Le Fèvre, O., Vettolani, G., Garilli, B., et al. 2005, A&A, 439, 845 [Google Scholar]
  58. Le Fèvre, O., Cassata, P., Cucciati, O., et al. 2013, A&A, 559, A14 [Google Scholar]
  59. Le Fèvre, O., Tasca, L. A. M., Cassata, P., et al. 2015, A&A, 576, A79 [Google Scholar]
  60. Lewis, J. R., Irwin, M., & Bunclark, P. 2010, ASP Conf. Ser., 434, 91 [Google Scholar]
  61. Li, R., Napolitano, N. R., Feng, H., et al. 2022, A&A, 666, A85 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  62. Li, S.-S., Kuijken, K., Hoekstra, H., et al. 2023, A&A, 670, A100 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  63. Liang, S., & von der Linden, A. 2023, MNRAS, 519, 2281 [Google Scholar]
  64. Lidman, C., Tucker, B. E., Davis, T. M., et al. 2020, MNRAS, 496, 19 [NASA ADS] [CrossRef] [Google Scholar]
  65. Lilly, S. J., Le Brun, V., Maier, C., et al. 2009, ApJS, 184, 218 [Google Scholar]
  66. Masters, D. C., Stern, D. K., Cohen, J. G., et al. 2017, ApJ, 841, 111 [Google Scholar]
  67. Masters, D. C., Stern, D. K., Cohen, J. G., et al. 2019, ApJ, 877, 81 [Google Scholar]
  68. McCracken, H. J., Milvang-Jensen, B., Dunlop, J., et al. 2012, A&A, 544, A156 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  69. McFarland, J. P., Verdoes-Kleijn, G., Sikkema, G., et al. 2013, Exp. Astron., 35, 45 [Google Scholar]
  70. Miller, L., Heymans, C., Kitching, T. D., et al. 2013, MNRAS, 429, 2858 [Google Scholar]
  71. Nakoneczny, S. J., Bilicki, M., Pollo, A., et al. 2021, A&A, 649, A81 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  72. Newman, J. A., Cooper, M. C., Davis, M., et al. 2013, ApJS, 208, 5 [Google Scholar]
  73. Parkinson, D., Riemer-Sørensen, S., Blake, C., et al. 2012, Phys. Rev. D, 86, 103518 [NASA ADS] [CrossRef] [Google Scholar]
  74. Pickles, A. J. 1998, PASP, 110, 863 [NASA ADS] [CrossRef] [Google Scholar]
  75. Popesso, P., Dickinson, M., Nonino, M., et al. 2009, A&A, 494, 443 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  76. Raichoor, A., Mei, S., Erben, T., et al. 2014, ApJ, 797, 102 [Google Scholar]
  77. Refregier, A. 2003, MNRAS, 338, 35 [Google Scholar]
  78. Riello, M., De Angeli, F., Evans, D. W., et al. 2021, A&A, 649, A3 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  79. Scodeggio, M., Guzzo, L., Garilli, B., et al. 2018, A&A, 609, A84 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  80. Scoville, N., Aussel, H., Brusa, M., et al. 2007, ApJS, 172, 1 [Google Scholar]
  81. Sevilla-Noarbe, I., Bechtol, K., Carrasco Kind, M., et al. 2021, ApJS, 254, 24 [NASA ADS] [CrossRef] [Google Scholar]
  82. Silverman, J. D., Kashino, D., Sanders, D., et al. 2015, ApJS, 220, 12 [NASA ADS] [CrossRef] [Google Scholar]
  83. Skrutskie, M. F., Cutri, R. M., Stiening, R., et al. 2006, AJ, 131, 1163 [Google Scholar]
  84. The LSST Dark Energy Science Collaboration (Mandelbaum, R., et al.) 2018, arXiv e-prints [arXiv:1809.01669] [Google Scholar]
  85. Trump, J. R., Impey, C. D., Elvis, M., et al. 2009, ApJ, 696, 1195 [NASA ADS] [CrossRef] [Google Scholar]
  86. Vakili, M., Hoekstra, H., Bilicki, M., et al. 2023, A&A, 675, A202 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  87. van den Busch, J. L., Hildebrandt, H., Wright, A. H., et al. 2020, A&A, 642, A200 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  88. van den Busch, J. L., Wright, A. H., Hildebrandt, H., et al. 2022, A&A, 664, A170 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  89. van der Wel, A., Noeske, K., Bezanson, R., et al. 2016, ApJS, 223, 29 [Google Scholar]
  90. Venemans, B. P., Verdoes Kleijn, G. A., Mwebaze, J., et al. 2015, MNRAS, 453, 2259 [Google Scholar]
  91. Wright, A. H., Robotham, A. S. G., Bourne, N., et al. 2016, MNRAS, 460, 765 [Google Scholar]
  92. Wright, A. H., Driver, S. P., & Robotham, A. S. G. 2018, MNRAS, 480, 3491 [Google Scholar]
  93. Wright, A. H., Hildebrandt, H., Kuijken, K., et al. 2019, A&A, 632, A34 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  94. York, D. G., Adelman, J., Anderson, John E., J., et al. 2000, AJ, 120, 1579 [NASA ADS] [CrossRef] [Google Scholar]

4

The effective wavelengths of the VST and SDSS filters agree to approximately 2, 6, 100, and 50 Å in the u𝑔ri-bands, respectively.

5

These magnitude limits are slightly brighter than those presented in previous KiDS releases, due largely to their computation allowing for inclusion of correlated noise. For a direct comparison with previous releases, see Appendix A and Table 1.

6

Catalogue I/345/gaia2, columns RAJ2000 and DEJ2000.

10

Sources from this sample that are proprietary are tagged with an additional flag, and are only provided to members of the community with allowed access (or after the public release of these data).

11

In practice, the matching of the spectroscopic compilation to the KiDZ sources happens after the computation of multi-band information for the KiDZ fields, including masks, discussed in Sect. 6. Nonetheless we include this step here for simplicity.

13

Insofar as the optimal estimator would require knowledge of the true measurement variance of the sources, whereas the flux error reported by GAAP is only a partial reflection of the true variance, as it does not account for all sources of flux uncertainty.

All Tables

Table 1

Summary of imaging data released in KiDS-DR5.

Table 2

Run numbers of all VST observations taken in the KiDS and KiDZ fields.

Table 3

KiDS+KiDZ observing strategy: observing condition constraints and exposure times.

Table 4

Source extraction parameters used in the creation of the KiDS DR5 source lists within ASTROWISE.

Table 5

Applied cross−talk coefficients.

Table 6

Updated parameters for the r-band PULECENELLA masking in KiDS DR5.

Table 7

Run numbers of all VISTA observations taken in the KiDS and KiDZ fields.

Table 8

Requirements and settings for VIKING observations with VIRCAM on VISTA.

Table 9

Requirements and settings for deep VIRCAM with ultraVISTA and VIDEO on VISTA.

Table 10

Statistics for the KiDZ spectroscopic sample.

Table 11

KiDS+KiDZ MASK bits, their names, and additional information.

Table 12

Mosaic areas computed from the 6″ mosaic masks.

Table 13

fitclass definitions and statistics output by lensfit υ321.

Table 14

Statistics of the additional selections used for KiDS-Legacy sample definition.

Table D.1

Co-adds for which the u-band was reprocessed with linear polynomial order for the astrometric distortion definition.

Table H.1

Statistics of the KiDS-Legacy sample definition, specified in terms of the effective number density.

Table I.1

Columns provided in the ten-band catalogue.

Table I.2

Columns provided in the single-band source lists.

Table I.3

Changes to variable names and definitions in DR5 compared to DR4.

Table I.4

Main keywords in the ten-band catalogue headers

All Figures

thumbnail Fig. 1

Summary statistics for stage-II (Heymans et al. 2012; Jee et al. 2016), stage-III (this work; Sevilla-Noarbe et al. 2021; Hikage et al. 2018; Guinot et al. 2022), and stage-IV (Euclid Collaboration 2022; The LSST Dark Energy Science Collaboration 2018; Gong et al. 2019; Eifler et al. 2021) cosmological imaging surveys, comparing the survey area and effective number density of sources for weak lensing analyses (square markers). The red curves show the total effective number of sources, which principally determines the cosmological constraining power. The colour bars indicate each survey’s wavelength coverage, which principally determines the photometric redshift quality of a survey. The size of the circular point within each survey’s square marker shows the typical seeing in the lensing band, which principally determines the accuracy of shape measurements. The statistics presented for future surveys are forecasts, as indicated in pink. The final DES-Y6 data release will increase in depth, relative to the DES-Y3 statistics shown, raising the final effective number density of sources.

In the text
thumbnail Fig. 2

Chart showing the data processing path that is taken for the KiDS and KiDZ data in DR5, from optical and NIR imaging (top) through to final mosaic catalogues (bottom). Each step outlined in this graph is discussed in the correspondingly annotated section. Yellow boxes show raw data products and data products input into our reduction pipelines. Green boxes contain imaging data products released as part of this data release. Pink boxes contain per-tile catalogue-level data products released as part of this data release. Blue boxes contain mosaic catalogue-level data products released as part of this data release.

In the text
thumbnail Fig. 3

Distribution of KiDS DR5 pointings on-sky. The figure shows the distribution of DR4 pointings (dark cyan) and new DR5 pointings (yellow). Pointings that were originally included in the 1500 deg2 KiDS footprint, but which were subsequently de-scoped due to the limited area observed by VIKING, are shown in dark purple. The combination of the green and yellow data therefore show the 1347 deg2 of KiDS DR5 observations.

In the text
thumbnail Fig. 4

Progression of optical observations over the full KiDS survey, for pointings within the final 1347 deg2 footprint. The total number of DR5 pointings is shown as the dashed grey line. The figure demonstrates the slow initial progress made by the survey, whereby only ~5% of planned r-band observations were completed in the initial 18 months of data-taking, prompting changes in the queuing of (in particular) the dark-time observations. Conversely, at the end of the survey, the complete second-pass of i-band observations (designated i2) was completed in roughly the same time span.

In the text
thumbnail Fig. 5

Temporal separation between the two i-band passes, as a function of position on-sky. The figure shows all KiDS DR5 pointings, coloured by the number of years separating the two i-band passes. The overall distribution of these values is shown as a KDE within the colour bar, computed with a 0.1-year rectangular bandwidth. The largest temporal separations occur typically in the GAMA fields, where KiDS observations were focussed during early survey operations. The shortest temporal separation between any two passes is at RA = 350.3 deg, Dec = −35.1 deg, which was observed on 21 August 2017 and 7 October 2017 for the two i-band passes, respectively, for a separation of 50 nights.

In the text
thumbnail Fig. 6

Footprints of the KiDZ pointings. The figure shows the regions where there exist optical and/or NIR data in dark and light grey, respectively. The rectangles show the limits of all observations made with the VST (black) and dedicated observations made with VISTA (green). Dark grey regions that are not covered by our dedicated VISTA observations are either already contained within the VIKING footprint (CDFS, G15-Deep) or have VIKING-like observations constructed from existing deep observations (VIPERS, COSMOS; see Sect. 4.1.3).

In the text
thumbnail Fig. 7

Primary observational properties of pointings in KiDS+KiDZ for observations in the four optical bands. Observations in the i-band are split into the two epochs (labelled i1 and i2). Each row shows the distribution of average PSF sizes (as reported by ASTROWISE; left) and limiting magnitude (5σ in a 2″ diameter circular aperture; right) determined with a KDE using the annotated kernel. The corresponding cumulative distribution functions for each panel are shown as grey dashed lines.

In the text
thumbnail Fig. 8

Astrometric calibration of KiDS DR5, with respect to Gaia DR3 stars at epoch J2000. Values here show the median astrometric residual per pointing for stars in the magnitude range 16.5 ≤ G ≤ 19. The colour bar shows the location of each field in the Dec direction, after subtracting a constant equal to the mean declination of all fields in the relevant hemisphere. Absolute residuals are typically less than the 0″.05 level, and are therefore negligible. Residual offsets below this level are possibly attributable to barycentric motion between the J2000 and J2015 epochs (see Appendix C).

In the text
thumbnail Fig. 9

On-sky variation in the KiDS DR5 astrometric calibration with respect to Gaia stars at the J2000 epoch. Residuals can be seen to fall around the earliest observations, due to the reduced proper motion differences between our observations and the assumed epoch.

In the text
thumbnail Fig. 10

Astrometric agreement between sources extracted from THELI-Lens and ASTROWISE images. The sample of all sources has a median offset of 0″.014 and −0″.017 in the RA and Dec directions, respectively. The NMAD scatter in the RA and Dec directions is 0″.097 and 0″.090, respectively.

In the text
thumbnail Fig. 11

Overall astrometric calibration of KiDS DR5, calibrated to Gaia, with respect to SDSS in the KiDS-N field. The sample of all sources has a median offset of 0″.031 and −0″.006 in the RA and Dec directions, respectively. The NMAD scatter in the RA and Dec directions is 0″.094 and 0″.084, respectively.

In the text
thumbnail Fig. 12

Maximal residual between chip corners within a single pointing, assuming all chips can be shifted to a common RA and Dec centroid. Cases where the astrometric distortion parameters are poorly constrained (and so result in unphysical chip distortions) manifest as large residuals. The distribution in the u-band when using third−order distortion polynomials is clearly systematically larger than the other bands, with some chips having significant biases (greater than 10″), due to the lower number of stars creating instability in the polynomial fits. To minimise the effect of this bias, we reprocessed all u-band tiles with maximal residuals greater than 1 using a linear polynomial distortion order.

In the text
thumbnail Fig. 13

Zero-point corrections estimated in KiDS+KiDZ using SLR. Left: difference between SLR offsets estimated using i1 photometry and i2 photometry, prior to recalibration of the u-band zero-points. Centre: differences between SLR offsets estimated using 0″.7 minimum-radius aperture fluxes and 1″.0 minimum-radius aperture fluxes (see Sect. 3.6). Right: distribution of final SLR offsets used in DR5 (i.e. corrections to the nightly zero points derived from photometric standards), including Gaia recalibration.

In the text
thumbnail Fig. 14

Comparison between the u-band zero-point calibration in KiDS DR4 (green) and KiDS DR5 (purple). The updated DR5 calibration procedure produces zero-points that show considerably less systematic and random variation, when compared to fluxes from SDSS.

In the text
thumbnail Fig. 15

Distribution of the median offsets between stars in KiDS DR5 tiles (in the magnitude range 16.5 < r < 19) and their counterparts from SDSS imaging, for tiles with 10% or more unmasked data. The offsets here were calculated after SLR corrections and Gaia recalibration, and therefore represent the final quality of photometry in the survey. Horizontal dashed lines demonstrate the systematic zero-point uncertainty that is included per-band in our scientific analyses (such as in the computation of photo-ɀ), which are designed to encapsulate any residual systematic variation in the photometry (such as can be seen as a function of RA, and whose origins are unclear).

In the text
thumbnail Fig. 16

Comparison between the ASTROWISE PULECENELLA masks for a selected KiDS pointing. The images show the masks on a consistent linear colour scale, so pixels masked with the same bit(s) are shown with the same colour in each image. The DR5 implementation of these masks was refined to use slightly different parameters (see Table 6), which more closely reproduce the masking behaviour of THELI-Lens in the ASTROWISE r-band. The primary effect is a reduction (in DR5) of the size of stellar masks (cyan) and the frequency of masking of large reflection halos (green).

In the text
thumbnail Fig. 17

Examples of the new ASTROWISE manual masking implemented for KiDS DR5. The figure shows a heavily contaminated field, caused by scattered light from Formalhaut (αPsA), a visible-magnitude star (V = 1.16) that is ~1.5 deg from the centre of the focal plane. The left column shows the tile before masking, while the right column shows the tile after application of the manual and PULECENELLA masks. The upper row shows the tile at native resolution, while the bottom row shows the tile after smoothing with a 5″ Gaussian filter.

In the text
thumbnail Fig. 18

Coverage of VISTA VIKING data in the KiDS fields, demonstrating that nearly the entire KiDS DR5 footprint is covered by complete ZYJHKs-band VIKING observations.

In the text
thumbnail Fig. 19

Progression of the NIR observations by the VIKING survey. Note that the J-band line has been multiplied by a factor of 0.5, to account for the fact that it is observed with twice the frequency of the other bands. The observing strategy, combining observations of ZYJ-bands and JHKs-bands, can be seen in the correlated increase in observed paws in the various filters.

In the text
thumbnail Fig. 20

Distribution of recalibration factors ℱ , derived by KVPIPE using the parameters provided by CASU, for VIKING and VIKINGlike observations in the KiDS+KiDZ fields. Following Driver et al. (2016) and Wright et al. (2019), we applied a rejection of detectors with recalibration factors ℱ ≥ 30. There are additional indirect selections, however, that occur for fields with large PSF sizes (see Sect. 6.4).

In the text
thumbnail Fig. 21

Primary observational properties of pointings in KiDS+KiDZ for observations in the five NIR bands. Each row shows the distribution of PSF sizes (as measured on each VISTA chip; left) and limiting magnitude (as determined by the magnitude of a 5σ source in a 2″ circular aperture; right) determined with a KDE using the annotated kernel. The corresponding cumulative distribution functions for each panel are shown as grey lines.

In the text
thumbnail Fig. 22

Distribution of KiDZ spectroscopic redshift estimates on-sky. The figure shows the distribution of all available spectroscopic redshift estimates from the full spectroscopic compilation (red) and those that are matched to unmasked KiDZ sources (blue). The available footprint of the KiDZ imaging is shown in grey scale beneath the points, demonstrating where we have imaging but no available spectra (and vice versa).

In the text
thumbnail Fig. 23

Distribution of KiDZ spectroscopic redshift estimates in RA-ɀ space. Sources in orange indicate estimates that were available for previous KiDS analyses, and sources in cyan show new estimates added here.

In the text
thumbnail Fig. 24

Colour-colour diagrams for KiDS DR5 sources. The stellar locus can be identified by the distinct clouds of data that are coincident with both SDSS stars (black contours) and Pickles stellar templates (pink dots).

In the text
thumbnail Fig. 25

Quality metrics for photo-ɀ point-estimates produced by BPZ in PHOTOPIPE. Each row shows one quality metric, computed from the distribution of ∆ɀ = (ɀB − ɀspec) /(1 + ɀspec): the running median (‘bias’, µ), the running normalised median-absolute-deviation (‘scatter’, σ), and the fraction of sources with |∆ɀ| > 0.15 (‘outlier rate’, η0.15). The columns show these statistics computed as a function of r-band magnitude (left), photo-ɀ point-estimate (ɀB, centre), and spectroscopic redshift (ɀspec, right).

In the text
thumbnail Fig. 26

Differences between photo-ɀ point-estimates produced by BPZ in PHOTOPIPE computed without the second pass i-band (‘9-band’) and with the second pass i-band (‘10-band’), using a full simulated KiDS wide-field sample described in van den Busch et al. (2020). Metrics here are not directly comparable to Fig. 25 due to this sample being simulated, having a different redshift baseline, and representing a full wide-field sample (unmatched to spectroscopy). Therefore, only relative differences between the nine- and ten-band cases are relevant. Metrics are computed as in Fig. 25, except metrics here are shown as percentage difference with respect to the nine-band case.

In the text
thumbnail Fig. 27

Demonstration of the strong-selection masking in the KiDS+KiDZ catalogues for KiDZ pointing KIDZ_333p0_1p9 in the Y-band. All sources detected by SOURCE EXTRACTOR are shown coloured by the number of missing photometric bands as determined by GAAP (0: blue, 1: red). The mask is determined by the fraction of sources per unit area on sky that are missing one or more bands, as determined using a ratio of KDEs constructed on a 1′ × 1′ grid with a 1′ Gaussian kernel. The strong-selection mask removes regions with missing-source fractions greater than 20%. The effect of this mask can be seen in the background, which shows the sum image of this field’s Y-band data after masking: areas that show the strong selection effect (i.e. red dots) have a zero value in the sum image (i.e. white), and are therefore masked.

In the text
thumbnail Fig. 28

Sources in the resolution-S/N plane that is used for recalibration of lensfit weights and shape estimates. Top row: estimates of the PSF leakage into the shape measurement variance, estimated with simple linear regression (left) and with clipped, iterative, residual-weighted linear regression (centre). Right: difference between the leakage estimates. Bottom row: PSF leakage estimates after correction of shape variances and weights (left). The estimates of PSF leakage into the source shape distributions is shown before (centre) and after (right) correction of estimated ellipticities.

In the text
thumbnail Fig. 29

Selections applied to the KiDS DR5 MASK ≤ 1 sample, and how they modify the available number of sources as a function of r-band magnitude, relative to the total number counts. The number of sources that are removed by each selection is given in Table 14.

In the text
thumbnail Fig. 30

Selection of binary stars in construction of the KiDS-Legacy lensing sample. Left: distribution of all DR5 sources in size and apparent magnitude. Centre: distribution of all sources remaining prior to the rejection of binary stars, which have ellipticities || ≤ 0.8. The colour scale in the centre panel is the same as in the left panel. Right: as in the centre panel, but for sources with ellipticities || > 0.8. This is the space in which the binary rejection is performed. The binary rejection criteria used in previous KiDS releases is shown as the dashed red line. The new binary rejection criteria for sources in DR5 is shown as the solid red line (sources above the line are discarded).

In the text
thumbnail Fig. A.1

Correlated background estimated using the LAMBDAR randoms routine. Left: Local region of the image under analysis, with the 100 realisations of the random aperture shown as translucent black circles. The image has been masked using the fiducial mask, seen as the missing triangular region (a diffraction spike) at the bottom of the image. Right: Distribution of pixel values measured within the random apertures, for all apertures (black) and the individual apertures (colours). There are some apertures that are coincident with sources, resulting in large positive fluxes. These are limited in the computation of the random noise estimate through the use of median statistics. The distribution of aperture fluxes from the randoms is shown by the box-and-whisker plot, with the outlier fluxes shown as stars. This figure is a direct output of the LAMBDAR code.

In the text
thumbnail Fig. A.2

Uncorrelated background estimated using the LAMBDAR sky-estimate routine. Left: Annuli used to estimate the background estimates, shown against the input image (colour mapping for pixels in the left panel is given by the point colouring in the right panel). Right: Distribution of pixel values (points) as a function of radius from the random location chosen. The black lines show the median sky value per annulus (solid) and the uncertainty on the median (dashed). The final mean (median) sky estimate is shown as the solid (dashed) red line. This figure is a direct output of the LAMBDAR code.

In the text
thumbnail Fig. A.3

Demonstration of the background estimates produced by LAMBDAR for pointing KIDS_l85p0_m0p5 in the r-band. Each panel shows a grey-scale image of the pointing, smoothed with a 2″ Gaussian kernel and after masking. Overlaid on this image are various estimates of the image noise level, shown with rectangles covering the extent of the area used for the estimate. Left: Magnitude limits estimated with blanks. Centre: Magnitude limits estimated with the pixel RMS. Right: Correlation factor, estimated as the ratio between the aperture noise RMSs estimated with blanks and pixels. This factor clearly recovers the regions of the image with increased noise correlation caused by residual bright artefacts.

In the text
thumbnail Fig. B.1

Distribution of limiting magnitudes in the optical bands for the KiDS fields. The background levels measured using 2″ circular apertures, as described in Appendix A. To construct the on-sky mosaic, estimates per-pointing are combined using the same WCS cuts as are applied to the data (Sect. 6.5).

In the text
thumbnail Fig. B.2

Distribution of PSF FWHM sizes (in arcsec) as reported by ASTROWISE for each pointing within KiDS.

In the text
thumbnail Fig. C.1

Astrometric residuals between KiDS and Gaia when comparing to J2016 epoch stellar positions. The systematic difference is caused by the systematic shift in the positions of stars between 2000 and 2016, caused by the motion of the Solar System through the Milky Way.

In the text
thumbnail Fig. C.2

Astrometric residuals between KiDS and Gaia when comparing to J2000 epoch stellar positions. The systematic difference is no longer present; however, a considerable increase in scatter is visible. This is due to the KiDS stars being relatively positioned according to the J2016 epoch (determined by when the images were taken).

In the text
thumbnail Fig. E.1

Distribution of effective NIR magnitude limits in the KiDS fields, computed using LAMBDAR with 2″ circular apertures.

In the text
thumbnail Fig. E.2

Distribution of mean PSF FWHM sizes (in arcsec) as reported by ESO for each paw-print observation of the KiDS fields. Binning is made is steps of {δRA, δDec} = {0.25/cos (Dec), 0.25} deg, as the spacing between adjacent chips in the paw-print is ~ 0.2 deg.

In the text
thumbnail Fig. F.1

Correlations between masked area per pointing, for each of the individual mask bits. Correlations are computed between the area available per pointing after the application of the relevant mask bits, for the full KiDS-DR5 survey.

In the text
thumbnail Fig. G.1

Distribution of PSF ellipticities (|∊PSF|) measured in each of the five r-band exposures used for shape measurement by lensfit. The panels show the relative homogeneity of the PSF measured across all exposures, and the amount of variability between exposures of the same pointing.

In the text
thumbnail Fig. G.2

Distribution of ∊1, ∊2, |∊|, σ∊,1, and σ∊,2 measured by lensfit after recalibration. All estimates are unweighted.

In the text
thumbnail Fig. G.3

Distribution of recalibrated shape-measurement weights on-sky.

In the text
thumbnail Fig. H.1

As in Figure 29, but now computed using sources weighted by their uncalibrated shape measurement weight. The distribution is truncated to the limits of analysed by lensfit (19 ≤ r ≤ 25.5), as the weights are not defined outside this region.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.