Gaia Data Release 3
Open Access
Issue
A&A
Volume 674, June 2023
Gaia Data Release 3
Article Number A32
Number of page(s) 25
Section Catalogs and data
DOI https://doi.org/10.1051/0004-6361/202243790
Published online 16 June 2023

© The Authors 2023

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1. Introduction

This paper describes the validation of the third Gaia data release, Gaia DR3 (Gaia Collaboration 2016, 2023e). The validation of the astrometric and photometric content can be found in the Gaia EDR3 validation paper (Fabricius et al. 2021). We focus here on the new products of Gaia DR3, which are summarised in Gaia Collaboration (2023e). The main new products of Gaia DR3 are the radial velocities, as well as line broadening and GRVS magnitude, astrophysical parameters, variable stars, Solar System objects, and for the first time, spectra (both from the spectrophotometric instrument and from the Radial Velocity Spectrometer (RVS)), non-single stars, and quasar (QSO) and galaxy candidates, and associated characterisation. The processing papers1 and the online documentation2 describe the data and their internal validation in detail. The performance verification papers1 highlight the overall quality of the catalogue. In this paper, we focus on presenting the caveats that the Gaia DR3 users should be aware of. Although the scientific validation process has confirmed the high quality of the Gaia EDR3 data, certain issues remain, and there are caveats. We focus this paper on highlighting them, and we provide advice to the users.

The approach followed by the validation presented in this paper is a transverse analysis of the properties of the catalogue content. Tests are either internal (including overall statistics, correlations, and clustering analysis between catalogue entries) or use external data, a Galaxy model, or clusters. The comparison with a Galaxy model was made using the Gaia object generator (GOG20) as a reference model or the Gaia universe model snapshot (GUMS20, which contains the intrinsic properties of the objects generated by GOG). They are described in detail in the online documentation3 and were released with the Gaia EDR3 set of catalogues4.

Although the validation tests were designed to be transverse, we organise the paper by product for convenience. We therefore discuss in turn the Radial Velocity Spectrometer products (Sect. 2: radial velocities, line broadening, GRVS magnitude, and RVS spectra), the low-resolution (Blue and Red Photometers, BP and RP) spectra (Sect. 3), the astrophysical parameters (Sect. 4), QSO and galaxy candidates (Sect. 5), non-single stars (Sect. 6), variables (Sect. 7), and Solar System objects (Sect. 8).

2. Radial Velocity Spectrometer

While only radial velocities were provided in Gaia DR2, more products from the Radial Velocity Spectrometer (RVS) are available in Gaia DR3 within the gaia_source table: the radial velocities (radial_velocity) down to GRVS < 14, the spectral line broadening (vbroad), and the magnitude GRVS estimated using the RVS spectra flux (grvs_mag). Moreover, a subset of RVS spectra are available through the rvs_mean_spectrum datalink5.

2.1. Radial velocity

The radial velocity (RV) data are presented in detail in Katz et al. (2023). The radial velocities of hot stars are specifically targeted in Blomme et al. (2023).

2.1.1. Radial velocity contaminants

During the process of the internal validation of a preliminary version of the catalogue, we detected erroneous radial velocities due to nearby bright contaminant sources, which was a well-known issue for Gaia DR2 (Boubert et al. 2019; Seabroke et al. 2021). This is illustrated in Fig. 1 (top), where we took all pairs of sources whose components are closer than 10 arcsec and plotted their difference in radial velocity (absolute value) versus their angular separation (squared). In this plot, optical pairs should contribute a constant density of points at a given ordinate, while physical binaries should contribute pairs with small RV differences. If in a given transit, the dispersion of the spectra is oriented close to the line of separation of the two sources, the lines from both sources will be present in both spectra. If in addition, the lines from the neighbour source are confused with the lines of the (fainter) target source, this will give the target an erroneous RV, which differs from the RV of its neighbour in proportion to the separation. This will normally just result in an outlying observation, but if a particular scan direction dominates, the final radial velocity difference will become 145 km s−1 for each arcsecond of separation. In the upper panel of Fig. 1, we see a rounded front, a parabola, closely matching the predicted 145 km s−1 arcsec−1 dependence. This is a clear sign that source confusion is occurring. We also note a second, weaker front below the first. It corresponds to a similar effect, but the two sources are separated 1 . $ \overset{\prime \prime }{.} $8 in the direction perpendicular to the scan, which is the limit at which two observations have independent data acquisition.

thumbnail Fig. 1.

Differences in radial velocities of the members of close pairs of sources as a function of the square of the angular separation in arcsec2 (top) and after filtering (bottom). The red lines enclose one of the criteria that were used to filter the problematic cases.

As a result, we have a population of false high radial velocities, but also sources with biased radial velocities that are not necessarily very high. Most of these problematic sources have been filtered out of the data released in Gaia DR3 based on the separation and magnitude difference of the pair for sources with |RV|> 200 km s−1 (see Katz et al. 2023). This filtering causes the fronts to almost disappear (Fig. 1 bottom), although some hints of them still remain.

In the top plot panel of Fig. 1, a vertical band of sources with very large radial velocity differences at very small separations is visible as well. From the pairs with separations smaller than 1.6 arcsec (corresponding to 2.56 arcsec2 in these plots) with velocity differences above and below 500 and −500 km s−1 (enclosed within the red lines), we filtered out the members with a higher RV, which in general correspond to the faint member of the pair. This is a total of 57 stars.

We also used the binary catalogue by El-Badry et al. (2021) to test the internal consistency of the radial velocities. More than 100 000 pairs with a probability of 90% of being bound according to this catalogue have a radial velocity for both members in Gaia DR3. The comparison of their radial velocities before filtering (Fig. 2, left) indicates that most of the pairs closely follow the one-to-one line (agreement of RV), but some sources have suspiciously high radial velocities, especially the secondary members. This sample also shows correlated velocity differences and separations in a plot similar to Fig. 1. Some problematic high radial velocities still remain in the catalogue after filtering (Fig. 2, right panel).

thumbnail Fig. 2.

Radial velocity of the primary star against the secondary for stars of binary pairs from the El-Badry et al. (2021) catalogue before (left) and after (right) filtering.

The |radial_velocity| > 600 km s−1 for 770 sources. We expect most of them to be real high-velocity stars, but due to the low signal-to-noise ratio of most of the spectra, it is difficult to know which fraction of the measurements is truly spurious. We can stack all of them, however, to improve the summed signal. All the spectra were corrected for radial velocity. If the RV value that was used was the correct value, the stacked spectra should have strong lines in the expected places such as the calcium triplet lines. This is shown in Fig. 3. However, if the radial velocity that was used for the correction was incorrect, the triplet lines are shifted and appear in the wrong place: for RV > 600 km s−1, they are at least 1.7 nm to the left of the expected position, while for RV < −600 km s−1, they appear at least 1.7 nm to the right (dashed lines). These secondary peaks are also seen in the figure. They are less sharp because the incorrect velocity corrections range from 600 to 900 km s−1 (i.e. between 1.7 and 2.6 nm), so they do not all peak in the same place. In any case, the figure clearly shows that most measurements are good, that is, most of the sources are real high-velocity targets.

thumbnail Fig. 3.

Stacked spectra for all the sources with a radial velocity > 600 km s−1 (421 sources, top) and <−600 km s−1 (349 sources, bottom). Solid vertical lines indicate the position of the calcium triplet, and dashed lines show the same lines shifted by 1.7 nm, indicating where the spectral line would be if the radial velocity correction were incorrect by 600 km s−1.

Radial velocities are provided down to GRVS = 14. A few sources are still fainter than G > 16 (see Fig. 4), however. A high (GRVS − G) could indicate contamination from nearby bright stars (affecting the RV estimation), but also (for the faintest GRVS) an under-subtraction of the background. We recommend caution for radial velocities with (GRVS − G) < − 3.

thumbnail Fig. 4.

Uncertainty in radial velocity as a function of G.

2.1.2. Radial velocity systematics

Comparison with external catalogues shows that the zero-point of the radial velocities is lower than 0.1 km s−1 than in the radial velocity standard catalogue of Soubiran et al. (2018), Carmenes (Lafarga et al. 2020), and SIM (Makarov & Unwin 2015), but it is about −0.2 km s−1 for GALAH DR3 (Zwitter et al. 2021), APOGEE DR16 (Ahumada et al. 2020), and GES DR3 (Gilmore et al. 2012). The number of 5σ outliers is smaller than 3%. The radial velocity zero-point shows a decrease with metallicity in all surveys illustrated in Fig. 5. The global change in radial velocity with magnitude is not consistent across the surveys. However, Katz et al. (2023) used subsamples on which they found a consistent trend between APOGEE and GALAH and propose a magnitude-term correction. For stars with rv_template_teff > 8500 K, the correction derived in Blomme et al. (2023) is to be used instead.

thumbnail Fig. 5.

Variation in radial velocity difference with APOGEE DR16 as a function of the APOGEE metallicity.

The comparison of the median radial velocity with the GOG model does not show any systematic significant difference throughout the sky, at least none that can be attributed to the data themselves. Figure 6 shows the median value of the radial velocity throughout the whole sky and per magnitude bin for DR3, EDR3, and GOG20. From G = 4 to G = 15, the DR3 values agree with GOG20 at the level of 1.5 km s−1 or lower. We note that it was at the level of 1 km s−1 or lower with EDR3. The EDR3 values are not reliable at G > 13 because there are too few stars. This limit is pushed to G > 15 for DR3. The dependence on the G magnitude seen in the data is not predicted by the model and might indicate some systematics in the data or in the model that are not yet understood.

thumbnail Fig. 6.

Radial velocities averaged over the whole sky as a function of G magnitude for DR3 (blue), GOG20 (green), and EDR3 (pink).

2.1.3. Radial velocity uncertainties

The different methods that were used to compute the radial velocity (indicated by rv_method_used, see Katz et al. 2023) lead to different error distributions as a function of magnitude (Fig. 4). In particular, the limit of using one method or the other is GRVS = 12, which produces the plume of large errors at G ∼ 12.

We tested the uncertainties on radial velocity given by radial_velocity_error using 48 944 stars in 804 open clusters in which at least ten members were brighter than GRVS < 14 with radial velocities. Membership was derived using parallaxes and proper motions. The sample used for this test includes the open clusters catalogued by Cantat-Gaudin et al. (2020) and new clusters in Castro-Ginard et al. (2022). These clusters are typically closer and more populated than average: 70% are located within 2 kpc of the Sun, and 50% of them have more than 140 identified members. We computed the difference ΔRV between the radial velocity of each star and the bulk cluster radial velocity (defined as the median of RVS radial velocities), and we compared this value with the nominal uncertainty of each star. The results are shown in Fig. 7. In the ideal case, ΔRV/radial_velocity_error should follow a normal distribution (centred on zero and with a dispersion of 1), but Fig. 7 shows that bright stars (with high rv_expected_sig_to_noise and small radial_velocity_error) tend to have a much broader dispersion than faint stars. It should be noted that several effects can broaden the distribution, such as the gravitational redshift and convective blueshift which affect stars of different spectral types differently, unrecognised binaries, non-members that are still present in the distribution, and the intrinsic internal dispersion of the clusters. While some of the effects are difficult to quantify, the internal velocity dispersion is about 0.5–1 km s−1 (see e.g., Torres et al. 2021). It is difficult to produce diagnostics on a per-cluster basis because the effect is revealed in a statistical way. However, the pattern seems to be identical for all clusters, and this is in favour of an instrumental effect.

thumbnail Fig. 7.

Radial velocity uncertainties tested with open clusters. Top panel: absolute value of the difference between the radial velocity of a star and its cluster median |ΔRV| normalised by the radial_velocity_ error. The black line is the lowess (locally weighted scatterplot smoothing). The slope of the lowess for lower values of radial_velocity_error indicates that the errors can be underestimated at the bright end (but see the text for a discussion). Bottom panel: difference between the radial velocity of a star and its cluster median ΔRV normalised by the radial velocity error in different radial velocity error bins.

Comparison with external catalogues and the wide binary catalogue of El-Badry et al. (2021) confirms that the errors are underestimated for GRVS < 12, but also for Teff < 4500 and Teff > 6000 K (Fig. 8). As the external catalogues provide different error underestimation estimates due to their own error estimation uncertainties, we used the wide binaries to estimate a correction. To limit the impact from the gravitational redshift, we selected stars with similar colours and magnitudes, using a difference of 0.1 mag in GBP − GRP and G. To avoid the additional dependence on the temperature, we selected only systems in which both components lay within 4500 < rv_template_teff < 6000 K. We further removed 5σ radial velocity outliers. This led to a total of 2452 systems that could be used. According to the APOGEE and GALAH comparison, we split the fit into bright and faint regimes at GRVS = 12 mag (which corresponds to the magnitude separation between the different methods that were used to derive the radial velocity) and fitted a second-order polynomial to the factor fσ to apply to the standard deviation,

f σ ( G RVS ) = a + b G RVS + c G RVS 2 , $$ \begin{aligned} f_\sigma (G_\mathrm{RVS} ) = a + b~G_\mathrm{RVS} + c~G_\mathrm{RVS} ^2 , \end{aligned} $$(1)

thumbnail Fig. 8.

Standard error factor fσ that should be applied to radial_velocity_error as a function of magnitude (left) and temperature (right), estimated from the comparison with GALAH. In green we over-plot fσ estimated from the wide binaries (Eq. (1) and Table 1).

by maximising the product of the likelihoods of

R V 1 R V 2 ( f σ ( G RVS 1 ) σ RVS 1 ) 2 + ( f σ ( G RVS 2 ) σ RVS 2 ) 2 $$ \begin{aligned} \frac{RV_1-RV_2}{\sqrt{(f_\sigma (G_\mathrm{RVS1} ) \sigma _\mathrm{RVS1} )^2+(f_\sigma (G_\mathrm{RVS2} ) \sigma _\mathrm{RVS2} )^2}} \end{aligned} $$(2)

to be normally distributed. The coefficients we obtained are illustrated in Fig. 8 and are provided in Table 1. The bright side is not constrained, so that it should not be extrapolated beyond GRVS < 8 mag. Based on the comparison with Soubiran et al. (2018), the value at GRVS = 8 mag seems to be a good estimate for GRVS < 8 mag. The Teff ranges showing a strong departure in Fig. 8 correspond to systematic offsets between rv_template_teff and the GALAH temperature. For cool stars, the deviation with APOGEE is found only for Teff < 4000 K. The effect of the random temperature template mismatch is included in the correction provided in Table 1, but not the systematics as we used an internal comparison. On the range 4500 < rv_template_teff < 6000 K, the median absolute deviation between rv_template_teff and the GALAH Teff is 250 K. When the correction of Table 1 was applied to wide binaries with a Teff template that was hotter and cooler, no correlation of an additional factor with rv_template_teff was detected.

Table 1.

Coefficients to derive the standard error factor fσ that should be applied to radial_velocity_error according to Eq. (1) for GRVS > 8 mag.

2.2. Vbroad

The estimation of the line-broadening parameter, vbroad, is detailed in Frémat et al. (2023). The comparison of the spectral line-broadening parameter vbroad with external catalogues shows that values lower than ∼10 km s−1 are systematically overestimated, while higher values tend to be underestimated for FGK stars, as illustrated in Fig. 9, which shows a similar behaviour as the comparison with GALAH (Zwitter et al. 2021). More details about the validation of the spectral line-broadening parameter can be found in Frémat et al. (2023).

thumbnail Fig. 9.

Comparison of the spectral line broadening parameter with the De Medeiros et al. (2014) catalogue of FGK stars, colour-coded by the template temperature.

2.3. Grvs magnitude

The estimation of the GRVS magnitude, grvs_mag, is detailed in Sartoretti et al. (2023). The comparisons of GRVS with the HIPPARCOS magnitude and Tycho2 colours indicate no saturation issues. The comparison of GRVS with Gaia G magnitude and GBP-GRP colour shows a change in behaviour at GRVS > 12. To illustrate this, we used here a sample of solar metallicity dwarfs selected from APOGEE DR16 (Ahumada et al. 2020) that had low extinction (A0 < 0.05 mag according to Lallement et al. 2019). An empirical robust spline regression was derived to model the global relation of G-GRVS versus GBP-GRP. The residuals from this spline are plotted as a function of magnitude in Fig. 10. The effect appears to be much larger than the internal variations observed with the G, GBP, and GRP magnitudes (Fabricius et al. 2021), but are still only at the 10 mmag level.

thumbnail Fig. 10.

Residuals from a global relation GRVS − G = f(GBP − GRP) for a sample of APOGEE low-extinction solar metallicity dwarfs.

Figure 11 shows the relative difference between DR3 and GOG20 in the G = 12–13 magnitude range. The agreement is very good, except in the bulge and Galactic plane, where the excess of simulated stars is large. This is expected as only sources with unblended spectra were used to estimate GRVS (Sartoretti et al. 2023). Exploring these maps at other magnitude bins shows that the completeness for GRVS measurements is still high at G = 14 outside of the Galactic plane, and at fainter magnitudes, the star counts start to drop. In the Galactic plane, the data already start to be incomplete at G = 11.

thumbnail Fig. 11.

Relative difference of the number of stars with a GRVS value between DR3 and GOG20 (DR3-GOG20)/DR3 in the magnitude range 12 < G < 13 in Galactic coordinates. −1 (+1) corresponds to a deficit (an excess) of 100% in DR3 data with regard to the GOG20 model.

2.4. RVS spectra

The main properties of the RVS spectra, available through the rvs_mean_spectrum datalink table, are described in Seabroke et al. (in prep.). The sky distribution of the sources with spectra, presented in Fig. 12, is non-uniform. There are patches with a higher density of sources, and some regions are basically empty. More details about how this sample was selected are given in Seabroke et al. (in prep.).

thumbnail Fig. 12.

Galactic distribution of the sources for which RVS spectra are available in the HEALPix map of order 6. White patches are regions without sources.

The continuum calibration of the spectra was performed using different methods, which resulted in different continuum levels. For faint targets, GRVS > 12 (or rv_method_used = 2), the method set the median value of the flux at 1, which forces the continuum to be slightly above this value (see Fig. 13, top). For brighter targets (or rv_method_used = 1), the continuum is slightly below 1 (see Fig. 13, middle). When the target is red, the continuum is not flattened, and a positive slope is visible (see Fig. 13, bottom).

thumbnail Fig. 13.

Three example spectra with three different continuum levels.

3. Spectrophotometry

Gaia DR3 provides low-resolution spectral data for about 220 million sources for the first time. These data consist of two sets of coefficients with the corresponding uncertainties and correlation matrices, available in the xp_continuous_mean_spectrum table through the datalink interface5. One set of coefficients is for the BP instrument, and the other set is for the RP instrument. The only exception is DR3 source_id = 5405570973190252288. This very red and faint source only has an RP spectrum. The coefficients are the development of a spectrum in basis functions for the internal spectrum in units of electrons per second per pseudo-wavelength within the Gaia aperture as a function of pseudo-wavelength (De Angeli et al. 2023). Externally calibrated spectra can be obtained through the GaiaXPy tool6 (see also the cosmos pages7 for the configuration files that allow producing these externally calibrated spectra). For a subset of the spectra with G < 15, these externally calibrated sampled spectra are available directly in the xp_sampled_mean_spectrum table through the datalink interface. However, the direct usage of the coefficients is strongly recommended (De Angeli et al. 2023). GaiaXPy can also be used to transform an external spectrum into the Gaia XP (shortcut for BP and/or RP) continuous representation. A detailed description of the data is provided by De Angeli et al. (2023) for the internal spectra, and by Montegriffo et al. (2023) for the external spectra. Tests were performed to ensure the validity of the spectrophotometric data, both for internal and external spectra. The tests are described below.

3.1. Ensemble properties of the spectrophotometric data

Figure 14 shows the sky density distribution for all sources with xp_continuous_mean_spectrum in Gaia DR3. In addition to the natural variation in source density, several distinct regions with a lower source density are seen. The natural variation includes high densities along the Galactic plane and in particular towards the Galactic centre, and decreasing densities towards the Galactic poles. These artificial patterns in the sky distribution of the sources result from the selection process of spectra to be published in Gaia DR3, in particular, the requirement of at least 15 observations.

thumbnail Fig. 14.

Density of Gaia DR3 sources in the sky with available xp_continuous_mean_spectrum in Galactic coordinates.

Figure 15 shows the distribution of the sources with XP spectra in the colour – apparent magnitude diagram. The natural increase in the number of sources at fainter apparent magnitudes is clearly visible. In addition to this, artificial structures are superimposed. For a G magnitude brighter than 17.65, all available XP spectra are included in Gaia DR3, while for fainter sources, only a subset is included, with a focus on red sources. This results in the break in the distribution at G = 17.65, and in the generally smaller number of sources at larger magnitudes, with a larger proportion of red sources. A detailed description of the selection process is provided by De Angeli et al. (2023).

thumbnail Fig. 15.

Magnitude-colour diagram for sources for which xp_continuous_mean_spectrum is available in Gaia DR3.

3.2. Tests of source coefficients

The first test we performed on the coefficients of the XP spectra determined the stability of the representation of internal spectra. The internal BP and RP spectra are represented by a linear combination of basis functions, and the integrated flux of a source is thus a linear combination of the integrals of the basis functions over the entire real axis. The absolute values of these integrals are about one, therefore we might expect the absolute values of the coefficients of the spectrum of any particular source to not be significantly higher than the integrated flux of the source. Coefficients that are very large compared to the integrated flux would indicate that the source spectrum is a linear combination of basis functions that mostly cancel each other out, and would thus be an indicator of an unstable representation of the internal spectrum. We compared the absolute values of all coefficients of all sources with the integrated flux, and they are all lower than 3.8 times the integrated flux. In most cases, the values are significantly lower. We therefore see no indications for excessively large contributions from different basis functions to internal spectra that are cancelling each other out.

The basis functions used to represent BP and RP spectra are constructed such that they are efficient in representing typical stellar spectra (Carrasco et al. 2021; De Angeli et al. 2023). As a consequence, the broad structure of a spectrum is represented by the low-order basis functions, and detailed spectral patterns are represented by higher-order basis functions. The absolute values of the XP source coefficients should therefore decrease in general with the order of the coefficient. Figure 16 shows the distributions of the BP and RP coefficients, normalised with respect to the L2-norm, for all sources in Gaia DR3. In both instruments, most coefficients of the majority of XP spectra are close to zero. To study this further, we compared the sum of absolute values for the first five coefficients with the sum of the remaining higher-order coefficients. We computed the difference between the first and the second sum, with the uncertainty on the difference, and considered sources for which the difference was smaller than five times the error. Of all the sources with XP spectra in Gaia DR3, 26037 sources have BP and 5470 sources have RP spectra that fulfil this criterion. The majority of these sources are concentrated in the Galactic plane and towards the direction of the Galactic centre. These sources may therefore be affected by crowding, resulting in a contamination of the spectra by flux from nearby sources and thus unexpected spectral shapes that require an unusual combination of basis functions. The larger number of sources as compared to RP may result from the larger number of faint sources in BP.

thumbnail Fig. 16.

Source mean spectrum coefficients for all sources in the xp_continuous_mean_spectrum table for BP (top) and RP (bottom). The colour index indicates the source density.

The xp_summary table contains two parameters specifying the number of relevant coefficients for each source in BP and RP, respectively (bp_n_relevant_bases and rp_n_relevant_bases). All coefficients with indices larger than the specified number are considered to be consistent with being zero. Figure 17 shows the histogram of the number of relevant bases for BP and RP.

thumbnail Fig. 17.

Number of relevant bases in the xp_summary table for BP (blue) and RP (red).

No source has a relevant number of coefficients zero and 54. In the first case, this is caused by assigning 55 coefficients as relevant if no coefficients are found to be relevant, and the spectrum therefore agrees with consisting of random noise alone. The lack of 54 relevant coefficients arises because only one last coefficient prevents the computation of the standard deviation (De Angeli et al. 2023).

3.3. Tests of the spectral shape

Due to the lower instrumental response at its edges, the flux values of the sampled internal spectra should be lower in the outer samples than in the central samples for sources with significant flux. These regions with low fluxes at either side are referred to as the wings of the spectra. To evaluate the behaviour of the spectra at the wings at either side, we integrated the fluxes over the pseudo-wavelength ranges [ − ∞, 0] and [0,5] and compared them to the integrated flux over the interval [0, 10]. Analogously, on the other side of the spectra, the integrals over the pseudo-wavelength intervals [60, ∞] and [55, 60] were compared with the integral over the interval [50, 55]. For the comparison, the difference between the integrals was computed and normalised with respect to its uncertainty. The four normalised differences are smaller than five for 204–796 RP sources and for 88–7411 BP sources. As was already the case for the test of the decreasing coefficients, a small part of these sources is homogeneously distributed in the sky, while the majority is concentrated in the Galactic plane and in the direction of the Galactic centre. This indicates crowding and the resulting contamination of the XP spectra with flux from nearby stars as a reason for the non-decreasing spectral wings. The larger number of BP spectra in this test that do not meet the threshold as compared to RP spectra may be a result of the larger number of faint spectra in BP. Figure 18 shows the distribution of sources with normalised differences larger than five when all XP coefficients are used and when the representation is truncated at xp_n_relevant_bases. The truncation results in an increase in the number of sources above the chosen threshold, in particular, for very red sources. A possible reason might be that the truncation results in an underestimated error.

thumbnail Fig. 18.

Comparison of the number of sources failing the small wings test in BP (blue/cyan) and RP (red/magenta) when all source coefficients are considered (solid lines) or only truncated coefficients are taken into account (dashed lines) as a function of colour.

Noise may cause parts of the spectrum to be negative, in particular, for faint sources. In order to determine the number of negative values in the sampled XP spectra, we defined the negativity of a spectrum as

z = | f ( u ) | d u f ( u ) d u 2 | f ( u ) | d u . $$ \begin{aligned} z = \frac{\int \limits _{-\infty }^{\infty } \left|f(u)\right|\, \mathrm{d}u - \int \limits _{-\infty }^{\infty } f(u)\, \mathrm{d}u}{2\, \int \limits _{-\infty }^{\infty } \left|f(u)\right|\, \mathrm{d}u.}\nonumber \end{aligned} $$

This measure for negativity z is zero if the sampled XP spectrum is positive at all values of the pseudo-wavelength u, and one if it is negative at all values of u. Figure 19 shows the distribution of sources as a function of the L1 norm of the spectrum and the value of z for BP and RP. The majority of sources follows a general trend of low negativity for large L1 norms and increasing negativity and a wider spread in the distribution as the norm decreases. The latter case corresponds to faint sources with increasing negativity due to noise. The more pronounced tail of sources with small L1 norms in BP results from the larger number of faint BP spectra as compared to RP. Only a small fraction of sources clearly lies beyond the general relation between the L1 norm and z. These outliers in general result from an over-subtraction of the background in the spectra, shifting its overall flux level towards negative values.

thumbnail Fig. 19.

Distribution of sources in the zL1 norm plane for BP (top panel) and RP (bottom panel).

3.4. Wiggling patterns

We tested whether truncation efficiently removes unnecessary wiggling patterns in the XP spectra. Here we considered 6377 main-sequence star members (Cantat-Gaudin et al. 2020) of 17 open clusters8. By considering only the members of these open clusters, we ensured that our testing sample is composed of stars with metallicities similar to the solar value. We employed XP spectra that were externally calibrated by GaiaXPy with the default constant wavelength step used for the xp_sampled_mean_spectrum table.

First, we defined a coefficient that measures the wiggling level in XP spectra. For each ith wavelength sampled portion of the spectrum, we calculated

δ i n = | f i mean ( f i n , f i + n ) | phot _ g _ mean _ flux , $$ \begin{aligned} \delta ^{n}_{i} = \frac{|f_i - \mathrm{mean} (f_{i-n},f_{i+n})|}{\mathtt{phot\_g\_mean\_flux }}, \end{aligned} $$(3)

where fi is the flux associated with the ith wavelength sample. The wiggling coefficient is thus defined as the average of the δ i n $ \delta^{n}_{i} $ across the entire spectrum or a portion of it,

w n = i δ i n N , $$ \begin{aligned} { w}_{n} = \sum _{i} \frac{ \delta ^{n}_{i}}{N}, \end{aligned} $$(4)

where N is the number of wavelengths over which δ i n $ \delta^{n}_{i} $ is determined.

The wiggling coefficient w3 was calculated for each star within the spectral range 450–900 nm. This coefficient is higher when the spectra contain more undesired wiggles, but also for later spectral types, whose spectra typically contain more molecular bands. In order to ensure that we probed the wiggling and not real spectral features, we therefore defined a differential coefficient

Δ w 3 = log 10 ( w 3 ) log 10 ( w 3 ¯ ) , $$ \begin{aligned} \Delta { w}_{3}=\log _{10} ({ w}_{3}) - \log _{10}(\overline{{ w}_{3}}), \end{aligned} $$(5)

where w3 is the coefficient defined in Eq. (4) and measured for the jth star, while w 3 ¯ $ \overline{\mathit{w}_{3}} $ is the coefficient measured on the average spectrum calculated over a sample of 100 dwarf stars with |phot_g_mean_mag - phot_g_mean_magj|< 0.01 mag and |bp_rp-bp_rpj|< 0.005. By averaging spectrum stars with very similar phot_g_mean_mag and bp_rp, we obtained a single spectrum that contained the typical absorption features that can be found in spectra of stars similar to the jth star and cleaned from wiggles. Therefore, Δw3 is truly representative of the actual wiggling shown by the jth spectrum, without the contribution from molecular bands or any other spectral feature.

Figure 20 shows the cumulative histograms of the differential wiggling coefficients Δw3 derived for stars in different bins of phot_g_mean_mag. The coefficients derived from non-truncated spectra are plotted in the left panel, and those from truncated spectra are shown in the right panel. While fainter stars tend to have larger Δw3 in their non-truncated spectra due to the lower signal-to-noise ratio, this dependence is significantly reduced by truncation.

thumbnail Fig. 20.

Cumulative histogram of the differential wiggling coefficient Δw3 measured for stars with 0.5 < bp_rp < 0.7 and within different bins of phot_g_mean_mag. Left panel: Δw3 values measured in the non-truncated spectra, right panel: these values for truncated spectra.

Wiggling in XP spectra might be enhanced by strong spectral features. It is especially important to test this possibility in spectra of young accretors, which are typically characterised by strong Hα emission lines. Therefore, we used XP spectra from 197 members of the star-forming regions Chamaeleon I, IC 348, Lupus, NGC 2024, NGC 2068, ONC, Ophiucus, and R Coronae Australis.

We defined a coefficient that measures the height of the Hα line,

H H α = f H α mean ( f ( 626 636 ) ( 676 686 ) ) , $$ \begin{aligned} \mathrm{H}_{\rm H\alpha } = \frac{f_{\rm H\alpha }}{\mathrm{mean} (f_{(626{-}636)\cup (676{-}686)})}, \end{aligned} $$(6)

where f is the flux measured at 656 nm, corresponding to the centre of the Hα line, while the mean flux at the denominator is measured at the base of the Hα line, that is, at all wavelengths within 626–636 nm and within 676–686 nm.

The differential wiggling coefficient Δw10 (see Eqs. (4) and (5)) is shown in Fig. 21 (left panel) as a function of H for non-truncated spectra. Each error bar represents the standard deviation of the coefficient w 10 ¯ $ \overline{\mathit{w}_{10}} $ measured on the comparison sample of 100 dwarfs. The plot shows that wiggling increases with the height of the Hα line. Therefore, we conclude that the presence of strong spectral features enhance wiggling in XP spectra.

thumbnail Fig. 21.

Differential wiggling Δw10 as a function of H. Non-truncated spectra are shown on the left, and truncated spectra are shown on the right.

In order to test whether truncation is able to fix or alleviate the problem, we repeated the experiment on truncated XP spectra. The results are shown in the right panel of Fig. 21, which indicates that truncation does not significantly remove the additional wigging produced by strong Hα lines. Instead, we found that the height of the Hα line is affected by truncation. We observe that for 23% of stars with H H α non trunc > 1.1 $ _{\mathrm{H\alpha}}^{\mathrm{non-trunc}} > 1.1 $, H is reduced by more than 5% by truncation.

3.5. Tests of the integrated fluxes

The calibration of XP integrated photometry and the spectra follow different calibration procedures that only have low-level processing steps in common. Although some differences might occur between the integrated fluxes from the XP spectra and the integrated photometry, mainly because of potential differences in passband calibration and noise, we expect to have comparable results among these two different processes in principle. In order to test this, we computed the ratio of the photometric and spectrum flux.

The distribution of this ratio shows that most sources have a value close to one (Fig. 22). For BP, however, there is a significant population of sources with values higher than one. This might be a result of a threshold of one electron/s that was applied in the selection of transits in the integrated photometry. Transits with a flux below this threshold were excluded from the computation of the mean flux, resulting in a biased mean flux for faint sources (Riello et al. 2021). This threshold was not applied in the computation of the mean spectra, thus avoiding the bias towards too high fluxes and leading to a better behaviour at low BP fluxes for the integrated flux from the spectra.

thumbnail Fig. 22.

Histogram of the ratio r of the photometric and spectrum flux in BP (left) and RP (right).

We compared the flux error uncertainties derived from photometric fluxes and those derived from the spectra. Although these flux uncertainties are similar, those derived from the photometric calibration tend to be slightly larger than those derived from the spectra. The ratio of the uncertainty in XP fluxes and on the flux resulting from integrating the spectra is shown in Fig. 23 for BP and RP as a function of BP and RP photometric magnitude. The shift towards larger photometric errors is clear, together with a dependence on the source magnitude. This behaviour might result from underestimated uncertainties, in particular for low-order coefficients in the source representation (De Angeli et al. 2023) to which the integration of the spectra is particularly sensitive. The distribution of the uncertainty ratio has strong tails towards extreme values. The photometric uncertainties of hundreds of sources are 100–1000 times larger than those derived from the spectra.

thumbnail Fig. 23.

Distribution of sources in decadic logarithm of the ratio of photometric and spectroscopic uncertainty and XP magnitude for BP (top panel) and RP (bottom panel).

3.6. Uncertainties of the XP coefficients

The analysis that we performed on the coefficients of the XP spectra tested whether their uncertainties were evaluated correctly. To do this, we compared pairs of stars using a chi-square,

χ 2 = ( X 1 X 2 ) T ( C 1 + C 2 ) 1 ( X 1 X 2 ) , $$ \begin{aligned} \chi ^2 = (X_1-X_2)^{T}(C_1+C_2)^{-1}(X_1-X_2), \end{aligned} $$(7)

where X1 and X2 are the coefficients of the two stars in either the BP or RP channel, while C1 and C2 are the associated covariance matrices. In order to ensure that we compared stars with the same metallicity and reddening, we applied Eq. (7) only to pairs of stars belonging to the same open cluster (i.e. membership probability ≥ 0.7 from Cantat-Gaudin et al. 2020). We also excluded all stars with ruwe > 1.4 from the comparison (to remove binaries) and stars belonging to open clusters younger than 100 Myr. This latter selection was necessary to avoid contamination due to differential extinction. Furthermore, the two stars must have G magnitudes and GBP − GRP colours that were consistent within their uncertainties. By applying all these selection criteria, we obtained a controlled sample of 1560 stellar pairs whose χ2 values we were able to derive.

The χ2 values were then used to calculate the associated p-values (the null hypothesis being that χ2 follows the expected chi-square distribution, the degree of freedom being the number of coefficients) separately for the BP and RP channels. If the χ2 indeed followed a chi-square distribution, the p-values should be distributed uniformly. The cumulative histograms of the p-value distributions of the 1560 stellar pairs are shown in Fig. 24 (blue lines). The p-values of 49% and 56% of the pairs are below 0.01 in BP and RP, respectively. This indicates that the uncertainties of the XP coefficients are underestimated.

thumbnail Fig. 24.

Zoom on the p-value distributions obtained for the two bands BP (left) and RP (right). The pairs with a p-value below 0.01 failed the test.

As a second test, we applied a more stringent criterion in the selection of the pairs that were to be tested. Specifically, we imposed that the G, GBP and GRP magnitudes of the two stars must be consistent within their uncertainties. This reduced our sample to 501 pairs. The relative p-value distributions are plotted as orange lines in Fig. 24: 25% and 26% of these pairs fail our test in BP and RP, respectively.

When we applied the further condition that the magnitudes G1 and G2 must be fainter than 16 mag, we further reduced the sample to 437 pairs. The difference between this new sample, which is plotted as green lines in Fig. 24, and the previous sample is too small to observe significant effects in the p-value distribution. The fraction of pairs that fails our test now decreased to 22% and 24% in BP and RP, respectively.

Finally, we studied the p-value distributions obtained from the pairs composed of stars with the same number of bp_n_relevant_bases and rp_n_relevant_bases. In this way, we compared spectra with similar wiggling levels. Applying these criteria for the two bands separately, we obtained a sample of 148 and 109 pairs for BP and RP, respectively. The cumulative distributions are plotted in Fig. 24 as red lines. The fraction of stars that do not pass the test decreases to 21% for BP and 15% for RP, but it is still significant.

In order to estimate how much the errors are underestimated, we multiplied the covariance matrix by various factors and then repeated the experiment on the 437 pairs that are fainter than G = 16 mag. The resulting p-value distributions are shown in Fig. 25. The figure shows that the variances are underestimated by a factor that is between 1.2 and 1.5.

thumbnail Fig. 25.

Same as Fig. 24 for the 437 pairs that are fainter than G = 16 mag, but have a covariance matrix (Cov) of one to three times its original value for the two bands BP (left) and RP (right).

However, a detailed study of the coefficient error underestimation is presented in De Angeli et al. (2023) by dividing the data into two groups of transits for the same source and comparing the obtained values. They show that the error underestimation depends on the coefficients. The lower-order coefficients lead to the highest underestimation.

3.7. Comparison with external spectra

Figure 26 shows the median flux difference, normalised by the errors, between the XP sampled and the CALSPEC9 spectra (Bohlin et al. 2014) for the sources in common. A dip at ∼600 nm is visible. Figure 27 presents the median normalised flux difference within 560 < λ < 620 nm with both CALSPEC and NGSL (Heap & Lindler 2016) as a function of magnitude. It shows that the strength of this dip is magnitude dependent and has a saturation effect. Figure 26 seems to suggest a difference in flux level between BP and RP, but it is not statistically significant in the CALSPEC or the NGSL sample. However, when the MILES library is used (Falcón-Barroso et al. 2011) and the MILES spectra are normalised to the absolute flux of the XP spectra in the common wavelength range, this difference in flux level becomes significant. The bluest wavelengths show a colour-dependent trend that is illustrated in Fig. 28. See also Montegriffo et al. (2023) for a discussion of these features.

thumbnail Fig. 26.

Median flux difference normalised by the errors between the XP sampled spectra and the CALSPEC spectra normalised by the errors as a function of wavelength. Dotted lines correspond to the 1σ confidence interval.

thumbnail Fig. 27.

Median flux difference normalised by the errors within 560 < λ < 620 nm between the XP sampled spectra and the CALSPEC (black dots) or NGSL (grey dots) spectra normalised by the errors as a function of magnitude.

thumbnail Fig. 28.

Median flux difference within λ < 350 nm between the XP sampled spectra and the CALSPEC (black dots) or NGSL (grey dots) spectra normalised by the errors as a function of magnitude.

4. Stellar astrophysical parameters

An overview of the Gaia DR3 astrophysical products produced by 13 different modules10 is presented in Creevey et al. (2023). The non-stellar content part of the astrophysical parameters is discussed in Sect. 5. This section focuses on the stellar content, which is presented in detail in Fouesneau et al. (2023).

The astrophysical parameters are available in two tables: astrophysical_parameters, and astrophysical_parameters_supp. We present here only the validation results of the main parameters. In particular, the specialised modules (ESP; Creevey et al. 2023) are almost not discussed here. The Outlier Analysis tables (oa_neuron_information, oa_neuron_xp_spectra) are not discussed here either. They were successfully checked for internal but not external consistency. Tests on the ESP and OA modules can be found in Fouesneau et al. (2023). GSP-Phot parameters (Andrae et al. 2023) were derived using several spectral libraries (MARCS, PHOENIX, A, and OB). The values obtained with these different libraries are presented in astrophysical_parameters_supp, while astrophysical_parameters contains the values that were obtained with what was selected as the best library indicated in the field libname_gspphot. We mainly discuss the best library results of GSP-Phot here.

In this section, we compare the Galaxy model using GUMS as a reference. In contrast to GOG, GUMS contains most of the astrophysical parameters, but they are error free.

4.1. DSC

The development of the discrete source classifier (DSC) was mainly driven by extragalactic source completeness (see the online documentation11 and Delchambre et al. 2023). The purity for QSOs and galaxies is discussed in Sect. 5. We did not find correlations between the classprob_dsc_binarystar probabilities with Multiple Source Classifier (MSC) results or with known binaries. White dwarfs are also often confused with hot main-sequence stars. We therefore advise against using the physical binary and white dwarf class probabilities12.

4.2. Extinction

The extinction is provided as the monochromatic extinction A0 at 541.4 nm by GSP-Phot (azero_gspphot), the hot star module ESP-HS (azero_esphs), and the multiple source module MSC (azero_msc). GSP-Phot and ESP-HS also provide AG and E(GBP − GRP).

The well-known and expected temperature – extinction degeneracy is discussed in Andrae et al. (2023), as is the effect of imposing the extinction to be positive on the mean of low-extinction regions. Andrae et al. (2023) also showed that the GSP-Phot extinction values azero_gspphot are globally overestimated with a saturation at 10 mag (by construction) in their comparison with Bayestar19 (Green et al. 2019). We find the same trend in our comparison with the monochromatic extinction A0 at 550 nm derived from APOGEE, Gaia, and 2MASS by Lallement et al. (2018) shown in Fig. 29. The Lallement et al. (2018)A0 are consistent with the AV values from StarHorse provided with APOGEE DR16 (Queiroz et al. 2020) with only the expected deviation for large extinctions between AV and A0 (see Sect. 11.2.3.1.4 of the online documentation). We further confirmed this overestimation of the GSP-Phot extinction values with the bstep extinctions provided with GALAH DR3 (Buder et al. 2021) and the Lallement et al. (2019) 3D extinction map. It is also confirmed with clusters in Fouesneau et al. (2023). For nearby stars, the extinction stays low because of the ad hoc extinction prior (Andrae et al. 2023). However, at high Galactic latitudes, 22% of the stars have azero_gspphot_lower > 0.16, the highest extinction value expected according to the map of Schlegel et al. (1998). As illustrated in Fig. 30, high extinction values occur at the bottom of the main sequence for red giants (which can be confused with extincted hot stars), but also for some stars near GBP − GRP ∼ 0.2 and ∼2, which also have an impact on the temperature that is estimated for these stars.

thumbnail Fig. 29.

Density plot of the comparison of the monochromatic extinctions of GSP-Phot with those derived by Lallement et al. (2018). The green line corresponds to the 1.02 relation that is expected given the slight wavelength difference between the two A0.

thumbnail Fig. 30.

Hertzsprung-Russel diagram of low-extinction stars (A0 < 0.05 mag according to Lallement et al. 2019) with a parallax relative precision lower than 10%, colour-coded with the mean extinction azero_gspphot. The colour is saturated as black for values higher than 1 mag. In this low extinction sample, MG is simply G + 5 + 5log(sϖ/1000).

The global overestimation of GSP-Phot extinction values naturally leads to an overestimation of the total Galactic extinctions provided in the table total_galactic_extinction_map_opt. This is shown in the comparison with Planck (Planck Collaboration Int. XLVIII 2016) in Fig. 31. As a0_uncertainty provides the error on the mean and can become very small when the number of stars becomes high, we recommend using a0_uncertainty × num _ tracers _ used $ \sqrt{{\mathtt{num\_tracers\_used}}} $ instead. While the overall appearance of the Galactic extinction map is as expected (large-scale dust filaments are clearly visible in the approximate expected relative intensity; see Delchambre et al. 2023), the A0 estimates are systematically overestimated in a large region of around 20 deg around the Galactic centre. At high Galactic latitudes, the uncertainties are large enough for the A0 overestimation to be not visible in Fig. 31.

thumbnail Fig. 31.

Comparison of the total galactic extinction map (a0) with PlanckE(B − V) normalised by the error R N = ( a 0 E ( B V ) × 3.1 ) / ( a 0 _ uncertainty × num _ tracers _ used ) $ R_N=({\mathtt{a0}}-E(B-V)\times3.1)/({\mathtt{a0\_uncertainty}}\times \sqrt{{\mathtt{num\_tracers\_used}}}) $. The white area corresponds to locations with status > 0.

The multiple source classifier (MSC) finds 37% of its lower extinction azero_msc_lower > 0.05 for the nearby star sample (ϖ > 20 mas) for which no significant extinction is expected. When compared to Traven et al. (2020), the MSC extinction is also globally overestimated. We advise that the GSP-Phot extinctions be preferred even for binary stars.

For the hot-star module, extinction azero_esphs can reach very high values for white dwarfs that are treated as hot main-sequence stars. We recommend to filter them out using a colour-absolute magnitude diagram.

Another estimate of the extinction is provided through the diffuse interstellar band (DIB) at 862 nm that is present in the RVS spectra: dibew_gspspec. The details of the measurement are presented in Recio-Blanco et al. (2023), and details of its performance are reported in Gaia Collaboration (2023d). The DIB equivalent width correlates well with the extinction (Gaia Collaboration 2023d). The large number of outliers in wavelength seen with wide binaries (Appendix A) is due to wavelength clusters around 862.5 and 861.8 nm, most of which are removed when only sources are kept that have dibqf_gspspec < 2. This selection criterion is globally recommended for the DIB parameters (Gaia Collaboration 2023d).

4.3. Teff, log g, metallicity, and abundances

The comparison of the astrophysical parameters with the GUMS model is satisfactory within the model uncertainties outside the Galactic plane. The exception is the GSP-Phot metallicity, which follows extinction patterns, as expected from the extinction-temperature degeneracy.

Stars with 3000 < Teff < 8000K are analysed by GSP-Phot using the MARCS and PHOENIX spectral libraries. Comparing the values of Teff, we find a median difference Teff(MARCS-PHOENIX) = −63 K (median absolute deviation, MAD = 145 K) due to the different temperature scale. Teff estimated using the best library compilation presents nonphysical clustering of points on the Hertzsprung-Russel (HR) diagram due to edge effects at the library borders. This is most evident for the OB library border at 15 000 K.

Figure 32 shows the comparison of the atmospheric parameters for the sources derived by both GSP-Phot (Andrae et al. 2023) and GSP-Spec (Recio-Blanco et al. 2023). The agreement between the two methods for the temperature is reasonable, has an offset on the surface gravity for small log g, and a very large dispersion for the global metallicity.

thumbnail Fig. 32.

Density plot of the comparison of the temperature (left), surface gravity (middle), and global metallicity (right) provided by GSP-Phot (y-axis) and GSP-Spec (x-axis). GSP-Phot has been filtered with parallax_over_error > 5 and teff_gspphot < 10 000. GSP-Spec parameters have been filtered with flags_gspspec[1,4,8,13] = 0 for Teff, flags_gspspec[2,5,8,13] = 0 for log g, and flags_gspspec[3,6,8] = 0 for [M/H]. The dashed green line shows the one-to-one correspondence. The median absolute deviation is indicated in each panel.

Figure 33 shows the comparison of the atmospheric parameters derived by GSP-Phot and GSP-Spec with APOGEE DR16 (Ahumada et al. 2020). The plots look similar to GALAH DR3. The large dispersion of the GSP-Phot metallicity explains Fig. 32. The GSP-Spec Teff is slightly overestimated for Teff > 5500 versus APOGEE and GALAH, while GSP-Phot is not. GSP-Phot log g for small log g is overestimated. GSP-Spec log g is globally underestimated, and a correction is proposed in Recio-Blanco et al. (2023). GSP-Spec median offsets and dispersion versus external catalogues after proposed corrections are quite good and are provided in Recio-Blanco et al. (2023). We also caution the user against using GSP-Spec log g values for AGB stars.

thumbnail Fig. 33.

Density plot of the comparison of the temperature (top), surface gravity (middle), and global metallicity (bottom) provided by GSP-Phot (left) and GSP-Spec (right) with APOGEE DR16. GSP-Phot has been filtered with parallax_over_error > 10, and teff_gspphot < 10 000. GSP-Spec parameters have been filtered with flags_gspspec[1,4,8,13] = 0 for Teff, flags_gspspec[2,5,8,13] = 0 for log g, and flags_gspspec[3,6,8] = 0 for [M/H]. The RVS spectrum signal-to-noise ratio was not filtered.

The shift in metallicity for GSP-Phot shown in Figs. 32 and 33 is larger than the literature values for open clusters. In Fig. 34 we averaged (median) the mh_gspphot inside each cluster. We used only open clusters (i.e. of about solar metallicity). We find a trend of the difference with literature values (Gaia-literature) versus [M/H] literature. A zero-point difference of −0.55 is found. GSP-Phot metallicities should be used with caution, and ideally, with a calibration (see Andrae et al. 2023).

thumbnail Fig. 34.

Comparison of mh_gspphot with literature values from open clusters. We plot the median value for each cluster. The error bars show the dispersion around the median. The red line indicates the zero value.

All the GSP-Spec parameters are found to be correlated with magnitude and metallicity. Figure 35 illustrates this correlation for teff_gspspec using APOGEE DR16. The plot is similar to that for GALAH DR3. This correlation with magnitude and metallicity leads to other unexpected correlations with extinction or sky position that are shown in the comparison with the GUMS model.

thumbnail Fig. 35.

Correlation between the GSP-Spec parameters and magnitude (left) and metallicity (right) illustrated here with the temperature residuals compared to APOGEE DR16.

We also tested the correlation of the abundances with magnitude, temperature, log g, and logchisq_gspspec with open clusters and present it for NGC 7789 in Fig. 36. The plots show clear positive trends between mh_gspspec and the two stellar parameters teff_gspspec and logg_gspspec. This correlation is similar to that obtained with magnitude, as expected for a cluster in which temperature and gravity are correlated with magnitude. It is also consistent with what is seen in the comparison with external catalogues (Fig. 35). For alphafe_gspspec, we observe correlations of the opposite sign. The [M/H] and [α/Fe] calibrations proposed in Recio-Blanco et al. (2023) alleviate these trends, but they do not remove them completely.

thumbnail Fig. 36.

Abundance trends for mh_gspspec (top) and alphafe_gspspec (bottom) as a function of teff_gspspec (left column), logg_gspspec (middle column), and phot_g_mean_mag (right column) for the stellar members of NGC 7789. The symbols are colour-coded as a function of logchisq_gspspec. Circles indicate the uncorrected alphafe_gspspec values, and crosses represent the calibrated mh_gspspec and alphafe_gspspec values. The parameters have been filtered with flags_gspspec[1:7,9:13] = 0 and flags_gspspec[8]< = 2.

The GSP-Spec abundances do not correlate very well with external catalogue values in general (Recio-Blanco et al. 2023). However, calibration formulas are proposed in Recio-Blanco et al. (2023), and Gaia Collaboration (2023c) showed that they allow retrieving the expected chemo-kinematical correlations in the disc.

GSP-Spec ANN metallicities (available in astrophysical_parameters_supp) have underestimated uncertainties (Table A.1) and are offset by ∼−0.2 with respect to APOGEE DR16. See Recio-Blanco et al. (2023) for a proposed calibration of the GSP-Spec ANN parameters as well.

The MSC metallicity and gravity are overestimated compared to those obtained with GALAH (Traven et al. 2020). Hot stars are found to be assigned a temperature of about 7500 K due to the empirical calibration based on APOGEE. The poor convergence of the MSC values can be flagged as low logposterior_msc values. The test using wide binaries (Table A.1) indicates a low number of outliers for the metallicity, but the errors are so large that most of the possible values are covered. We advise using the MSC parameters with caution in general (see also Fouesneau et al. 2023) and the online documentation).

Tests using wide binaries (Table A.1) and open clusters shows a strong underestimation of the errors for all GSP-Phot parameters with an associated large number of outliers. For GSP-Spec, they show an underestimation of the errors for mh_gspspec, alphafe_gspspec, cafe_gspspec, and crfe_gspspec, while for some other elements, the uncertainties are slightly overestimated. GSP-Spec values were discretised at two decimals, except for dibew_gspspec, which was discretised at three decimals, and Teff, which was stored as an integer. This might cause some parameters to have similar upper and lower values, in which case the discretisation step should be used as an uncertainty estimate.

The published GSP-Phot MCMC samples contain 2000 points for G < 12, but only the last 100 points are made available for G > 12, except for a random 1% subset that was given the full 2000 points. When only 100 points are available, the upper or lower values of the GSP-Phot parameters, which were determined on the full 2000 steps, may not be fully consistent with the MCMC. This inconsistency is an indication of convergence issues. However, failed convergence usually does not relate to strong outliers, which are cases when the MCMC has converged to a very different solution. We find that 18% of the 2000-point chains present some problems such as multiple solutions, local maxima of the posterior probability, or edge effects. The MSC inflated their errors in post-processing, therefore the MSC MCMCs are not consistent with the provided upper or lower values.

4.4. Distance and absolute magnitude

The global distribution of GSP-Phot distances against parallax is shown in Fig. 37, where we consider the sources with ϖ/σϖ > 5. While a large fraction of sources follows the inverse parallax curve, 37% are 5σ outliers (considering only the parallax error). We measured the clustering in this space using the Kullback-Leibler divergence (KLD; Kullback & Leibler 1951; Fabricius et al. 2021), which is higher away from the plane and in particular around the large and small Magellanic clouds (LMC and SMC).

thumbnail Fig. 37.

Global distribution of distance_gspphot against parallax for sources with ϖ/σϖ > 5. The solid black line represents the 1/parallax relation. The map (l, b) on the right shows the sky distribution of the clustering between the two parameters.

Strong outliers in the GSP-Phot distances are seen in the comparison of the estimates of the two wide binary components (Appendix A), while the relative precision of the parallax is better than 20% in this sample. The distances are shown to be systematically underestimated at large distances, and the relative parallax precision is poor when the known cluster distances of Fouesneau et al. (2023) are used. This is also confirmed with the APOGEE DR16 red clump sample. This seems to be due to a too strong prior (Andrae et al. 2023). The MSC distances distance_msc are shown to present a higher dispersion than the GSP-Phot distances even for known binaries by Fouesneau et al. (2023).

The GSP-Phot absolute magnitude estimate mg_gspphot is compared to the absolute magnitude computed directly from the parallax for a sample of stars with negligible extinction in Fig. 38. It shows the combination of distance outliers (leading the strong outliers) and extinction overestimation for stars with MG ≳ 7 (see Fig. 30, leading to a bias). Moreover, mg_gspphot is not correctly estimated in stars farther away than 1–2 kpc as an effect of the underestimated distance. This is clearly illustrated in Fig. 39, where the distance modulus (m − M) is derived from distance_gspphot.

thumbnail Fig. 38.

Density plot of the difference between mg_gspphot and the absolute magnitude computed directly with the parallax for a sample of stars with negligible extinction (A0 < 0.05 according to Lallement et al. 2019) and a parallax_over_error > 10.

thumbnail Fig. 39.

Colour-magnitude diagram of NGC 6791 (left panel), mg_gspphot vs. GBP − GRP (central panel), and G vs. distance modulus (m − M) derived from distance_gspphot (right panel). The blue line in the right panel shows the literature value, and the green lines in the left and central panels show the PARSEC isochrone, which has the same parameters as the cluster.

We recommend using the deviation between the GSP-Phot distances and the parallax13 to filter GSP-Phot outliers. For a number of usages, it may be preferable to use the parallax to estimate the distance and absolute magnitude (see Luri et al. 2018) over the GSP-Phot estimates.

4.5. Stellar evolution parameters

The stellar evolution parameters radius, mass, age, evolution stage, and gravitational redshift are provided by the FLAME module (Creevey et al. 2023). They are derived either from GSP-Phot parameters (fields named _flame in the table astrophysical_parameters) or from GSP-Spec parameters (fields named _flame_spec in astrophysical_parameters_supp). For the mass, age, and evolutionary stage, they use solar metallicity evolution models. The estimates of these parameters for non-solar metallicity stars should therefore be used with caution.

Figure 40 shows the comparison between the FLAME radius and the radius from the JSDC stellar diameter catalogue (Bourges et al. 2017, v2, selecting stars with χ2 < 2) for stars with relative parallax uncertainties smaller than 10%. The parallax is used to transform the JSDC angular diameter into radius. The radius derived by FLAME using GSP-Spec Teff, radius_flame_spec, is overestimated for blue main-sequence stars and for red giants, but it is underestimated for very red giants (GBP − GRP > 2.2). The radius derived by FLAME using GSP-Phot Teff, radius_flame, has the same properties, but fewer outliers than radius_gspphot provided directly by GSP-Phot because FLAME directly uses the parallax to derive the luminosity for this sample with a good parallax signal-to-noise ratio (see flags_flame). We therefore recommend using radius_flame for a radius estimation.

thumbnail Fig. 40.

Comparison between JSDC radius and the FLAME radii based on GSP-Phot (left) or GSP-Spec (right), colour-coded with the GBP − GRP colour.

Masses from FLAME compare well with asteroseismic estimates for dwarfs and subgiants (using Serenelli et al. 2017; Godoy-Rivera et al. 2021), but strong outliers are seen for giants (using Yu et al. 2018). Comparison with the GUMS model confirms the presence of a high-mass tail in the FLAME data that is not predicted by the model. This tail is present in all Galactic directions, even at high latitudes. It is associated with an excess of young stars. These young (< 2 Gyr) and massive (> 2 ℳ) stars are on the giant branch. We therefore recommend using the FLAME masses with flags_flame[_spec] first character 1 (giant flag) only within the 1−2 ℳ range and with caution, and taking their large uncertainties into account.

The overestimation of the GSP-Phot extinction for low-mass stars (MG ≳ 7, Fig. 30) also has an impact on the FLAME parameters. The impact on masses ℳ ≲ 0.7 is illustrated in Appendix D of Gaia Collaboration (2023a). It has an impact on the luminosity similar to what is shown in Fig. 38.

Strong outliers in the luminosity of giants are visible in the APOGEE red clump sample. A few mismatches of the evolutionary stage may occur also for giants that are confused with dwarfs with high extinction. They can be spotted in an HR diagram using an independent extinction estimate.

The gravitational redshift determined by FLAME is compared to the one used in GALAH DR3 (Zwitter et al. 2021) in Fig. 41 for sources with a gravitational redshift error from FLAME lower than 1 km s−1. The gravitational redshift based on GSP-Spec (gravredshift_flame_spec) has fewer outliers than the redshift based on GSP-Phot (gravredshift_flame), but it has a small bias of 0.05 km s−1, corresponding to the bias in log g discussed above.

thumbnail Fig. 41.

Density plot of the comparison between the GALAH gravitational redshift and the FLAME redshifts based on GSP-Phot (left) or GSP-Spec (right).

5. QSO and galaxies

Gaia DR3 includes two tables of extragalactic candidate sources, one for quasars and one for galaxies, called qso_candidates and galaxy_candidates (QSO and galaxy tables, for simplicity). These tables contain two main types of added value columns: on the one hand, we can use the different labels that are provided to tune the purity-to-completeness ratio of the sample, and on the other hand, each table also contains physical properties of the objects such as redshift, size, or variability. The astrophysical parameters (Creevey et al. 2023) associated with these tables, that is, classification and redshifts, are described in Delchambre et al. (2023), the surface brightness profiles are described in Ducourant et al. (2023), the variability is presented in Rimoldini et al. (2023) and in Carnerero et al. (2023) for AGNs. Moreover, a global analysis of these tables is presented in Gaia Collaboration (2023b). It is worth noting that these tables have been constructed with the aim of completeness, and as we show below, this means that their default purity is rather low. However, it is possible to obtain a high-purity subsample (Gaia Collaboration 2023b), as discussed below.

5.1. Purity

The different labels that are included within the QSO and galaxy tables can be used to create a subsample with different properties (see Sect. 8 of Gaia Collaboration 2023b) for a selection leading to 94% and 95% purity in the QSO and galaxy tables, respectively). The QSO and galaxy tables contain three common labels: vari_best_class_name = “AGN”/“GALAXY” (classification according to the stellar variability patterns Rimoldini et al. 2023), classlabel_dsc = “quasar”/“galaxy” (from the discrete source classifier (Creevey et al. 2023), classlabel_dsc_joint = “quasar”/“galaxy” (similar to the previous, but more restrictive since it requires DSC-Specmod and DSC-Allosmod to agree, both with a score higher than 50%) and classlabel_oa14 (assigned by the self-organising map (SOM; Creevey et al. 2023)). It is worth noting that the results of the SOM were not used to construct these tables. In other words, this label was attached to the sources that were previously selected as candidates by other means.

In addition to these shared class labels, we can also identify unique labels for each table. In the QSO table we have access to the astrometric_selection_flag (ASF), which allows us to select only sources with a high probability of being quasars based on their astrometry. We can also use the source_selection_flags to isolate the QSOs in the qso_catalogue_name table (bit 3 set to 1), hereafter called QuasarObject list, which effectively corresponds to the sources that are found in well-known QSO catalogues that had enough raw data to be processed successfully by the QSO pipeline. Finally, in the galaxy table, we can select the sources whose morphology was fit reliably. We refer to this subset as EO solution.

In Fig. 42 we show the astrometric properties, normalised by their formal uncertainties, of the different subsamples described above. Because extragalactic sources are so far away from the Sun, the astrometry of these objects might be expected to be dominated by observational errors. This is what we indeed observe for some of the subsamples, for example those in panels a and c, and also in panel e in the QSOs and panel b in the galaxies. However, the other subsamples show clear deviations from the expected standard normal probability distribution. These deviations in the DSC and OA subsamples are due mostly to the astrometric signal of the Magellanic clouds and the Galactic disc (Fig. 43). We note, however, that while ∼94% of the sources in the QSO table have a 5p or 6p astrometric solution, the galaxy table is mostly dominated by 2p sources ( ∼71%). Therefore, the conclusions we draw concern only a portion of the galaxy table. In either case, it is clear that some subsamples, namely vari_best_class_name and classlabel_dsc_joint, are purer than others (classlabel_dsc and classlabel_oa) that were built for completeness.

thumbnail Fig. 42.

Astrometric properties of the different subsamples contained within the QSO (top) and galaxy (bottom) candidate tables. Each panel contains the distribution of parallaxes and proper motions, normalised to their errors. The grey line corresponds to a normal distribution.

The sky distribution of the sources in these subsamples is presented in Fig. 43. These plots are difficult to relate directly to the purity as some modules remove the LMC and SMC and the disk plane by force or using criteria that depend on the density, while some others, such as DSC, do not. Moreover, a constant misclassification rate over the sky leads to a higher density of misclassified objects where the objects density is higher. However, in the LMC and SMC areas, more than 10% of the sources are in the qso_candidates table, so here the classification does not work well.

thumbnail Fig. 43.

Sky distribution of the different subsamples contained within the QSO (left) and galaxy (right) candidate tables.

A good idea of the main stellar types of the stellar contaminants in the QSO and galaxy tables can be obtained by positioning those with a relative parallax uncertainty lower than 20% in an HR diagram in Fig. 44. It shows that the QSO candidate stellar contaminants are mainly white dwarfs and stars with GBP − GRP∼0.4, while galaxy candidate stellar contaminants are mainly stars with GBP − GRP∼1.4 or 0.8. Figure 44 also shows that the criteria for the purer samples proposed in Gaia Collaboration (2023b) are efficient, but still retain a few contaminants. We use here host_galaxy_detected=’true’ instead of host_galaxy_flag < 6 as the latter leads to eight times more contaminants in our sample. These are due to the EO input catalogue, however, the host galaxy has not been detected for them. By construction, the entire QSO sample of Fig. 44 has astrometric_selection_flag=‘false’.

thumbnail Fig. 44.

Gaia DR3 low-extinction HR diagram (grey scale). The position of sources in the QSO (top) and galaxy (bottom) candidate tables with parallax_over_error > 5 is overplotted with a red scaling with the square root of the number of sources. Colour points correspond to the stricter selection of candidates proposed in Gaia Collaboration (2023b).

5.2. Morphological parameters

We compared the extended object morphological parameters of the galaxy table (Ducourant et al. 2023) with the GAMA (Kelvin et al. 2012) and Dark Energy Survey (DES Tarsitano et al. 2018) Sérsic profiles and with the SDSS DR16 (Ahumada et al. 2020) de Vaucouleurs profiles. Figure 45 illustrates the comparison with the DES Sérsic profiles. Saturation of the effective radius radius_sersic and radius_de_vaucouleurs at 8000 mas and of the Sérsic n-index at 8 is visible. It corresponds to the boundaries of the algorithm. Compared to DES, the Gaia DR3 index seems to be spread out more or less uniformly, with DES preferring n = 4. Essentially, the comparison with external catalogues of galaxy profiles shows an overestimation of the Sérsic index and an underestimation of the ellipticity. Both are a consequence of the fact that Gaia observes a smaller area around the galaxies than external catalogues, which prefer central measurements and are biased towards bulges (Ducourant et al. 2023).

thumbnail Fig. 45.

Comparison of Sérsic index (panel a), and ellipticity (panel b) from DES with the Gaia measurements.

The morphological parameters are accompanied by their formal uncertainties. Since these are estimated from the variance resulting from the search for a minimum in the residuals between model and observations, the provided uncertainties reflect the quality of the convergence rather than the precision of the estimation. In consequence, a fraction of sources may appear to have extremely small uncertainties while in reality, this is just the byproduct of a correlation with the convergence velocity.

5.3. Redshifts

The provided galaxy redshift upper and lower values do not correspond to confidence intervals, but to prediction limits based on machine-learning. Still, the comparison with external catalogues indicates that (redshift_ugc_upper − redshift_ugc_lower)/2 gives a good estimate of the uncertainty. A redshift peak at about 0.07 (red arrow in the orange histogram in Fig. 46) is found, which corresponds either to very bright galaxies or to stellar contaminants with convergence issues. The redshift range 0.070–0.071 should therefore be ignored (Delchambre et al. 2023). A global overestimation of the redshifts for bright sources (G < 19) is also observed.

thumbnail Fig. 46.

Redshift distribution of the QSO (blue) and galaxy (orange) candidate sources. The dotted blue line corresponds to the quasars that were selected with the recommended QSOC redshift flags (flags_qsoc = 0 or flags_qsoc = 16). The two peaks marked by the red arrows are discussed in the text.

The QSO redshifts are log-normally distributed. To compare them to the literature, we therefore used Z = log(redshift_qsoc + 1), which is normally distributed with a standard deviation of σ = (log(redhift_qsoc_upper + 1)−log(redhift_qsoc_lower + 1))/2 (Delchambre et al. 2023). The comparison with LQAC5 (Souchay et al. 2019) presents 33% of outliers at 5σ, which reduces to 8% when the flag flags_qsoc = 0 or flags_qsoc = 16 is used. This is due to the degeneracies between spectral lines and redshift in the XP spectra (see Delchambre et al. 2023; Gaia Collaboration 2023b). A peak, this time at about 0.08, is also visible in the redshift distribution of the QSO (red arrow in the blue histogram in Fig. 46). The reason for this peak is that the MgII emission line is misclassified as Hβ, a characteristic emission line of this specific redshift range (Delchambre et al. 2023). However, only a small number of sources contributes to this peak, and most of them have a non-zero flags_qsoc.

6. Non-single stars

Gaia DR3 provides four tables for non-single stars (NSSs). The table nss_two_body_orbit contains orbital two-body models, covering astrometric (Halbwachs et al. 2023; Holl et al. 2023a), spectroscopic (Gosset et al., in prep.; Damerdji et al., in prep.), and eclipsing (Siopis et al., in prep.) binaries as well combinations of these. The model that is used is indicated in the field nss_solution_type, and the parameters that are solved for a given solution are described in the bit_index field15. The tables nss_acceleration_astro and nss_non_linear_spectro contain astrometric (Halbwachs et al. 2023) and spectroscopic (Gosset et al., in prep.) acceleration solutions, and nss_vim_fl contains variability-induced mover (VIM) solutions (Halbwachs et al. 2023). Gaia Collaboration (2023a) also present the overall content of these non-single star tables.

6.1. Astrometric orbital elements

The orbital solutions for the astrometric binaries are presented using what is called Thiele-Innes coefficients. They express the orbital motion of the photocentre on the sky with a linear formulation. These coefficients replace the more usual Campbell elements a0, i, ω, and Ω, which are semi-major axis, inclination, longitude of periastron, and position angle of the ascending node, respectively. The relations between the two parameter sets are described in Halbwachs et al. (2023).

In the transformation from Thiele-Innes to Campbell coefficients, it may be useful to use Monte Carlo simulations that take the correlation matrix into account instead of using local linear approximation formulas. In 87% of the NSS sample, Gaussian errors in Thiele-Innes coefficients are transformed into asymmetric distributions for at least one of the Campbell elements. Mostly in the case of very low eccentricities, however, a number of sources shows a significance parameter that disagrees with the signal-to-noise ratio that can be derived from Monte Carlo simulations (Fig. 47). This seems to be due to an overestimation of the Thiele-Innes coefficient errors. Despite this, the local linear approximation formulas work well in deriving the error on the a0 parameter even with a very strong overestimation of the Thiele-Innes coefficient errors. As Orbital solutions are filtered to have significance> 5, using a Gaussian error model for a0 is reasonable while OrbitalTargetedSearch solutions need to be filtered. Due to an issue with the significance of AstroSpectroSB1 (see the online documentation), it needs to be verified that the signal-to-noise ratio is higher than 5. The issue is also present for the spectroscopic part of the AstroSpectroSB1 solutions, for which local linear approximation errors on a1 can be used as soon as the resulting signal-to-noise ratio is confirmed to be higher than 5. The local linear approximation formulas for the Thiele-Innes coefficients can be found in the appendix of Halbwachs et al. (2023). Overall, to handle the Thiele-Innes coefficients, usual Monte Carlo techniques such as MCMC should not be used. Codes using automatic differentiation such as ADMB (Fournier et al. 2012) and TMB (Kristensen et al. 2016) have been tested to work fine for signal-to-noise ratios higher than 5.

thumbnail Fig. 47.

Density plot of the signal-to-noise ratio of the semi-major axis of the photocentre orbit (a0) derived from a Monte Carlo method as a function of the value provided in the field significance.

The covariance matrix for very low eccentricity solutions may be problematic. In these cases, the eccentricity and periastron time should be set to zero. For AstroSpectroSB1 with eccentricity and argument of periastron fixed to zero (bit_index = 65435), c_thiele_innes is fixed to the non-circular value instead of zero. The statistical properties of the distribution of the orbital elements are discussed in the appendix of Gaia Collaboration (2023a).

6.2. External comparisons

The comparison with external catalogues16 shows that the orbital parameters agree well with literature values when the periods are consistent. It also confirms that the center_of_mass_velocity agrees better with literature binary values than with the gaia_source.radial_velocity. The strongest disagreements with external catalogues on the radial velocity semi-amplitude of the primary are for stars that are known to be SB2 (double-line spectroscopic binary), but are treated as SB1 (single-line) by NSS (Gosset et al., in prep.).

The comparison of the literature orbits with astrometric acceleration solutions in the nss_acceleration_astro table indicates that a significant fraction might have had an orbital solution and that some Acceleration7 could have been Acceleration9. This is intrinsic to the decision chain explained in Halbwachs et al. (2023). The acceleration values disagree with the expectations from the known orbits and the Gaia observation times. The acceleration values should therefore be used with caution.

The NSS parallaxes show a median difference with the gaia_source parallaxes that is smaller than a few μas. The HR diagram derived using NSS parallaxes (orbital and acceleration) is slightly sharper than the diagram derived with the gaia_source parallaxes, which indicates that the parallaxes are slightly more precise. The (statistical) improvement of the solutions does not guarantee that the accelerations are all physical, however. When they are compared to the long-term proper motion provided in the HIPPARCOS-Gaia catalogue of accelerations (Kervella et al. 2022; Gaia Collaboration 2023a), the NSS proper motions improve versus the gaia_source proper motions for the orbital solutions (moving from 21% of 5σ outliers to 9%), but not the acceleration solutions (which have a much higher median signal-to-noise ratio of the proper motion anomaly than the orbital solutions) for which the comparison is slightly worse (moving from 80% outliers to 85%). This highlights that proper motion and accelerations may both have absorbed the orbital motion.

The temperature ratio of eclipsing binaries corresponds well to the ratio derived by Eker et al. (2014), except for sources with a low g_luminosity_ratio. The correspondence with the MSC temperature ratio is poor, but these ratios are to be used with caution (see Sect. 4.3). The uncertainties on the eclipsing binary inclinations are suspiciously small.

6.3. Spurious solutions and error rescaling

To achieve the required radial velocity precision, the precise position of the spectra at the epoch on the focal plane needs to be known. For this purpose, the expected astrometric position as given by the predicted standard astrometric motion is used, rather than the measured astrometric position at the epoch, which would not be precise enough. However, if the astrometric motion is perturbed, or if the astrometric solution is not correct, then the computed epoch RV will absorb this astrometric perturbation. This means that the epoch radial velocities could increase by up to ≈0.146 × astrometric_excess_noise (km s−1) in the case of binaries. In the best case, this would add an unmodelled additional dispersion and possibly a small trend in the worst case. This may lead to spurious short period and large ruwe SB1 solutions as well as to spurious solutions around the precession period (62.97 days), see Gaia Collaboration (2023a).

The absence of a gaia_source.radial_velocity value for an SB1 solution should warn the user: the source might have been considered peculiar, potentially SB2, too hot, too cool, with emission lines, or contaminated by a nearby star. Radial velocity variations can also be due to stellar pulsations instead of an orbital motion (Gaia Collaboration 2023a), so that the variability information should also be confirmed for suspicious solutions.

While the errors were rescaled according to the goodness of fit for the astrometric solutions (Orbital, AstroSpectroSB1, VIM, and acceleration), this is not the case for the others. Because the mean goodness-of-fit distribution of SB2 and eclipsing solutions is quite large, we recommend rescaling the formal uncertainties for these solutions. The goodness_of_fit provided for SB2 solutions can deviate by up to 1.6 from the one that can be recomputed using obj_func.

7. Variability

Gaia DR3 provides variability information for about 11.8 million sources, including 10.5 million variable sources of about 30 types of variability (Eyer et al. 2023) and 1.3 million sources (variable or not) in the Gaia Andromeda Photometric Survey (GAPS; Evans et al. 2023). Time-series photometry is released for all these 11.8 million sources in the epoch_photometry datalink table as well as their statistical parameters, and links to their potential other variability table are listed in the vari_summary table. The variability associated with galaxies provided in the galaxy_candidates table are mostly artefacts due to their extension (Holl et al. 2023b) and therefore are not in the vari_summary or epoch_photometry tables. Here, we present a brief overview of some issues we found during the scientific validation, while for further details, we suggest the readers to consult the online documentation2 and papers1.

A number of sources show more than one type of variability. While most overlaps between different classes can be explained scientifically, some stars have contradicting classifications. For example, 3159 sources are classified as both long-period and short-timescale variables. Detailed analyses of the final classification for these sources are provided in Lebzelter et al. (2023).

Intensity-averaged magnitudes in the BP (int_average_bp) and RP (int_average_rp) bands for four and two RR Lyrae stars, respectively, have unreliable negative values reaching BP = −88 ± 22 mag. These six sources are faint RR Lyrae variables (G ∼ 18.5 − 19 mag) for which the specific objects study pipeline for Cepheids and RR Lyrae stars (SOS Cep&RRL; Clementini et al. 2023) failed to fit data points with the model line. The values that were provided were accordingly unreliable. Instead, other parameters that were calculated for these stars, such as intensity-averaged G magnitudes and pulsation periods, are correct. It was therefore decided to include these sources in the DR3 sample of RR Lyrae stars despite the incorrect BP and RP intensity-averaged magnitude estimates.

For 286 RR Lyrae stars, absorption in the G passband (g_absorption) reaches unreliably high values from 10 to 3367 mag. This is likely caused by the imprecise estimation of the GRP magnitudes for the faint sources (see Clementini et al. 2023 for further details).

8. Solar System objects

Gaia DR3 provides information for 158 152 Solar System objects (SSO) with more than 20 million observations (epoch astrometry). A large data-set of ultra-accurate observations like this is made available in a single day for the first time.

The sample contains 156 801 known numbered minor planets, 1 320 unmatched moving objects and 31 natural satellites of planets. The source selection is described in Tanga et al. (2023).

All the main categories of Solar System bodies are present among the known numbered minor planets: 447 Near-Earth asteroids (NEAs), 154 771 main-belt asteroids (MBAs), and a total of 1 551 Jupiter Trojans, Centaurs, and more distant objects. Figure 48 shows the different categories in the semi-major axis and eccentricity plane.

thumbnail Fig. 48.

Asteroid population in Gaia DR3 in the (a, e) plane, where a is the semi-major axis in au and e is the eccentricity of the minor planets. The legend shows the different categories: blue squares for NEAs, red stars for MBAs, and green dots for Jupiter Trojans. For sake of clarity, the plot does not show Centaurs and more distant objects.

The table sso_source in the Gaia archive contains the number of observations for each source. We would like to point out that the count of the number of observations is incorrect for four sources. The explanation for how to obtain the correct number of observations is provided in the online documentation17.

8.1. Unmatched sources and natural satellites of planets

A small subsample of the data consists of 1 320 objects that were considered unknown at the time of processing. We refer to them as unmatched sources. Tanga et al. (2023) performed a search to identify how many unmatched sources can now be identified (February 2022), and they found an identification for 712 sources. We cannot exclude either that some of the still-unmatched sources will be identified or linked to known objects when the observations are sent to the Minor Planet Center18. All the sources will still appear as unmatched in the sso_source and sso_observations tables in the Gaia archive.

The sample also contains natural satellites of planets for the first time. For a complete description of the process of selection, we refer to Tanga et al. (2023, Sect. 3.1).

8.2. Orbit determination process

We used an orbit determination process to assess the quality of the data. This process is similar to the process that was carried out to validate Gaia DR2 (Gaia Collaboration 2018; Arenou et al. 2018).

We selected Gaia observations only for every known numbered minor planet in Gaia DR3. We used a modified version of the OrbFit software19 to fit the orbits to Gaia observations alone. It is important to note that this software is completely independent from everything that runs in the Gaia data processing, and we improved it to fully exploit the accuracy of Gaia observations.

The results of the orbital fit can be summarised as an orbit (if the fit converges), post-fit residuals in the (αcos(δ),δ) space and in the (AL, AC) space, and rejection of incorrect quality or mistakenly linked observations.

The modified version of the OrbFit software makes use of a non-linear weighted least-squares algorithm to fit the orbits. The weight matrix for Gaia is the quadratic sum of the systematic and random matrices, available in the sso_observation table.

We also corrected the observations for the light bending. This is a different approach than was applied in the validation of Gaia DR2.

8.2.1. Orbit determination results: Orbit failures

The orbit fit procedure worked for almost all the known sources. It only failed for 198 objects. The reasons for this vary: the time spanned by the observations was too short, too few observations were available, or a combination of the two, as shown in Fig. 49.

thumbnail Fig. 49.

Time span by the observations in Gaia DR3 (in days) vs. the number of observations for each known source. The red stars represent the objects for which the orbit determination process did not converge.

The quality of the observations is not affected by the non-convergence of the orbit. They were therefore all accepted and are available in the sso_observations table.

8.2.2. Orbit-determination results: Post-fit residuals

The orbit-determination process is based on finding and removing bad-quality observations, so that they do not affect the goodness of fit. The orbit-determination software we used to validate the data rejects observations with χ2 > 25 (5σ). This can happen because the quality of the data is not as expected, because the weights that are used are too low, or because the observations do not belong to the object. The latter case is called mistaken linkage or incorrect identification. We decided to remove from the data only the observations for which the absolute value of the along-scan post-fit residuals was higher than 250 mas and for which the absolute value of the across-scan post-fit residuals was higher than 2500 mas. In this way, we cleaned the database from possible contaminants. At the same time, we wished to keep the largest possible number of observations so that the community could search for interesting features (e.g., the presence of satellites). As a consequence, the sample can still contain some contaminants. For example, we may only have removed part of a transit, but we decided to adopt a unique approach that is valid for all the observations. Some observations can also be rejected during the orbit-determination process, but this does not affect the overall quality of the data.

After removing the bad-quality observations, we analysed the post-fit residuals in the along-scan and across-scan directions. Figure 50 shows the histogram of the post-fit residuals along-scan (ΔAL) and across-scan (ΔAC). These residuals are obtained as a rotation of the residuals in αcos(δ) and δ, where the rotation angle is the position angle as given in the sso_sobservations table. The whole procedure has been described in Gaia Collaboration (2018).

thumbnail Fig. 50.

Histogram of post-fit residuals of the selected observations in the left: along-scan, right: across-scan direction.

The mean of the post-fit along-scan residuals is 0.03 mas, and the standard deviation is slightly larger than 5 mas. This is exactly what we expected as a result of the orbit-determination fit (we recall that we discarded observations at 5σ level). Post-fit residuals in the across-scan direction are expected to be far larger than the corresponding along-scan residuals as a result of the geometry of the spacecraft observations (Gaia Collaboration 2018), as the histogram in Fig. 50 shows. The mean in this case is close to 13 mas, which shows that the across-scan observations still contain a small bias. The standard deviation is larger than 200 mas. This is close to what we expected.

We now examine the (ΔAL, ΔAC) post-fit residuals as a function of the G magnitude (Fig. 51). For very bright sources (G < 13 mag), a full two-dimensional window is transmitted, which means that across-scan information is available, corresponding to what we show in Fig. 51, where across-scan residuals are at the milliarcsecond level when G < 13 mag. Figure 51 shows the increase in along-scan residuals when the source is fainter (G > 19 mag). They almost reach the detectability limit, but usually remain very small (inside the [ − 10, 10] mas interval) for all the other sources.

thumbnail Fig. 51.

Density plot of the post-fit residuals as a function of the G magnitude left: along-scan, right: across-scan.

Additional information about residuals and a comparison with Gaia DR2 are available in Tanga et al. (2023), even though the authors used a different set of residuals that they obtained as a result of the internal process of the observations and not from the validation. It has been proved in the same paper that these residuals and the corresponding orbit can be considered equivalent to those that were computed during the validation process.

8.3. Orbit accuracy: Comparison with known catalogues

The post-fit accuracy of the semi-major axis (σa) is a good estimator of the orbit quality. We compared the post-fit σa obtained using Gaia observations alone with that available from the JPL Small Body Database20, which makes use of all the available observations; see Fig. 52. The black line in the figure is the bisector of the first quadrant: the orbits of the objects below the line have a better uncertainty using Gaia observation alone. It is clear that Gaia DR3 alone is still not enough to reach the final accuracy expected for Gaia (Gaia Collaboration 2018, Fig. 32), but the number of orbits for which the accuracy is now better using Gaia alone has largely increased from Gaia DR2.

thumbnail Fig. 52.

Quality of the orbit determination measured by the post-fit uncertainty of the semi-major axis for the whole sample of objects contained in Gaia DR3 with respect to the current measurements from the JPL Small Body Database. The black line is the bisector of the first quadrant.

9. Conclusions

The third data release of Gaia, DR3, provides a very large amount of new data. This complex and diverse dataset has a number of caveats that the users should be aware of. In this paper we summarised the main issues we found during the transversal validation, and we provided links to the relevant papers or documentation and recommendations. In particular, we highlighted that flags provided with the data products should be used whenever available (e.g., flags_gspspec, flags_flame, dibqf_gspspec, and flags_qsoc). We warned about the error underestimation of the XP coefficients, GSP-Phot parameters, GSP-Spec ANN, spectroscopic and eclipsing binary solutions with a poor goodness-of-fit, and we provided a correction formula for the radial velocity error estimates. The DSC white dwarf and binary star classifications should not be used. A number of parameters were highlighted as to be used with caution (MSC parameters, mh_gspphot, distance_gspphot, mg_gspphot, FLAME mass and ages for giants, radial velocities for (GRVS − G) < − 3, and astrometric binary acceleration values). The corrections proposed in Recio-Blanco et al. (2023) should be applied to the GSP-Spec parameters. Some systematics such as those presented for vbroad, the XP spectra dip, or extinction overestimation are to be taken into account according to the science case. Filters need to be applied to the QSO and galaxy candidates to have purer samples (Gaia Collaboration 2023b). Monte Carlo techniques should not be used with the Thiele-Innes astrometric binary orbital parameters.

This paper focused on limitations in the data released in Gaia DR3. We encourage a study of the other papers that accompany this data release for an overview of the high quality of the Gaia products and a glance at the wonderful science outcomes that can be expected from this wealth of data. We hope that this paper will help users to find their way through the data so that they can make the best of it.


4

GOG is published in the Gaia archive in the table gaiaedr3.gaia_source_simulation and GUMS in the table gaiaedr3.gaia_universe_model.

8

IC 4651, Melotte 20, Melotte 22, NGC 2099, NGC 2168, NGC 2287, NGC 2506, NGC 2516, NGC 2632, NGC 3114, NGC 3532, NGC 3766, NGC 457, NGC 6405, NGC 6475, Stock 2, and Trumpler 19.

9

2021 March update.

10

DSC: Discrete source classifier

GSP-Phot: General stellar parametriser from photometry

GSP-Spec: General stellar parametriser from spectroscopy

FLAME: Final luminosity age mass estimator

ESP-CS: Extended stellar parametriser for cool stars

ESP-UCD: Extended stellar parametriser for ultra-cool dwarfs

ESP-HS: Extended stellar parametriser for hot stars

ESP-ELS: Extended stellar parametriser for emission line stars

MSC: Multiple star classifier

QSOC: QSO classifier

UGC: Unresolved galaxy classifier

OA: Outlier analysis

TGE: Total Galactic extinction.

12

They are needed to be able to adapt the DSC probabilities to a new prior, however; see Sect. 11.3.2.7 of the online documentation.

13

(parallax - 1000/distance_gspphot)/parallax_error.

14

OA distinguishes extragalactic sources by bins of redshift, so we select all quasars/galaxies at all redshifts to create this label.

18

The Minor Planet Center is the single worldwide location for receipt and distribution of positional measurements of minor planets, comets, and outer irregular natural satellites of major planets https://www.minorplanetcenter.net/

Acknowledgments

This work presents results from the European Space Agency (ESA) space mission Gaia. Gaia data are being processed by the Gaia Data Processing and Analysis Consortium (DPAC). Funding for the DPAC is provided by national institutions, in particular the institutions participating in the Gaia MultiLateral Agreement (MLA). The Gaia mission website is https://www.cosmos.esa.int/gaia. The Gaia archive website is https://archives.esac.esa.int/gaia. This work has made an extensive use of Aladin and the SIMBAD, VizieR databases operated at the Centre de Données Astronomiques (Strasbourg) in France and of the software TOPCAT (Taylor 2005). This work has been supported by the Agence Nationale de la Recherche (ANR project SEGAL ANR-19-CE31-0017). It has also received funding from the project ANR-18-CE31-0006 and from the European Research Council (ERC grant agreement No. 834148). ZKR acknowledges funding from the Netherlands Research School for Astronomy (NOVA). This work was partially funded by the Spanish MICIN/AEI/10.13039/501100011033 and by “ERDF A way of making Europe” by the “European Union” through grant RTI2018-095076-B-C21, and the Institute of Cosmos Sciences University of Barcelona (ICCUB, Unidad de Excelencia ‘María de Maeztu’) through grant CEX2019-000918-M.

References

  1. Ahumada, R., Allende Prieto, C., Almeida, A., et al. 2020, ApJS, 249, 3 [NASA ADS] [CrossRef] [Google Scholar]
  2. Andrae, R., Fouesneau, M., Sordo, R., et al. 2023, A&A, 674, A27 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  3. Arenou, F., Luri, X., Babusiaux, C., et al. 2018, A&A, 616, A17 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  4. Blomme, R., Frémat, Y., Sartoretti, P., et al. 2023, A&A, 674, A7 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  5. Bohlin, R. C., Gordon, K. D., & Tremblay, P. E. 2014, PASP, 126, 711 [NASA ADS] [Google Scholar]
  6. Boubert, D., Strader, J., Aguado, D., et al. 2019, MNRAS, 486, 2618 [Google Scholar]
  7. Bourges, L., Mella, G., Lafrasse, S., et al. 2017, VizieR Online Data Catalog, II/346 [Google Scholar]
  8. Buder, S., Sharma, S., Kos, J., et al. 2021, MNRAS, 506, 150 [NASA ADS] [CrossRef] [Google Scholar]
  9. Cantat-Gaudin, T., Anders, F., Castro-Ginard, A., et al. 2020, A&A, 640, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  10. Carnerero, M. I., Raiteri, C. M., Rimoldini, L., et al. 2023, A&A, 674, A24 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  11. Carrasco, J. M., Weiler, M., Jordi, C., et al. 2021, A&A, 652, A86 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  12. Castro-Ginard, A., Jordi, C., Luri, X., et al. 2022, A&A, 661, A118 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  13. Clementini, G., Ripepi V., Garofalo A., et al. 2023, A&A, 674, A18 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  14. Creevey, O. L., Sordo R., Pailler F., et al. 2023, A&A, 674, A26 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  15. De Angeli, F., Weiler M., Montegriffo P., et al. 2023, A&A, 674, A2 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  16. Delchambre, L., Bailer-Jones C. A. L., Bellas-Velidis I., et al. 2023, A&A, 674, A31 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  17. De Medeiros, J. R., Alves, S., Udry, S., et al. 2014, A&A, 561, A126 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  18. Ducourant, C., Krone-Martins, A., Galluccio L., et al. 2023, A&A, 674, A11 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  19. Eker, Z., Bilir, S., Soydugan, F., et al. 2014, PASA, 31, e024 [NASA ADS] [CrossRef] [Google Scholar]
  20. El-Badry, K., Rix, H.-W., & Heintz, T. M. 2021, MNRAS, 506, 2269 [NASA ADS] [CrossRef] [Google Scholar]
  21. Evans, D. W., Eyer L., Busso G., et al. 2023, A&A, 674, A4 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  22. Eyer, L., Audard, M., Holl, B., et al. 2023, A&A, 674, A13 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  23. Fabricius, C., Luri, X., Arenou, F., et al. 2021, A&A, 649, A5 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  24. Falcón-Barroso, J., Sánchez-Blázquez, P., Vazdekis, A., et al. 2011, A&A, 532, A95 [Google Scholar]
  25. Fouesneau, M., Frémat Y., Andrae R., et al. 2023, A&A, 674, A28 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  26. Fournier, D. A., Skaug, H. J., Ancheta, J., et al. 2012, Optim. Methods Software, 27, 233 [CrossRef] [Google Scholar]
  27. Frémat, Y., Royer F., Marchal O., et al. 2023, A&A, 674, A8 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  28. Gaia Collaboration (Prusti, T., et al.) 2016, A&A, 595, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  29. Gaia Collaboration (Spoto, F., et al.) 2018, A&A, 616, A13 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  30. Gaia Collaboration (Arenou, F., et al.) 2023a, A&A, 674, A34 (Gaia DR3 SI) [CrossRef] [EDP Sciences] [Google Scholar]
  31. Gaia Collaboration (Bailer-Jones, C. A. L., et al.) 2023b, A&A, 674, A41 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  32. Gaia Collaboration (Recio-Blanco, A., et al.) 2023c, A&A, 674, A38 (Gaia DR3 SI) [CrossRef] [EDP Sciences] [Google Scholar]
  33. Gaia Collaboration (Schultheis, M., et al.) 2023d, A&A, 674, A40 (Gaia DR3 SI) [CrossRef] [EDP Sciences] [Google Scholar]
  34. Gaia Collaboration (Vallenari, A., et al.) 2023e, A&A, 674, A1 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  35. Gallenne, A., Pietrzyński, G., Graczyk, D., et al. 2019, A&A, 632, A31 [EDP Sciences] [Google Scholar]
  36. Gilmore, G., Randich, S., Asplund, M., et al. 2012, Messenger, 147, 25 [Google Scholar]
  37. Godoy-Rivera, D., Tayar, J., Pinsonneault, M. H., et al. 2021, ApJ, 915, 19 [NASA ADS] [CrossRef] [Google Scholar]
  38. Green, G. M., Schlafly, E., Zucker, C., Speagle, J. S., & Finkbeiner, D. 2019, ApJ, 887, 93 [NASA ADS] [CrossRef] [Google Scholar]
  39. Halbwachs, J. L., Kiefer, F., Lebreton, Y., et al. 2020, MNRAS, 496, 1355 [NASA ADS] [CrossRef] [Google Scholar]
  40. Halbwachs, J.-L., Pourbaix D., Arenou F. et al. 2023, A&A, 674, A9 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  41. Heap, S. R., & Lindler, D. 2016, in The Science of Calibration, eds. S. Deustua, S. Allam, D. Tucker, & J. A. Smith, ASP Conf. Ser., 503, 211 [NASA ADS] [Google Scholar]
  42. Holl, B., Sozetti, A., Sahlmann, J., et al. 2023a, A&A, 674, A10 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  43. Holl, B., Fabricius, C., Portell, J., et al. 2023b, A&A, 674, A25 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  44. Jancart, S., Jorissen, A., Babusiaux, C., & Pourbaix, D. 2005, A&A, 442, 365 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  45. Katz, D., Sartoretti P., Guerrier A. et al. 2023, A&A, 674, A5 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  46. Kelvin, L. S., Driver, S. P., Robotham, A. S. G., et al. 2012, MNRAS, 421, 1007 [Google Scholar]
  47. Kervella, P., Arenou, F., & Thévenin, F. 2022, A&A, 657, A7 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  48. Kiefer, F., Halbwachs, J. L., Arenou, F., et al. 2016, MNRAS, 458, 3272 [NASA ADS] [CrossRef] [Google Scholar]
  49. Kiefer, F., Halbwachs, J. L., Lebreton, Y., et al. 2018, MNRAS, 474, 731 [NASA ADS] [CrossRef] [Google Scholar]
  50. Kounkel, M., Covey, K. R., Stassun, K. G., et al. 2021, AJ, 162, 184 [NASA ADS] [CrossRef] [Google Scholar]
  51. Kristensen, K., Nielsen, A., Berg, C. W., Skaug, H., & Bell, B. M. 2016, J. Stat. Software, 70, 1 [CrossRef] [Google Scholar]
  52. Kullback, S., & Leibler, R. A. 1951, Ann. Math. Stat., 22, 79 [CrossRef] [Google Scholar]
  53. Lafarga, M., Ribas, I., Lovis, C., et al. 2020, A&A, 636, A36 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  54. Lallement, R., Capitanio, L., Ruiz-Dern, L., et al. 2018, A&A, 616, A132 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  55. Lallement, R., Babusiaux, C., Vergely, J. L., et al. 2019, A&A, 625, A135 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  56. Lebzelter, T., Mowlavi, N., Lecoeur-Taibi, I., et al. 2023, A&A, 674, A15 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  57. Luri, X., Brown, A. G. A., Sarro, L. M., et al. 2018, A&A, 616, A9 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  58. Makarov, V. V., & Unwin, S. C. 2015, MNRAS, 446, 2055 [NASA ADS] [CrossRef] [Google Scholar]
  59. Montegriffo, P., De Angeli, F., Andrae, R., et al. 2023, A&A, 674, A3 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  60. Planck Collaboration Int. XLVIII. 2016, A&A, 596, A109 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  61. Pourbaix, D. 2000, A&AS, 145, 215 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  62. Pourbaix, D., Tokovinin, A. A., Batten, A. H., et al. 2004, A&A, 424, 727 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  63. Price-Whelan, A. M., Hogg, D. W., Foreman-Mackey, D., & Rix, H.-W. 2017, ApJ, 837, 20 [NASA ADS] [CrossRef] [Google Scholar]
  64. Queiroz, A. B. A., Anders, F., Chiappini, C., et al. 2020, A&A, 638, A76 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  65. Recio-Blanco, A., De Laverny P., Palicio P. A. et al. 2023, A&A, 674, A29 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  66. Riello, M., De Angeli, F., Evans, D. W., et al. 2021, A&A, 649, A3 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  67. Rimoldini, Holl B., Gavras P., et al. 2023, A&A, 674, A14 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  68. Sartoretti, P., Marchal O., Babusiaux C. et al. 2023, A&A, 674, A6 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  69. Schlegel, D. J., Finkbeiner, D. P., & Davis, M. 1998, ApJ, 500, 525 [Google Scholar]
  70. Seabroke, G. M., Fabricius, C., Teyssier, D., et al. 2021, A&A, 653, A160 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  71. Serenelli, A., Johnson, J., Huber, D., et al. 2017, ApJS, 233, 23 [Google Scholar]
  72. Soubiran, C., Jasniewicz, G., Chemin, L., et al. 2018, A&A, 616, A7 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  73. Souchay, J., Gattano, C., Andrei, A. H., et al. 2019, A&A, 624, A145 [EDP Sciences] [Google Scholar]
  74. Tanga, P., Pauwels, T., Mignard, F., et al. 2023, A&A, 674, A12 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  75. Tarsitano, F., Hartley, W. G., Amara, A., et al. 2018, MNRAS, 481, 2018 [NASA ADS] [CrossRef] [Google Scholar]
  76. Taylor, M. B. 2005, Astronomical Data Analysis Software and Systems XIV, eds. P. Shopbell, M. Britton, & R. Ebert, 347, 29 [Google Scholar]
  77. Torres, G., Latham, D. W., & Quinn, S. N. 2021, ApJ, 921, 117 [Google Scholar]
  78. Traven, G., Feltzing, S., Merle, T., et al. 2020, A&A, 638, A145 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  79. Yu, J., Huber, D., Bedding, T. R., et al. 2018, ApJS, 236, 42 [NASA ADS] [CrossRef] [Google Scholar]
  80. Zwitter, T., Kos, J., Buder, S., et al. 2021, MNRAS, 508, 4202 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A: Wide binaries

We used the El-Badry et al. (2021) catalogue of wide binaries, further limited to a chance alignment probability R_chance_align < 0.1, to test the radial velocity error underestimation in Sect. 2.1.3. Here we test the astrophysical parameter compatibility of the two components. A summary of the results is listed in Table A.1. The percentage of systems consistent within 1σ was computed after removing the 5σ outliers. If this percentage is significantly lower (higher) than about 68, the errors are underestimated (overestimated).

Table A.1.

Comparison of the astrophysical parameters derived for the two components of a wide binary (El-Badry et al. 2021).

All Tables

Table 1.

Coefficients to derive the standard error factor fσ that should be applied to radial_velocity_error according to Eq. (1) for GRVS > 8 mag.

Table A.1.

Comparison of the astrophysical parameters derived for the two components of a wide binary (El-Badry et al. 2021).

All Figures

thumbnail Fig. 1.

Differences in radial velocities of the members of close pairs of sources as a function of the square of the angular separation in arcsec2 (top) and after filtering (bottom). The red lines enclose one of the criteria that were used to filter the problematic cases.

In the text
thumbnail Fig. 2.

Radial velocity of the primary star against the secondary for stars of binary pairs from the El-Badry et al. (2021) catalogue before (left) and after (right) filtering.

In the text
thumbnail Fig. 3.

Stacked spectra for all the sources with a radial velocity > 600 km s−1 (421 sources, top) and <−600 km s−1 (349 sources, bottom). Solid vertical lines indicate the position of the calcium triplet, and dashed lines show the same lines shifted by 1.7 nm, indicating where the spectral line would be if the radial velocity correction were incorrect by 600 km s−1.

In the text
thumbnail Fig. 4.

Uncertainty in radial velocity as a function of G.

In the text
thumbnail Fig. 5.

Variation in radial velocity difference with APOGEE DR16 as a function of the APOGEE metallicity.

In the text
thumbnail Fig. 6.

Radial velocities averaged over the whole sky as a function of G magnitude for DR3 (blue), GOG20 (green), and EDR3 (pink).

In the text
thumbnail Fig. 7.

Radial velocity uncertainties tested with open clusters. Top panel: absolute value of the difference between the radial velocity of a star and its cluster median |ΔRV| normalised by the radial_velocity_ error. The black line is the lowess (locally weighted scatterplot smoothing). The slope of the lowess for lower values of radial_velocity_error indicates that the errors can be underestimated at the bright end (but see the text for a discussion). Bottom panel: difference between the radial velocity of a star and its cluster median ΔRV normalised by the radial velocity error in different radial velocity error bins.

In the text
thumbnail Fig. 8.

Standard error factor fσ that should be applied to radial_velocity_error as a function of magnitude (left) and temperature (right), estimated from the comparison with GALAH. In green we over-plot fσ estimated from the wide binaries (Eq. (1) and Table 1).

In the text
thumbnail Fig. 9.

Comparison of the spectral line broadening parameter with the De Medeiros et al. (2014) catalogue of FGK stars, colour-coded by the template temperature.

In the text
thumbnail Fig. 10.

Residuals from a global relation GRVS − G = f(GBP − GRP) for a sample of APOGEE low-extinction solar metallicity dwarfs.

In the text
thumbnail Fig. 11.

Relative difference of the number of stars with a GRVS value between DR3 and GOG20 (DR3-GOG20)/DR3 in the magnitude range 12 < G < 13 in Galactic coordinates. −1 (+1) corresponds to a deficit (an excess) of 100% in DR3 data with regard to the GOG20 model.

In the text
thumbnail Fig. 12.

Galactic distribution of the sources for which RVS spectra are available in the HEALPix map of order 6. White patches are regions without sources.

In the text
thumbnail Fig. 13.

Three example spectra with three different continuum levels.

In the text
thumbnail Fig. 14.

Density of Gaia DR3 sources in the sky with available xp_continuous_mean_spectrum in Galactic coordinates.

In the text
thumbnail Fig. 15.

Magnitude-colour diagram for sources for which xp_continuous_mean_spectrum is available in Gaia DR3.

In the text
thumbnail Fig. 16.

Source mean spectrum coefficients for all sources in the xp_continuous_mean_spectrum table for BP (top) and RP (bottom). The colour index indicates the source density.

In the text
thumbnail Fig. 17.

Number of relevant bases in the xp_summary table for BP (blue) and RP (red).

In the text
thumbnail Fig. 18.

Comparison of the number of sources failing the small wings test in BP (blue/cyan) and RP (red/magenta) when all source coefficients are considered (solid lines) or only truncated coefficients are taken into account (dashed lines) as a function of colour.

In the text
thumbnail Fig. 19.

Distribution of sources in the zL1 norm plane for BP (top panel) and RP (bottom panel).

In the text
thumbnail Fig. 20.

Cumulative histogram of the differential wiggling coefficient Δw3 measured for stars with 0.5 < bp_rp < 0.7 and within different bins of phot_g_mean_mag. Left panel: Δw3 values measured in the non-truncated spectra, right panel: these values for truncated spectra.

In the text
thumbnail Fig. 21.

Differential wiggling Δw10 as a function of H. Non-truncated spectra are shown on the left, and truncated spectra are shown on the right.

In the text
thumbnail Fig. 22.

Histogram of the ratio r of the photometric and spectrum flux in BP (left) and RP (right).

In the text
thumbnail Fig. 23.

Distribution of sources in decadic logarithm of the ratio of photometric and spectroscopic uncertainty and XP magnitude for BP (top panel) and RP (bottom panel).

In the text
thumbnail Fig. 24.

Zoom on the p-value distributions obtained for the two bands BP (left) and RP (right). The pairs with a p-value below 0.01 failed the test.

In the text
thumbnail Fig. 25.

Same as Fig. 24 for the 437 pairs that are fainter than G = 16 mag, but have a covariance matrix (Cov) of one to three times its original value for the two bands BP (left) and RP (right).

In the text
thumbnail Fig. 26.

Median flux difference normalised by the errors between the XP sampled spectra and the CALSPEC spectra normalised by the errors as a function of wavelength. Dotted lines correspond to the 1σ confidence interval.

In the text
thumbnail Fig. 27.

Median flux difference normalised by the errors within 560 < λ < 620 nm between the XP sampled spectra and the CALSPEC (black dots) or NGSL (grey dots) spectra normalised by the errors as a function of magnitude.

In the text
thumbnail Fig. 28.

Median flux difference within λ < 350 nm between the XP sampled spectra and the CALSPEC (black dots) or NGSL (grey dots) spectra normalised by the errors as a function of magnitude.

In the text
thumbnail Fig. 29.

Density plot of the comparison of the monochromatic extinctions of GSP-Phot with those derived by Lallement et al. (2018). The green line corresponds to the 1.02 relation that is expected given the slight wavelength difference between the two A0.

In the text
thumbnail Fig. 30.

Hertzsprung-Russel diagram of low-extinction stars (A0 < 0.05 mag according to Lallement et al. 2019) with a parallax relative precision lower than 10%, colour-coded with the mean extinction azero_gspphot. The colour is saturated as black for values higher than 1 mag. In this low extinction sample, MG is simply G + 5 + 5log(sϖ/1000).

In the text
thumbnail Fig. 31.

Comparison of the total galactic extinction map (a0) with PlanckE(B − V) normalised by the error R N = ( a 0 E ( B V ) × 3.1 ) / ( a 0 _ uncertainty × num _ tracers _ used ) $ R_N=({\mathtt{a0}}-E(B-V)\times3.1)/({\mathtt{a0\_uncertainty}}\times \sqrt{{\mathtt{num\_tracers\_used}}}) $. The white area corresponds to locations with status > 0.

In the text
thumbnail Fig. 32.

Density plot of the comparison of the temperature (left), surface gravity (middle), and global metallicity (right) provided by GSP-Phot (y-axis) and GSP-Spec (x-axis). GSP-Phot has been filtered with parallax_over_error > 5 and teff_gspphot < 10 000. GSP-Spec parameters have been filtered with flags_gspspec[1,4,8,13] = 0 for Teff, flags_gspspec[2,5,8,13] = 0 for log g, and flags_gspspec[3,6,8] = 0 for [M/H]. The dashed green line shows the one-to-one correspondence. The median absolute deviation is indicated in each panel.

In the text
thumbnail Fig. 33.

Density plot of the comparison of the temperature (top), surface gravity (middle), and global metallicity (bottom) provided by GSP-Phot (left) and GSP-Spec (right) with APOGEE DR16. GSP-Phot has been filtered with parallax_over_error > 10, and teff_gspphot < 10 000. GSP-Spec parameters have been filtered with flags_gspspec[1,4,8,13] = 0 for Teff, flags_gspspec[2,5,8,13] = 0 for log g, and flags_gspspec[3,6,8] = 0 for [M/H]. The RVS spectrum signal-to-noise ratio was not filtered.

In the text
thumbnail Fig. 34.

Comparison of mh_gspphot with literature values from open clusters. We plot the median value for each cluster. The error bars show the dispersion around the median. The red line indicates the zero value.

In the text
thumbnail Fig. 35.

Correlation between the GSP-Spec parameters and magnitude (left) and metallicity (right) illustrated here with the temperature residuals compared to APOGEE DR16.

In the text
thumbnail Fig. 36.

Abundance trends for mh_gspspec (top) and alphafe_gspspec (bottom) as a function of teff_gspspec (left column), logg_gspspec (middle column), and phot_g_mean_mag (right column) for the stellar members of NGC 7789. The symbols are colour-coded as a function of logchisq_gspspec. Circles indicate the uncorrected alphafe_gspspec values, and crosses represent the calibrated mh_gspspec and alphafe_gspspec values. The parameters have been filtered with flags_gspspec[1:7,9:13] = 0 and flags_gspspec[8]< = 2.

In the text
thumbnail Fig. 37.

Global distribution of distance_gspphot against parallax for sources with ϖ/σϖ > 5. The solid black line represents the 1/parallax relation. The map (l, b) on the right shows the sky distribution of the clustering between the two parameters.

In the text
thumbnail Fig. 38.

Density plot of the difference between mg_gspphot and the absolute magnitude computed directly with the parallax for a sample of stars with negligible extinction (A0 < 0.05 according to Lallement et al. 2019) and a parallax_over_error > 10.

In the text
thumbnail Fig. 39.

Colour-magnitude diagram of NGC 6791 (left panel), mg_gspphot vs. GBP − GRP (central panel), and G vs. distance modulus (m − M) derived from distance_gspphot (right panel). The blue line in the right panel shows the literature value, and the green lines in the left and central panels show the PARSEC isochrone, which has the same parameters as the cluster.

In the text
thumbnail Fig. 40.

Comparison between JSDC radius and the FLAME radii based on GSP-Phot (left) or GSP-Spec (right), colour-coded with the GBP − GRP colour.

In the text
thumbnail Fig. 41.

Density plot of the comparison between the GALAH gravitational redshift and the FLAME redshifts based on GSP-Phot (left) or GSP-Spec (right).

In the text
thumbnail Fig. 42.

Astrometric properties of the different subsamples contained within the QSO (top) and galaxy (bottom) candidate tables. Each panel contains the distribution of parallaxes and proper motions, normalised to their errors. The grey line corresponds to a normal distribution.

In the text
thumbnail Fig. 43.

Sky distribution of the different subsamples contained within the QSO (left) and galaxy (right) candidate tables.

In the text
thumbnail Fig. 44.

Gaia DR3 low-extinction HR diagram (grey scale). The position of sources in the QSO (top) and galaxy (bottom) candidate tables with parallax_over_error > 5 is overplotted with a red scaling with the square root of the number of sources. Colour points correspond to the stricter selection of candidates proposed in Gaia Collaboration (2023b).

In the text
thumbnail Fig. 45.

Comparison of Sérsic index (panel a), and ellipticity (panel b) from DES with the Gaia measurements.

In the text
thumbnail Fig. 46.

Redshift distribution of the QSO (blue) and galaxy (orange) candidate sources. The dotted blue line corresponds to the quasars that were selected with the recommended QSOC redshift flags (flags_qsoc = 0 or flags_qsoc = 16). The two peaks marked by the red arrows are discussed in the text.

In the text
thumbnail Fig. 47.

Density plot of the signal-to-noise ratio of the semi-major axis of the photocentre orbit (a0) derived from a Monte Carlo method as a function of the value provided in the field significance.

In the text
thumbnail Fig. 48.

Asteroid population in Gaia DR3 in the (a, e) plane, where a is the semi-major axis in au and e is the eccentricity of the minor planets. The legend shows the different categories: blue squares for NEAs, red stars for MBAs, and green dots for Jupiter Trojans. For sake of clarity, the plot does not show Centaurs and more distant objects.

In the text
thumbnail Fig. 49.

Time span by the observations in Gaia DR3 (in days) vs. the number of observations for each known source. The red stars represent the objects for which the orbit determination process did not converge.

In the text
thumbnail Fig. 50.

Histogram of post-fit residuals of the selected observations in the left: along-scan, right: across-scan direction.

In the text
thumbnail Fig. 51.

Density plot of the post-fit residuals as a function of the G magnitude left: along-scan, right: across-scan.

In the text
thumbnail Fig. 52.

Quality of the orbit determination measured by the post-fit uncertainty of the semi-major axis for the whole sample of objects contained in Gaia DR3 with respect to the current measurements from the JPL Small Body Database. The black line is the bisector of the first quadrant.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.