Gaia Data Release 3
Open Access
Issue
A&A
Volume 674, June 2023
Gaia Data Release 3
Article Number A17
Number of page(s) 35
Section Stellar structure and evolution
DOI https://doi.org/10.1051/0004-6361/202243990
Published online 16 June 2023

© The Authors 2023

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1. Introduction

Cepheids made their appearance on the scene when Edward Pigott discovered their first representative, η Aql, in 1784, opening a field of astrophysical research that is still fully active today. The variable stars that are collectively called Cepheids are actually an ensemble of different types which we now separate into three groups: Classical Cepheids (DCEPs, whose prototype is δ Cep), type II Cepheids (T2CEPs), and anomalous Cepheids (ACEPs).

The crucial role played by DECPs resides in their period–luminosity (PL) and period–Wesenheit (PW) relations, which represent fundamental tools at the basis of the extragalactic distance ladder (e.g. Leavitt & Pickering 1912; Madore 1982; Caputo et al. 2000; Riess et al. 2016). However, DCEPs are also important astrophysical objects for stellar evolution and Galactic studies. Indeed, as their pulsational properties (mainly periods) are linked to the intrinsic stellar parameters (effective temperature, mass, luminosity), DCEPs can be used as an independent test for stellar evolution models. Moreover, given their young age (∼50−500 Myr) they are preferentially located in the Milky Way (MW) thin disc, and, thanks to precise distances that can be derived from their PL and PW relations, DCEPs can be used to model the disc and trace their birthplaces in the spiral arms, where star formation is most active (e.g. Skowron et al. 2019; Poggio et al. 2021, and references therein). Furthermore, if the chemical abundance of the DCEPs is available, they can be used to trace the metallicity gradient of the MW (e.g. Genovali et al. 2014; Luck & Lambert 2011; Luck 2018; Ripepi et al. 2022a, and references therein).

While DCEPs are luminous, young, and massive (M ∼ 3 − 11 M) stars, T2CEPs are more evolved objects and are older than 10 Gyr, more luminous, and are slightly less massive than RR Lyrae variables (M ∼ 0.55 − 0.7 M, see e.g., Caputo 1998; Sandage & Tammann 2006, for a more extended description of T2CEPs properties). T2CEPs are preferentially metal-poor objects and, as in the RR Lyrae variables, populate the main Galactic components, that is disc, bulge, and halo. T2CEPs pulsate with periods from ∼1 to ∼24 d and are separated into BL Herculis stars (BLHER; periods between 1 and 4 d) and W Virginis (WVIR; periods between 4 and 24 d) stars. Historically, a third class of variables is considered as an additional subgroup of the T2CEP class, namely the RV Tauri (RVTAU) stars (see e.g. Feast et al. 2008, and references therein), with periods from about 20 to 150 d and often less regular light curves. These latter are post-asymptotic giant branch stars on their way to becoming planetary nebulae. This evolutionary phase corresponds to the latest stage in the evolution of intermediate-mass stars and therefore the link between RVTAU and the low-mass WVIR stars should be considered with caution. T2CEPs follow very tight PL and PW relations, especially in the near-infrared (NIR; see e.g. Matsunaga et al. 2011; Ripepi et al. 2015, and references therein) and are therefore excellent distance indicators.

The third Cepheid-like class of pulsating stars is represented by the anomalous Cepheids (ACEPs). These have periods of between approximately 0.4 d and 2.5 d and brighter absolute magnitudes than RR Lyrae stars by 0.3 mag to 2 mag (Caputo et al. 2004, and references therein). ACEP variables are thought to be in their central He-burning evolutionary phase and to have masses of between approximately 1.3 and 2.1 M, as well as metallicities lower than Z = 0.0004 (corresponding to an iron abundance lower than ∼ − 1.6 dex, for Z = 0.0152; see Caputo 1998; Marconi et al. 2004, for details). Similarly to RR Lyrae stars, the ACEPs can pulsate in the fundamental or first overtone modes and in both cases show well-defined PL and PW relations, especially in the Large Magellanic Cloud (LMC; Ripepi et al. 2014; Soszyński et al. 2015a).

Great advances in the study of variable sources have been made thanks to the Gaia mission (Gaia Collaboration 2016a) and its subsequent data releases (DR1, DR2 and EDR3, Gaia Collaboration 2016b, 2018, 2021a; Riello et al. 2021). Indeed, the multi-epoch nature of Gaia observations makes the satellite a very powerful tool to identify, characterise, and classify many different classes of variable stars across the whole Hertzsprung-Russell diagram (HRD; see Gaia Collaboration 2019). In Gaia DR1, time-series photometry in the G-band and parameters derived from the G light curves were released for a small number of objects in and around the LMC, including 599 Cepheids of all types and 2595 RR Lyrae stars (Clementini et al. 2016, hereafter Paper I). In 2018, Gaia DR2 released more than 550 000 variable sources belonging to a variety of different classes (see Holl et al. 2018), including about 9500 Cepheids of all types and about 140 000 RR Lyrae stars (Clementini et al. 2019, hereafter Paper II). In December 2020, Gaia Early Data Release 3 (EDR3; Gaia Collaboration 2021a) published average photometry, parallaxes, and proper motions but no time-series data. Epoch data are now made available with Gaia Data Release 3 (DR3, see Gaia Collaboration 2023a), providing multiband time-series photometry for nearly 12 million variable sources (see Eyer et al. 2023).

The Specific Objects Study (SOS) Cep&RRL pipeline (SOS Cep&RRL pipeline hereafter) was developed to validate and characterise Cepheids and RR Lyrae stars observed by Gaia. The pipeline has been described in detail in Papers I and II, to which we refer the interested reader. The general properties of the entire sample of variable objects released in Gaia DR3 are discussed in Eyer et al. (2023), which also describes the chain of subsequent steps carried out in the general variability analysis before the SOS Cep&RRL processing of the data (see also Holl et al. 2018).

In this paper, we describe the properties of the Cepheids for which time-series data are released in DR3 – along with their characteristic parameters – and that populate the vari_cepheid catalogue, which is part of the data release. More specifically, we (i) illustrate the changes we implemented in the SOS Cep&RRL pipeline to process the DR3 photometric and radial velocity (RV) time series of candidate Cepheids provided by the general variable star classification pipelines (Eyer et al. 2023; Rimoldini et al. 2023; ii) discuss the procedures adopted to clean the sample of Cepheids that are released in the Gaia DR3; (iii) present the ensemble properties of the DR3 Cepheids; and (iv) describe the validation procedures adopted to estimate the completeness and contamination of the sample.

A complementary paper (Clementini et al. 2023) describes the SOS Cep&RRL pipeline and the relative results for the RR Lyrae variables.

2. SOS Cep&RRL pipeline: changes from DR2 to DR3

The main steps of the SOS Cep&RRL pipeline for candidate Cepheids are shown in Figs. 1 and 3 of Papers I and II. The procedures used for DR3 are almost the same as for DR1 and DR2, but with some important changes that we describe in the following sections:

2.1. Pipeline changes

  1. Subregions in the sky. For the processing of the DR2 data, the SOS pipeline subdivided the sky into three regions: two around the LMC and the Small Magellanic Cloud (SMC), respectively, and a third one, called All Sky, including all the remaining stars, which mainly belonged to the MW. This subdivision was needed because of the different observational properties of Cepheids and RR Lyrae stars in the Magellanic Clouds (MCs) and in the MW. Indeed, while Cepheids (and RR Lyrae) in the LMC and SMC are all more or less at the same distance from us within each galaxy –meaning that we can simply use their apparent magnitudes to define their position in and around the reference PL or PW relations –, for the MW we need absolute magnitudes calculated from Gaia parallaxes to place these stars on the PL and PW diagrams. These differences, in turn, required different steps in the SOS pipeline. In DR3, we enlarge the regions around the LMC and SMC and introduce two new subregions encircling the Andromeda (M 31) and Triangulum (M 33) galaxies, whose brightest Cepheids are within reach of the Gaia mission. These four subregions are listed in Table 1. The fifth subregion is composed of all the remaining sky after excluding the four subregions defined above; for continuity with Paper II, we refer to this fifth subregion as All Sky. The large majority of the stars contained in this subregion are those of MW, with a small fraction of objects belonging to dwarf galaxies that are satellites of our Galaxy.

  2. Treatment of multi-mode DCEPs. To avoid spurious detections of multi-mode DCEPs, we only searched for more than one pulsation mode in the time-series of stars with a number of epochs greater than or equal to 40. In addition, we introduced an analysis of the residuals after the fit of the G light curve with just one pulsation mode, and only retained objects showing a dispersion larger than or equal to 0.025 mag as potential multi-mode (a similar procedure, although with a larger scatter, is adopted for RR Lyrae stars, see Clementini et al. 2023).

  3. RV curves treatment. As RV time-series are published for a small sample of Cepheids and RR Lyrae stars (see also Sartoretti et al. 2022) as part of DR3, a new module of the pipeline analyses the RV curves, providing average RV values, peak-to-peak amplitudes, and the epoch of minimum RVs (see Clementini et al. 2023, for more details).

  4. Update of the PL and PW relations. We have updated the PL and PW relations that are used in the pipeline, adding those needed to deal with M 31 and M 33 data. As all these relations are significantly changed with respect to DR2, we describe them in detail in Sect. 2.2.

  5. Errors with bootstrap. We applied a bootstrap technique to estimate the uncertainties on all Cepheid parameters published in DR3. Specifically, to estimate the uncertainties on the Fourier fit parameters (period, amplitudes and phases) and on all the other quantities characterising the light and RV curves (e.g. mean magnitudes, mean RV, peak-to-peak amplitudes, etc.), the input data were randomly re-sampled (allowing data point repetitions) and all parameters were recalculated on each simulated sample. This procedure was repeated 100 times, and the respective uncertainty was estimated for each parameter by considering the robust standard deviation (1.486 ⋅ MAD) of the distributions obtained with the bootstrap method. A similar procedure was applied to estimate the uncertainties on all other released quantities, such as the metallicity and the Fourier parameters.

  6. Fine tuning of the ROFABO outlier rejection operator. The photometric and RV time-series are inserted in the SOS Cep&RRL pipeline after undergoing a chain of routines that elaborate the observations to obtain a standard time, magnitudes, RVs, and relative uncertainties; these constitute the input time-series data. Among these operators, standard outlier rejection techniques are applied to remove as many bad points as possible, without affecting the scientific information contained in the time series. To improve the rejection of outliers from the time series of Cepheids and RR Lyrae stars, the SOS Cep&RRL pipeline adopted a customised configuration set of parameters for the ‘Remove Outliers on both FAint and Bright sides Operator’ (ROFABO); see (Eyer et al. 2023) routine. To determine the best configuration parameters of ROFABO allowing to maximise the rejection of bad points while preserving good measures (specifically for Cepheids and RR Lyrae stars), we used the SOS pipeline to process a sample of hundreds of time series affected by different kinds of outliers, together with time series not presenting obvious bad measures. A specific ROFABO function for the SOS Cep&RRL pipeline with configuration parameters fine-tuned as described above was then added to the whole operator chain.

Table 1.

Sky subregions considered by the SOS Cep&RRL pipeline.

2.2. New PL and PW relations employed in the SOS Cep&RRL pipeline

A number of significant changes with respect to DR2 were introduced in the branch of the SOS Cep&RRL pipeline that processes the candidate Cepheids. Specifically, (1) we adopted new PL and PW relations directly derived from the Gaia data, while in DR2 we used photometry in the Johnson system transformed into the Gaia bands (see Sect. 3.2 of Paper II); and (2) for DR3 we used the new Wesenheit magnitudes defined by Ripepi et al. (2019), that is W(G, GBP − GRP) = G − 1.90(GBP − GRP), which replaced the W(G, G − GRP) magnitudes used in DR2 (see Eq. (5) in Paper II).

To calculate the PL and PW relations we gathered Cepheids of all types known from the literature and used the SOS pipeline to analyse their light curves in the Gaia bands to obtain periods and intensity-averaged magnitudes in the G, GBP, and GRP bands (see Sect. 2.1 of Clementini et al. 2016, for details on how the SOS Cep&RRL pipeline determines these quantities). The calculation of the PL and PW relations required different approaches for the different subregions as specified in the following:

  1. LMC and SMC. For both galaxies, we adopted 9649 DCEPs and 262 ACEPs from Soszyński et al. (2017) for reference, while the T2CEPs (338 objects) were taken from Soszyński et al. (2018). We retrieved the DR3 time-series photometry of these stars and used the SOS Cep&RRL pipeline to derive periods and intensity-averaged magnitudes in the G, GBP, and GRP bands for the objects with more than 20 epochs (we only wanted good light curves to build the reference PL and PW relations). We discarded all objects for which the SOS and the literature periods did not agree to within 1%. After all these steps, we remained with the number of stars listed in the last column of Table 2. Linear PL and PW relations were derived from them using the python LtsFit package (Cappellari et al. 2013), which has a robust outlier-removal procedure.

  2. M 31 and M 33. Given the faint apparent magnitude of the Cepheids in these distant galaxies, for reference we adopted the PL and PW relations that we calculated for the LMC (see above) – which have the lowest scatter – and simply re-scaled the zero points to take into account the difference in distance moduli between the LMC and M 31/M 33. For the latter, we adopted μM 31 = 24.40 mag (the typical value for the M 31 globular clusters, see Perina et al. 2009) and μM 33 = 24.57 mag (Conn et al. 2012). However, a different choice for the distance moduli of M 31 and M 33 would not affect our analysis and results, as we used rather large magnitude intervals (up to 0.6−0.8 mag) around the PL and PW relations to select the candidate Cepheids.

  3. All Sky. The first step consisted in collecting a reliable sample of Cepheids of all types in the MW. To this aim, we adopted the most updated lists of Cepheids available as of October 2020, namely Ripepi et al. (2019, all types); Skowron et al. (2019, only DCEPs); Soszyński et al. (2020, including DCEPs, ACEPs, and T2CEPs); and Chen et al. (2020, only DCEPs and T2CEPs, with the former not classified according to the pulsation mode and the latter in the different T2CEPs subtypes). After removing overlaps between catalogues, we filtered the resulting list of objects adopting the Gaia EDR3 astrometry. In particular, we retained only objects with relative error on parallax better than 20% and RUWE < 1.41. This choice was driven by the need to clean the sample for contaminants, particularly binaries, which are easily spotted in the PW diagram as they are usually significantly subluminous compared to Cepheids. At the end of this procedure, we were left with a ‘clean’ sample of All Sky Cepheids, for which numbers divided into various types and/or modes are provided in the last column of Table 3. We note that the T2CEP sample only includes BLHER and WVIR stars, because the physical connection with RVTAU stars is questioned (see Introduction). Table 3 shows that for ACEPs, we have only four stars in each pulsation mode. Therefore, for ACEPs, we adopted the slope of the LMC PW relation and fitted only the zero point. For DCEPs and T2CEP, the number of objects is instead sufficient to obtain good PWs. In fitting the relations to preserve the symmetry of the uncertainties on the parallax as much as possible, we adopted the astrometry-based luminosity (ABL Feast & Catchpole 1997; Arenou & Luri 1999):

    ABL = 10 0.2 W = 10 0.2 ( α + β log P ) = ϖ 10 0.2 w 2 , $$ \begin{aligned} \mathrm{ABL} = 10^{0.2\,W} = 10^{0.2(\alpha +\beta \log P)} = \varpi 10^{0.2{ w}-2}, \end{aligned} $$(1)

    where W and w are the absolute and apparent Wesenheit magnitudes and ϖ is the parallax. The fitting procedure is similar to that adopted in Ripepi et al. (2019, 2022a) and is not repeated here. The resulting coefficients for the PW relations of All Sky Cepheids of different type and/or mode are summarised in Table 3.

Table 2.

Coefficients and scatter values of the PL and PW relations used for the sky regions including the LMC and SMC.

Table 3.

Same as in Table 2, but for All Sky Cepheids.

The PL and PW relations described above represent a fundamental tool of the Cepheid branch in the SOS Cep&RRL pipeline, as we use them for a first classification of the candidate Cepheids of different types and/or modes (see Papers I and II for full details on the pipeline). In practice, we define a band across each PL and PW relation, as ±n × σ, where σ is the dispersion of each relation. For DR3, we used 1σ for the ACEPs, 4σ (10σ for the ABL formalism) for the DCEPs, and 2σ for the T2CEPs. These values were calibrated using the LMC, SMC, and All Sky samples of known Cepheids defined above so as to minimise the overlap between contiguous variable types and modes, and at the same time maximise the number of correct classifications.

3. Application of the SOS Cep&RRL pipeline to the DR3 data: cleaning of the sample

The Gaia DR3 data analysed by the SOS Cep&RRL pipeline consist of G and integrated GBP and GRP time-series photometry collected between 25 July 2014 and 28 May 2017, spanning a period of 34 months (for reference, DR2 was based on 22 months of observations). In addition to the time-series photometry, for DR3 we also analysed the RV time series (see Sartoretti et al. 2022, for the general procedures used to measure RV in Gaia) for a selected sample of 799 Cepheids of all types. Among these, 798 are Cepheids present in the vari_cepheid catalogue, while one object, previously classified as RR Lyrae, was found to be a DCEP_MULTI variable (source_id = 5861856101075703552) and is present in the vari_rrlyrae catalogue (see Clementini et al. 2023, for full details).

The general treatment of the light and RV curves and the processing steps that precede the SOS Cep&RRL pipeline are schematically summarised by Holl et al. (2018) and Eyer et al. (2023). In particular, the SOS Cep&RRL pipeline processed candidate Cepheids (and RR Lyrae stars)2 identified as such by the supervised classification of the general variability pipeline (see Eyer et al. 2023; Rimoldini et al. 2023, for details) with various probability levels. In order to maximise the number of DCEPs known from the literature that are recovered, we considered classification candidates that also have low probability levels. Among the Cepheid candidates, the SOS Cep&RRL pipeline only retained objects with at least 12 measurements in the G-band for analysis, while the RV time series were only processed for sources with seven RV measurements or more.

At the end of this first processing, we obtained a sample of about 1 million Cepheid candidates of all types. Among them, only about 5000 were in M 31 and M 33. To reduce the huge number of candidate Cepheids in the LMC, SMC, and particularly in the All Sky sample to more manageable numbers, we applied the following series of filters:

  1. Separation of known or suspected Cepheids in the literature. From the whole sample of Cepheid candidates, we separated sources that are known or suspected Cepheids of all types in the literature. This was done for each of the five subregions defined in Table 1. This first step was necessary to avoid filtering out possible good objects in the following cutting steps. The majority of the literature Cepheids were then validated by visual inspection as described in Sect. 4. For the known Cepheids in the LMC and SMC, we retained those mentioned in Sect. 2.2, while to the All Sky known Cepheids we added all the objects classified as Cepheids as of February 2021 in the SIMBAD database (available at CDS, Centre de Donnés astronomiques de Strasbourg, Wenger et al. 2000). For M 31 and M 33, we adopted the samples by Kodric et al. (2018) and Pellerin & Macri (2011), respectively. After eliminating overlaps, the overall literature sample within the one million candidates contains about 16 000 objects. These literature Cepeheids were elected for visual inspection, with the exclusion of about 9000 Cepheids in the MCs, for which the OGLE classification is already reliable.

  2. Goodness of the light curves. We filtered the remaining sample based on uncertainties on the light curve parameters. More specifically, we applied the cuts listed in Table 4. This allowed us to filter out about 10% of the sources and, in particular, to reduce the All Sky sample to approximately 667 000 sources.

  3. Probability of the classifiers (LMC and SMC samples). As the number of candidates remaining from the previous steps was still too large for the LMC and SMC, we reconsidered the probability adopted in selecting Cepheid candidates from the classifiers of the general variability pipeline. Again adopting the highly reliable sample of literature objects in the MCs, for each classifier we calculated the probability that returns 95% of the known Cepheids. This procedure was very effective, leaving us with only about 2500 new Cepheid candidates in the two MCs.

  4. Filtering of aliasing periods (M 31 and M 33 samples): As discussed in Holl et al. (2023), instrumental effects produce false variable sources with typical periods which are strictly correlated with the position on the sky of the objects. These effects are particularly disturbing in the case of M 31 and M 33, given that for these galaxies we have only the G-band photometry for reference. Luckily, as the range in coordinates spanned by the M 31 and M 33 data is rather small, the aliases correlated with the position on the sky produce narrow peaks in period. A histogram of the periods provides five and seven narrow period peaks in M 31 and M 33, respectively. Filtering the stars in those intervals left us with 1923 candidate Cepheids in M 31 and 1332 stars in M 33 for further verification.

  5. Filtering on number of epochs, limiting magnitude, amplitude, and period (All Sky sample). As the All Sky sample resulting from the previous filtering was still too large, we applied the following further filtering: G < 19.0 mag, amp(G) > 0.15 mag, maximum period Pmax = 100 days and number of epochs in the G-band > 30. The selection on the number of epochs was motivated by the need to measure accurate periods, while the limits in magnitude and amplitude allowed us to significantly reduce the number of spurious variability detections caused by instrumental effects (see e.g. Holl et al. 2023) which are more likely among faint sources, whose GBP and GRP magnitudes are also in most cases not accurate. The cut in period is justified because very few Cepheids, that is, both DCEPs and RVTAU, are expected to exceed a period of 100 days. In the end, the above filtering left us with 166k candidates for further analysis.

  6. Machine learning filtering. While the sample in the MCs was small enough to be checked visually, the All Sky sample was still too large. We therefore applied an additional filtering based on machine learning techniques. We adopted a supervised classification method based on a reliable training set. To build the training set, we adopted a sample of Cepheids of all types similar to that described in Sect. 2.2 – but not limited in relative parallax error – to increase the statistics, including about 4100 objects in total. To this sample, we added about 2250 contaminants of different types, including RR Lyrae stars, long-period variables, eclipsing binaries, and so on taken from objects for which the general classification pipeline assigns a very high probability of belonging to the given class. In addition, we verified that the vast majority of the contaminants were also known in the literature with a classification in agreement with that assigned by the classification pipeline. After establishing the training set, we defined the input attributes for the machine learning algorithm. Based on parameters that are already used by the SOS Cep&RRL pipeline, we adopted: the first periodicity, the second periodicity (if any), the absolute magnitudes in all bands, the absolute Wesenheit magnitudes (in G, GBP − GRP), the amplitudes in all bands, the amplitude ratios (amp(GBP)/amp(GRP); amp(GBP)/amp(G); amp(G)/amp(GRP)), colours (GBP − GRP; GBP − G; G − GRP) and the Fourier parameters (R21;  R31;  ϕ21;  ϕ31). The classes fed to the algorithm were: ACEP_F, ACEP_1O, DCEP_F, DCEP_1O, DCEP_MULTI, BLHER, WVIR, RVTAU, and OTHER, where the last tag included all the non-Cepheid objects. To execute the machine learning procedure we used the H2O platform3. After ingesting the training set, we divided it into training and validation sets in proportions of 85% and 15%, respectively. We then carried out several tests to find the best model for our case amongst those offered by the H2O package. The model that returned the largest percentage of precision in detecting the right classes and modes was found to be the XGBOOST algorithm. We applied this model to the sample of 166k All Sky candidate Cepheids returned by the selection described in point (5) above, obtaining a probability of belonging to one of the classes mentioned above for each candidate. A quick visual examination of samples of light curves for objects with a probability larger than 50% of being Cepheids of any type revealed that there were no reliable candidates with probability < 90%. We therefore considered only candidates with probability larger than 90%, giving a total of 10 273 sources. Finally, to further restrict the number of stars for visual inspection, we adopted the peak-to-peak amplitudes, requiring that: 1.3 ≤ amp(GBP)/amp(GRP)≤2.0; 1.1 ≤ amp(G)/amp(GRP)≤1.5 and 100 × σG/amp(G)≤2.0. These broad limits include the large majority of bona fine Cepheids according to tests carried out on the training set adopted for the machine learning procedure. After applying this last filtering, we were left with 7349 stars for subsequent visual inspection.

Table 4.

Constraints on the results from the light-curve fitting.

In summary, at the end of the whole filtering procedure described in this section, we were left with about 20 100 Cepheids for subsequent visual inspection in order to validate the classification provided by the SOS Cep&RRL pipeline.

4. Correction of the SOS Cep&RRL pipeline classification

As mentioned in the previous section, a number of sources were selected for further inspection to verify their classification. Different procedures were adopted for the LMC/SMC, M 31/M 33, and All Sky samples because of the different characteristics of the available data. More specifically, for the LMC and SMC, the literature samples have a robust classification and we already knew from DR2 that the SOS Cep&RRL pipeline provides reliable classifications for the Cepheids in these two galaxies. For this reason, we did not visually check the known Cepheids in the MCs, but only the new candidates. On the contrary, the classification of Cepheids in M 31 and M 33 required careful validation because of the low signal-to-noise ratio (S/N) of the Gaia data and the much less established literature for Cepheids in these galaxies. Concerning the All Sky sample, the literature sample is likely contaminated by both non-Cepheids and incorrect classifications (i.e. incorrect Cepheid types and pulsation modes) because their classification in these two respects does not rely on solid distances but mainly on the analysis of the light curve shapes. For this reason, we checked all the known Cepheids in addition to the new candidates for the All Sky sample.

4.1. Visual inspection of the Gaia DR3 light curves

In general, for each star, we evaluated the shape of the light curves in all Gaia bands, the position in the period–Fourier parameters diagrams (P − R21;  P − R31;  P − ϕ21;  P − ϕ31), the position on the PL and PW relations, and the amplitude ratios amp(GBP)/amp(GRP) and amp(G)/amp(GRP). In the case of negative parallax values, the ABL function was used. We adopted the very useful ‘OGLE atlas of light curves’4 as reference for the shapes of the light curves of Cepheids of all types. For Cepheids in M 31 and M 33, we only have the G-band light curves and the position in the PL relation for reference, as for G ∼ 20 − 21 mag the GRP and GBP magnitudes are often missing or totally unreliable, hence the Wesenheit relation was not usable in most cases.

In the All Sky sample, the major difficulties were to distinguish DCEP_1O from first overtone RR Lyrae stars with periods smaller than 0.4 days wherever light curves were not very well defined and parallaxes had relative errors of greater than 10%−20%. Similarly, in some cases, it was difficult to separate DCEP_1O and ACEP_1O with periods ∼0.7−0.8 days. Even more challenging was to distinguish ACEP_F from ab-type RR Lyrae for periods of around 0.6 days and from DCEP_F in the period range 1.0−1.4 days. These difficulties arose mainly from the very similar shape of the light curves for these types of variable stars, which can only be distinguished based on fine details of the light curves, such as humps and bumps, which are not always clearly visible. Also, WVIR and DCEP_F can be confused when light curves are noisy and parallaxes inaccurate. In all these cases, the Fourier parameters also provide ambiguous results because they stem directly from the light curve shape.

A main source of contamination is given by contact binary stars, whose light curves mimic those of the overtone Cepheids and, to some extent, also those of the WVIR variables. To mitigate this problem, we always inspected the light curves folded according to once and twice the period provided by the SOS Cep&RRL pipeline. In this way, it was often possible to identify stars for which there was a small but detectable difference between the light curve minima. This check, in conjunction with the amplitude ratios amp(GBP)/amp(GRP) and amp(G)/amp(GRP) – which for binaries tend to assume values close to unity (see Sect. 4 in Paper II), while much larger for pulsating stars (see e.g. Table 4 in Ripepi et al. 2019) –, allowed us to detect and reject the large majority of potential contaminants that are contact binaries.

During visual inspection, many objects classified as DCEP_1O variables by the SOS Cep&RRL pipeline were found to show larger scatter than other sources of the same magnitude, leading us to suspect they might be missed multi-mode objects. As discussed in detail in Sect. 4.2, we searched for secondary periodicities in the light curves of the stars in this sample, finding that many of them are actually multi-mode pulsators. A large fraction of them were missed simply because of the overly strict constraint on the number of epochs and scatter in the light curves introduced in the SOS Cep&RRL pipeline (see point 2 of Sect. 2), which allowed us to minimise the number of spurious detections but at the same time also prevented us from detecting many genuine multi-mode pulsators.

4.2. Multi-mode Cepheids

DCEPs in the All Sky sample that, during visual inspection, were suspected to be multi-mode pulsators were further investigated by analysing their light curves with software external to the SOS Cep&RRL pipeline. In particular, we used the Period04 package (Lenz & Breger 2005) for a first selection of the most promising candidates and to determine their periodicities. We then used a custom program written in Python to carry out the non-linear fitting with truncated Fourier series, the prewhitening of the first periodicity, and then the fitting of all periodicities together. In close similarity with the SOS Cep&RRL pipeline, we finally determined the period uncertainties with a bootstrap procedure. The re-processing led to the identification of 109 DCEP_MULTI variables in addition to the 86 DCEP_MULTI for which the SOS Cep&RRL pipeline provided the correct classification. The list of additional DCEP_MULTI variables and their periods are provided in Table 5 with relative errors. The Petersen diagram (period ratios vs longer period) for the DCEP_MULTI in the All Sky sample is shown in Fig. 1, where the loci occupied by the different period ratios are taken from Soszyński et al. (2020).

thumbnail Fig. 1.

Petersen diagram for confirmed DCEP_MULTI objects published in the Gaia DR3 catalogue (red filled circles) and for additional DCEP_MULTI objects detected in the re-processing of the data (blue filled circles). PL and PS represent the longest and shortest pulsation periods of the multi-mode object. Labels show the typical location of the different multi-mode pulsation combinations identified in these sources. Black squares mark six objects known in the literature as ARRDs (see Sect. 6.4).

Table 5.

Re-processing of the Gaia data for DCEP_MULTI objects not detected as such by the SOS Cep&RRL pipeline.

4.3. Final classification

The processing of the SOS pipeline along with the validation, cleaning, and re-classification procedures described in the previous sections produced a final catalogue of 15 021 Cepheids of all types, which populate the vari_cepheid table in the Gaia DR3 archive. Despite our efforts to clean the sample from spurious objects, after a deeper analysis, 15 sources turned out to be non-Cepheid variables (these objects are listed with the new classification in Table 6), bringing the total number of bona fide Cepheids of all types in Gaia DR3 to 15 006.

Table 6.

Reclassification of objects incorrectly classified by the SOS Cep&RRL pipeline.

In total, we changed the SOS Cep&RRL classification of 1160 stars. This corresponds to about 8% of the total sample. The new classifications are given in Table 6. Taking into account all re-classifications, in Table 7 we report the breakdown of the DR3 Cepheids by type in the different subregions in which we divided our sample.

Table 7.

Number and type or mode classification of Cepheids confirmed by the SOS Cep&RRL pipeline and published in Gaia DR3.

Comparison with the literature, which is discussed in more detail in Sect. 6.1, along with a cross match with the SIMBAD database5 (Wenger et al. 2000) allowed us to calculate the number of Cepheids of any type already known in the literature, the number that are classified as variables but of non-Cepheid type, and the number of new discoveries. The result of this exercise is reported in the last line of Table 7. The largest number of new or reclassified objects belongs to the All Sky sample, but we note that many new Cepheids were also discovered in M 31 and M 33.

5. Properties of the Cepheids in the Gaia DR3

A summary of the parameters provided by the SOS Cep&RRL pipeline that form the entries of the vari_cepheid table is provided in Table 8. In the following subsections we describe the main properties of the Cepheids in Gaia DR3.

Table 8.

Links to Gaia archive table to retrieve the pulsation characteristics: period(s), epochs of maximum light and minimum radial velocity (E), peak-to-peak amplitudes, intensity-averaged mean magnitudes, mean radial velocity, ϕ21, R21, ϕ31, R31 Fourier parameters with related uncertainties and metallicity computed by the SOS Cep&RRL pipeline for the 15 021 objects (15 006 Cepheids and 15 stars of different type) released in Gaia DR3.

Examples of light and RV curves for DCEPs of different pulsation modes are shown in Fig. A.1. Similarly, Fig. A.2 displays the Gaia time series for the prototypes of the T2CEP classes, namely BL Her, W Vir, and RV Tau. Finally, Fig. A.3 shows the light and RV curves for ACEP_F and ACEP_1O variables.

5.1. Number of epochs

An important quantity affecting the quality of the results is the number of epochs in the light and RV curves. This feature strictly depends on the position of a specific object in the sky, as the Gaia scanning law is extremely non-uniform (see Gaia Collaboration 2016a). The more epochs available for the analysis of the time series, the more precise the determination of the periods, amplitudes, and so on. Figure 2 shows histograms with the number of epochs in G band for each subsample (the number of epochs in GBP and GRP provides similar distributions). Restricted regions in the sky such as the SMC, M 31, and M 33 show narrower intervals of epochs than both the All Sky and LMC samples. The latter shows an extended tail with many DCEPs having more than 140 epochs because they are located in the region of the EPSL (Ecliptic Pole Scanning Law) which was covered continuously during the first 28 days of the Gaia mission (see Gaia Collaboration 2016b). Unfortunately, for M 31 and M 33, which are the most difficult subregions because of the large distance, the number of epochs is small (less than 40 on average for M 31), making it difficult to study the Cepheids in these systems.

thumbnail Fig. 2.

Number of epochs in the G-band time series. From top to bottom, the different panels show the data for the different subsamples corresponding to the five regions of the sky defined in Sect. 2.

Concerning the RVs, the number of useful epochs for the Cepheids with RV time series published in DR3 is displayed in Fig. 3. There are 15 and 9 DCEPs with RV time series in the LMC and SMC, respectively. The rest of the objects belong to the All Sky sample.

thumbnail Fig. 3.

Number of epochs in the RV time series for the labelled subsamples.

5.2. Spatial distribution

The spatial distribution of Cepheids of different types in the All Sky sample is shown in Fig. 4. The different distributions reflect the progenitor stellar populations of the different types: DCEPs are concentrated in the Galactic disc, as expected for a young population6; ACEPs, which are intermediate-age objects, are preferentially located in the Galactic halo; T2CEPs are present in almost all Galactic components, namely disc, thick disc, halo, and bulge, where they are more concentrated. The spatial distributions of the LMC and SMC Cepheids are shown in Fig. 5. Also, in these galaxies, the DCEPs trace the young populations inhabiting the LMC bar and the spiral arms (see e.g. Ripepi et al. 2022b, and references therein) as well as the body and the wing of the SMC (see e.g. Ripepi et al. 2017, and references therein). The spatial distributions of ACEPs and T2CEPs are more sparse and connected with the spheroids describing the intermediate-old populations in both galaxies (see e.g. Gaia Collaboration 2021b).

thumbnail Fig. 4.

Map in Galactic coordinates of the different Cepheid types in the MW. The objects are colour coded according to their apparent G magnitude.

thumbnail Fig. 5.

Map of the different Cepheid types in the MCs. The objects are colour coded according to their apparent G magnitude. The map is a zenithal equidistant projection centred at equatorial coordinates RA, Dec = 56.0, −73.0 deg (J2000).

Figure 6 shows the spatial distribution of the Cepheids in M 31 and M 33. In this case, we mostly find DCEPs, except for two RVTAU stars detected in M 31. The spatial distribution of the M 31 DCEPs closely follows the galaxy spiral arms, where young stars are expected, while the DCEP distribution in M 33 is less ordered because of the different morphology of the galaxy and the different viewing angle from the Sun.

thumbnail Fig. 6.

Map of the DCEPs in M 31 (top panel) and M 33 (bottom panel). The symbols are colour coded based on the apparent G magnitude of the DCEPs. The two black crosses identify two RVTAU stars in M 31. The maps are in zenithal equidistant projection centred at equatorial coordinates (RA, Dec)M 31 = 10.6, 41.2 deg (J2000) and (RA, Dec)M 33 = 23.5, 30.65 deg (J2000).

5.3. Fourier parameters

An important product of the SOS Cep&RRL pipeline is the Fourier parameters R21, R31, ϕ21, and ϕ31 which represent an important tool to distinguish the different types of variables. The Fourier parameters for the Cepheids in the All Sky sample are shown in Fig. 7, separated in different panels for DCEPs, ACEPs, and T2CEPs in the interest of clarity. The different distributions occupy the expected location for each variable type, confirming the efficacy of our classification. The same kind of considerations are valid for the LMC and SMC as shown in Figs. B.1 and B.2. In the cases of M 31 and M 33 (Figs. B.3 and B.4), the Fourier parameters show a less clear morphology, because light curves are mostly noisy, because we are analysing objects with magnitudes at the limits of Gaia capabilities. Nevertheless, it is remarkable that especially for M 31, the morphology of the P − R21 and P − ϕ21 relations is similar to that displayed by the much closer All Sky, LMC, and SMC samples.

thumbnail Fig. 7.

Fourier parameters for the All Sky sample. From top to bottom the different panels show the results for DCEPs, ACEPs, and T2CEPs, respectively.

5.4. PL and PW diagrams

Figure 8 shows the PW relations for the All Sky sample; shown separately for different Cepheid types and modes. These relationships were adopted by the SOS Cep&RRL pipeline to select and classify the different types of Cepheids, as discussed in Sect. 2.27. There is a large scatter in Fig. 8 as we also plot objects with very large parallax errors (pulsators with negative parallaxes cannot be shown in the figure). Much better defined PW relationships can be obtained by plotting only objects with relative error in parallax better than 20%, as shown in Fig. 9.

thumbnail Fig. 8.

PW relation for the All Sky sample. The different types and modes of the Cepheids displayed in the figures are labelled in each panel.

thumbnail Fig. 9.

Same as in Fig. 8 but restricting the sample to objects with σϖ/ϖ< 0.2.

Contrary to the All Sky sample, for the LMC and SMC, we can use the PL relations in the G band in addition to the PW relations, as the reddening in these galaxies is in general rather low and approximately constant over each galaxy. The PL diagrams are shown in Figs. C.1 and C.2 for the LMC and SMC, respectively. Both the PL and the PW diagrams are well defined, especially in the LMC, while the large depth along the line of sight significantly increases the dispersion in the SMC (see Ripepi et al. 2017, and references therein).

5.5. Colour–magnitude diagrams

Colour–magnitude diagrams (CMDs) for the Cepheids in all the subregions are shown in Figs. 10, D.1D.3. The CMDs for the All Sky sample show very large dispersions, as the reddening along the disc and the bulge – where most of the DCEPs and T2CEPs reside – can be of several magnitudes. Not surprisingly, the dispersion of ACEPs is smaller, as the majority of these objects are situated in the halo, where reddening is on average rather low.

thumbnail Fig. 10.

CMD of the All Sky Cepheid sample.

The MCs have approximately constant and low reddening, meaning that the CMDs of the Cepheids in these galaxies are more meaningful, with the DCEP_1O clearly bluer than the DCEP_F, as expected. The ‘spur’ of LMC DCEPs of both modes extending up to GBP − GRP ∼ 1.5 mag remind us that in the LMC there are regions with high reddening values. The range in colours spanned by ACEPs and different types of T2CEPs reflects their locations in the instability strip. The CMDs of M 31 and M 33 DCEPs are shown only for completeness, as the colours are totally unreliable in most cases.

5.6. Period–amplitude diagrams

Figures 11, E.1E.3 display the period versus amplitude in the G band (P-Amp(G)) relations for the different subregions and Cepheid types. The morphology of these plots for DCEPs in the All Sky and MC samples is as expected from the literature (see e.g. Ripepi et al. 2017, 2022b, for the SMC and LMC, respectively). The DCEP_1O, as well as most of the DCEP_MULTI objects, have Amp(G) < 0.5 mag, while the DCEP_F objects show the characteristic double peak at periods of 2−3 days and 11−12 days in the All Sky and LMC samples. The P-Amp(G) distribution in the SMC is instead significantly different: the DCEP_1Os show larger amplitudes and the first peak of the DCEP_F pulsators occurs at shorter periods and larger amplitudes than in the All Sky and LMC samples, while the second peak is only barely visible with much smaller amplitudes than the first one, again in contrast with the All Sky and LMC samples. All these differences are most likely due to the much lower metallicity of the SMC DCEPs with respect to the MW and LMC samples (see e.g. De Somma et al. 2023). The P-Amp(G) diagrams for the DCEPs in the M 31 and M 33 galaxies appear rather different from the other samples. This is mainly because only a handful of stars with period shorter than 10 days were detected in these galaxies, which means that the first amplitude peak for DCEP_F is completely missed. Instead, we observe the second peak, at least in M 31, but shifted to about P ∼ 30 days. However, this feature requires confirmation, as in M 31, Gaia is operating at the extreme limits of its capabilities.

thumbnail Fig. 11.

Period–amplitude (G) diagram for the All Sky sample.

The P-Amp(G) distributions of ACEPs and T2CEPs are also very interesting: (i) as expected, ACEP_1O objects have smaller amplitudes than those of ACEP_F; (ii) at periods in the range 1−2 days, ACEP_F can reach significantly higher amplitudes than both DCEP_F and BLHER, providing us with an additional tool to distinguish them from the different Cepheid types; and (iii) the period separation between different T2CEP types also corresponds to a difference in amplitude, meaning that the WVIR stars have a minimum and a maximum at the extreme periods characterising this class. These features are clearly visible in the data of the All Sky survey because of the large sample size, but are also clearly discernible in the LMC, while in the SMC the paucity of T2CEPs prevents any conclusions.

5.7. Radial velocities

One of the new products of Gaia DR3 is the publication of time-series RV data. The final catalogue of Cepheids of all types includes 799 objects for which RV time series are released. The SOS Cep&RRL pipeline only obtained average RV and peak-to-peak amplitude values for 786 objects, as for 13 objects the number of epochs is smaller than seven, which is the minimum required for the RV curve fitting. In total, the time-series are released for 582 DCEP_F, 133 DCEP_1O, 14 DCEP_MULTI, 12 BLHER, 35 WVIR, 17 RVTAU, 3 ACEP_F, and 2 ACEP_1O pulsators. Among the DCEP_Fs, 15 and 9 objects belong to the LMC and SMC, respectively. In addition to the time series, median RV values calculated by the general RV data processing in Gaia (Sartoretti et al. 2022) are published for 3190 Cepheids of all types in the gaia_source table. As shown in Fig. 12, there is excellent agreement between the two estimates for the 736 stars in common between the two samples (see Clementini et al. 2023, for further details). Indeed, the median and mean difference between the two average values are of 0.43 and 0.33 km s−1, respectively, with a standard deviation of 6.40 km s−1.

thumbnail Fig. 12.

Comparison between the average RV calculated by the SOS Cep&RRL pipeline from fitting the RV curves and the mean values published in the gaia_source table (see Sartoretti et al. 2022, for details).

The spatial distributions of Cepheids with average RV values from both the general and the SOS Cep&RRL pipelines are shown in Fig. 13 and are colour coded according to the RV values. As expected, the objects lying in the disc (mainly DCEPs, see Gaia Collaboration 2023b, for the an example of exploitation of these data) show low values of RV, while the halo Cepheids show both highly positive and negative RV values. The LMC and SMC are clearly identified by the RV values shared by all stars belonging to the two galaxies.

thumbnail Fig. 13.

RV maps defined by the 3190 Cepheids in the DR3 gaia_source table (top panel) and 786/799 Cepheids in the DR3 vari_cepheid table (bottom panel).

The uncertainties measured by the SOS Cep&RRL pipeline on the average RV (⟨RV⟩) and on the RV peak-to-peak amplitude (Amp(RV)) are shown in Fig. 14. The typical uncertainties on ⟨RV⟩ are on the order of 1−1.5 km s−1, as expected (see Clementini et al. 2023). However there are a few objects showing large errors as measured by the bootstrap procedure. These cases are often correlated with the low number of RV epochs available for these Cepheids (see Fig. 3). Similarly, the typical uncertainty is ∼3−4 km s−1 for the Amp(RV), but there are a few objects with uncertainties larger than 30−40 km s−1 which can be an indication of unreliable Amp(RV) values. This is verified in Fig. 15, where, in analogy to the photometry, we show the relation between amplitude in RV and period. The general trend closely follows that shown from photometry, with DCEP_F objects having larger amplitudes than DCEP_1O or DCEP_MULTI objects and showing the typical bell shape starting from a minimum amplitude at a period of ∼9 days and a maximum at ∼20 days. The figure shows that despite the large uncertainties in the Amp(RV) of some objects, only a few Cepheids appear out of their expected position in this plot. We conclude that the RV amplitudes calculated by the SOS Cep&RRL pipeline are generally reliable.

thumbnail Fig. 14.

Uncertainties on the average and peak-to-peak RV values measured by the SOS Cep&RRL pipeline for a sample of 786 Cepheids.

thumbnail Fig. 15.

Period–amplitude (RV) for the 786 Cepheids whose RV curves were analysed by the SOS Cep&RRL pipeline. The different Cepheid types are labelled. The size of the circles surrounding the symbols is proportional to the uncertainty in Amp(RV) (see also Fig. 14).

5.8. Metallicities

An additional product of the SOS Cep&RRL pipeline are the photometric iron abundances inferred from the Fourier parameters R21 and R31 according to the calibration by Klagyivik et al. (2013), which is valid for DCEP_Fs with periods shorter than 6.3 days and for an interval of metallicity reaching the average [Fe/H] values of the LMC and SMC DCEPs (see Clementini et al. 2019, for details). As the metallicity estimates rely on the R21 and R31 Fourier parameters, which sometimes have large errors calculated with the bootstrap technique, we suggest using the [Fe/H] values with uncertainties larger than ∼0.5 dex with care. The catalogue includes a total of 5265 DCEP_Fs with [Fe/H] estimates. However, as we have changed the classification (see Sect. 4) for the 142 objects reported in Table 6, some of these objects are no longer DCEP_Fs, and therefore their metallicity estimates are incorrect and should not be used. The DCEP_Fs with an [Fe/H] estimate are 1053, 1882, 2174, 7, and 7 in the All Sky, LMC, SMC, M 31, and M 33 samples, respectively. The distribution of the metallicities in the SMC, LMC, and All Sky samples is shown in Fig. 16. The figure shows that, as expected, the DCEPs in the All Sky sample (exclusively MW objects) are, on average, more metal rich than the LMC ones, which in turn are more metal rich than those in the SMC. From a quantitative point of view, we can see that the peak of the All Sky distribution is [Fe/H] ∼ +0.05 dex, which is in general agreement with the literature (see e.g. Ripepi et al. 2019). On the contrary, for the LMC and SMC, we have peaks of approximately −0.2 dex and −0.3 dex for the LMC and SMC, respectively. These values are significantly larger than those found in the literature, namely [Fe/H]LMC = −0.41 dex (σ = 0.08 dex Romaniello et al. 2022) and [Fe/H]SMC = −0.75 dex (σ = 0.08 dex Romaniello et al. 2008). Therefore, the photometric metallicities are not particularly reliable for metallicity values lower than [Fe/H] ∼ −0.3 dex, which is not unexpected as the work by Klagyivik et al. (2013) relies on very few calibrators in this metallicity range.

thumbnail Fig. 16.

Photometric metallicities in the LMC, SMC, and All Sky samples.

For M 31 and M 33, the PL relations are more accurate than the PW relations because the magnitudes in the GBP and GRP bands, if any, are less accurate that that in the G band, which leads to much greater dispersion in the PW relations (see Fig. C.3). The PL relations for both M 31 and M 33 show a remarkable linearity up to about G ∼ 21 mag.

We can perform a more detailed comparison between the photometric metallicities from the SOS Cep&RRL pipeline and the literature by cross-matching the All Sky sample with the list of DCEPs that have metallicities measured from high-resolution spectroscopy recently published by Ripepi et al. (2022a)8. The metallicity estimates for the 185 DCEPs in common between the two samples are displayed in Fig. 17. The photometric [Fe/H] values appear to be systematically higher than the spectroscopic abundances. The average difference is [Fe/H]Lit–[Fe/H]SOS = −0.08 dex, with σ = 0.16 dex and no apparent trend with the [Fe/H]Lit value. The mean shift and relative dispersion are modest, meaning that as far as the All Sky sample is concerned, or at least in the metallicity range −0.3 < [Fe/H] < +0.4 dex, the photometric metallicites can be used. We speculate that for lower values, the metallicity sensitivity of the R21 and R31 parameters may vanish. This could explain the poor performance of the method for the LMC and SMC DCEP samples (see Table 9).

thumbnail Fig. 17.

Comparison between photometric metallicities computed by the SOS Cep&RRL pipeline ([Fe/H]SOS) and metal abundances from high-resolution spectroscopy available in the literature ([Fe/H]Lit).

Table 9.

Gaia source_id of sources for which the SOS Cep&RRL pipeline provides a metallicity estimate which should not be used as these stars are not DCEP_F pulsators.

5.9. Cepheids hosted by stellar clusters and satellite dwarf galaxies of the MW

We searched for any association of Cepheids in the All Sky sample with stellar clusters hosted by the MW or with dwarf galaxies orbiting our Galaxy. For the open clusters (OCs), we adopted the list of likely member stars by Cantat-Gaudin et al. (2020) supplemented with new data provided by Castro-Ginard et al. (2022) and Tarricq et al. (2022); for the globular clusters (GCs) we used the list by Clement et al. (2001) (continuously updated); for the dwarf galaxies we used a variety of literature sources including (Soszyński et al. 2017, 2018). Results are shown in Table 10. An additional 35 objects from the All Sky sample can be associated with the MCs, 45 with Galactic GCs, 24 with OCs, and one with the Draco dwarf spheroidal galaxy (variable data for Draco by Kinemuchi et al. 2008).

Table 10.

Association of Cepheids in the All Sky sample with open and globular clusters and with dwarf galaxies that are satellites of the MW.

6. Validation

In the following sections we discuss the many different procedures adopted to validate the catalogue of Cepheids of all types published in Gaia DR3. Also, we discuss its completeness and contamination level.

6.1. Literature adopted for the validation

To validate the results of the SOS Cep&RRL pipeline classification, we adopted different literature sources according to the different subregions of reference. Starting with the All Sky, for the DCEPs we adopted the recent compilation by Pietrukowicz et al. (2021, hereafter, P21) – including 3352 reliable bona fide DCEPs – which is mainly based on results from the OGLE survey (Udalski et al. 2018; Soszyński et al. 2020). For ACEPs and T2CEPs, we adopted the results of the OGLE survey (Soszyński et al. 2020, and references therein) complemented by entries in Chen et al. (2020), which is based on the ZTF (Zwicky Transient Factory) survey, and by Drake et al. (2014), Torrealba et al. (2015), which are based on the Catalina sky survey (CSS). As the classification of the latter papers does not distinguish the mode or type of pulsation, we assigned the fundamental mode to the ACEP detected by CSS9 and separated BLHER from WVIR and WVIR from RVTAU using period thresholds of 4 and 24 days, respectively (in analogy with the SOS Cep&RRL pipeline). The total sample of sources with a positive cross-match with the Gaia DR3 catalogue includes 3917 Cepheids. We note that we have intentionally not included results from Gaia DR2 re-classifications by Ripepi et al. (2019) to preserve the independence of the counterpart. We have also not included Cepheids by ASAS-SN (All Sky Automated Survey for Supernovae Shappee et al. 2014; Jayasinghe et al. 2019) or ATLAS (Asteroid Terrestrial-impact Last Alert System Heinze et al. 2018), who adopt automatic classification procedures and perform no careful visual inspection of the light curves. However, many stars originally detected by these surveys were analysed by Pietrukowicz et al. (2021) and are included in their catalogue.

As for the MCs, we adopted the OGLE catalogue by Soszyński et al. (2019a), including 9650 DCEPs, 343 T2CEPs, and 278 ACEPs. A cross-match with Gaia DR3 results provides 4638 and 4608 matches for the LMC and SMC, respectively.

For M 31 we used the work by Kodric et al. (2018) who provide the classification for 2247 Cepheids, including DCEP_F, DCEP_1O, and RVTAU stars. We have 262 stars in common with this work. As for M 33, 112 of the 185 objects classified as Cepheid from the SOS Cep&RRL pipeline are present in the work by Pellerin & Macri (2011). However, these latter authors do not provide a classification in DCEPs or T2CEPs, and therefore we refrained from any comparison.

6.2. Accuracy of the classification, completeness, and contamination

On the basis of the literature data discussed in the previous section, we produced confusion matrices for the LMC, SMC, and All Sky samples. There are 2739 stars in common with P21, corresponding to 82% of the sample. A further 130 objects are published in the general classification (Rimoldini et al. 2023) as the SOS Cep&RRL pipeline found an incorrect period for these objects. Therefore, taking the latter objects into account, the completeness of the Gaia catalogue for the All Sky DCEP sample is of 85.6% at least. However, the catalogue by Pietrukowicz et al. (2021) is not free of contamination, especially for the DCEP_1Os, which can be easily confused with binaries if the distance is not used in the classification. This is shown in Fig. 18, which shows the PW relation for a selected sample of DCEPs with parallax relative errors of better than 20% and good astrometric solution (RUWE ≤ 1.4). The vast majority of the objects shown in the figure are common to the Gaia DR3 catalogue and Pietrukowicz et al. (2021), and the figure nicely depicts the expected linear relations for both DCEP_F and DCEP_1O pulsators. The second sample includes objects present only in the Pietrukowicz et al. (2021) list. Most of the DCEP_1O are clearly too faint to be DCEPs or any other type of Cepheid, and are likely binaries contaminating the DCEPs sample. Although the numbers of objects with a good parallax is too small to obtain statistical significance, it is plausible that the completeness of the Gaia DR3 catalogue for DCEPs is larger than 85.6% once the purity of the comparison samples is taken into account.

thumbnail Fig. 18.

PW relation for a selected sample of DCEPs. Red and blue small filled circles show the DCEP_Fs and DCEP_1Os in common between Gaia DR3 and Pietrukowicz et al. (2021, abbreviated as P21 in the labels), respectively. Cyan and green large filled circles show DCEP_Fs and DCEP_1Os present in the P21 catalogue only. For all objects, we applied a selection in parallax, requiring that the relative precision be better than 20%. We also required the RUWE parameter to be lower than 1.4, so as to ensure a good astrometric solution (see text).

The completeness for ACEPs and T2CEPs is more difficult to establish as there are no homogeneous catalogues for these Cepheid types, except for regions of the sky covered by the OGLE survey. Therefore, we restricted our estimates to the bulge and a portion of the disc (see e.g. Soszyński et al. 2020), and calculated the ratio of the number of ACEPs and T2CEPs in DR3 and the OGLE catalogues. Given the small numbers involved compared with DCEPs, we summed ACEPs and T2CEPs, obtaining an overall completeness of about 25%. Such a low completeness compared to the DCEPs is due to the fact that the large majority of the OGLE ACEPs and T2CEPs are in the bulge, a region where Gaia has still a low number of epochs on average. In addition, the bulge is also almost devoid of DCEPs, meaning that the Gaia low detection efficiency in this region does not impact the DCEP completeness.

The confusion matrix of the All Sky sample is shown in Fig. F.1. The apparent accuracy of our DCEPs classification (‘Recall’ column) is satisfactorily high, being 96%, 92%, and 95% for DCEP_F, DCEP_1O, and DCEP_MULTI, respectively. A similar result is obtained for T2CEP variables, namely > 94% for all Cepheid types. The percentages are less good for the ACEPs which are much more difficult to classify, given the similarities in light curve shape with DCEP and BLHER variables. We therefore tend to classify more ACEPs than the literature, where the classification is usually only based on the light curve shape. Precision is again very high for T2CEPs and DCEPs with the exception of DCEP_MULTI, of which we appear to have missed about 30%. This is not surprising, as for many pulsators we just do not have enough epochs to resolve more than one pulsating mode. For ACEPs, the precision is about 70%, which means that we are able to detect a large fraction of the literature ACEPs.

The same kind of comparison is shown in Figs. F.2 and F.3 for the LMC and SMC, respectively. The results are very good in the LMC for both accuracy and precision for all types, with the exception of the DCEP_MULTI, which we massively missed and classified as DCEP_1O because the low number of epochs prevented the detection of the second (or third) periodicity. The results are slightly worse in the SMC, where the elongation along the line of sight produces far less separated PL and PW relations. This especially impacts the ACEPs, which were confused with DCEPs, introducing a 2% contamination among the latter. In the SMC, we missed a smaller percentage of DCEP_MULTI sources.

Concerning the overall completeness (e.g. ignoring the subclassification in types or modes), in both the LMC and SMC the Gaia DR3 catalogue includes 90% of the known Cepheids of all types. As for M 31, we do not show the confusion matrix as the agreement between our classification and the literature is 100%. The completeness is much less, because we were only able to detect reasonable light curves for the brightest Cepheids in M 31, which is due to the Gaia limiting magnitude. This corresponded to only 12.1% of the known Cepheids of all types. We do not have an accurate literature control sample for the M 33 Cepheids, and therefore we only mention that we detected about 23% of the known Cepheids in this galaxy.

6.3. Contamination by variables other than Cepheids

In the previous section, we established the reliability of the Cepheid classification in the Gaia DR3 catalogue by comparison with high-quality Cepheid catalogues in the literature. For the All Sky sample, we use the same literature catalogues –namely OGLE (Soszyński et al. 2019b), ZTF (Chen et al. 2020), and CSS (Drake et al. 2014), which also list variability types other than Cepheids– to assess the possible contamination of the Gaia DR3 catalogue by non-Cepheids. As a result, we found 93 objects which are listed in Table 11. The main source of possible contamination is from RR Lyrae stars, eclipsing binaries, and eruptive variables. Even if we restrict our comparison to the aforementioned surveys, we can nevertheless conclude that contamination of the Gaia DR3 Cepheid catalogue is on the order of 1%−2%.

Table 11.

Potential contaminants of type other than Cepheids.

6.4. The case of ARRD stars

Anomalous double-mode RR Lyrae stars (ARRDs) differ from normal RRDs because of the smaller ratio between the 1O and F pulsation modes (see Soszyński et al. 2016a,b). The ARRDs were originally discovered in the LMC, but Soszyński et al. (2019b) reported the presence of many ARRDs also among the OGLE bulge and disc collection of RR Lyrae stars. Six of these ARRDs are in the All Sky sample with classification as DCEP_MULTI. The position of these stars in the Petersen diagram is highlighted in Fig. 1. Five objects lie in the region where DCEPs pulsate in the F/1O multi-mode, while one (Gaia EDR3 4091104989668551936) is placed in the locus of 2O/3O pulsators. However, the two periods of the latter differ from those found by OGLE and could be incorrect, as we have only 23 epochs in Gaia. Adoption of the OGLE periods would also place this sixth source close to the F/1O DCEP multi-mode pulsators.

The location of the six objects in the PW plane is shown in Fig. 19. The uncertainty of the W values for three objects is rather high because of the large uncertainty in their parallaxes. Nevertheless, the location on the PW relation of all six objects seems compatible with them being short-period DCEPs. We conclude that at least some of the objects classified as ARRDs in the MW are actually DCEPs and not RR Lyrae variables. This is due to the difficulty in determining the distances in the MW compared with the LMC, where all the objects are at approximately the same distance from us.

thumbnail Fig. 19.

Position on the PW diagram of the six stars that are known in the literature as ARRD stars but are classified here as DCEP_MULTI (black filled circles). For reference, red and blue dots show the PW for the same DCEP_F and DCEP_1O samples displayed in Fig. 18.

6.5. Validation with TESS photometry

For validation we used photometric data collected by the Transiting Exoplanet Survey Satellite (TESS, Ricker et al. 2015), which is collecting continuous photometry over a large (24° ×96°) area with four cameras with adjacent fields of view over segments of 27 days in length, called sectors. In mission years 1, 2, and 3, the field of view was rotated around the centre of camera 4, positioned towards the southern and then the northern and then again the southern ecliptic pole, while avoiding a 12-degree band along the Ecliptic. In year 4, five sectors were rotated so that all cameras were pointing towards the Ecliptic and observations cover a roughly 230° segment of it. We searched the full-frame image data up to Sector 43, which was the fourth sector in year 4 and the second along the Ecliptic. Sampling cadence of the full-frame images was initially 30 min in years 1–2, and was lowered to 10 min in the first extended mission (years 3–4).

The spatial resolution of TESS is limited to 21″ px−1. Therefore, although it is capable of reaching the brighter Cepheids in the LMC and SMC, the images suffer from severe crowding and blending (Plachy et al. 2021). To avoid that, we only looked at Galactic Cepheids in this study. We cross-matched the Gaia coordinates with the sector coverage using the Web TESS Viewing Tool10 and then queried the TESS Quick Look Pipeline (QLP) database for light curves (Kunimoto et al. 2021; Huang et al. 2020a,b). The pre-processed QLP light curves have a faint limit of T = 15 mag, which is equivalent to the same GRP magnitude, and are produced primarily for searches of exoplanet transits. As a consequence, not all Cepheid candidates have good QLP light curves. Therefore, we also extracted photometry from the full-frame images with the eleanor software, which is capable of both pixel aperture and PSF photometry and post-processing of the light curves via regression against a systematic error model or via principal component analysis (Feinstein et al. 2019). We then selected the best light curves from the QLP and the four eleanor results (raw, corrected, PCA-corrected, and PSF photometry), and applied further corrections: sigma clipping to remove outliers and detrending to remove residual slow variations. For the trend removal, we used the method described by Bódi et al. (2022). Briefly, the algorithm searches for the dominant periodicity in the light curve, computes the phase dispersion of the folded data, and then fits a polynomial to the data by minimising against the phase dispersion. This way even high-order polynomials can be fitted that still follow the changes in average brightness and are much less affected by the effects of incomplete pulsation cycles at the edges.

We then calculated the pulsation periods and Ai1 and ϕi0 relative Fourier coefficients of the first few harmonics from the processed light curves and compared them to that of the OGLE I-band measurements (Soszyński et al. 2015a,b, 2018, 2019b, 2021). This validation only focused on the periods, light curve shapes, and Fourier coefficients and we did not use positions on the PL or PW relations for classification here. If the software failed to calculate the Fourier coefficients, we only classified the star if we deemed the light curve shape conclusive enough through visual inspection. For the DCEP_MULTI candidates, we fitted all possible pulsation frequencies and calculated the frequency ratios. We also checked for the presence of significant secondary periodicities in the single-mode stars and calculated period ratios for any potential DCEP_MULTI stars. As TESS sectors are 27 d in duration, we were effectively limited to < 20 d periods. For some long-period stars, we were able to stitch data from consecutive sectors but this was limited to high and low ecliptic latitudes and was prone to brightness differences and other systematic errors.

Overall we searched for light curves for 4690 stars and were able to classify 2378 (51%) of those. The validation results show strong agreement between the Gaia and TESS classifications. The largest discrepancy occurs among the 1O/2O DCEP_MULTI stars, where we identified a significant number of further stars classified as single-mode DCEP_1O in DR3. We also identified six stars as 1O/2O/3O DCEP_MULTI pulsators. This subclass is not included in DR3 but is known among the OGLE Cepheids.

Finally, we investigated the possible reasons for missing a significant amount of DM Cepheids in the DR3 classification. Figure 20 displays four diagnostic quantities: the upper panel shows the brightness of the stars (in GRP band) against the number of epochs in the light curves; the lower panel shows the amplitude ratio of the modes (calculated from the Fourier amplitudes of the pulsation frequencies) against the logarithm of the periods. The plots indicate that the number of epochs and brightness had little effect on the detection, with the brightest and most well-sampled stars having the highest positive detection rate in Gaia. The main driver for detection success appears to be the mode amplitude ratio, with all stars above 40% identified from the Gaia data. Longer period DCEP_MULTI stars also seem to be easier to discover from the sparse photometry collected by the mission.

thumbnail Fig. 20.

Comparison of the parameters of the multi-mode stars detected in Gaia (blue circles) or from the TESS light curves (red dots). The upper plot compares the GRP brightness (which is close to the TESS passband) and the number of photometric epochs available in DR3. The lower plot compares the amplitude ratio of the modes and the logarithm of the longer pulsation period.

6.6. Validation of RV data

It is important to validate the RV curves of Cepheids published in Gaia DR3, as they have important applications, especially in the case of DCEPs. For example, the Baade-Wesselink method is widely used to estimate the radius and the distance of radial pulsators of different types and Cepheids in particular from the combination of light and RV curves (e.g. Wesselink 1946; Gautschy 1987; Ripepi et al. 1997; Gieren et al. 2018, and references therein). We searched the literature for Cepheids with complete and reliable RV curves. As a result, we considered 14 DCEPs, complete properties of which are listed in Table 12. The comparison between the centre-of-mass velocity estimated from the Gaia RVs and those from the literature shows good agreement within 1−2σ. The only discrepant object is V340 Nor for which the SOS Cep&RRL pipeline uncertainty is perhaps underestimated. In summary, the Cepheid RV time series released with Gaia DR3 are reliable and can be used to derive the intrinsic parameters of the stars. Examples of the comparison between Gaia and literature RVs are shown in Fig. 21.

thumbnail Fig. 21.

Examples of the comparison between the Gaia and the literature RV curves for a DCEP_1O (DT Cyg) and a DCEP_F (S Cru).

Table 12.

Data for the validation of 14 DCEPs with high accuracy RV curves available in the literature.

7. Conclusions

In this paper, we present the Gaia DR3 catalogue of Cepheids of all types. We discuss the changes in the SOS Cep&RRL pipeline with respect to DR2, including the derivation of a full set of PL and PW relations adopted in the pipeline. The major novelties in DR3 compared to the previous release are the analysis of DCEPs in the distant galaxies M 31 and M 33, and the analysis of the RV data for a subsample of 799 Cepheids of all types, including 24 objects belonging to LMC and SMC.

We describe the techniques adopted to carry out a first gross cleaning of the sample for the large number of spurious objects retrieved from the general classification catalogue. In this process we also made use of machine learning techniques which significantly helped to single out the most promising candidates for further analysis.

To obtain maximum purity in the sample, we visually analysed almost all the candidates and corrected the classification provided from the Gaia SOS Cep&RRL pipeline when it was incorrect. In this context, the G time series of a number of suspect multi-mode pulsators was re-processed to determine correct pulsation periods.

In total, the Gaia DR3 catalogue counts 15 006 Cepheids of all types, among which 327 objects were known variable stars with a different classification in the literature, while, to our knowledge, 474 stars either had not been reported previously or had non-Cepheid type classification in the literature, and are therefore likely new Cepheid discoveries by Gaia.

The validation of the DR3 catalogue was carried out via comparison with literature results and through analysis of a consistent sample of light curves from TESS. The overall purity of the sample is very high and certainly larger than 90%−95%. The completeness varies significantly from one region in the sky to another and also as a function of Cepheid type. Completeness is larger than 90% in LMC and SMC overall, and is on the order of 10%−20% in M 31 and M 33. Concerning the All Sky sample, which is largely dominated by MW objects, the completeness for DCEPs is likely between 85% and 90%, with contamination of a few percent. The completeness is lower for ACEPs and especially T2CEPs, which are located in large numbers in the MW bulge, a region for which Gaia has not yet collected a sufficient amount of epoch data. Validation of the RV curves with literature data showed that the Gaia RV curves for Cepheids are generally accurate and usable for astrophysical purposes.

Compared to DR2, the Cepheids in DR3 represent a huge improvement both quantitatively, given the addition of about 5000 Cepheids of all types, and qualitatively, as the DR3 Cepheid catalogue has a much improved purity, especially for the All Sky sample. In addition, a significant benefit of DR3 is the release of RV time series for 799 Cepheids of all types.

The following release (DR4) will present further improvements compared to DR3, mainly due to the additional 24 months of data, which in turn will lead to more accurate period determinations. For the next release, we plan to thoroughly use the machine learning technique that was implemented to clean the DR3 sample. In this respect, the present Gaia DR3 Cepheid sample, with its high purity, will represent an excellent training set.


1

Section 14.1.2 of “Gaia Data Release 2 Documentation release 1.2”; https://gea.esac.esa.int/archive/documentation/GDR2/

2

We recall that the RR Lyrae stars are discussed in a companion paper (Clementini et al. 2023).

3

https://h2o.ai

6

In the figure we have removed from the All Sky sample objects physically bound to the LMC and SMC (see Sect. 5.9).

7

We remind the reader that these PW relations are used with the ABL formulation in the SOS Cep&RRL pipeline (see Sect. 2.2).

8

In this and many other phases of this work, we made use of the TOPCAT package (Tool for OPerations on Catalogues And Tables Taylor 2005).

9

For analogy with their studies on RR Lyrae stars, for which they only consider fundamental mode pulsators.

Acknowledgments

We wish to thank our anonymous Referee whose comments helped us to improve the manuscript. This work has made use of data from the European Space Agency (ESA) mission Gaia (https://www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www.cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement. The Italian participation in DPAC has been supported by Istituto Nazionale di Astrofisica (INAF) and the Agenzia Spaziale Italiana (ASI) through grants I/037/08/0, I/058/10/0, 2014-025-R.0, 2014-025-R.1.2015 and 2018-24-HH.0 to INAF (PI M.G. Lattanzi). The Swiss participation by the Swiss State Secretariat for Education, Research and Innovation through the “Activités Nationales Complémentaires”. UK community participation in this work has been supported by funding from the UK Space Agency, and from the UK Science and Technology Research Council. This work was supported in part by the French Centre National de la Recherche Scientifique (CNRS), the Centre National d’Etudes Spatiales (CNES), the Institut des Sciences de l’Univers (INSU) through the Service National d’Observation (SNO) Gaia. This research was supported by the ‘SeismoLab’ KKP-137523 Élvonal grant of the Hungarian Research, Development and Innovation Office (NKFIH), and by the LP2018-7 Lendület grant of the Hungarian Academy of Sciences. This research has made use of the SIMBAD database, operated at CDS, Strasbourg, France. It is a pleasure to thank M.B. Taylor for developing the TOPCAT software, which was very useful in carrying out this work.

References

  1. Anderson, R. I. 2014, A&A, 566, L10 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  2. Anderson, R. I., Casertano, S., Riess, A. G., et al. 2016, ApJS, 226, 18 [NASA ADS] [CrossRef] [Google Scholar]
  3. Andrievsky, S. M., Kovtyukh, V. V., Luck, R. E., et al. 2002, A&A, 392, 491 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  4. Andrievsky, S. M., Lépine, J. R. D., Korotin, S. A., et al. 2013, MNRAS, 428, 3252 [NASA ADS] [CrossRef] [Google Scholar]
  5. Arenou, F., & Luri, X. 1999, in Harmonizing Cosmic Distance Scales in a Post-HIPPARCOS Era, eds. D. Egret, & A. Heck, ASP Conf. Ser., 167, 13 [Google Scholar]
  6. Bersier, D. 2002, ApJS, 140, 465 [NASA ADS] [CrossRef] [Google Scholar]
  7. Bersier, D., Burki, G., Mayor, M., & Duquennoy, A. 1994, A&AS, 108, 25 [NASA ADS] [Google Scholar]
  8. Bódi, A., Szabó, P., Plachy, E., Molnár, L., & Szabó, R. 2022, PASP, 134, 014503 [CrossRef] [Google Scholar]
  9. Cantat-Gaudin, T., Anders, F., Castro-Ginard, A., et al. 2020, A&A, 640, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  10. Cappellari, M., Scott, N., Alatalo, K., et al. 2013, MNRAS, 432, 1709 [Google Scholar]
  11. Caputo, F. 1998, A&ARv, 9, 33 [NASA ADS] [CrossRef] [Google Scholar]
  12. Caputo, F., Marconi, M., Musella, I., & Santolamazza, P. 2000, A&A, 359, 1059 [NASA ADS] [Google Scholar]
  13. Caputo, F., Castellani, V., Degl’Innocenti, S., Fiorentino, G., & Marconi, M. 2004, A&A, 424, 927 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  14. Castro-Ginard, A., Jordi, C., Luri, X., et al. 2022, A&A, 661, A118 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  15. Chen, X., Wang, S., Deng, L., et al. 2020, ApJS, 249, 18 [NASA ADS] [CrossRef] [Google Scholar]
  16. Clement, C. M., Muzzin, A., Dufton, Q., et al. 2001, AJ, 122, 2587 [NASA ADS] [CrossRef] [Google Scholar]
  17. Clementini, G., Ripepi, V., Leccia, S., et al. 2016, A&A, 595, A133 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  18. Clementini, G., Ripepi, V., Molinaro, R., et al. 2019, A&A, 622, A60 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  19. Clementini, G., Ripepi, V., Molinaro, R., et al. 2023, A&A, 674, A18 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  20. Conn, A. R., Ibata, R. A., Lewis, G. F., et al. 2012, ApJ, 758, 11 [Google Scholar]
  21. De Somma, G., Marconi, M., Molinaro, R., et al. 2023, ApJS, submitted [Google Scholar]
  22. Drake, A. J., Graham, M. J., Djorgovski, S. G., et al. 2014, ApJS, 213, 9 [Google Scholar]
  23. Drake, A. J., Djorgovski, S. G., Catelan, M., et al. 2017, MNRAS, 469, 3688 [NASA ADS] [CrossRef] [Google Scholar]
  24. Eyer, L., Audard, M., Holl, B., et al. 2023, A&A, 674, A13 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  25. Feast, M. W., & Catchpole, R. M. 1997, MNRAS, 286, L1 [NASA ADS] [CrossRef] [Google Scholar]
  26. Feast, M. W., Laney, C. D., Kinman, T. D., van Leeuwen, F., & Whitelock, P. A. 2008, MNRAS, 386, 2115 [NASA ADS] [CrossRef] [Google Scholar]
  27. Feinstein, A. D., Montet, B. T., Foreman-Mackey, D., et al. 2019, PASP, 131, 094502 [Google Scholar]
  28. Gaia Collaboration (Prusti, T., et al.) 2016a, A&A, 595, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  29. Gaia Collaboration (Brown, A. G. A., et al.) 2016b, A&A, 595, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  30. Gaia Collaboration (Brown, A. G. A., et al.) 2018, A&A, 616, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  31. Gaia Collaboration (Eyer, L., et al.) 2019, A&A, 623, A110 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  32. Gaia Collaboration (Brown, A. G. A., et al.) 2021a, A&A, 649, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  33. Gaia Collaboration (Luri, X., et al.) 2021b, A&A, 649, A7 [EDP Sciences] [Google Scholar]
  34. Gaia Collaboration (Vallenari, A., et al.) 2023a, A&A, 674, A1 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  35. Gaia Collaboration (Drimmel, R., et al.) 2023b, A&A, 674, A37 (Gaia DR3 SI) [CrossRef] [EDP Sciences] [Google Scholar]
  36. Gallenne, A., Kervella, P., Borgniet, S., et al. 2019, A&A, 622, A164 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  37. Gautschy, A. 1987, Vistas Astron., 30, 197 [NASA ADS] [CrossRef] [Google Scholar]
  38. Genovali, K., Lemasle, B., Bono, G., et al. 2014, A&A, 566, A37 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  39. Gieren, W. 1977, A&AS, 28, 193 [NASA ADS] [Google Scholar]
  40. Gieren, W., Storm, J., Konorski, P., et al. 2018, A&A, 620, A99 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  41. Gorynya, N. A., Samus’, N. N., Rastorguev, A. S., & Sachkov, M. E. 1996, Astron. Lett., 22, 175 [Google Scholar]
  42. Heinze, A. N., Tonry, J. L., Denneau, L., et al. 2018, AJ, 156, 241 [Google Scholar]
  43. Holl, B., Audard, M., Nienartowicz, K., et al. 2018, A&A, 618, A30 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  44. Holl, B., Fabricius, C., Portell, J., et al. 2023, A&A, 674, A25 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  45. Huang, C. X., Vanderburg, A., Pál, A., et al. 2020a, Res. Notes Am. Astron. Soc., 4, 204 [Google Scholar]
  46. Huang, C. X., Vanderburg, A., Pál, A., et al. 2020b, Res. Notes Am. Astron. Soc., 4, 206 [Google Scholar]
  47. Jayasinghe, T., Stanek, K. Z., Kochanek, C. S., et al. 2019, MNRAS, 486, 1907 [NASA ADS] [Google Scholar]
  48. Kienzle, F., Moskalik, P., Bersier, D., & Pont, F. 1999, A&A, 341, 818 [NASA ADS] [Google Scholar]
  49. Kinemuchi, K., Harris, H. C., Smith, H. A., et al. 2008, AJ, 136, 1921 [Google Scholar]
  50. Klagyivik, P., Szabados, L., Szing, A., Leccia, S., & Mowlavi, N. 2013, MNRAS, 434, 2418 [CrossRef] [Google Scholar]
  51. Kodric, M., Riffeser, A., Hopp, U., et al. 2018, AJ, 156, 130 [NASA ADS] [CrossRef] [Google Scholar]
  52. Kunimoto, M., Huang, C., Tey, E., et al. 2021, Res. Notes Am. Astron. Soc., 5, 234 [Google Scholar]
  53. Leavitt, H. S., & Pickering, E. C. 1912, Harv. Coll. Obs. Circ., 173, 1 [Google Scholar]
  54. Lenz, P., & Breger, M. 2005, Commun. Asteroseismol., 146, 53 [Google Scholar]
  55. Luck, R. E. 2018, AJ, 156, 171 [Google Scholar]
  56. Luck, R. E., & Lambert, D. L. 2011, AJ, 142, 136 [Google Scholar]
  57. Madore, B. F. 1982, ApJ, 253, 575 [NASA ADS] [CrossRef] [Google Scholar]
  58. Marconi, M., Fiorentino, G., & Caputo, F. 2004, A&A, 417, 1101 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  59. Matsunaga, N., Feast, M. W., & Soszyński, I. 2011, MNRAS, 413, 223 [Google Scholar]
  60. Pellerin, A., & Macri, L. M. 2011, ApJS, 193, 26 [NASA ADS] [CrossRef] [Google Scholar]
  61. Perina, S., Federici, L., Bellazzini, M., et al. 2009, A&A, 507, 1375 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  62. Petterson, O. K. L., Cottrell, P. L., & Albrow, M. D. 2004, MNRAS, 350, 95 [NASA ADS] [CrossRef] [Google Scholar]
  63. Petterson, O. K. L., Cottrell, P. L., Albrow, M. D., & Fokin, A. 2005, MNRAS, 362, 1167 [NASA ADS] [CrossRef] [Google Scholar]
  64. Pietrukowicz, P., Soszyński, I., & Udalski, A. 2021, Acta Astron., 71, 205 [NASA ADS] [Google Scholar]
  65. Plachy, E., Pál, A., Bódi, A., et al. 2021, ApJS, 253, 11 [NASA ADS] [CrossRef] [Google Scholar]
  66. Poggio, E., Drimmel, R., Cantat-Gaudin, T., et al. 2021, A&A, 651, A104 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  67. Ricker, G. R., Winn, J. N., Vanderspek, R., et al. 2015, J. Astron. Telesc. Instrum. Syst., 1, 014003 [Google Scholar]
  68. Riello, M., De Angeli, F., Evans, D. W., et al. 2021, A&A, 649, A3 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  69. Riess, A. G., Macri, L. M., Hoffmann, S. L., et al. 2016, ApJ, 826, 56 [Google Scholar]
  70. Rimoldini, L., Holl, B., Gavras, P., et al. 2023, A&A, 674, A14 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  71. Ripepi, V., Barone, F., Milano, L., & Russo, G. 1997, A&A, 318, 797 [NASA ADS] [Google Scholar]
  72. Ripepi, V., Marconi, M., Moretti, M. I., et al. 2014, MNRAS, 437, 2307 [Google Scholar]
  73. Ripepi, V., Moretti, M. I., Marconi, M., et al. 2015, MNRAS, 446, 3034 [NASA ADS] [CrossRef] [Google Scholar]
  74. Ripepi, V., Cioni, M.-R. L., Moretti, M. I., et al. 2017, MNRAS, 472, 808 [Google Scholar]
  75. Ripepi, V., Molinaro, R., Musella, I., et al. 2019, A&A, 625, A14 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  76. Ripepi, V., Catanzaro, G., Clementini, G., et al. 2022a, A&A, 659, A167 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  77. Ripepi, V., Chemin, L., Molinaro, R., et al. 2022b, MNRAS, 512, 563 [NASA ADS] [CrossRef] [Google Scholar]
  78. Romaniello, M., Primas, F., Mottini, M., et al. 2008, A&A, 488, 731 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  79. Romaniello, M., Riess, A., Mancino, S., et al. 2022, A&A, 658, A29 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  80. Sandage, A., & Tammann, G. A. 2006, ARA&A, 44, 93 [NASA ADS] [CrossRef] [Google Scholar]
  81. Sartoretti, P., Blomme, R., David, M., et al. 2022, Gaia DR2 Documentation, European Space Agency; Gaia Data Processing and Analysis Consortium, 6 [Google Scholar]
  82. Shappee, B. J., Prieto, J. L., Grupe, D., et al. 2014, ApJ, 788, 48 [Google Scholar]
  83. Skowron, D. M., Skowron, J., Mróz, P., et al. 2019, Science, 365, 478 [Google Scholar]
  84. Soszyński, I., Udalski, A., Szymański, M. K., et al. 2015a, Acta Astron., 65, 233 [NASA ADS] [Google Scholar]
  85. Soszyński, I., Udalski, A., Szymański, M. K., et al. 2015b, Acta Astron., 65, 297 [NASA ADS] [Google Scholar]
  86. Soszyński, I., Pawlak, M., Pietrukowicz, P., et al. 2016a, Acta Astron., 66, 405 [NASA ADS] [Google Scholar]
  87. Soszyński, I., Smolec, R., Dziembowski, W. A., et al. 2016b, MNRAS, 463, 1332 [Google Scholar]
  88. Soszyński, I., Udalski, A., Szymański, M. K., et al. 2017, Acta Astron., 67, 103 [NASA ADS] [Google Scholar]
  89. Soszyński, I., Udalski, A., Szymański, M. K., et al. 2018, Acta Astron., 68, 89 [NASA ADS] [Google Scholar]
  90. Soszyński, I., Udalski, A., Szymański, M. K., et al. 2019a, Acta Astron., 69, 87 [Google Scholar]
  91. Soszyński, I., Udalski, A., Wrona, M., et al. 2019b, Acta Astron., 69, 321 [Google Scholar]
  92. Soszyński, I., Udalski, A., Szymański, M. K., et al. 2020, Acta Astron., 70, 101 [Google Scholar]
  93. Soszyński, I., Pietrukowicz, P., Skowron, J., et al. 2021, Acta Astron., 71, 189 [Google Scholar]
  94. Storm, J., Carney, B. W., Gieren, W. P., et al. 2004, A&A, 415, 521 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  95. Storm, J., Gieren, W., Fouqué, P., et al. 2011, A&A, 534, A94 [CrossRef] [EDP Sciences] [Google Scholar]
  96. Szabados, L. 1989, Commun. Konkoly Obs. Hung., 94, 1 [NASA ADS] [Google Scholar]
  97. Tarricq, Y., Soubiran, C., Casamiquela, L., et al. 2022, A&A, 659, A59 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  98. Taylor, M. B. 2005, in Astronomical Data Analysis Software and Systems XIV, eds. P. Shopbell, M. Britton, & R. Ebert, ASP Conf. Ser., 347, 29 [Google Scholar]
  99. Torrealba, G., Catelan, M., Drake, A. J., et al. 2015, MNRAS, 446, 2251 [NASA ADS] [CrossRef] [Google Scholar]
  100. Udalski, A., Soszyński, I., Pietrukowicz, P., et al. 2018, Acta Astron., 68, 315 [Google Scholar]
  101. Usenko, I. A., Kniazev, A. Y., Berdnikov, L. N., & Kravtsov, V. V. 2014, Astron. Lett., 40, 800 [NASA ADS] [CrossRef] [Google Scholar]
  102. Wenger, M., Ochsenbein, F., Egret, D., et al. 2000, A&AS, 143, 9 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  103. Wesselink, A. J. 1946, Bull. Astron. Inst. Neth., 10, 91 [NASA ADS] [Google Scholar]

Appendix A: Light curve examples

thumbnail Fig. A.1.

Light and RV curves for a selected sample of DCEPs of different modes.

thumbnail Fig. A.2.

Light and RV curves for the prototypes of the BLHER (left), WVIR (centre), and RVTAU (right) classes.

thumbnail Fig. A.3.

Light and RV curves for ACEP_F (left) and ACEP_1O (right) variables.

Appendix B: Fourier parameters for the LMC, SMC, M31, and M33 Cepheid samples

thumbnail Fig. B.1.

Same as in Fig. 7 but for the LMC.

thumbnail Fig. B.2.

Same as in Fig. 7 but for the SMC.

thumbnail Fig. B.3.

Fourier parameters for the M31 DCEPs.

thumbnail Fig. B.4.

Fourier parameters for the M33 DCEPs.

Appendix C: PL and PW relations for the LMC, SMC, M31, and M33 Cepheid samples

thumbnail Fig. C.1.

PL in the G-band and PW relations for the LMC Cepheids. The top panels show results for the DCEPs, while the bottom panels display ACEPs and T2CEPs.

thumbnail Fig. C.2.

Same as in Fig. C.1 but for the SMC.

thumbnail Fig. C.3.

PL in the G-band and PW relations for the Cepheids in M31 (left panel) and M33 (right panel).

Appendix D: CMDs for the LMC, SMC, M31, and M33 Cepheid samples

thumbnail Fig. D.1.

CMD in apparent G magnitude of the LMC Cepheid sample.

thumbnail Fig. D.2.

CMD in apparent G magnitude of the SMC Cepheid sample.

thumbnail Fig. D.3.

CMD in apparent G magnitude of the M31 (left pane)l and M33 (right panel) Cepheid samples.

Appendix E: Period–amplitude diagram for the LMC, SMC, M31, and M33 Cepheid samples

thumbnail Fig. E.1.

Period–amplitude(G) diagram for the LMC sample.

thumbnail Fig. E.2.

Period–amplitude(G) diagram for the SMC sample.

thumbnail Fig. E.3.

Period–amplitude(G) diagram for the M31 (left panel) and M33 (right panel) samples, respectively.

Appendix F: Confusion matrices

thumbnail Fig. F.1.

Confusion matrix for the All Sky sample. The percentages between parenthesis are calculated with respect to the literature.

thumbnail Fig. F.2.

As in Fig. F.1 but for the LMC.

thumbnail Fig. F.3.

As in Fig. F.1 but for the SMC.

All Tables

Table 1.

Sky subregions considered by the SOS Cep&RRL pipeline.

Table 2.

Coefficients and scatter values of the PL and PW relations used for the sky regions including the LMC and SMC.

Table 3.

Same as in Table 2, but for All Sky Cepheids.

Table 4.

Constraints on the results from the light-curve fitting.

Table 5.

Re-processing of the Gaia data for DCEP_MULTI objects not detected as such by the SOS Cep&RRL pipeline.

Table 6.

Reclassification of objects incorrectly classified by the SOS Cep&RRL pipeline.

Table 7.

Number and type or mode classification of Cepheids confirmed by the SOS Cep&RRL pipeline and published in Gaia DR3.

Table 8.

Links to Gaia archive table to retrieve the pulsation characteristics: period(s), epochs of maximum light and minimum radial velocity (E), peak-to-peak amplitudes, intensity-averaged mean magnitudes, mean radial velocity, ϕ21, R21, ϕ31, R31 Fourier parameters with related uncertainties and metallicity computed by the SOS Cep&RRL pipeline for the 15 021 objects (15 006 Cepheids and 15 stars of different type) released in Gaia DR3.

Table 9.

Gaia source_id of sources for which the SOS Cep&RRL pipeline provides a metallicity estimate which should not be used as these stars are not DCEP_F pulsators.

Table 10.

Association of Cepheids in the All Sky sample with open and globular clusters and with dwarf galaxies that are satellites of the MW.

Table 11.

Potential contaminants of type other than Cepheids.

Table 12.

Data for the validation of 14 DCEPs with high accuracy RV curves available in the literature.

All Figures

thumbnail Fig. 1.

Petersen diagram for confirmed DCEP_MULTI objects published in the Gaia DR3 catalogue (red filled circles) and for additional DCEP_MULTI objects detected in the re-processing of the data (blue filled circles). PL and PS represent the longest and shortest pulsation periods of the multi-mode object. Labels show the typical location of the different multi-mode pulsation combinations identified in these sources. Black squares mark six objects known in the literature as ARRDs (see Sect. 6.4).

In the text
thumbnail Fig. 2.

Number of epochs in the G-band time series. From top to bottom, the different panels show the data for the different subsamples corresponding to the five regions of the sky defined in Sect. 2.

In the text
thumbnail Fig. 3.

Number of epochs in the RV time series for the labelled subsamples.

In the text
thumbnail Fig. 4.

Map in Galactic coordinates of the different Cepheid types in the MW. The objects are colour coded according to their apparent G magnitude.

In the text
thumbnail Fig. 5.

Map of the different Cepheid types in the MCs. The objects are colour coded according to their apparent G magnitude. The map is a zenithal equidistant projection centred at equatorial coordinates RA, Dec = 56.0, −73.0 deg (J2000).

In the text
thumbnail Fig. 6.

Map of the DCEPs in M 31 (top panel) and M 33 (bottom panel). The symbols are colour coded based on the apparent G magnitude of the DCEPs. The two black crosses identify two RVTAU stars in M 31. The maps are in zenithal equidistant projection centred at equatorial coordinates (RA, Dec)M 31 = 10.6, 41.2 deg (J2000) and (RA, Dec)M 33 = 23.5, 30.65 deg (J2000).

In the text
thumbnail Fig. 7.

Fourier parameters for the All Sky sample. From top to bottom the different panels show the results for DCEPs, ACEPs, and T2CEPs, respectively.

In the text
thumbnail Fig. 8.

PW relation for the All Sky sample. The different types and modes of the Cepheids displayed in the figures are labelled in each panel.

In the text
thumbnail Fig. 9.

Same as in Fig. 8 but restricting the sample to objects with σϖ/ϖ< 0.2.

In the text
thumbnail Fig. 10.

CMD of the All Sky Cepheid sample.

In the text
thumbnail Fig. 11.

Period–amplitude (G) diagram for the All Sky sample.

In the text
thumbnail Fig. 12.

Comparison between the average RV calculated by the SOS Cep&RRL pipeline from fitting the RV curves and the mean values published in the gaia_source table (see Sartoretti et al. 2022, for details).

In the text
thumbnail Fig. 13.

RV maps defined by the 3190 Cepheids in the DR3 gaia_source table (top panel) and 786/799 Cepheids in the DR3 vari_cepheid table (bottom panel).

In the text
thumbnail Fig. 14.

Uncertainties on the average and peak-to-peak RV values measured by the SOS Cep&RRL pipeline for a sample of 786 Cepheids.

In the text
thumbnail Fig. 15.

Period–amplitude (RV) for the 786 Cepheids whose RV curves were analysed by the SOS Cep&RRL pipeline. The different Cepheid types are labelled. The size of the circles surrounding the symbols is proportional to the uncertainty in Amp(RV) (see also Fig. 14).

In the text
thumbnail Fig. 16.

Photometric metallicities in the LMC, SMC, and All Sky samples.

In the text
thumbnail Fig. 17.

Comparison between photometric metallicities computed by the SOS Cep&RRL pipeline ([Fe/H]SOS) and metal abundances from high-resolution spectroscopy available in the literature ([Fe/H]Lit).

In the text
thumbnail Fig. 18.

PW relation for a selected sample of DCEPs. Red and blue small filled circles show the DCEP_Fs and DCEP_1Os in common between Gaia DR3 and Pietrukowicz et al. (2021, abbreviated as P21 in the labels), respectively. Cyan and green large filled circles show DCEP_Fs and DCEP_1Os present in the P21 catalogue only. For all objects, we applied a selection in parallax, requiring that the relative precision be better than 20%. We also required the RUWE parameter to be lower than 1.4, so as to ensure a good astrometric solution (see text).

In the text
thumbnail Fig. 19.

Position on the PW diagram of the six stars that are known in the literature as ARRD stars but are classified here as DCEP_MULTI (black filled circles). For reference, red and blue dots show the PW for the same DCEP_F and DCEP_1O samples displayed in Fig. 18.

In the text
thumbnail Fig. 20.

Comparison of the parameters of the multi-mode stars detected in Gaia (blue circles) or from the TESS light curves (red dots). The upper plot compares the GRP brightness (which is close to the TESS passband) and the number of photometric epochs available in DR3. The lower plot compares the amplitude ratio of the modes and the logarithm of the longer pulsation period.

In the text
thumbnail Fig. 21.

Examples of the comparison between the Gaia and the literature RV curves for a DCEP_1O (DT Cyg) and a DCEP_F (S Cru).

In the text
thumbnail Fig. A.1.

Light and RV curves for a selected sample of DCEPs of different modes.

In the text
thumbnail Fig. A.2.

Light and RV curves for the prototypes of the BLHER (left), WVIR (centre), and RVTAU (right) classes.

In the text
thumbnail Fig. A.3.

Light and RV curves for ACEP_F (left) and ACEP_1O (right) variables.

In the text
thumbnail Fig. B.1.

Same as in Fig. 7 but for the LMC.

In the text
thumbnail Fig. B.2.

Same as in Fig. 7 but for the SMC.

In the text
thumbnail Fig. B.3.

Fourier parameters for the M31 DCEPs.

In the text
thumbnail Fig. B.4.

Fourier parameters for the M33 DCEPs.

In the text
thumbnail Fig. C.1.

PL in the G-band and PW relations for the LMC Cepheids. The top panels show results for the DCEPs, while the bottom panels display ACEPs and T2CEPs.

In the text
thumbnail Fig. C.2.

Same as in Fig. C.1 but for the SMC.

In the text
thumbnail Fig. C.3.

PL in the G-band and PW relations for the Cepheids in M31 (left panel) and M33 (right panel).

In the text
thumbnail Fig. D.1.

CMD in apparent G magnitude of the LMC Cepheid sample.

In the text
thumbnail Fig. D.2.

CMD in apparent G magnitude of the SMC Cepheid sample.

In the text
thumbnail Fig. D.3.

CMD in apparent G magnitude of the M31 (left pane)l and M33 (right panel) Cepheid samples.

In the text
thumbnail Fig. E.1.

Period–amplitude(G) diagram for the LMC sample.

In the text
thumbnail Fig. E.2.

Period–amplitude(G) diagram for the SMC sample.

In the text
thumbnail Fig. E.3.

Period–amplitude(G) diagram for the M31 (left panel) and M33 (right panel) samples, respectively.

In the text
thumbnail Fig. F.1.

Confusion matrix for the All Sky sample. The percentages between parenthesis are calculated with respect to the literature.

In the text
thumbnail Fig. F.2.

As in Fig. F.1 but for the LMC.

In the text
thumbnail Fig. F.3.

As in Fig. F.1 but for the SMC.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.