Issue |
A&A
Volume 674, June 2023
Gaia Data Release 3
|
|
---|---|---|
Article Number | A14 | |
Number of page(s) | 105 | |
Section | Catalogs and data | |
DOI | https://doi.org/10.1051/0004-6361/202245591 | |
Published online | 16 June 2023 |
Gaia Data Release 3
All-sky classification of 12.4 million variable sources into 25 classes
1
Department of Astronomy, University of Geneva, Chemin d’Ecogia 16, 1290 Versoix, Switzerland
2
Department of Astronomy, University of Geneva, Chemin Pegasi 51, 1290 Versoix, Switzerland
3
RHEA for European Space Agency (ESA), Camino bajo del Castillo, s/n, Urbanizacion Villafranca del Castillo, Villanueva de la Cañada, 28692 Madrid, Spain
4
Institute of Astronomy, KU Leuven, Celestijnenlaan 200D, 3001 Leuven, Belgium
5
Sednai Sàrl, Geneva, Switzerland
6
Université de Caen Normandie, Côte de Nacre, Boulevard Maréchal Juin, 14032 Caen, France
7
Institute of Astronomy, University of Cambridge, Madingley Road, Cambridge CB3 0HA, UK
8
Konkoly Observatory, Research Centre for Astronomy and Earth Sciences, Eötvös Loránd Research Network (ELKH), MTA Centre of Excellence, Konkoly Thege Miklós út 15-17, 1121 Budapest, Hungary
9
ELTE Eötvös Loránd University, Institute of Physics, Pázmány Péter sétány 1A, 1117 Budapest, Hungary
10
INAF – Osservatorio Astrofisico di Torino, Via Osservatorio 20, 10025 Pino Torinese, Italy
11
INAF – Osservatorio di Astrofisica e Scienza dello Spazio di Bologna, Via Gobetti 93/3, 40129 Bologna, Italy
12
INAF – Osservatorio Astrofisico di Catania, Via S. Sofia 78, 95123 Catania, Italy
13
European Space Agency (ESA), European Space Astronomy Centre (ESAC), Camino bajo del Castillo, s/n, Urbanizacion Villafranca del Castillo, Villanueva de la Cañada, 28692 Madrid, Spain
14
School of Physics and Astronomy, Tel Aviv University, Tel Aviv 6997801, Israel
15
Lohrmann Observatory, Technische Universität Dresden, Mommsenstraße 13, 01062 Dresden, Germany
16
Astronomical Observatory, University of Warsaw, Al. Ujazdowskie 4, 00-478 Warszawa, Poland
17
Department of Physics and Astronomy, University of Catania, Via S. Sofia 64, 95123 Catania, Italy
18
University of Vienna, Department of Astrophysics, Tuerkenschanzstrasse 17, 1180 Vienna, Austria
19
INAF – Osservatorio Astronomico di Capodimonte, Via Moiariello 16, 80131 Napoli, Italy
20
Telespazio Vega UK Ltd for ESA/ESAC, Camino bajo del Castillo, s/n, Urbanizacion Villafranca del Castillo, Villanueva de la Cañada, 28692 Madrid, Spain
21
Porter School of the Environment and Earth Sciences, Tel Aviv University, Tel Aviv 6997801, Israel
Received:
30
November
2022
Accepted:
17
December
2022
Context. Gaia DR3 contains 1.8 billion sources with G-band photometry, 1.5 billion of which with GBP and GRP photometry, complemented by positions on the sky, parallax, and proper motion. The median number of field-of-view transits in the three photometric bands is between 40 and 44 measurements per source and covers 34 months of data collection.
Aims. We pursue a classification of Galactic and extra-galactic objects that are detected as variable by Gaia across the whole sky.
Methods. Supervised machine learning (eXtreme Gradient Boosting and Random Forest) was employed to generate multi-class, binary, and meta-classifiers that classified variable objects with photometric time series in the G, GBP, and GRP bands.
Results. Classification results comprise 12.4 million sources (selected from a much larger set of potential variable objects) and include about 9 million variable stars classified into 22 variability types in the Milky Way and nearby galaxies such as the Magellanic Clouds and Andromeda, plus thousands of supernova explosions in distant galaxies, 1 million active galactic nuclei, and almost 2.5 million galaxies. The identification of galaxies was made possible by the artificial variability of extended objects as detected by Gaia, so they were published in the galaxy_candidates table of the Gaia DR3 archive, separate from the classifications of genuine variability (in the vari_classifier_result table). The latter contains 24 variability classes or class groups of periodic and non-periodic variables (pulsating, eclipsing, rotating, eruptive, cataclysmic, stochastic, and microlensing), with amplitudes from a few milli-magnitudes to several magnitudes.
Key words: catalogs / galaxies: general / methods: data analysis / quasars: general / stars: variables: general
© The Authors 2023
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
1. Introduction
Time-dependent brightness variations of celestial objects may be caused by different phenomena: intrinsic physical changes, such as pulsations, eruptions, and cataclysmic outbursts, or extrinsic reasons that depend on the direction of observation, such as eclipsing binaries, stars rotating with spots or with ellipsoidal shapes, and microlensing events, as shown in Fig. 1 of Gaia Collaboration (2019). The detection of variability requires multi-epoch observations and, depending on the signal sampling, a certain set of classes can be identified. Gaia’s sparse sampling allows for the detection of periodic signals ranging from minutes to years and for medium to long-term non-periodic variability. The chance of detection of up to approximately six-week long transient phenomena, for example, crucially depends on the sampling at a given location in the sky (see Appendix A in Eyer et al. 2017), which follows from the scanning law properties (Gaia Collaboration 2016b). Although the scanning law of Gaia was designed for astrometric goals, it allows for the identification of a broad variety of variability types, with different possible levels of completeness (Eyer et al. 2023).
As the time span of Gaia data collection progressively increased from Gaia data release 1 (DR1) to DR2 and DR3 (14, 22, and 34 months, respectively; Gaia Collaboration 2016a, 2018, 2023c), the classified variability types increased from Cepheids and RR Lyrae stars in a limited region of the sky (in DR1; Eyer et al. 2017), to an all-sky1 classification of the DR1 classes plus long-period variables and δ Scuti or SX Phoenicis stars (in DR2; Rimoldini et al. 2019), and 20 further variability classes in DR3 (presented in this article and listed in Sect. 3.1.1). For brevity, we refer to Table 1 for selected publications related to these classes, with representatives identified in various surveys.
Classification training classes (see Sect. 3.1.1 for class label definitions), with the specification of their components (whose approximate representation is indicated in brackets when greater than 500; see Sects. 3.1.2 and 3.1.3 for details on the creation of class subsets), the number of training sources NTRN, and references.
Machine learning is a practical tool to automate classification tasks that involve multiple known classes and a possibly high number of attributes to identify such classes and distinguish them from others (e.g. see Debosscher et al. 2007; Sarro et al. 2009; Blomme et al. 2010; Richards et al. 2011; Dubath et al. 2011). Herein, we present how a supervised classification was applied to Gaia DR3 data to classify variable sources into two dozen classes (plus galaxies). In particular, we describe the details concerning the construction of the training set and of the classifiers, the verification of the results, and the generation of an overall classification score. Selection procedures, parameter distributions, and assessments of candidates are presented for each class.
Some of the classification results are further processed by specific object studies (SOSs) dedicated to single classes, typically describing a subset of the most reliable candidates in detail. Such single-class processing modules are available in DR3 for active galactic nuclei (AGNs Carnerero et al. 2023), Cepheids (Ripepi et al. 2023), compact companions (Gomel et al. 2023), eclipsing binaries (Mowlavi et al. 2023), long-period variables (Lebzelter et al. 2023), main-sequence oscillators (Gaia Collaboration 2023b), planetary transits (Panahi et al. 2022), and RR Lyrae stars (Clementini et al. 2023). Other SOS modules, such as microlensing events (Wyrzykowski et al. 2023), short-timescale variables (see Sect. 10.12 of the Gaia DR3 documentation; Rimoldini et al. 2022), and solar-like rotation modulation stars (Distefano et al. 2023), were executed independently of the classification results, as they relied on their own candidate selection. A summary of the variability results from all modules is presented in Eyer et al. (2023).
This article is organised as follows. The classification input data are outlined in Sect. 2; the preparation, application, and verification of supervised learning procedures are described in Sect. 3; the results for each class are presented in Sect. 4; and conclusions are drawn in Sect. 5. Special training selections applied to a subset of classes are detailed in Appendix A; selected classification attributes are listed in Appendix B; additional class labels from the literature (among the false positive classes listed in Table 3) are defined in Appendix C; some examples of queries to facilitate the exploitation of classification results in the Gaia archive2 are provided in Appendix D; and common diagrams for all classes, including a summary of trained and classified sources, an assessment of the results with respect to the literature, and sample light curves, are presented in Appendix E. All table names in the Gaia archive that are mentioned in the text assume the prefix gaiadr3 (as shown in Appendix D).
2. Data
As part of the Gaia variability pipeline (Eyer et al. 2023), the general classification module received – as input – sources with photometric time series in the G, GBP, and GRP bands (Riello et al. 2021) that had at least five field-of-view (FoV) measurements in the G band, which were already identified as potential variable sources and characterised by basic statistics and periodicity parameters. Before any computation, sources and associated epoch FoV transits were processed by the chain of operators described in Sect. 10.2.3 of the Gaia DR3 documentation (Rimoldini et al. 2022) and Sect. 3.1 of Eyer et al. (2023), which selected, transformed, and cleaned time series from spurious or doubtful observations. The balance between outlier removal and signal preservation favoured the latter, considering that some of the targeted variability types relied on a small number of outlier-like measurements (such as Algol-type eclipsing binaries and microlensing events). All time series and derived statistical numbers hereafter refer to these cleaned time series. The median number of FoV measurements in the three photometric bands is between 40 and 44 per source (Eyer et al. 2023), within a time span of typically 900–1000 days in the G band.
While the processing of Gaia (early) DR3 photometry included significant calibration improvements with respect to DR2 (Riello et al. 2021), some low-level uncalibrated systematic effects remained and their impact on epoch photometry are described in Evans et al. (2023). Among instrumental effects, scan-angle dependent signals were induced mainly by asymmetric extended sources (such as barred spiral galaxies and tidally distorted stars) and multiple close pairs (≲1″) of point-like sources (Holl et al. 2023). Although such signals helped the identification of galaxies from photometric variations, in general data artefacts might interfere with the correct identification of classes with genuine variability, especially those associated with low signal-to-noise ratios.
The classification of variables employed also astrometrically derived parameters such as parallax and proper motion (Lindegren et al. 2021b). However, Gaia DR3 astrophysical parameters (Andrae et al. 2023; Creevey et al. 2023; Delchambre et al. 2023; Fouesneau et al. 2023) could not be included as they were processed in parallel and became available after the results of the variability pipeline were finalised.
A subset of classified sources were analysed in more detail by subsequent SOS modules, typically focusing on specific classes, as mentioned in Sect. 1. The results of all variability modules were subject to additional source filtering before their ingestion into the public Gaia archive (Babusiaux et al. 2023). Statistical parameters of all the photometric time series published in Gaia DR3 are available in the vari_summary table.
3. Method
For Gaia DR3, general classification relied on supervised machine learning, that is, training classifiers with sources of known variability types and applying the resulting models to classify sources of an unknown variability type. Known variables in the literature are cross-matched with Gaia sources, verified, selected, and characterised by attributes derived from the Gaia data. The use of both cross-match sources and (optimised) classification attributes for training was described in Rimoldini et al. (2019) and it is not repeated herein.
An extensive cross-match of Gaia sources was compiled by Gavras et al. (2023), which provided millions of variable objects from the literature and represented over 100 variability types. The robustness of the cross-match method, which included astrometric and photometric information in the identification of matches, and the verification of the genuineness of literature classifications ensured the reliability of training sources (critical to supervised classification) and of the validation of the results.
3.1. Training set
Potential training sources from literature were vetted for each class to ensure the correct class membership. This was repeated for every catalogue that was deemed trustable for training the class under investigation. The reliance of supervised classification on known objects makes it vulnerable to biases from the literature, for instance, related to their data acquisition and classification methods. Thus, in addition to class verification, the cross-matched objects were probed in several dimensions to identify intrinsic biases, such as limited sky coverage or apparent magnitude range with respect to the ones of Gaia, in order to prevent (or minimise) the transfer of literature selection functions to the Gaia classifications.
3.1.1. Published classes
Since it was difficult to know a priori the full list of classes that could be identified in Gaia DR3, the verification of literature classifications and source selection for training purposes were performed for every variability type defined in Gavras et al. (2023). However, only the actions on classes relevant to the published results are presented herein.
The published variability classes, corresponding acronyms, including the types trained within class groups or a (non-comprehensive) list of sub-types for some classes, are presented as follows:
1. α2 Canum Venaticorum (ACV) or (magnetic) chemically peculiar (MCP, CP) or rapidly oscillating Am- and Ap-type (ROAM and ROAP) or SX Arietis (SXARI) star (collectively denominated as ACV|CP|MCP|ROAM|ROAP|SXARI);
2. α Cygni-type star (ACYG);
3. AGN, from the perspective of variability, the general term AGN is favoured with respect to a quasar (or QSO), as brightness variability is caused by the activity of galactic nuclei, from processes in the accretion disc around a supermassive black hole (such as in Seyfert galaxies and QSOs), which can lead to the formation of relativistic plasma jets (identified as blazars when directed towards us);
4. β Cephei variable (BCEP);
5. B-type emission line (BE) star or γ Cassiopeiae (GCAS) or S Doradus (SDOR) or Wolf-Rayet (WR) star (denoted as BE|GCAS|SDOR|WR);
6. Cepheid (CEP), including anomalous Cepheid (ACEP), BL Herculis variable (BLHER, also known as CWB), W Virginis variable (CW), δ Cephei star (DCEP), RV Tauri-type star (RV), and generic type II Cepheid (T2CEP);
7. cataclysmic variable (CV), excluding supernova and symbiotic star (mentioned separately);
8. δ Scuti (DSCT) or γ Doradus (GDOR) or SX Phoenicis (SXPHE) star, including hybrid δ Scuti + γ Doradus (DSCT+GDOR) stars (labelled as DSCT|GDOR|SXPHE);
9. eclipsing binary (ECL), including Algol (β Persei) type (EA), β Lyrae type (EB), and W Ursae Majoris type (EW);
10. ellipsoidal variable (ELL);
11. star with exoplanet transits (EP);
12. long-period variable (LPV), including long secondary period variable (LSP), Mira (o Ceti) type (M), Mira or semi-regular variable (M|SR), OGLE small amplitude red giant (OSARG), small amplitude red giant (SARG), and semi-regular variable (SR) of sub-types SRA, SRB, SRC, SRD, and SRS;
13. microlensing event (MICROLENSING);
14. R Coronae Borealis variable (RCB);
15. RR Lyrae star (RR), including fundamental-mode (RRAB), first overtone (RRC), double-mode (RRD) RR Lyrae star (and anomalous double-mode, ARRD);
16. RS Canum Venaticorum variable (RS);
17. short-timescale object (S);
18. subdwarf B star (SDB) of type V1093 Herculis (V1093HER) or V361 Hydrae (V361HYA);
19. supernova (SN);
20. solar-like star (SOLAR_LIKE), including BY Draconis type (BY), rotating spotted star (ROT), BY and/or ROT star (BY|ROT), and flaring star (FLARES);
21. slowly pulsating B-type variable (SPB);
22. symbiotic variable star (SYST), including Z Andromedae type (ZAND);
23. variable white dwarf (WD), including a generic class (ZZ), objects of spectral type DA, such as ZZ Ceti (known as ZZA or DAV), DB, such as V777 Herculis (known as ZZB, DBV, V777HER, GD358), or DO, such as GW Virginis pre-white dwarf (comprising ZZO, DOV, GWVIR, PG1159); extremely low mass and hot ZZ Ceti variables were labelled as ELM_ZZA and HOT_ZZA, respectively;
24. young stellar object (YSO), including dipper stars (DIP), eruptive YSOs such as FU Orionis type variables (FUOR), pulsating pre-main-sequence stars (PULS_PMS), Herbig Ae or Be types (HAEBE), including UX Orionis stars (UXOR), and T Tauri star (TTS), among which classical (CTTS), weak-lined (WTTS), and late G to early K type pre-main-sequence (GTTS) stars.
A brief definition of the 24 variability class labels is presented also in the vari_classifier_class_definition table of the Gaia archive. These labels identify the best classification of a variable source in the field best_class_name of table vari_classifier_result.
In addition, galaxies (labelled as GALAXY) are classified not because they are intrinsically variable, but because they appear to be photometrically variable in the Gaia processing (see Sect. 2). To prevent misinterpretation, galaxies are published exclusively in the extra-galactic galaxy_candidates table, together with results on galaxies from other Gaia pipelines (Ducourant et al. 2023; Delchambre et al. 2023).
3.1.2. Source selection
For each class in Gavras et al. (2023), two sets of sources were selected: one for training and one for testing purposes, generally with no sources in common, except for classes that were insufficiently represented. Among the trained types, sub-types, and possible combinations thereof that were part of the published classes (listed in Sect. 3.1.1 and in Table 1), the ones associated with the following labels (in alphabetical order) were represented by less than about 500–600 (with a median of 86) sources, after the selections described in the subsequent paragraphs of this section: ACEP, ACV, ACYG, ARRD, BCEP, BY|ROT, CP, CTTS, DIP, DSCT+GDOR, ELM_ZZA, EP, FLARES, FUOR, GTTS, GWVIR, HAEBE, HOT_ZZA, MICROLENSING, PULS_PMS, RCB, ROAM, ROAP, RV, SDOR, SN, SPB, SRA, SRC, SRD, SXARI, SXPHE, SYST, T2CEP, UXOR, V1093HER, V361HYA, V777HER, WR, WTTS, ZAND, ZZ, and ZZA. In such cases, higher priority was given to the quality of results (by training with all known objects) rather than to their assessment, so the independence of training and test sets of poorly represented types was not pursued.
For both training and test sets, up to six subsets of sources were created with different sample sizes (up to about 500 for the first one and then approximately 1000, 2000, ..., 5000, depending on the available number of sources per class), in order to have, for a given class, a set of the chosen size ready for use, while preserving the representation of the full magnitude range and sky coverage (see Sect. 3.1.3).
General conditions and selection procedures were applied to all sources of known variables from the cross-match with Gaia (with a few exceptions), as follows.
1. At least five FoV transit observations in the G band.
2. Due to the importance of colour information, at least one observation in GBP or GRP was required (at this stage, independently of the trained attributes described in Sect. 3.1.4).
3. The distribution of angular distances of the cross-match was verified for each class and catalogue: potential spurious matches at significantly higher angular distances than most were removed, unless their correlation with parallax or median G-band magnitude suggested nearby objects. Selections beyond simple angular distance thresholds are detailed in Appendix A.
4. The minimum standard deviation versus median magnitude in the G band was generally set to the third quartile of the standard deviations of 1.6 million reference sources binned in 0.05 mag intervals (see Appendix E), except for a small number of classes such as planetary transits and γ Doradus stars.
5. For catalogues whose distribution of sources in the sky had over- or under-represented regions due to, for example, limitations of ground-based observations or special targeted regions (and not because of the natural distribution of such objects), sources were sampled to prevent or minimise biases resulting from these limitations. The same procedure was also applied to classes that were extremely rich with respect to the few thousand sources per class needed for training (not to overwhelm the identification of rare classes trained with only a few tens or hundreds of sources). In particular, the sky was subdivided into 3072 Hierarchical Equal Area iso-Latitude Pixelization (HEALPix; Górski et al. 2002) pixels (corresponding to a resolution of 4, with mean angular spacing of 3.6645 deg)3 and the k nearest neighbours (with k depending on catalogue) to each HEALPix centre were identified within a radius of 5 degrees. This radius allowed for sources to be drawn from nearby HEALPix as well, in order to avoid gaps or distributions reflecting single HEALPix boundaries. Sources were then randomly selected from such k neighbours according to the desired downsampling and possible duplicated sources were removed.
6. The steep increase in the number of objects towards faint magnitudes can make the bright end unnoticeable by classifiers, so sources were sampled in median G magnitude to improve the detectability of objects at the bright end, while keeping the full range of represented magnitudes, for each class with more than 500 sources. In particular, the magnitude distribution was binned in 100 intervals and unique sources were randomly sampled up to a maximum number per bin. The latter depended linearly on the magnitude by a custom (positive) coefficient and offset per class. These coefficient and offset were adjusted for each of the sub-sample sizes introduced earlier (for training and testing). This procedure made it possible to retain the brightest and the faintest objects, as well as an indication of the most represented magnitudes, with a supplementary option to keep full catalogues of special relevance before adding sources from other catalogues.
Additional per-class selections were applied in specific cases to ensure genuine class representation (see Appendix A). The distribution of training sources in various diagrams is presented in Appendix E, for each published class, as listed in Sect. 3.1.1.
3.1.3. Training classes
Commonly confused or physically related (sub-)types were combined as listed in the second column of Table 1, where the values in parentheses indicate an approximate number of representatives or an upper limit when fewer sources were available (see Sect. 3.1.2). In general, larger sample sizes were assigned to more frequent types, although not with a realistic occurrence relative to other types (as known from the literature), otherwise (unweighted) decision-tree based classifiers might optimise models that neglected rare types. As some types overlapped in part with other ones (such as CP and MCP), some of the sets had sources in common, thus duplicated sources were removed. For one per cent of the training set, some of the literature classifications were in conflict with other classes (represented in different class groups). In such cases, duplicated sources that were part of generic class groups (such as S and SOLAR_LIKE) were removed in favour of specific classes and the remaining few hundred sources in conflict (none of which belonged to rare classes) were removed from the training set. Special objects that were wished not to be missed, such as class prototypes, were added to the training set for several types. A similar procedure (except for the addition of special objects) was followed for the test set.
Classes that were trained, classified, but not published in Gaia DR3 (following the verification filters mentioned in Sect. 3.3), included blue large-amplitude pulsators (BLAP, Pietrukowicz et al. 2017), FK Comae Berenices-type variables, heartbeat stars, high mass X-ray binaries, poorly studied irregular variables, post-common envelope binaries (or pre-cataclysmic variables), protoplanetary nebulae embedding yellow supergiant post-AGB stars, PV Telescopii-type variables, strong reflection (re-radiation) in close binary systems, UV Ceti stars, general sources with variable X-ray emission, ZZ Leporis stars, and a selection of constant objects from HIPPARCOS (ESA 1997), Optical Gravitational Lensing Experiment-IV (OGLE-IV; Soszyński et al. 2012), Sloan Digital Sky Survey (SDSS; Ivezić et al. 2007), Transiting Exoplanet Survey Satellite (TESS; Ricker et al. 2015), and Zwicky Transient Facility (ZTF; Bellm 2014), from the cross-match of Gavras et al. (2023); the class of constant sources was introduced to clean the classification of variables from false variability detections. Such omitted classes are not listed in Table 1 to prevent false expectations.
3.1.4. Attributes
Classification attributes were used to characterise the light variations and the general properties of sources. About 60 attributes were generated, including time series statistics, photometric colours, astrometric parameters, periodicity indicators, combinations of photometric and astrometric quantities, comparisons of statistics and correlations between the GBP and GRP bands, and stochastic model parameters for AGNs (Butler & Bloom 2011). Their effectiveness in the identification of variability classes was tested with Random Forest classifiers (Breiman 2001), which assessed the attribute usefulness with ‘out-of-bag’ objects (unused for training). Such classifiers were used to select attributes by adding the most useful one to the selection iteratively, until the reduction of the total error rate was insignificant. About 10% of training sources per class (or 20% for classes with less than 100 training objects and that were not merged with other types in the training set) were used in the identification of optimal attributes. The choices of method for attribute selection and of training-set downsampling were due to the necessity of computational efficiency, given the limited time available.
The final list of attributes and their definitions are presented in Appendix B. The adaptation of Butler & Bloom (2011) parameters (qso_variability and non_qso_variability) to the Gaia data is described in Carnerero et al. (2023).
The predictive power of classification attributes was limited by the unaccounted reddening of colours, extinguished magnitudes, and no priors on parallax (as a function of celestial location), in addition to the effects of residual outliers, artificial signals, and other photometric imperfections.
3.2. Classifiers
The numbers of classified variable sources and related classes are significantly higher in Gaia DR3 with respect to the previous data release. The richness of variability types was a major challenge and it was addressed with a multitude of classifiers, which provided different perspectives on the peculiarities of each type of variable object.
Classifiers were trained with two machine learning algorithms, as implemented in the H2O platform (H2O.ai 2020): Distributed Random Forest (DRF; Breiman 2001) and eXtreme Gradient Boosting (XGBoost; Chen & Guestrin 2016). Both methods had their pros and cons; DRF results seemed more robust with realistic posterior probability distributions, while XGBoost was more effective in the identification of rare and subtle classes (such as planetary transits).
According to the attribute selection (Sect. 3.1.4), optimal results were achieved with 300 trees and with the square root of the number of attributes as the number of randomly sampled attributes to test at each split of any DRF tree. The same number of trees was used for XGBoost too.
The following types of classifiers were trained:
-
multi-class DRF and XGBoost classifiers with all classes at the same level (two versions, with minor updates to some class groups),
-
binary DRF (unweighted and weighted) and XGBoost classifiers of one class versus all other classes,
-
meta-classifiers that aimed to combine the best per-class results of the DRF and XGBoost methods, applied to both multi-class and binary classifiers,
-
a two-stage classifier for main-sequence pulsating OBAF-type stars (Gaia Collaboration 2023b),
which lead to over 100 classifiers in total, with up to 12 classifiers per class. For some of the least represented classes, there were classifiers that could not recover any training sources, so they were excluded. Also, results from all the available classifiers were not necessarily used for each class.
In the case of binary classifiers, DRF was more sensitive to the class imbalance than XGBoost, especially for classes that were two orders of magnitude less numerous than the rest of the training set. However, higher weights could be associated with the less represented classes, so DRF was executed also by weighting the targeted class such that the latter became as relevant as all other classes grouped into one, that is, equivalent to a binary classifier with two perfectly balanced classes.
In the two-stage classifier for main-sequence pulsating OBAF-type stars, the first stage separated ACV, BCEP, BE, CP, DSCT, GCAS, GDOR, MCP, PULS_PMS, SPB, SXARI, and SXPHE stars (including variability types that may be confused with the pulsating OBAF-type stars, when using only Gaia data) from all other types. In the second stage, the first group of types was split into the targeted ones (BCEP, DSCT, GDOR, SXPHE, and SPB) versus the others.
As a result, for each source, only the combined solution of all these different classifiers is recorded in the best_class_name and best_class_score fields of the vari_classifier_result table (as explained in Sect. 3.3), and the vari_classifier_definition table lists only a single ‘combined’ classifier instead of the over 100 classifiers that were used to compile the classifications of all sources.
As a side note, the training set for the preceding module of general variability detection (see Sect. 10.2.3 of the Gaia DR3 documentation; Rimoldini et al. 2022) included the training set described in Sects. 3.1.2 and 3.1.3 (except for the constant objects) as a single ‘variable’ class, for the targeted variable sources for Gaia DR3. In addition, a similar number of sources was selected from the 75% least variable ones among 1.6 million reference sources binned in 0.05 mag intervals (see Appendix E), as the other class. A binary XGBoost classifier then identified variable objects according to a classifier probability greater than 0.5 (not published in Gaia DR3), which were subsequently processed by the over 100 classifiers of variability types.
3.3. Verification and filtering of results
After the automated execution of the classifiers (Sect. 3.2), trained with the selected sources (Sect. 3.1.2) and attributes (Sect. 3.1.4), the results of all classifiers for a given class were verified, alleviated from contaminants, and assessed. The classes whose value was deemed sufficient to be published (separately or merged with other classes) are listed in Sect. 3.1.1, while unconvincing results of other classes were excluded (Sect. 3.1.3).
Depending on class, different selective conditions were applied to the classifier results, such as minimum thresholds on the posterior probability to belong to a given class, adjustments of the minimum variability level, colour and/or magnitude cuts, conditions on astrometric and time series parameters, and limitations on environment crowding, among other restrictions, often guided by the distributions of known objects (among the classified ones) and of the candidates. The list of such filters and thresholds are presented for each class in Sect. 4, to help the interpretation of the results where they are described. Exceptionally, following the feedback from SOS modules, a small fraction of sources (of particular interest) might override the general selection conditions (such as the special objects mentioned in Sect. 3.1.3 or bona fide sources with peculiar behaviour), provided they were classified correctly.
Additional sources were removed following a further post-processing verification, typically involving suspect features for a very small fraction of candidates.
3.4. Classification score
After the verification filters, the remaining candidates were assigned a classification score, which was derived from the posterior probabilities of the classifiers used to identify the sources of a given class. Classifier posterior probabilities were not calibrated, so their values were not directly comparable, because of the use of different methods and classifier structures. In order to treat the posterior probabilities of all classifiers on the same footing for a given class, they were converted into normalised ranks (for each classifier), which could then be compared across different classifiers.
For each classifier, this normalised rank was computed by sorting posterior probabilities of a certain class in ascending order. The source rank depended on its location in this ordered list and identical probabilities corresponded to identical ranks. Denoting the number of different probabilities (that is, the number of sources with posterior probabilities of a given class minus that of sources with duplicated probability values for that class) by N, the possible score values were represented by the ranks of the probabilities, from the first to the N-th, normalised by N: {1, 2, ..., N} / N (some of which were repeated in presence of duplicated probabilities). Thus, for each classifier, the score associated with a source of a given class ranged from a minimum value greater than zero to a maximum of one. The median of such normalised ranks from different classifiers, for a given source and class (best_class_name), was stored as best_class_score in the vari_classifier_result table. Exceptionally, for BCEP and WD, the normalised ranks from probabilities of the general variability detection classifier (Sect. 3.2) were included in the score evaluation, when higher variability implied higher reliability.
Occasional gaps in the distribution of the classification score of some classes were created by the removal of sources associated with multiple classes (Sect. 3.5) and also of some questionable candidates, following the post-processing verification. The average and extreme values of the classification scores are presented in Table 2 for each class.
Statistics of classification results for each class (source counts, classification and F1 scores, completeness and contamination rates) and their distribution with respect to the corresponding SOS modules (sources in ‘common’ between the ones classified as a given class and the corresponding SOS module, ‘extra’ sources in classification, sources classified as ‘other’ classes or ‘missed’ by classification).
3.5. Multi-class source reduction
After the verification filters (Sect. 3.3) and assignment of classification scores (Sect. 3.4), the combination of all per-class candidates revealed about 110 thousand sources (less than 1% of all classified sources) associated with multiple (usually two) classes. The vast majority of them were not due to genuine concurrent variability types (as it may happen, for example, in an eclipsing binary system with intrinsically variable components, perhaps with transiting exoplanets too). Multiple variability was not pursued in Gaia DR3, so a single class per source was enforced.
The following set of rules was applied to multi-class sources in order to reduce them to a single class per source.
1. The least numerous classes would be unfairly reduced if their candidates were left to compete with those of the most numerous classes, so rare classes of ACYG, BCEP, BE|GCAS|SDOR|WR, EP, MICROLENSING, RCB, SDB, SN, SPB, SYST, and WD were safeguarded against any other alternative class.
2. For 1290 sources that remained with multiple classes after the application of all the other rules, the class corresponding to the highest classification score was kept.
3. Candidates of some classes were considered dispensable because they were byproducts of classification (such as galaxies), or related to SOS modules that did not rely on classification and were expected to provide higher quality candidates than a general classification (with expert procedures dedicated to a single class). Such classes included S, GALAXY, SOLAR_LIKE, and RS, ordered from the most to the least dispensable class. Dispensable classes were dropped from multi-class sources, provided at least one non-dispensable class per source remained. For sources with all dispensable multi-classes, only the least dispensable one was kept (following the above mentioned list ordered by decreasing dispensability).
4. For the special objects (Sect. 3.1.3) with multiple classes, only the known class was kept, when available among the classifications, otherwise the standard treatment of other multi-class sources was followed.
5. For multi-class sources that were present also in at least one SOS module, different scenarios were possible:
-
(a)
the classified classes included at least one class for which a corresponding SOS module did not exist or whose source was absent from the corresponding SOS module:
-
all classes with the source absent from the corresponding SOS module, provided the latter existed, were removed (e.g. for a source classified as BE+AGN and present in the upper main-sequence oscillator SOS module but not in the AGN SOS one, the AGN class was removed),
-
the classes with the source present in the corresponding SOS modules were kept and all other classes were removed (e.g. for a source classified as BE+AGN+RR and present only in the AGN SOS module, the BE and RR classes were removed);
-
-
(b)
all classes had the source present in the SOS modules: classes unmatched by the corresponding SOS classes were removed, as long as at least one of the other classes matched an SOS class, otherwise item (5c) was applied;
-
(c)
for all classes that had the source present in SOS modules but that did not match any of the SOS classes, the class matching criterion was extended to include the similarity of such classes with the following SOS modules, for partial validation:
-
BE|GCAS|SDOR|WR and GALAXY sources in the AGN SOS module,
-
CEP sources in the RR Lyrae and the long-period variable SOS modules,
-
ECL sources in the RR Lyrae SOS module,
-
ELL and EP sources in the eclipsing binary SOS module,
-
RR sources in the Cepheid or eclipsing binary SOS modules,
-
S in the eclipsing binary or the RR Lyrae SOS modules,
-
YSO in the rotational modulation SOS module;
classes not similar to those of SOS modules were removed, as long as at least one of the other classes was similar to an SOS class, otherwise item (2) was applied.
-
The existence of SOS modules matching some of the multiple classes, but without the multi-class sources present in any of them, added no information to the selection rules (bona fide candidates can be excluded from the corresponding SOS modules for various reasons, such as insufficient sampling for reliable model results).
3.6. Classification versus SOS modules
Although SOS modules provided purer class samples than those from classification, the latter did not necessarily represent a superset of the corresponding SOS modules. As shown in Table 2, in addition to the ‘extra’ sources in classification and the ones in ‘common’ with respect to the SOS modules, some sources were classified as ‘other’ classes (see Fig. 7 of Eyer et al. 2023), different from the expected SOS modules, and others were ‘missed’ in the published classification results. The reasons for the ‘other’ and ‘missed’ classifications are listed in the following.
-
Some SOS modules did not depend on classification but relied on special variability detection or extraction of candidates (as for microlensing events, short timescales, and solar-like rotation modulation).
-
For the SOS modules that depended on classification:
-
(a)
they received input candidates before the multi-class source treatment described in Sect. 3.5, so sources could be assigned to ‘other’ classes as a consequence of enforcing a single class per source;
-
(b)
some of these SOS modules included classification candidates from multiple classes, due to similar features (such as long-period variables and symbiotic stars);
-
(c)
given the advanced class-specific SOS verification, permissive classifier probability thresholds or their combination with other parameters allowed for a larger initial set of candidates (before verification cuts) than the one considered for classification, for a given class.
-
(a)
-
The results of several SOS modules were not mutually exclusive (see Fig. 6 of Eyer et al. 2023), while only one class per source was required by classification.
4. Results
From the about 400 million sources identified as potentially variable by the preceding general variability detection module (Sect. 3.2), 12 428 245 objects were selected for the Gaia DR3 classification results. Among them, 9 976 881 variable sources, classified into 24 groups of variability types, and 2 451 364 galaxies were published in the vari_classifier_result and galaxy_candidates tables, respectively. The latter includes also results from other coordination units4 of the Gaia Data Processing and Analysis Consortium (DPAC), as described in Gaia Collaboration (2023a). The exclusion of galaxies from the variability result tables was motivated by their spurious light curve variability (Holl et al. 2023). For the same reason, galaxy light curves were not published, except for the ones that were automatically included in the Gaia Andromeda Photometric Survey (Evans et al. 2023).
The classified variable sources are identified by source_id, they are assigned class labels (best_class_name), and they are associated with classification scores (best_class_score) in the vari_classifier_result table. The galaxies whose classification was based on photometric time series are labelled as GALAXY in the field vari_best_class_name, with the classification score stored in the field vari_best_class_score of the galaxy_candidates table (see Appendix D).
Figure 1 depicts the number of classified sources per class group, from the most to the least numerous class (galaxies and R Coronae Borealis stars, respectively). Besides common and rare classes, certain class groups such as ACV|CP|MCP|ROAM|ROAP|SXARI and S were published for exploratory use to the benefit of the community.
Fig. 1. Number of sources per class, sorted in decreasing order. The bars shaded in yellow correspond to variability types published in the vari_classifier_result table, while galaxies, identified by their artificial photometric variations in Gaia, are published exclusively in galaxy_candidates. The ACV|CP|MCP|ROAM|ROAP|SXARI class group is abbreviated as ACV|CP|...|SXARI. |
Table 2 summarises the classification results for each class: the source counts, the distribution of classification scores (Sect. 3.4), completeness (the ratio of identified to all known sources of a given class), contamination (the fraction of contaminants among the classifications of a given class, i.e. 1− purity), the F1 score (the harmonic mean of completeness and purity), the maximum F1 value and the corresponding minimum classification score (for an optimal balance between completeness and contamination), and a comparison with SOS modules (explained in Sect. 3.6). Sample ADQL queries in Appendix D include indications of how to retrieve all the candidates of a given class (from classification and SOS modules) and how to reproduce the comparison of results from classification versus SOS modules in Table 2.
Completeness and contamination rates are computed for each class globally, with no restriction on sky location, magnitude, amplitude, signal-to-noise, period range, or other parameters, thus they might be more conservative than the detailed estimates presented in the papers describing the SOS module results. Typically, such rates depend on surveys from the literature in optical bands (or with detectable counterparts in Gaia), which are employed as reference. For this study, we take advantage of the catalogues of variable objects compiled by Gavras et al. (2023), in particular considering a subset of reliable catalogues (flagged in their selection field). More details on the class composition of the true and unknown positives, in addition to the false positives and negatives, are presented in Table 3.
Completeness and contamination details of classification results for each class (the class group ACV|CP|MCP|ROAM|ROAP|SXARI was abbreviated as ACV|CP|...|SXARI) (this table is continued on the next page).
For an overview on the identified classes of the least sampled sources, Fig. 2 shows the occurrence of classified objects as a function of median G-band magnitude for sources with up to 10, 15, and 20 G-band observations. Variable objects and galaxies are shown separately and in both cases the least sampled sources are the most numerous at the faint end, as a consequence of Gaia’s magnitude limit, with thicker tails of the magnitude distribution towards bright magnitudes from objects with more observations. Similarly, the distributions in the sky of variable classifications with up to 10, 15, and 20 G-band observations are presented in Fig. 3. The gaps created by sources with more measurements (following the Gaia scanning law) split the sky distributions in several regions: variables with up to 15 and 20 G-band observations tend to be more numerous in regions intersected by the Ecliptic, while sources with with up to 10 measurements in the G band concentrate towards the Galactic plane. Although galaxies are not shown in Fig. 3, most of them are located in the previously mentioned regions except for the Galactic plane, for all samples (including the galaxies with up to 10 observations). The top-five most common classes among the classified sources with the lowest number of observations are listed as follows (with the number of sources satisfying the sampling conditions indicated in parentheses):
-
GALAXY (43 428), LPV (30 203), SN (2515), CV (818), and AGN (579), for sources with up to 10 G-band FoV observations;
-
GALAXY (273 603), LPV (104 133), AGN (20 846), RR (4295), and SN (2833), for sources with up to 15 G-band FoV observations;
-
GALAXY (643 935), LPV (170 681), ECL (104 043), AGN (96 760), and RR (22 005), for sources with up to 20 G-band FoV observations.
Fig. 2. Number of the least-sampled sources in 0.2 mag intervals as a function of the median G-band magnitude. The number of variable sources (in the vari_classifier_result table of the Gaia archive) is colour coded by the maximum number of selected FoV observations in the G band (num_selected_g_fov up to 10, 15, and 20) as shown in the legend (with orange, green, and blue colours, respectively). The bars shaded in grey refer to the same conditions but for the galaxies identified by their artificial variability (published in the galaxy_candidates table), including unpublished values of num_selected_g_fov for galaxies outside the Gaia Andromeda Photometric Survey. The white vertical line marks the median G magnitude of 20.7. The fields num_selected_g_fov and median_mag_g_fov are published in the vari_summary table. |
Fig. 3. Sky map of the least-sampled sources in Galactic coordinates (white grid), colour coded as in Fig. 2 for variable sources with num_selected_g_fov up to 10, 15, and 20. Galactic longitude is zero at the centre and increases towards the left. The thin line in black denotes the Ecliptic. |
Galaxies are the most common sources with few observations because they are the faintest classified objects and thus susceptible to reduced detectability at the faint end of the Gaia detection limit. Long-period variables rank second, as expected from their relatively high occurrence and some of their features (such as high intrinsic luminosity, red colours, and often high amplitudes), which ease their identification even at large distances. Active galactic nuclei appear among the top classes, given their magnitude distribution at the faint end. About 83% of supernovae have less than 11 measurements in the G band, because the originating stars become detectable only after explosion and the subsequent brightness decay further reduces the time span of detectability (for the distribution of the time intervals of Gaia observations, see Appendix A.4.3 in Eyer et al. 2017).
The following sub-sections present more details for each class, among which the number of selected classifiers (as an upper limit as not all sources were necessarily classified as a given class by all classifiers), the verification cuts, special considerations on completeness or contamination (when applicable), and the references to relevant articles. A selection of diagrams illustrating the general properties of classified sources for each class is presented in Appendix E, followed by a visualisation of completeness versus contamination rates as a function of minimum classification score, F1 score versus minimum classification score, and samples of light curves in the G band versus time or phase (the latter is contingent upon the presence of a single dominant period). An overview of all variability results (including classifications), with figures and tables combining metrics of several classes, is presented in Eyer et al. (2023).
4.1. ACV|CP|MCP|ROAM|ROAP|SXARI stars
This class group accounts for 10 779 variable sources from a set of different types, which are particularly challenging and share similar features (according to the Gaia DR3 data): α2 Canum Venaticorum, (Magnetic) Chemically Peculiar star, Rapidly Oscillating Am or Ap star, and SX Arietis variable. Some types are physically unrelated to others, nevertheless they were grouped because classifiers often confused them. For example, the rapid and low amplitude oscillations of ROAM and ROAP could not be captured with Gaia’s per-FoV photometry, so these types were modelled only with global properties, such as the location in the observational Hertzsprung–Russell diagram, thus causing significant contamination by constant stars, in addition to contaminants from other variability types such as DSCT and eclipsing binaries (see Table 3). Perhaps, their classification will improve with per-CCD photometry in the next Gaia data release. This group of variability types includes also some SPB stars, which are not mentioned in this class group denomination because of the attempt to split them into their own class (Sect. 4.21). This class was considered by the upper main-sequence oscillator SOS module, but no source satisfied its requirements (see Sect. 10.14 of the Gaia DR3 documentation; Rimoldini et al. 2022).
These candidates were selected from three multi-class and binary meta-classifiers (Sect. 3.2) with some minimum probability level. The following additional conditions were required (employing field names in the vari_summary table).
1. The values of std_dev_mag_g_fov were set above the third quartile of the standard deviations in G, in 0.05 mag intervals, of 1.6 million reference sources (see Appendix E), as the ROAP training objects below this threshold (as mentioned in Appendix A) were most likely amplifying the contamination by constant stars.
2. To further reduce contaminants, a higher level of variability probability than the one used for the general variability detection (Sect. 3.2) was required.
3. As predictions still tended to be less variable than those exhibited in the literature, a minimum threshold was set for the single-band Stetson index: stetson_mag_g_fov > 1.5.
4. The reddened colour median_mag_bp − median_mag_rp was set to be bluer than 1 mag, in order to remove sparse outliers, whose presence among the training sources (see Fig. E.1d) was subsequently deemed questionable. In fact, the bulk of results was bluer than 0.5 mag and redder objects were associated with low scores (Fig. E.2b).
5. While training objects reduced steeply beyond a median G-band magnitude of about 10, most classifications extended towards fainter magnitudes; such dubious predictions were limited by the conservative condition median_mag_g_fov < 10 mag.
According to Table 3, additional candidates of this class group may be found among the false positives of the SPB (221) and ACYG (7) classes, where the number of known sources indicated in parentheses is a lower limit.
4.2. α Cygni stars (ACYG)
The classification of the α Cygni-type variables comprised 329 candidates, selected from two binary meta-classifiers (Sect. 3.2) with some minimum probability threshold, with no further condition. It is one of the rare types for which the training set sources were included in the assessment of completeness and contamination rates (in Table 2) because of the lack of known stars of this type. Among the contaminating classes, GCAS stars are the most common, as expected from the partial overlap in the bright and blue part of the colour–magnitude diagram (see Figs. E.5b and E.14b).
From the diagrams that include median_mag_g_fov from the vari_summary table (as in Figs. E.5b,c,g), two clumps of candidates separated at median_mag_g_fov ≈ 9.5 mag are clearly visible; the fainter one is also associated with lower scores and, as apparent from the sky map in Fig. E.5a, 82% of them (73/89) are projected in the direction of the Magellanic Clouds, with an average best_class_score of 0.3 (in the vari_classifier_result table). The bright and faint clumps were not so obvious from the training sources, as the relative occurrence in the Magellanic Clouds was strongly underrepresented and consequently related to lower scores than Galactic sources.
The upper main-sequence oscillator SOS module did not consider ACYG classifications as input, but it implicitly did so as its candidates were selected before the implementation of the rules that reduced multi-class sources to a single class per source (Sect. 3.5) and eventually five ACYG candidates became upper main-sequence oscillator candidates too.
4.3. Active galactic nuclei (AGN)
The 1 035 207 AGN candidates were selected from 11 multi-class and binary meta-classifiers (Sect. 3.2) following some minimum probability level and were filtered according to the following conditions (employing field names in the vari_summary and gaia_source tables), guided mostly by the Gaia celestial reference frame objects (Gaia-CRF3; Gaia Collaboration 2022) among the classified AGN sources.
1. A higher level of variability probability than the one used for the general variability detection (Sect. 3.2) was required in order to focus on truly variable AGNs, considering the possible contribution of artefacts such as the one described by Holl et al. (2023), when the host galaxy is detectable; the consequent offset with respect to the general threshold is visible in Fig. E.8e.
2. The renormalised unit weight error ruwe was set to be lower than 1.2, as the astrometric measurements of the vast majority of AGNs fit well the single-source model of the astrometric solution (97% of Gaia-CRF3 objects satisfy ruwe < 1.2).
3. The abbe_mag_g_fov parameter was set to be lower than 0.9, as long timescale variations observed with Gaia’s average sampling typically lead to low values for this parameter (see Fig. E.8f, where the classification score mirrors the density of training objects in Fig. E.7f).
4. The number of sources within 100 arcsec from each AGN candidate (computed by excluding the contribution of the AGN source at the centre) was set to be less than 126, in order to avoid crowded stellar fields in the foreground, as the steep increase in the number of stars per solid angle can lead to regions with excessive false positive rates, sometimes even where very few AGNs are detectable in the optical wavelengths following an enhanced interstellar extinction. The number density threshold corresponds to a suspicious increase of the fraction of unknown-to-known AGNs towards higher crowding. The most evident sky regions affected by this conditions are the Magellanic Clouds and the Galactic plane, as shown in Fig. E.8a. Occasional AGN candidates appear at low Galactic latitudes, especially towards the Galactic anti-centre, most of which are associated with low classification scores.
5. For sources with existing parallax, the latter was assumed to be insignificant, as for any extra-galactic object, after correcting it for a global systematic offset of −0.017 mas (Lindegren et al. 2021a), consequently its ratio with an assumed Gaussian uncertainty was expected to follow a standard normal distribution (with zero average and unit variance). We required parallax measurements not to deviate beyond five sigma: | parallax+0.017 mas | / parallax_error < 5. Only 282 of the published AGN candidates had no parallax (nor proper motion) available, while the remaining 1 034 925 sources had (at least) five-parameter astrometric solutions.
6. The same rationale of item (5) was applied to AGN classifications with existing proper motion (with negligible offset; see Lindegren et al. 2021a; Gaia Collaboration 2022), leading to the following condition, which accounts also for the correlation between the measured proper motion along the right ascension and declination directions:
pmra2/ pmra_error2+ pmdec2/ pmdec_error2+− 2 (pmra / pmra_error) × (pmdec / pmdec_error) ×× pmra_pmdec_corr < 52 (1 − pmra_pmdec_corr2).
7. Three conditions were set in the Gaia reddened colour–colour diagrams as follows:
-
(a)
median_mag_g_fov − median_mag_rp > − 3.7 (median_mag_bp − median_mag_g_fov) − 0.7,
-
(b)
median_mag_g_fov − median_mag_rp < − 0.75 (median_mag_bp − median_mag_g_fov) + 1.55,
-
(c)
median_mag_g_fov − median_mag_rp > 1.6 (median_mag_bp − median_mag_g_fov) − 0.6,
whose impact is clearly visible in Fig. E.8c.
The set of conditions listed above proved efficient in the selection of the bulk of AGN candidates while keeping low contamination, however, peculiar objects may not conform to these general rules. In order to recuperate known AGNs with different behaviour from average and those of particular interest for which Gaia time series might provide useful complementary information, some criteria were relaxed and the following objects were included (provided they were classified as AGN): blazars (Massaro et al. 2015; Bonato et al. 2018; Chang et al. 2017), known and possible strong gravitationally lensed AGNs5, the top-thousand brightest AGNs (among which 3C 273; Gaia DR3 source_id 3700386905605055360), and the supermassive binary black hole candidate OJ 287 (Gaia DR3 source_id 660820614442429056). Similarly, about 50 thousand AGN candidates were recovered following the feedback from the AGN SOS module. According to Table 3, additional candidates may be found among the false positives of the GALAXY (1884) and CV (37) classes, where the numbers of known sources indicated in parentheses represent lower limits.
In comparison with the AGN SOS module requirements described in Carnerero et al. (2023), the conditions applied to classification results were generally more permissive, in particular:
1. at least five (instead of 20) FoV transits in the G band,
2. no cut in the ruwe versus abbe_mag_g_fov plane, but a slightly more restrictive limit on the ruwe parameter (less than 1.2 instead of 1.3),
3. no constraint on the amount of scan-angle dependent signal (Holl et al. 2023),
4. no requirements on the following parameters, which are defined and published in the vari_agn table: fractional_variability_g (Vaughan et al. 2003), structure_function_index (Simonetti et al. 1985), qso_variability and non_qso_variability (Butler & Bloom 2011).
Parallax was used as classification attribute (among others listed in Appendix B) and consequently it introduced a bias in the distribution of the parallax significance of AGN candidates with respect to best_class_score (in the vari_classifier_result table). Figure 4a illustrates a comparison between the top-150 000 (high-score) AGN classifications and the bottom-150 000 (low-score) candidates: there is a distinct deficit of sources with positive parallax among the most reliable AGN candidates, clearly yielding to stellar objects in the Galaxy (often associated with positive parallax), while the low-score sample is biased in the opposite direction, in addition to being the main contributor to thicker than Gaussian tails at both ends. Eventually, the interplay among high and low score AGNs leads to an overall distribution that approximates a standard normal distribution. As expected, no training AGN source and only 13 low-score AGN candidates fulfil the condition of parallax_over_error > 5 required for observational Hertzsprung–Russell diagrams (Figs. E.7d and E.8d). A different distribution of high versus low score candidates is expected for every classification attribute that can discriminate AGN from other classes, such as the G-band magnitude and colours (Figs. E.8b,c), the Abbe parameter (Fig. E.8f), and the Butler & Bloom (2011) metrics (Carnerero et al. 2023).
Fig. 4. Distribution of parallax (a) and proper motion components, pmra along the right ascension (b) and pmdec along the declination (c), normalised by the corresponding uncertainties and binned in intervals of 0.02, for the 1 034 925 AGN candidates with at least five-parameter astrometric solutions, for the top-150 000 candidates (denoted as high scores; best_class_score > 0.8995154), and for the bottom-150 000 candidates (denoted as low scores; best_class_score < 0.1615997), colour-coded as indicated in the legend. The bias from the inclusion of parallax among classification attributes is evident in panel a (see Sect. 4.3 for details). |
Proper motion was not among the classification attributes and Figs. 4b,c do not show evident biases in the distribution of proper motion components, except for the expected thicker tails and more extreme values from low-score candidates with respect to those from sources in the high-score sample. The asymmetric proper motion distribution of low-score AGNs, in particular in the declination direction (Fig. 4c), is characteristic of the motion of stars in the Galaxy (see Fig. D.3 in Gaia Collaboration 2022) and thus it suggests stellar contamination. Table 4 summarises the averages and standard deviations of the distributions illustrated in Fig. 4, in order to help assess biases and similarity with respect to a standard normal distribution.
While the completeness rate of AGN candidates in Table 2 is consistent with the assessment of Carnerero et al. (2023), the contamination rate is deemed underestimated, given the large set of 200 081 ‘unknown’ positives stated in Table 3. Comparing the 1 034 925 AGN candidates with at least five-parameter astrometric solutions with 17 catalogues from the literature as described in Gaia Collaboration (2022), only about 45 thousand sources were unknown. While many of them may still be genuine AGNs, literature catalogues may also have false positives, thus an approximate estimate for the contamination is ∼5%.
A comparison of the sources from all of the classes identified by variability that overlap with those from QSO modules published by other Gaia coordination units (gathered in the qso_candidates table) is presented in Tables 12.9 and 12.12 of the Gaia DR3 documentation (Teyssier & Gaia QSO Working Group 2022). It is noted that the unfiltered qso_candidates table has significant stellar contamination (estimated at 76% in Gaia Collaboration 2023a, where a query to select a purer sub-sample is indicated).
4.4. β Cephei stars (BCEP)
The classified β Cephei type candidates include 1475 sources, 174 of which are shared with the upper main-sequence oscillator SOS module (see Sect. 10.14 of the Gaia DR3 documentation; Rimoldini et al. 2022), although multi-periodicity may not be easily detectable with the current average number of observations per source. More information on the upper main-sequence oscillators is provided in Gaia Collaboration (2023b).
Training sources for the β Cephei class included a tail of objects at the blue end with lower intrinsic luminosity than possible for this class (see Fig. E.11d), explaining the contamination by GCAS mentioned in Table 3. Figures E.11b,e,g, show that candidates with fainter apparent magnitudes tend to have lower scores, mirroring the decreasing occurrence of training sources towards faint magnitudes (only three training objects had G median magnitude fainter than 12).
The BCEP candidates were selected from 11 binary, multi-class, multi-stage, and meta-classifiers (Sect. 3.2) according to some minimum probability threshold and two conditions were set in the Gaia reddened colour–colour diagrams, in order to remove sparse outliers with respect to the general colour–colour relation followed by all other sources of this class (with a model of G − GRP given GBP − GRP, based on known objects; see Fig. E.11c):
1. median_mag_g_fov − median_mag_rp < model(G − GRP | GBP − GRP) + 0.03 mag,
2. median_mag_g_fov − median_mag_rp > model(G − GRP | GBP − GRP) − 0.03 mag,
employing field names in the vari_summary table.
Although not listed in the top-6 false positive types of Table 3, additional BCEP candidates may be found among the false positives of the following classes: ACV|CP|MCP|ROAM|ROAP|SXARI (3), BE|GCAS|SDOR|WR (1), CV (1), and RS (1), where the numbers of known sources indicated in parentheses represent lower limits.
4.5. BE|GCAS|SDOR|WR stars
This class group includes a set of 8560 eruptive variables of the following types: B-type emission line star, γ Cassiopeiae, S Doradus, and Wolf-Rayet star. Stars of types BE|GCAS, SDOR, and WR can significantly overlap in the blue and bright end of the Hertzsprung–Russell diagram and share similarities in the irregular light changes. Therefore, although candidates of these types were selected independently (including the verification filters mentioned below), they were subsequently merged, preserving their original classification scores. The bimodal distribution in apparent magnitude, clearly visible in Figs. E.13b,e,g and E.14b,e,g, is not related to different class components, but to sources (mostly BE|GCAS) located in the Galaxy (bright clump) versus the Magellanic Clouds (faint clump). The latter is also associated with the clump with scattered colours in Figs. E.13c and E.14c.
The BE|GCAS candidates were obtained from two binary and multi-class classifiers (Sect. 3.2) with some minimum probability level. They were further constrained by the following criteria (employing field names in the vari_summary and gaia_source tables).
1. A higher level of variability probability than the one used for the general variability detection (Sect. 3.2) was required, in order to prioritise high-amplitude candidates.
2. The renormalised unit weight error ruwe was restricted to values lower than 1.4, which was consistent with the vast majority of known objects of this class.
3. The values of the Abbe parameter were constrained for the three Gaia bands, to increase the contribution of correlated time-dependent variations, given the sampling of Gaia:
-
(a)
abbe_mag_g_fov < 0.8,
-
(b)
abbe_mag_bp < 1,
-
(c)
abbe_mag_rp < 1.
Exceptionally, probabilities from the general variability detection classifier were included in the computation of the classification score, to further increase the weight of high-amplitude candidates. Given the similarities between Be pulsators and SPB stars and related confusion from classifiers, the upper main-sequence oscillator SOS module (see Sect. 10.14 of the Gaia DR3 documentation; Rimoldini et al. 2022) considered, among other input sources, classifications of BE|GCAS stars (223 of which were published as upper main-sequence oscillators).
The SDOR classifications were derived from ten binary, multi-class, and meta-classifiers with minimum probability thresholds, some of which were combined with the maximum brightness limits of median_mag_g_fov < 13 or 17 mag, to exclude a suspicious component characterised by very faint and low-probability candidates, as some of them extended above the chosen minimum probability. The condition of ruwe < 2 was applied to all candidates, as no known SDOR star existed above such a limit. Eight further questionable candidates were removed following visual inspection.
The WR candidates were selected from four binary and meta-classifiers according to minimum probability thresholds. The number of sources within 100 arcsec from each WR candidate (computed by excluding the contribution of the WR source at the centre) was set to be greater than 157, as these relatively young stars were expected to be within the Galactic disc and no WR star was known to exist in lower star density fields.
According to Table 3, additional BE|GCAS|SDOR|WR candidates may be found among the false positives of the following classes: BCEP (55), SPB (42), ACV|CP|MCP|ROAM|ROAP|SXARI (34), and ACYG (20), where the number of known sources indicated in parentheses is a lower limit.
4.6. Cepheids (CEP)
The classification of Cepheid variables included 16 141 stars of types δ Cephei, anomalous Cepheid, and type-II Cepheid. For this class, the selection of Cepheids from the relevant SOS module (Ripepi et al. 2023) was followed, with the addition of 1154 candidates from three binary and multi-class classifiers (Sect. 3.2) with strict probability conditions, including about 400 sources that were rejected from the SOS module because reliable periods (and thus models) could not be achieved with the Gaia data, although such sources were believed to be genuine Cepheids. The pulsating nature of these objects is clearly visible in Figs. E.16g and E.17g (until noise prevails at the faint end), indicating greater amplitudes in the GBP than in the GRP band. Low-amplitude candidates (std_dev_mag_g_fov < 0.06 mag) and those at the faint end (median_mag_g_fov < 20.3 mag) are associated with low classification scores (Fig. E.17e). The main classes responsible of the few percent of contamination are represented by RR Lyrae, eclipsing binaries, and long-period variables (Table 3).
Although not listed in the top-6 false positive types of Table 3, additional CEP candidates may be found among the false positives of the following classes: RS (494), RR (274), LPV (199), ECL (114), DSCT|GDOR|SXPHE (33), YSO (13), BE|GCAS|SDOR|WR (11), CV (7), RCB (4), S (4), AGN (3), ELL (2), SYST (2), ACV|CP|MCP|ROAM|ROAP|SXARI (1), and BCEP (1), where the numbers of known sources indicated in parentheses represent lower limits.
4.7. Cataclysmic variables (CV)
The 7306 candidates of cataclysmic variables (excluding supernovae and symbiotic stars, described in Sects. 4.19 and 4.22, respectively) were selected from five binary, multi-class, and meta-classifiers (Sect. 3.2), after the application of minimum probability thresholds. The verification of CV classifications lead to the following conditions (employing field names in the vari_summary table).
1. A higher level of variability probability than the one used for the general variability detection (Sect. 3.2) was required to omit doubtful low signal-to-noise candidates.
2. A suspicious clump of extremely blue candidates was filtered out by median_mag_bp − median_mag_g_fov > − 4 mag.
3. A thin tail of blue outliers was excluded by the condition median_mag_bp − median_mag_rp > − 0.5 mag.
4. The number of sources within 100 arcsec from each CV candidate (computed by excluding the contribution of the CV source at the centre) was set to be less than 408, to prevent an increase of candidates towards the end of the tail of the distribution of CV from the literature.
5. The following two conditions served to exclude two clumps of candidates in the outlier_median_g_fov versus skewness_mag_g_fov space that were only marginally populated by literature instances:
-
(a)
outlier_median_g_fov > 20,
-
(b)
skewness_mag_g_fov > − 4.
The two features with skewness < − 3 that appear in the plot of skewness_mag_g_fov versus abbe_mag_g_fov (Fig. E.20f) are related to faint sources, with median G ≈ 20.7 mag, that exhibit a single bright outlier; these CV candidates should be dismissed.
According to Table 3, additional CV candidates may be found among the false positives of the SN (22) and WD (12) classes, where the numbers of known sources indicated in parentheses represent lower limits.
4.8. DSCT|GDOR|SXPHE stars
This class group includes 748 058 candidates of types δ Scuti, γ Doradus, and SX Phoenicis, in particular as DSCT and SXPHE types could not be distinguished without metallicity or other indicators of type I versus type II star. These classifications were also the main contributors to the upper main-sequence oscillator SOS module (see Sect. 10.14 of the Gaia DR3 documentation; Rimoldini et al. 2022), which were further analysed in Gaia Collaboration (2023b).
This sample is dominated by low amplitude candidates (std_dev_mag_g_fov < 0.01 mag for 537 769 sources), which are expected to be significantly contaminated, as suggested from their distribution of std_dev_mag_bp / std_dev_mag_rp < 1 between median_mag_g_fov of 13 and 17 mag, as shown in Fig. E.23g. About seven thousand candidates with GBP − GRP < 0.25 mag, associated with low scores (0.01 on average and clearly visible in Figs. E.23b,d), should be disregarded.
The DSCT|GDOR|SXPHE candidates were extracted from ten binary, multi-class, multi-stage, and meta-classifiers (Sect. 3.2) with minimum probability thresholds. The following additional criteria were applied (employing field names in the vari_summary table).
1. Removal of outliers in the Gaia colours:
-
(a)
median_mag_bp − median_mag_rp > −0.5 mag,
-
(b)
median_mag_bp − median_mag_g_fov > −1 mag,
-
(c)
median_mag_g_fov − median_mag_rp < 1 mag.
2. Constraints on the colour–colour scatter around the mean relation (with a model of G − GRP given GBP − GRP, based on known objects; see Fig. E.23c):
-
(a)
median_mag_g_fov − median_mag_rp < model(G − GRP | GBP − GRP) + 0.025 mag,
-
(b)
median_mag_g_fov − median_mag_rp > model(G − GRP | GBP − GRP) − 0.025 mag.
The conditions described above were overridden by candidates that existed also in the upper main-sequence oscillator SOS module.
According to Table 3, additional DSCT|GDOR|SXPHE candidates may be found among the false positives of the following classes: RS (792), AGN (275), ACV|CP|MCP|ROAM|ROAP|SXARI (49), BCEP (11), WD (4), and SDB (1), where the numbers of known sources indicated in parentheses represent lower limits.
4.9. Eclipsing binaries (ECL)
The classification of eclipsing binaries included 2 184 356 systems of types β Persei (Algol), β Lyrae, and W Ursae Majoris. They were obtained from four meta-classifiers (Sect. 3.2) and followed the selection of the corresponding SOS module described by Mowlavi et al. (2023), including the following conditions (employing field names in the vari_summary table).
1. A minimum apparent brightness in the G band as defined by median_mag_g_fov < 20 mag (see Figs. E.26b,e,g).
2. A minimum value of the time series skewness, computed from G-band magnitudes: skewness_mag_g_fov > − 0.2 (see Fig. E.26f).
3. A minimum number of clean FoV transits in the G band: num_selected_g_fov > 15.
4. Further constraints on period and model properties listed in the vari_eclipsing_binary table.
The ratio between GBP and GRP amplitudes is close to one (Fig. E.26g), as expected from non-pulsating objects (apart from the increasing scatter towards faint magnitudes due to shot noise). Low-amplitude and low-skewness candidates tend to be associated with low classification scores (Figs. E.26e,f).
According to Table 3, additional ECL candidates may be found among the false positives of the following classes: RS (10 908), S (5713), RR (4318), DSCT|GDOR|SXPHE (3402), LPV (2507), ELL (2135), SOLAR_LIKE (286), YSO (219), BE|GCAS|SDOR|WR (152), CEP (104), ACV|CP|MCP|ROAM|ROAP|SXARI (61), CV (28), ACYG (6), EP (3), SDB (3), WD (3), and MICROLENSING (2), where the numbers of known sources indicated in parentheses represent lower limits.
4.10. Ellipsoidal variables (ELL)
The classification of 65 300 ellipsoidal variables targeted input sources for the compact companion SOS module (Gomel et al. 2023). The training objects were not sufficiently representative of the whole sky (Fig. E.28a) and thus low scores were assigned to candidates beyond the Galactic bulge region (Fig. E.29a). Most of them are likely contaminants, considering that low score classifications contribute negligibly to completeness while they double the contamination rate (see Fig. E.30a). As expected, most of the contaminants are represented by W Ursae Majoris eclipsing binaries (Table 3).
The ELL candidates were selected from ten binary, multi-class, and meta-classifiers (Sect. 3.2) with minimum probability thresholds. They fulfil the following conditions (employing field names in the vari_summary and gaia_source tables).
1. The renormalised unit weight error ruwe was set to values lower than 1.2, to restrict an excess of candidates with high ruwe values (and thus unreliable astrometric solutions).
2. In order to better fit the distribution of known ELL stars in standard deviation versus median G magnitude, the minimum level of variability was raised (see Fig. E.29e):
-
(a)
std_dev_mag_g_fov > 1.5 times the third quartile of the standard deviations in G, in 0.05 mag intervals, of 1.6 million reference sources (see Appendix E),
-
(b)
std_dev_over_rms_err_mag_g_fov > 4.
3. Additional constraints removed candidates from colour and magnitude ranges that were poorly represented in the literature:
-
(a)
median_mag_bp − median_mag_g_fov > 0.3 mag,
-
(b)
median_mag_bp − median_mag_g_fov > 0.022 (median_mag_bp − median_mag_rp)2 + 0.7 (median_mag_bp − median_mag_rp) − 0.65 mag,
-
(c)
median_mag_g_fov < 19 mag.
4. The number of sources within 100 arcsec from each ELL candidate (computed by excluding the contribution of the ELL source at the centre) was set to be greater than 314, to avoid a suspicious concentration of candidates around the Galactic anti-centre.
These criteria were overridden by candidates that existed also in the compact companion SOS module (including the intention of the last item above).
According to Table 3, additional ELL candidates may be found among the false positives of the ECL (1056), LPV (731), SPB (9), BCEP (3), and ACYG (2) classes, where the number of known sources indicated in parentheses is a lower limit.
4.11. Exoplanets (EP)
The 214 stars classified with exoplanet transits have percent-level variations in the G band, above and below the general variability threshold, as shown in Fig. E.32e. These stars are relatively bright to allow for high per-FoV photometric precision, thus they are also nearby and distributed rather homogeneously across the sky (Fig. E.32a).
The weakness of the signal and the typically small number of observations in transit implied the necessity of a dedicated period search algorithm such as that used in the planetary transit SOS module (Panahi et al. 2022), rather than the generic computationally efficient generalised Lomb-Scargle method used for all the classes (Heck et al. 1985; Zechmeister & Kürster 2009). Consequently, the EP classifications were limited to the selection of the corresponding SOS module and their scores were set to one.
A binary XGBoost classifier (Sect. 3.2) was used to extract a sample of 18 383 potential candidates, without the application of a minimum probability threshold. All of these sources were processed by the planetary transit SOS module, which returned 214 objects, including known stars with planetary transits as well as new candidates.
According to Table 3, at least 93 additional EP candidates may be found among the false positives of the SOLAR_LIKE class.
4.12. Long-period variables (LPV)
The classification of long-period variables included 2 325 775 stars, among which long secondary period variables, o Ceti (Mira) stars, (OGLE) small amplitude red giants, and semi-regular types. They were obtained from four meta-classifiers (of binary and multi-class classifiers, see Sect. 3.2) with minimum probability thresholds and selected according to the criteria defined in the LPV SOS module (Lebzelter et al. 2023), some of which were more permissive for classification candidates, as indicated in the following (employing field names in the vari_summary and gaia_source tables).
1. A minimum number of clean FoV transits in the G band: num_selected_g_fov > 9 (instead of 12 in the SOS module).
2. The Gaia reddened GBP − GRP colour redder than 1 mag, estimated as median_mag_bp − median_mag_rp > 1 mag.
3. A minimum fraction of clean FoV transits in the GRP versus G band: num_selected_rp / num_selected_g_fov > 0.5 (instead of 0.8 in the SOS module).
4. A minimum amplitude assessed from the 5th to the 95th percentile of the magnitude distribution of G-band time series: trimmed_range_mag_g_fov > 0.1 mag (Fig. E.35c).
5. Given the conditions above, LPV candidates satisfied in addition at least one of the following conditions:
-
(a)
std_dev_over_rms_err_mag_g_fov > 4, for a significant signal-to-noise level,
-
(b)
median_mag_g_fov < 14 mag, for a substantial apparent brightness,
-
(c)
trimmed_range_mag_g_fov > 0.5 mag, for a high amplitude,
-
(d)
a high probability for a source to be classified as an LPV by any of two of the meta-classifiers.
6. The number of groups of observations in the G band (visibility_periods_used), separated from other groups by the absence of measurements for at least four days, was not constrained, while it was greater than ten in the SOS module.
For further details, including an analysis of the difference between the SOS module and the classification LPV candidates, see Lebzelter et al. (2023).
Faint and/or blue candidates in Figs. E.35b,d tend to be associated with low scores and might include spurious candidates, as their distribution in the sky suggests scanning law features (Fig. E.35a) and the step around G ≈ 20 mag indicates marginal candidates bordering with other classes.
According to Table 3, additional LPV candidates may be found among the false positives of the following classes: RS (5761), ELL (443), YSO (234), SYST (215), CEP (73), AGN (50), RCB (42), BE|GCAS|SDOR|WR (20), and MICROLENSING (3), where the numbers of known sources indicated in parentheses represent lower limits.
4.13. Microlensing events
The classification of 254 candidate microlensing events was performed independently from the microlensing SOS module (Wyrzykowski et al. 2023) and the latter did not consider these classification results as input of potential candidates, as it already extracted all possible candidates with methods specific to this class. Nevertheless, the classification candidates had the advantage of fewer assumptions and requirements than the SOS counterparts, in particular regarding the time series modelling. Among the microlensing candidates, 187 are in common with the SOS module and 67 are unique to classification.
Microlensing sources were selected from ten binary, multi-class, and meta-classifiers (Sect. 3.2) with minimum probability thresholds. The following additional criteria were applied to such candidates (employing field names in the vari_summary and gaia_source tables).
1. In order to focus on high amplitude events, the std_dev_mag_g_fov parameter was set to be greater than 0.06 mag, as only a handful of the known sources (among the classified ones) were below this threshold.
2. To further support the rationale of the previous item, at the cost of completeness, the most significant candidates were selected by outlier_median_g_fov > 100.
3. The ratio of the mean magnitude, weighted by squared uncertainties, to the unweighted mean magnitude was required to be less than one, as expected from the bias introduced by weighting towards brighter measurements (this condition was fulfilled by almost all known events).
4. The renormalised unit weight error ruwe was set to be lower than 1.4, considering the distribution of ruwe for known microlensing events peaked at close to one.
5. The colour median_mag_g_fov − median_mag_rp was set bluer than 1.7 mag, following the distribution of known events.
Some of the criteria listed above were overridden by visual inspection of a selection of candidates.
As the probability of source-lens alignment along the line-of-sight to the observer increases in high-density regions, microlensing candidates are found, as expected, prevalently towards the Galactic bulge and a minority of cases in the Galactic disc (see Fig. E.38a). They populate the region of negative skewness_mag_g_fov and low abbe_mag_g_fov values, which are typical of bright and time-dependent outburst-like events (Fig. E.38f).
Although not listed in the top-6 false positive types of Table 3, at least three additional microlensing candidates may be found among the false positives of the LPV class.
4.14. R Coronae Borealis stars (RCB)
The 153 stars classified as R Coronae Borealis variables were obtained from two meta-classifiers (of binary and multi-class classifiers, see Sect. 3.2), after filtering out low probability candidates and confirmation by visual inspection. These high-amplitude variables are characterised by sudden fading by up to several magnitudes (Fig. E.41e), followed by an irregular recovery (sample light curves are shown in Figs. E.42c–f). They are simultaneously eruptive and pulsating, although the amplitude of the latter is an order of magnitude lower. As a consequence of the long timescale variations, these objects are also characterised by low abbe_mag_g_fov values (Fig. E.41f) and long-period variables are the main source of contamination (Table 3).
Although not listed in the top-6 false positive types of Table 3, additional RCB candidates may be found among the false positives of the LPV (5) and RS (4) classes, where the numbers of known sources indicated in parentheses represent lower limits.
4.15. RR Lyrae stars (RR)
The classification of RR Lyrae stars includes 297 778 variables of fundamental-mode, first-overtone, double mode (and anomalous double mode) types. The selection of candidates followed the one of the relevant SOS module (Clementini et al. 2023). Additional candidates were obtained from four binary and multi-class classifiers (Sect. 3.2) after the application of strict minimum probability thresholds.
Among the 26 202 extra RR candidates in classification with respect to the RR Lyrae SOS module, 9239 are known RR Lyrae stars in the literature compilation of Gavras et al. (2023; or 8276 if counting only those flagged by their selection field), for which the correct period could not be recovered with the Gaia DR3 data (Clementini et al. 2023). Other candidates were found to be eclipsing binaries (as already noted in Sect. 4.9), which constitute the main source of contamination for this class (Table 3). About 800 AGN and galaxy contaminants, as well as dubious RR candidates, are listed in Tables 5 and 6 of Clementini et al. (2023).
Faint sources, affected by extinction in the Galactic disc and bulge, and low-amplitude candidates tend to be associated with low classification scores (Figs. E.44a,b,e). Other low-score candidates are found between the main sequence and the white dwarf sequence (Fig. E.44d) and are located mainly in the Galactic bulge and disc (some are in the Magellanic Clouds too). They are characterised by higher values of the renormalised unit weight error (ruwe) than the other candidates: among all RR classifications with parallax_over_error > 5, about 80% have ruwe < 1.22, while only half of the ones between the main and white-dwarf sequences fulfil this condition. The pulsating nature of the RR candidates is confirmed in Fig. E.44g, until the contribution of noise overcomes the expected GBP to GRP band amplitude ratio.
According to Table 3, additional RR candidates may be found among the false positives of the following classes: ECL (5833), DSCT|GDOR|SXPHE (525), AGN (264), GALAXY (244), S (168), CEP (158), CV (41), and BCEP (3), where the numbers of known sources indicated in parentheses represent lower limits.
4.16. RS Canum Venaticorum stars (RS)
The classification of RS Canum Venaticorum type binary systems includes 742 263 candidates. Although they are presented separately from SOLAR_LIKE stars, any confusion between these two classes does not constitute real contamination, as at least one of the components of an RS binary system is a solar-like star. Thus, as expected, 20 050 RS sources are included also in the vari_rotation_modulation table of the solar-like SOS module (Distefano et al. 2023) and the top RS false positive classes listed in Table 3 are ROT and BY, which should not be considered as contaminants (as they are typical solar-like types).
The RS candidates were selected from ten binary, multi-class, and meta-classifiers (Sect. 3.2) with minimum probability thresholds. The following additional criteria were applied (employing field names in the vari_summary and gaia_source tables).
1. The renormalised unit weight error ruwe was set to be lower than 1.1, to exclude many candidates in a range sparsely populated by literature RS instances.
2. To better fit the bulk of known RS stars in standard deviation versus median G magnitude and exclude suspicious low-amplitude candidates, the minimum level of variability was raised (Fig. E.47e) by std_dev_mag_g_fov > 100.2 times the third quartile of the standard deviations in G, in 0.05 mag intervals, of 1.6 million reference sources (see Appendix E).
3. To exclude a clump of negatively skewed candidates (by artefact) and for symmetry reasons, the following condition was applied: | skewness_mag_g_fov | < 3 (as visible in Fig. E.47f).
4. Two conditions in the Gaia reddened colour–colour diagrams excluded a significant excess of candidates where only few known RS were present and followed the general colour–colour relation of all other known RS sources (with a model of GBP − G given G − GRP, as shown in Fig. E.47c):
-
(a)
median_mag_bp − median_mag_g_fov < model(GBP − G | G − GRP) + 0.07 mag,
-
(b)
median_mag_bp − median_mag_g_fov > model(GBP − G | G − GRP) − 0.06 mag.
According to Table 3, additional RS candidates may be found among the false positives of the following classes: ECL (10 791), DSCT|GDOR|SXPHE (1598), YSO (1542), ELL (210), RR (165), SOLAR_LIKE (148), and CEP (36), where the numbers of known sources indicated in parentheses represent lower limits.
4.17. Short-timescale objects (S)
This class was trained with stars exhibiting rapid light variations that however were not well studied in the literature. It resulted in 512 005 short-timescale candidates, originally targeting possible input for the short-timescale SOS module (described in Sect. 10.12 of the Gaia DR3 documentation; Rimoldini et al. 2022). Eventually, a similar goal was pursued independently (with different data types, as the SOS module employed per-CCD photometry), leading to two rather complementary sets. This class is meant to be exploratory and considered as a sample where not well-defined short-timescale sources (including those from other classes published in Gaia DR3) can be found, such as, for instance, nine BLAP sources from Pietrukowicz et al. (2017).
The assessment of completeness and contamination in Table 2 is not truly meaningful for this class, as completeness is relative to objects that might be better resolved in Gaia and thus classified accordingly, while the main source of contamination derives from EW-type eclipsing binaries (Table 3), which do vary on short timescales.
The S candidates were obtained from seven binary, multi-class, and meta-classifiers (Sect. 3.2) with minimum probability thresholds. The lowest probability candidates excluded by such thresholds corresponded also to the least sampled sources, causing the appearance of regions (intersected by the Ecliptic) almost devoid of candidates, as a consequence of the Gaia scanning law (see Fig. E.50a). Additional selection criteria are described in the following (employing field names in the vari_summary table).
1. To remove a suspicious peak of low-amplitude candidates without counterpart in the literature, a minimum standard deviation threshold was set: std_dev_mag_g_fov > 0.05 mag, as noticeable in Fig. E.50e.
2. The tails of colour distributions of S candidates that were overly represented in proportion to the literature were reduced by the following conditions:
-
(a)
median_mag_bp − median_mag_rp < 2.6 mag,
-
(b)
median_mag_bp − median_mag_g_fov < 1.3 mag,
-
(c)
median_mag_g_fov − median_mag_rp < 1.6 mag (see Fig. E.50c).
3. To exclude a population of candidates with magnitude time series associated with extremely negative skewness and a few positively skewed outliers, only skewness_mag_g_fov values between −1.4 and 4 were accepted (Fig. E.50f).
4. The number of sources within a radius of 3 arcsec of each S candidate (excluding the contribution of the candidate at the centre) was set to be less than 5, to exclude candidates in the most crowded environments.
5. The minimum ratio of two (unpublished) spectral shape components in the GBP band (SSC ids 0 and 2)6 was set to match approximately the one from S objects in the literature.
6. A constraint on the amount of scan-angle dependent signal (Holl et al. 2023), quantified by the Spearman correlation ripd, G between the G-band epoch photometry and the model of the Image Parameter Determination (IPD; see Sect. 3.3.6 of the Gaia DR3 documentation in Castañeda et al. 2022), was applied as ripd, G < 0.45, where the upper limit corresponded to the minimum between two apparent distributions. This condition halved the number of S sources in the literature, but it was necessary to avoid a sample otherwise dominated by candidates with spurious signals.
Although not listed in the top-6 false positive types of Table 3, additional S sources from the literature are found among the following classes: ECL (24), RR (12), DSCT|GDOR|SXPHE (9), CV (3), RS (2), WD (2), and AGN (1), where the numbers of known sources indicated in parentheses represent lower limits.
4.18. Subdwarf B-type stars (SDB)
The classification of subdwarf B variables returned 893 candidates meant to represent stars of types V1093 Herculis and V361 Hydrae. While they occupy the expected location in the observational Hertzsprung–Russell diagram (Fig. E.53d), their variability might not be due to pulsation only and the presence of the latter cannot be assured when concurrent high-amplitude phenomena exist, such as:
-
the reflection effect due to irradiation of a cool companion by a hot subdwarf primary and consequent re-radiation from the illuminated side of the cool companion;
-
the tidal distortion of the cool companion, which generates ellipsoidal-like flux variations (and possible mass transfer);
-
the possible presence of spots on a rotating SDB star.
The first two items are related to binary systems with a hot subdwarf and a cool companion. They cause larger variations in the red band than in the blue band. The difference between the GBP and GRP amplitudes of the reflection and tidal effects arises from the fact that the blue part of the flux variations of the cool companion is diluted in the blue-band flux of the hot subdwarf, leading to a smaller percentage of flux variation in GBP than in GRP (for example, see Schaffenroth et al. 2014). A multi-periodic analysis of the SDB candidates should be performed to account for the effects of binarity and/or spots, in addition to the stellar oscillations.
These candidates were selected from ten binary, multi-class, and meta-classifiers (Sect. 3.2) with minimum probability thresholds. Additional conditions are described as follows (employing field names in the vari_summary and gaia_source tables).
1. To highlight candidates with clear signals, the minimum level of variability was raised by requiring std_dev_mag_g_fov to be greater than the third quartile of the standard deviations in G, in 0.05 mag intervals, of 1.6 million reference sources (see Appendix E). Additionally, a higher level of variability probability than the one used for the general variability detection (Sect. 3.2) was required (Fig. E.53e).
2. Sparse outliers with respect to the general colour–colour relation followed by all other candidates were removed by the condition median_mag_bp − median_mag_g_fov > 0.33 (median_mag_bp − median_mag_rp) − 0.03 mag.
3. The renormalised unit weight error ruwe was restricted to values lower than 1.15, to remove a tail of SDB candidates at high ruwe values, which was marginally represented in the literature.
4. The possible association of SDB candidates with crowded regions (mostly around the Galactic bulge) was limited by setting the number of sources within 100 arcsec from each SDB candidate (computed by excluding the contribution of the SDB source at the centre) to be less than 471.
Although not listed in the top-6 false positive types of Table 3, at least one additional SDB candidate may be found among the false positives of the WD class.
4.19. Supernovae (SN)
The 3029 classified supernovae represent some of the most extreme types of cataclysmic variables. As anticipated in Sect. 4, SN candidates are the least sampled sources, given their transient detectability, with an average number of clean observations in the G band num_selected_g_fov of 8 (versus 45) within an average time interval time_duration_g_fov of 100 (versus 941) days, where the comparison in parentheses refers to all sources in the vari_classifier_result table.
The G − GRP versus GBP − G diagram in Fig. E.56c suggests that the galaxies hosting the SN candidates are detected, because the extra flux that is collected for extended sources in the GBP and GRP bands with respect to G (see Sect. 4.25) causes the overall negative slope of the colour–colour distribution. Classification scores of the candidates tend to be high for sources with high signal-to-noise (Fig. E.56e). The linear features observed in the skewness_mag_g_fov versus abbe_mag_g_fov diagram (Fig. E.56f) are due to sources with the least amount of observations (typically five).
The SN candidates were obtained from ten binary, multi-class, and meta-classifiers (Sect. 3.2) with minimum probability thresholds. Additional verification filters are described as follows (employing field names in the vari_summary and gaia_source tables).
1. Given the slow SN luminosity decay with respect to Gaia’s average sampling, most candidates have an Abbe value lower than about 0.5. A tail of candidates with Abbe greater than 1 (with no counterpart in the literature) was excluded by the condition abbe_mag_g_fov < 1 (Fig. E.56f).
2. Obvious contamination-dominated candidates at low Galactic latitudes (b) were removed by requiring | b | > 7 degrees (Fig. E.56a).
3. In order to favour candidates with a clear signal, the single-band Stetson index stetson_mag_g_fov was set to be greater than 8.
4. To remove a small number of outliers in the GBP − GRP distribution, median_mag_bp − median_mag_rp was restricted between −1 and 2 mag.
5. The minimum number of clean measurements in the G band was raised to five by num_selected_g_fov > 4 (it was 3).
6. The number of sources within 100 arcsec from each SN candidate (computed by excluding the contribution of the SN at the centre) was set to be less than 314, to remove a high-density tail of candidates prone to contamination.
According to Table 3, at least 106 additional SN candidates may be found among the false positives of the GALAXY class.
4.20. Solar-like stars (SOLAR_LIKE)
The classification of 1 934 844 stars with solar-like variability, such as flaring and rotating spotted stars, was obtained independently of the rotational modulation SOS module (Distefano et al. 2023). Further insights on stellar chromospheric activity using Gaia’s radial velocity spectrometer are described in Lanzafame et al. (2023).
The scanning law features in the sky map of candidates (Fig. E.59a) reflect the dependence of the solar-like identifications on the number of observations, which was learnt from the distribution of training sources that included Gaia DR2 results, as shown in Fig. E.58a. The average number of G-band clean observations (num_selected_g_fov) is 58 for solar-like candidates, with respect to 45 for all classification results.
The contamination by constant stars (Table 3) is not unexpected, given the low-amplitude signal (for example, see Figs. E.60c–f), and the true nature of such objects depends on the photometric precision of those sources in Gaia DR3 with respect to that of other surveys.
The solar-like candidates were obtained from 11 binary, multi-class, and meta-classifiers (Sect. 3.2) with minimum probability thresholds and two additional conditions in the Gaia reddened colour–colour diagrams, to enforce the general colour–colour relation followed by most solar-like sources in the literature (with a model of G − GRP given GBP − GRP, based on known objects; see Fig. E.59c):
-
median_mag_g_fov − median_mag_rp < model(G − GRP | GBP − GRP) + 0.01 mag,
-
median_mag_g_fov − median_mag_rp > model(G − GRP | GBP − GRP) − 0.02 mag,
employing field names in the vari_summary table. The impact of these conditions are also apparent in other colour–colour diagrams, such as the one in Fig. E.59c.
According to Table 3, additional SOLAR_LIKE candidates may be found among the false positives of the following classes: RS (41 913), ECL (5804), YSO (1922), DSCT|GDOR|SXPHE (1052), RR (314), WD (3), and SDB (1), where the numbers of known sources indicated in parentheses represent lower limits.
4.21. Slowly pulsating B-type stars (SPB)
The classified slowly pulsating B stars include 1228 sources, among which 434 contributed to the upper main-sequence oscillator SOS module (see Sect. 10.14 of the Gaia DR3 documentation; Rimoldini et al. 2022). Detailed analyses on this type of objects are presented in Gaia Collaboration (2023b).
The classification SPB candidates were obtained from eight binary, multi-class, multi-stage, and meta-classifiers (Sect. 3.2), after the application of minimum probability thresholds and of an additional condition on the GBP − GRP colour, namely median_mag_bp − median_mag_rp < 0.15 mag (employing field names in the vari_summary table). The main reason of the colour cut was to remove suspicious candidates, employing a stricter colour range than the one used for training (see Figs. E.61b versus E.62b), according to the SPB candidates that were also part of the results of the upper main-sequence oscillator SOS module. Most SPB contaminants originated from classes in the ACV|CP|MCP|ROAM|ROAP|SXARI group (Table 3), as expected from the presence of known SPB stars contaminating this group of types.
Although not listed in the top-6 false positive types of Table 3, additional SPB candidates may be found among the false positives of the following classes: DSCT|GDOR|SXPHE (17), ACV|CP|MCP|ROAM|ROAP|SXARI (13), and RS (1), where the numbers of known sources indicated in parentheses represent lower limits.
4.22. Symbiotic stars (SYST)
The classification of 649 symbiotic variable stars was obtained from a single binary meta-classifier (Sect. 3.2), after the application of a minimum probability threshold that selected the high probability component (to which almost all known symbiotic stars belonged) of a bimodal distribution.
The SYST candidates are prevalently distributed in high stellar density regions, such as the Galactic bulge, disc, and Magellanic Clouds (Fig. E.65a). The stars in the red clump in the colour–magnitude diagram in Fig. E.65b are mostly in the Galactic disc and bulge. Given the long timescale variations typical of this class (see Figs. E.66c–f for a sample of light curves), the SYST candidates tend to have small values of the abbe_mag_g_fov parameter (Fig. E.65f) and long-period variables constitute their main source of contamination (Table 3).
Although not listed in the top-6 false positive types of Table 3, additional SYST candidates may be found among the false positives of the LPV (6) and RS (1) classes, where the numbers of known sources indicated in parentheses represent lower limits.
4.23. Variable white dwarfs (WD)
The classification of 910 white dwarf variables intended to target pulsating stars of types ZZ Ceti, V777 Herculis, and GW Virginis. However, training sources, in particular the ones based simply on variability and location in the observational Hertzsprung–Russell diagram (such as Eyer et al. 2020), most likely included photometric variations due to binarity and spots (see Sect. 4.18), and not necessarily pulsation. Because of this reason, there are fewer WD candidates than the corresponding training sources (which account for less than a quarter of the selected candidates). The reflection and tidal effects in binary sytems with a WD and a cooler companion cause larger variations in GRP than in GBP, as observed in many training and classified sources (Figs. E.67g and E.68g). The selection of the most variable WD candidates implicitly favoured effects of binarity or spots with respect to those from stellar oscillations.
Given the low intrinsic brightness of WDs, the ones detectable by Gaia are nearby (for example, see Fig. 2 of Gaia Collaboration 2019) and thus they are distributed rather homogeneously in the sky, as shown in Figs. E.67a and E.68a. The computation of the classification score included probabilities from one multi-class meta-classifier and from the general variability detection classifier, to increase the relevance of high-amplitude candidates. The main contaminating classes are represented by those of cataclysmic variables and post-common envelope binaries (Table 3), both of which consist of systems that include a WD.
The variable WD candidates were obtained from 11 binary, multi-class, and meta-classifiers (Sect. 3.2) with minimum probability thresholds. Additional verification filters are described in the following (employing field names in the vari_summary and gaia_source tables).
1. Since WD candidates tended to be on average less variable than instances from the literature, a higher level of variability probability than the one used for the general variability detection (Sect. 3.2) was required (Fig. E.68e).
2. A significant clump of candidates at GBP − GRP ≈ 1 mag, with no WD counterpart in the literature and directed towards the Galactic bulge, was excluded by the condition median_mag_bp−median_mag_rp< 0.5 mag (Figs. E.68b,d).
3. The renormalised unit weight error ruwe was set to be lower than 1.15, to remove contaminants towards the Magellanic Clouds (only one known WD lost among the classified ones).
4. The corrected GBP and GRP flux excess factor (Riello et al. 2021) divided by its scatter was set to be less than 3.6, to exclude outliers with significant flux excess in GBP and GRP typical of extended objects (for example, see Sect. 4.25). The removed objects corresponded to the blue-end of the GBP − G distribution, which was devoid of literature WD counterparts.
Although not listed in the top-6 false positive types of Table 3, additional variable WD candidates may be found among the false positives of the DSCT|GDOR|SXPHE (4) and ECL (2) classes, where the numbers of known sources indicated in parentheses represent lower limits.
4.24. Young stellar objects (YSO)
The classification of young stellar objects included 79 375 sources of several types (listed in item 24 of Sect. 3.1.1). These candidates were validated in detail by Marton et al. (2023). The low completeness and high contamination (mainly by RS and SOLAR_LIKE stars; see Table 3) are expected for YSOs identified in the optical wavelengths. While complementary observations in the infrared bands would help reduce the confusion between YSO and solar-like stars, the contamination rate and false positives listed in Tables 2 and 3, respectively, are significantly overestimated, considering subsequent studies on the RS and BY classifications of Chen et al. (2020), as reported in Marton et al. (2023).
The YSO candidates were selected from three binary and multi-class classifiers (Sect. 3.2) with minimum probability thresholds. Additional verification filters are described as follows (employing field names in the vari_summary and gaia_source tables), often with a common criterion that trimmed one or both ends of a distribution, when a high fraction of unknown YSO candidates was removed at the cost of a low fraction of known YSOs (among the classified sources).
1. The minimum variability probability of the general variability detection (Sect. 3.2) was set to be greater than the 5th percentile of known YSOs (Fig. E.71e).
2. The G-band χ2 value was required to be greater than the 5th percentile of known YSOs.
3. The std_dev_over_rms_err_mag_g_fov parameter was set to be greater than the 5th percentile of known YSOs.
4. The single-band stetson_mag_g_fov index was set to be greater than the 5th percentile of known YSOs.
5. The proper motion components in the right ascension and declination directions (pmra and pmdec, respectively), were restricted between the 1st and the 99th percentiles of known YSOs.
6. The parallax_over_error ratio was set to be greater than 3, for a minimal significance of parallax and also for a selection of sources that were not too distant for YSOs to be observable.
7. Two conditions in GBP − G versus GBP − GRP selected the YSO candidates close to the general colour–colour relation followed by most YSOs in the literature (with a model of GBP − G given GBP − GRP, based on known objects; see Fig. E.71c):
-
(a)
median_mag_bp − median_mag_g_fov < model(GBP − G | GBP − GRP) + 0.1 mag,
-
(b)
median_mag_bp − median_mag_g_fov > model(GBP − G | GBP − GRP) − 0.04 mag.
According to Table 3, additional YSO candidates may be found among the false positives of the following classes: AGN (328), CV (20), BE|GCAS|SDOR|WR (9), and RCB (5), where the numbers of known sources indicated in parentheses represent lower limits.
4.25. Galaxies (GALAXY, in galaxy_candidates)
The classification of 2 451 364 galaxies was made possible by their apparent variability in the Gaia photometry. As mentioned in Sect. 2, galaxies can be affected by spurious signals peculiar to Gaia’s detection and measurement strategy (Holl et al. 2023). The main role of galaxies for the classification of variable objects was to reduce the impact of artificial variations on the identification of candidates of genuine variability types, as already noticed in Gaia DR2 (Clementini et al. 2019).
Unlike stars, galaxies occupy the red end in G − GRP and the blue end in GBP − G (see Fig. E.74c), because the GBP and GRP window size is much larger than the G one, thus more flux from extended objects is included in the sum of GBP and GRP bands than in the G band. This discrepancy is estimated by the field gaia_source.phot_bp_rp_excess_factor (Riello et al. 2021). As expected, only a few training sources or galaxy candidates fulfil the condition of parallax_over_error > 5 required for observational Hertzsprung–Russell diagrams (Figs. E.73d and E.74d).
Galaxy candidates were obtained from three binary, multi-class, and meta-classifiers (Sect. 3.2) with minimum probability thresholds. They were further filtered by the following conditions (employing field names in the vari_summary and gaia_source tables).
1. Removal of extremely blue and red outliers:
(a) median_mag_bp − median_mag_rp in the range from 0 to 3 mag,
(b) median_mag_bp − median_mag_g_fov > − 4 mag,
(c) median_mag_g_fov − median_mag_rp in the range from 0.5 to 5.5 mag.
2. The distribution of the GBP and GRP flux excess factor phot_bp_rp_excess_factor showed a bimodal distribution and known galaxies corresponded to the mode at high values, so phot_bp_rp_excess_factor was set to be greater than five (excluding the first peak of the distribution).
3. With respect to point sources, extended objects are often associated with higher positional uncertainty (further amplified by the causes of the spurious photometric variations) and the condition astrometric_excess_noise > 7 mas matched the distribution of known galaxies.
4. The number of sources within 100 arcsec from each GALAXY candidate (computed by excluding the contribution of the galaxy at the centre) was set to be less than 314, affecting mostly candidates around the Galactic plane and behind the Magellanic Clouds (Fig. E.74a).
5. It was further required that num_selected_rp > 5.
6. Two (unpublished) spectral shape components in the GBP band (SSC ids 2 and 3)6 were set to be greater than minimum thresholds, according to the galaxy distributions in the literature.
According to Table 3, additional GALAXY candidates may be found among the false positives of the following classes: LPV (1364), S (244), AGN (48), and SN (2), where the numbers of known sources indicated in parentheses represent lower limits. In addition to those of galaxy contaminants in other classes, as mentioned in the beginning of Sect. 4, only the light curves of the sources included in the Gaia Andromeda Photometric Survey (Evans et al. 2023) are published (Fig. E.76).
A comparison of the sources from all of the classes identified by variability that overlap with those from galaxy modules published by other Gaia coordination units (gathered in the galaxy_candidates table) is presented in Tables 12.15 and 12.18 of the Gaia DR3 documentation (Teyssier & Gaia QSO Working Group 2022). It is noted that the unfiltered galaxy_candidates table has significant stellar contamination (see Gaia Collaboration 2023a, where a query to select a purer sub-sample is indicated). An example of a query to extract our GALAXY candidates from the galaxy_candidates table is presented in Appendix D.
5. Conclusions
The Gaia DR3 photometric time series provided sufficient information to classify ten million variable objects into two dozen variability class groups across the whole sky. This combination of number of sources and classes made it one of the largest and most uniformly constructed variable source catalogues in the literature. The cross-match of Gaia sources with an extensive compilation of known variability types (Gavras et al. 2023) enabled a detailed exploitation of the knowledge in the literature for supervised machine learning and for the assessment of the results. A multi-classifier approach made it possible to obtain suitable models for a large variety of variability classes, involving several types of pulsating stars, eclipsing binaries, ellipsoidal variables, spotted stars, eruptive and cataclysmic phenomena, stochastic variations of AGNs, microlensing events, and planetary transits. Almost half of the genuine variable sources (4.7 million) and several classes are available uniquely as classification results (in the vari_classifier_result table), while the other variable sources are (also) included among the SOS module results. Galaxies were detected by an artificial signal of Gaia, which might have led to a biased yet, nevertheless, numerous addition to the extra-galactic content of this data release.
In Gaia DR4, the number of photometric epochs will double and additional input data types, such as the GBP and GRP spectra (as time series, as well as averaged in time) and radial velocities, will be available. Together with ongoing developments in our attribute extraction and classification techniques, the discernibility of variability types is expected to further improve and allow for a significant increase in the number of classified sources and related classes.
Due to the scanning law coverage, only from Gaia DR2 onwards were sufficient epochs available at all locations on the sky, for example, see Fig. 1 in Holl et al. (2018).
Acknowledgments
This work presents results from the European Space Agency (ESA) space mission Gaia. Gaia data are being processed by the Gaia Data Processing and Analysis Consortium (DPAC). Funding for the DPAC is provided by national institutions, in particular the institutions participating in the Gaia MultiLateral Agreement (MLA). The Gaia mission website is https://www.cosmos.esa.int/gaia. The Gaia archive website is https://archives.esac.esa.int/gaia. The Gaia mission and data processing have financially been supported by, in alphabetical order by country: the Algerian Centre de Recherche en Astronomie, Astrophysique et Géophysique of Bouzareah Observatory; the Austrian Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Hertha Firnberg Programme through grants T359, P20046, and P23737; the BELgian federal Science Policy Office (BELSPO) through various PROgramme de Développement d’Expériences scientifiques (PRODEX) grants, the Research Foundation Flanders (Fonds Wetenschappelijk Onderzoek) through grant VS.091.16N, the Fonds de la Recherche Scientifique (FNRS), and the Research Council of Katholieke Universiteit (KU) Leuven through grant C16/18/005 (Pushing AsteRoseismology to the next level with TESS, GaiA, and the Sloan DIgital Sky SurvEy – PARADISE); the Brazil-France exchange programmes Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) and Coordenação de Aperfeicoamento de Pessoal de Nível Superior (CAPES) - Comité Français d’Evaluation de la Coopération Universitaire et Scientifique avec le Brésil (COFECUB); the Chilean Agencia Nacional de Investigación y Desarrollo (ANID) through Fondo Nacional de Desarrollo Científico y Tecnológico (FONDECYT) Regular Project 1210992 (L. Chemin); the National Natural Science Foundation of China (NSFC) through grants 11573054, 11703065, and 12173069, the China Scholarship Council through grant 201806040200, and the Natural Science Foundation of Shanghai through grant 21ZR1474100; the Tenure Track Pilot Programme of the Croatian Science Foundation and the École Polytechnique Fédérale de Lausanne and the project TTP-2018-07-1171 ‘Mining the Variable Sky’, with the funds of the Croatian-Swiss Research Programme; the Czech-Republic Ministry of Education, Youth, and Sports through grant LG 15010 and INTER-EXCELLENCE grant LTAUSA18093, and the Czech Space Office through ESA PECS contract 98058; the Danish Ministry of Science; the Estonian Ministry of Education and Research through grant IUT40-1; the European Commission’s Sixth Framework Programme through the European Leadership in Space Astrometry (https://www.cosmos.esa.int/web/gaia/elsa-rtn-programme) Marie Curie Research Training Network (MRTN-CT-2006-033481), through Marie Curie project PIOF-GA-2009-255267 (Space AsteroSeismology & RR Lyrae stars, SAS-RRL), and through a Marie Curie Transfer-of-Knowledge (ToK) fellowship (MTKD-CT-2004-014188); the European Commission’s Seventh Framework Programme through grant FP7-606740 (FP7-SPACE-2013-1) for the Gaia European Network for Improved data User Services (https://gaia.ub.edu/twiki/do/view/GENIUS/) and through grant 264895 for the Gaia Research for European Astronomy Training (https://www.cosmos.esa.int/web/gaia/great-programme) network; the European Cooperation in Science and Technology (COST) through COST Action CA18104 ‘Revealing the Milky Way with Gaia (MW-Gaia)’; the European Research Council (ERC) through grants 320360, 647208, and 834148 and through the European Union’s Horizon 2020 research and innovation and excellent science programmes through Marie Skłodowska-Curie grant 745617 (Our Galaxy at full HD – Gal-HD) and 895174 (The build-up and fate of self-gravitating systems in the Universe) as well as grants 687378 (Small Bodies: Near and Far), 682115 (Using the Magellanic Clouds to Understand the Interaction of Galaxies), 695099 (A sub-percent distance scale from binaries and Cepheids – CepBin), 716155 (Structured ACCREtion Disks – SACCRED), 951549 (Sub-percent calibration of the extragalactic distance scale in the era of big surveys – UniverScale), and 101004214 (Innovative Scientific Data Exploration and Exploitation Applications for Space Sciences – EXPLORE); the European Science Foundation (ESF), in the framework of the Gaia Research for European Astronomy Training Research Network Programme (https://www.cosmos.esa.int/web/gaia/great-programme); the European Space Agency (ESA) in the framework of the Gaia project, through the Plan for European Cooperating States (PECS) programme through contracts C98090 and 4000106398/12/NL/KML for Hungary, through contract 4000115263/15/NL/IB for Germany, and through PROgramme de Développement d’Expériences scientifiques (PRODEX) grant 4000127986 for Slovenia; the Academy of Finland through grants 299543, 307157, 325805, 328654, 336546, and 345115 and the Magnus Ehrnrooth Foundation; the French Centre National d’Études Spatiales (CNES), the Agence Nationale de la Recherche (ANR) through grant ANR-10-IDEX-0001-02 for the ‘Investissements d’avenir’ programme, through grant ANR-15-CE31-0007 for project ‘Modelling the Milky Way in the Gaia era’ (MOD4Gaia), through grant ANR-14-CE33-0014-01 for project ‘The Milky Way disc formation in the Gaia era’ (ARCHEOGAL), through grant ANR-15-CE31-0012-01 for project ‘Unlocking the potential of Cepheids as primary distance calibrators’ (UnlockCepheids), through grant ANR-19-CE31-0017 for project ‘Secular evolution of galaxies’ (SEGAL), and through grant ANR-18-CE31-0006 for project ‘Galactic Dark Matter’ (GaDaMa), the Centre National de la Recherche Scientifique (CNRS) and its SNO Gaia of the Institut des Sciences de l’Univers (INSU), its Programmes Nationaux: Cosmologie et Galaxies (PNCG), Gravitation Références Astronomie Métrologie (PNGRAM), Planétologie (PNP), Physique et Chimie du Milieu Interstellaire (PCMI), and Physique Stellaire (PNPS), the ‘Action Fédératrice Gaia’ of the Observatoire de Paris, the Région de Franche-Comté, the Institut National Polytechnique (INP) and the Institut National de Physique nucléaire et de Physique des Particules (IN2P3) co-funded by CNES; the German Aerospace Agency (Deutsches Zentrum für Luft- und Raumfahrt e.V., DLR) through grants 50QG0501, 50QG0601, 50QG0602, 50QG0701, 50QG0901, 50QG1001, 50QG1101, 50QG1401, 50QG1402, 50QG1403, 50QG1404, 50QG1904, 50QG2101, 50QG2102, and 50QG2202, and the Centre for Information Services and High Performance Computing (ZIH) at the Technische Universität Dresden for generous allocations of computer time; the Hungarian Academy of Sciences through the Lendület Programme grants LP2014-17 and LP2018-7 and the Hungarian National Research, Development, and Innovation Office (NKFIH) through grant KKP-137523 (‘SeismoLab’); the Science Foundation Ireland (SFI) through a Royal Society - SFI University Research Fellowship (M. Fraser); the Israel Ministry of Science and Technology through grant 3-18143 and the Tel Aviv University Center for Artificial Intelligence and Data Science (TAD) through a grant; the Agenzia Spaziale Italiana (ASI) through contracts I/037/08/0, I/058/10/0, 2014-025-R.0, 2014-025-R.1.2015, and 2018-24-HH.0 to the Italian Istituto Nazionale di Astrofisica (INAF), contract 2014-049-R.0/1/2 to INAF for the Space Science Data Centre (SSDC, formerly known as the ASI Science Data Center, ASDC), contracts I/008/10/0, 2013/030/I.0, 2013-030-I.0.1-2015, and 2016-17-I.0 to the Aerospace Logistics Technology Engineering Company (ALTEC S.p.A.), INAF, and the Italian Ministry of Education, University, and Research (Ministero dell’Istruzione, dell’Università e della Ricerca) through the Premiale project ‘MIning The Cosmos Big Data and Innovative Italian Technology for Frontier Astrophysics and Cosmology’ (MITiC); the Netherlands Organisation for Scientific Research (NWO) through grant NWO-M-614.061.414, through a VICI grant (A. Helmi), and through a Spinoza prize (A. Helmi), and the Netherlands Research School for Astronomy (NOVA); the Polish National Science Centre through HARMONIA grant 2018/30/M/ST9/00311 and DAINA grant 2017/27/L/ST9/03221 and the Ministry of Science and Higher Education (MNiSW) through grant DIR/WK/2018/12; the Portuguese Fundação para a Ciência e a Tecnologia (FCT) through national funds, grants SFRH/BD/128840/2017 and PTDC/FIS-AST/30389/2017, and work contract DL 57/2016/CP1364/CT0006, the Fundo Europeu de Desenvolvimento Regional (FEDER) through grant POCI-01-0145-FEDER-030389 and its Programa Operacional Competitividade e Internacionalização (COMPETE2020) through grants UIDB/04434/2020 and UIDP/04434/2020, and the Strategic Programme UIDB/00099/2020 for the Centro de Astrofísica e Gravitação (CENTRA); the Slovenian Research Agency through grant P1-0188; the Spanish Ministry of Economy (MINECO/FEDER, UE), the Spanish Ministry of Science and Innovation (MICIN), the Spanish Ministry of Education, Culture, and Sports, and the Spanish Government through grants BES-2016-078499, BES-2017-083126, BES-C-2017-0085, ESP2016-80079-C2-1-R, ESP2016-80079-C2-2-R, FPU16/03827, PDC2021-121059-C22, RTI2018-095076-B-C22, and TIN2015-65316-P (‘Computación de Altas Prestaciones VII’), the Juan de la Cierva Incorporación Programme (FJCI-2015-2671 and IJC2019-04862-I for F. Anders), the Severo Ochoa Centre of Excellence Programme (SEV2015-0493), and MICIN/AEI/10.13039/501100011033 (and the European Union through European Regional Development Fund ‘A way of making Europe’) through grant RTI2018-095076-B-C21, the Institute of Cosmos Sciences University of Barcelona (ICCUB, Unidad de Excelencia ‘María de Maeztu’) through grant CEX2019-000918-M, the University of Barcelona’s official doctoral programme for the development of an R+D+i project through an Ajuts de Personal Investigador en Formació (APIF) grant, the Spanish Virtual Observatory through project AyA2017-84089, the Galician Regional Government, Xunta de Galicia, through grants ED431B-2021/36, ED481A-2019/155, and ED481A-2021/296, the Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), funded by the Xunta de Galicia and the European Union (European Regional Development Fund – Galicia 2014-2020 Programme), through grant ED431G-2019/01, the Red Española de Supercomputación (RES) computer resources at MareNostrum, the Barcelona Supercomputing Centre - Centro Nacional de Supercomputación (BSC-CNS) through activities AECT-2017-2-0002, AECT-2017-3-0006, AECT-2018-1-0017, AECT-2018-2-0013, AECT-2018-3-0011, AECT-2019-1-0010, AECT-2019-2-0014, AECT-2019-3-0003, AECT-2020-1-0004, and DATA-2020-1-0010, the Departament d’Innovació, Universitats i Empresa de la Generalitat de Catalunya through grant 2014-SGR-1051 for project ‘Models de Programació i Entorns d’Execució Parallels’ (MPEXPAR), and Ramon y Cajal Fellowship RYC2018-025968-I funded by MICIN/AEI/10.13039/501100011033 and the European Science Foundation (‘Investing in your future’); the Swedish National Space Agency (SNSA/Rymdstyrelsen); the Swiss State Secretariat for Education, Research, and Innovation through the Swiss Activités Nationales Complémentaires and the Swiss National Science Foundation through an Eccellenza Professorial Fellowship (award PCEFP2_194638 for R. Anderson); the United Kingdom Particle Physics and Astronomy Research Council (PPARC), the United Kingdom Science and Technology Facilities Council (STFC), and the United Kingdom Space Agency (UKSA) through the following grants to the University of Bristol, the University of Cambridge, the University of Edinburgh, the University of Leicester, the Mullard Space Sciences Laboratory of University College London, and the United Kingdom Rutherford Appleton Laboratory (RAL): PP/D006511/1, PP/D006546/1, PP/D006570/1, ST/I000852/1, ST/J005045/1, ST/K00056X/1, ST/K000209/1, ST/K000756/1, ST/L006561/1, ST/N000595/1, ST/N000641/1, ST/N000978/1, ST/N001117/1, ST/S000089/1, ST/S000976/1, ST/S000984/1, ST/S001123/1, ST/S001948/1, ST/S001980/1, ST/S002103/1, ST/V000969/1, ST/W002469/1, ST/W002493/1, ST/W002671/1, ST/W002809/1, and EP/V520342/1. The Ground Based Optical Tracking (GBOT) programme uses observations collected at (i) the European Organisation for Astronomical Research in the Southern Hemisphere (ESO) with the VLT Survey Telescope (VST), under ESO programmes 092.B-0165, 093.B-0236, 094.B-0181, 095.B-0046, 096.B-0162, 097.B-0304, 098.B-0030, 099.B-0034, 0100.B-0131, 0101.B-0156, 0102.B-0174, and 0103.B-0165; and (ii) the Liverpool Telescope, which is operated on the island of La Palma by Liverpool John Moores University in the Spanish Observatorio del Roque de los Muchachos of the Instituto de Astrofísica de Canarias with financial support from the United Kingdom Science and Technology Facilities Council, and (iii) telescopes of the Las Cumbres Observatory Global Telescope Network. This work made use of software from H2O (H2O.ai 2020), Postgres-XL (https://www.postgres-xl.org), Java (https://www.oracle.com/java/), R (R Core Team 2018), TBase database management system (https://github.com/Tencent/TBase), and TOPCAT/STILTS (Taylor 2005).
References
- Abbas, M. A., Grebel, E. K., Martin, N. F., et al. 2014, MNRAS, 441, 1230 [Google Scholar]
- Akras, S., Guzman-Ramirez, L., Leal-Ferreira, M. L., & Ramos-Larios, G. 2019, ApJS, 240, 21 [NASA ADS] [CrossRef] [Google Scholar]
- Alfonso-Garzón, J., Domingo, A., Mas-Hesse, J. M., & Giménez, A. 2012, A&A, 548, A79 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Andrae, R., Fouesneau, M., Sordo, R., et al. 2023, A&A, 674, A27 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Arenou, F., & Luri, X. 1999, ASP Conf. Ser., 167, 13 [Google Scholar]
- Babusiaux, C., Fabricius, C., Khanna, S., et al. 2023, A&A, 674, A32 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Baluev, R. V. 2009, MNRAS, 395, 1541 [CrossRef] [Google Scholar]
- Beauchamp, A., Wesemael, F., Bergeron, P., et al. 1999, ApJ, 516, 887 [NASA ADS] [CrossRef] [Google Scholar]
- Belczyński, K., Mikołajewska, J., Munari, U., Ivison, R. J., & Friedjung, M. 2000, A&AS, 146, 407 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Bellm, E. 2014, in The Third Hot-wiring the Transient Universe Workshop, eds. P. R. Wozniak, M. J. Graham, A. A. Mahabal, & R. Seaman, 27 [Google Scholar]
- Benkő, J. M., Bakos, G. Á., & Nuspl, J. 2006, MNRAS, 372, 1657 [CrossRef] [Google Scholar]
- Bergeat, J., Knapik, A., & Rutily, B. 2001, A&A, 369, 178 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Bernhard, K., Hümmerich, S., Otero, S., & Paunzen, E. 2015, A&A, 581, A138 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Blomme, J., Debosscher, J., De Ridder, J., et al. 2010, ApJ, 713, L204 [NASA ADS] [CrossRef] [Google Scholar]
- Boettcher, E., Willman, B., Fadely, R., et al. 2013, AJ, 146, 94 [NASA ADS] [CrossRef] [Google Scholar]
- Bognár, Z., Kawaler, S. D., Bell, K. J., et al. 2020, A&A, 638, A82 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Bonato, M., Liuzzo, E., Giannetti, A., et al. 2018, MNRAS, 478, 1512 [NASA ADS] [CrossRef] [Google Scholar]
- Bradley, P. A., Guzik, J. A., Miles, L. F., et al. 2015, AJ, 149, 68 [CrossRef] [Google Scholar]
- Braga, V. F., Stetson, P. B., Bono, G., et al. 2016, AJ, 152, 170 [Google Scholar]
- Braga, V. F., Contreras Ramos, R., Minniti, D., et al. 2019, A&A, 625, A151 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Bredall, J. W., Shappee, B. J., Gaidos, E., et al. 2020, MNRAS, 496, 3257 [NASA ADS] [CrossRef] [Google Scholar]
- Breiman, L. 2001, Mach. Learn., 45, 5 [Google Scholar]
- Butler, N. R., & Bloom, J. S. 2011, AJ, 141, 93 [Google Scholar]
- Carnerero, M. I., Raiteri, C. M., Rimoldini, L., et al. 2023, A&A, 674, A24 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Castañeda, J., Hobbs, D., Fabricius, C., et al. 2022, Gaia DR3 documentation Chapter 3: Pre-processing, Gaia DR3 documentation, European Space Agency; Gaia Data Processing and Analysis Consortium, Online at https://gea.esac.esa.int/archive/documentation/GDR3/index.html [Google Scholar]
- Chang, S. W., Byun, Y. I., & Hartman, J. D. 2015, ApJ, 814, 35 [NASA ADS] [CrossRef] [Google Scholar]
- Chang, Y. L., Arsioli, B., Giommi, P., & Padovani, P. 2017, A&A, 598, A17 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Chen, T., & Guestrin, C. 2016, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’16 (New York, NY, USA: ACM), 785 [CrossRef] [Google Scholar]
- Chen, X., Wang, S., Deng, L., et al. 2020, ApJS, 249, 18 [NASA ADS] [CrossRef] [Google Scholar]
- Clementini, G., Ripepi, V., Molinaro, R., et al. 2019, A&A, 622, A60 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Clementini, G., Ripepi, V., Garofalo, A., et al. 2023, A&A, 674, A18 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Córsico, A. H., Althaus, L. G., Miller Bertolami, M. M., & Kepler, S. O. 2019, A&ARv, 27, 7 [Google Scholar]
- Corwin, T. M., Sumerel, A. N., Pritzl, B. J., et al. 2006, AJ, 132, 1014 [NASA ADS] [CrossRef] [Google Scholar]
- Corwin, T. M., Borissova, J., Stetson, P. B., et al. 2008, AJ, 135, 1459 [Google Scholar]
- Creevey, O. L., Sordo, R., Pailler, F., et al. 2023, A&A, 674, A26 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Cunha, M. S., Antoci, V., Holdsworth, D. L., et al. 2019, MNRAS, 487, 3523 [NASA ADS] [CrossRef] [Google Scholar]
- Dall’Ora, M., Clementini, G., Kinemuchi, K., et al. 2006, ApJ, 653, L109 [CrossRef] [Google Scholar]
- Dall’Ora, M., Kinemuchi, K., Ripepi, V., et al. 2012, ApJ, 752, 42 [CrossRef] [Google Scholar]
- Debosscher, J., Sarro, L. M., Aerts, C., et al. 2007, A&A, 475, 1159 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Debosscher, J., Blomme, J., Aerts, C., & De Ridder, J. 2011, A&A, 529, A89 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Delchambre, L., Bailer-Jones, C. A. L., Bellas-Velidis, I., et al. 2023, A&A, 674, A31 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- De Medeiros, J. R., Ferreira Lopes, C. E., Leão, I. C., et al. 2013, A&A, 555, A63 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Demers, S., & Battinelli, P. 2007, A&A, 473, 143 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Distefano, E., Lanzafame, A. C., Brugaletta, E., et al. 2023, A&A, 674, A20 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Drake, A. J. 2006, AJ, 131, 1044 [Google Scholar]
- Drake, A. J., Catelan, M., Djorgovski, S. G., et al. 2013a, ApJ, 763, 32 [NASA ADS] [CrossRef] [Google Scholar]
- Drake, A. J., Catelan, M., Djorgovski, S. G., et al. 2013b, ApJ, 765, 154 [Google Scholar]
- Drake, A. J., Gänsicke, B. T., Djorgovski, S. G., et al. 2014a, MNRAS, 441, 1186 [NASA ADS] [CrossRef] [Google Scholar]
- Drake, A. J., Graham, M. J., Djorgovski, S. G., et al. 2014b, ApJS, 213, 9 [Google Scholar]
- Drake, A. J., Djorgovski, S. G., Catelan, M., et al. 2017, MNRAS, 469, 3688 [NASA ADS] [CrossRef] [Google Scholar]
- Dubath, P., Rimoldini, L., Süveges, M., et al. 2011, MNRAS, 414, 2602 [Google Scholar]
- Ducourant, C., Krone-Martins, A., Galluccio, L., et al. 2023, A&A, 674, A11 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Dufour, P., Béland, S., Fontaine, G., Chayer, P., & Bergeron, P. 2011, ApJ, 733, L19 [NASA ADS] [CrossRef] [Google Scholar]
- Dunlap, B. H., Barlow, B. N., & Clemens, J. C. 2010, ApJ, 720, L159 [CrossRef] [Google Scholar]
- Eker, Z., Ak, N. F., Bilir, S., et al. 2008, MNRAS, 389, 1722 [NASA ADS] [CrossRef] [Google Scholar]
- ESA 1997, in The HIPPARCOS and TYCHO catalogues. Astrometric and Photometric Star Catalogues Derived from the ESA HIPPARCOS Space Astrometry Mission, ESA Spec. Publ., 1200 [Google Scholar]
- Evans, D. W., Eyer, L., Busso, G., et al. 2023, A&A, 674, A4 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Eyer, L., Mowlavi, N., Evans, D. W., et al. 2017, A&A, submittted, [arXiv:1702.03295] [Google Scholar]
- Eyer, L., Rimoldini, L., & Rohrbasser, L. 2020, in Stars and their Variability Observed from Space, eds. C. Neiner, W. W. Weiss, & D. Baade, 11 [Google Scholar]
- Eyer, L., Audard, M., Holl, B., et al. 2023, A&A, 674, A13 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Flesch, E. W. 2019, ArXiv e-prints [arXiv:1912.05614] [Google Scholar]
- Fouesneau, M., Frémat, Y., Andrae, R., et al. 2023, A&A, 674, A28 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Gaia Collaboration (Brown, A. G. A., et al.) 2016a, A&A, 595, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Gaia Collaboration (Prusti, T., et al.) 2016b, A&A, 595, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Gaia Collaboration (Brown, A. G. A., et al.) 2018, A&A, 616, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Gaia Collaboration (Eyer, L., et al.) 2019, A&A, 623, A110 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Gaia Collaboration (Klioner, S. A., et al.) 2022, A&A, 667, A148 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Gaia Collaboration (Bailer-Jones, C. A. L., et al.) 2023a, A&A, 674, A41 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Gaia Collaboration (De Ridder, J., et al.) 2023b, A&A, 674, A36 (Gaia DR3 SI) [CrossRef] [EDP Sciences] [Google Scholar]
- Gaia Collaboration (Vallenari, A., et al.) 2023c, A&A, 674, A1 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Garofalo, A., Cusano, F., Clementini, G., et al. 2013, ApJ, 767, 62 [NASA ADS] [CrossRef] [Google Scholar]
- Gavras, P., Rimoldini, L., Nienartowicz, K., et al. 2023, A&A, 674, A22 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Gianninas, A., Bergeron, P., & Fontaine, G. 2005, ApJ, 631, 1100 [NASA ADS] [CrossRef] [Google Scholar]
- Gomel, R., Mazeh, T., Faigler, S., et al. 2023, A&A, 674, A19 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Górski, K. M., Banday, A. J., Hivon, E., & Wandelt, B. D. 2002, ASP Conf. Ser., 281, 107 [Google Scholar]
- H2O.ai 2020, H2O: Scalable Machine Learning Platform, version 3.30.0.1 [Google Scholar]
- Hamanowicz, A., Pietrukowicz, P., Udalski, A., et al. 2016, Acta Astron., 66, 197 [NASA ADS] [Google Scholar]
- Hartman, J. D., Bakos, G. Á., Kovács, G., & Noyes, R. W. 2010, MNRAS, 408, 475 [Google Scholar]
- Heck, A., Manfroid, J., & Mersch, G. 1985, A&AS, 59, 63 [NASA ADS] [Google Scholar]
- Heinze, A. N., Tonry, J. L., Denneau, L., et al. 2018, AJ, 156, 241 [Google Scholar]
- Hermes, J. J., Montgomery, M. H., Winget, D. E., et al. 2012, ApJ, 750, L28 [Google Scholar]
- Hermes, J. J., Montgomery, M. H., Gianninas, A., et al. 2013a, MNRAS, 436, 3573 [Google Scholar]
- Hermes, J. J., Montgomery, M. H., Winget, D. E., et al. 2013b, ApJ, 765, 102 [Google Scholar]
- Hey, D. R., Holdsworth, D. L., Bedding, T. R., et al. 2019, MNRAS, 488, 18 [NASA ADS] [CrossRef] [Google Scholar]
- Holl, B., Audard, M., Nienartowicz, K., et al. 2018, A&A, 618, A30 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Holl, B., Fabricius, C., Portell, J., et al. 2023, A&A, 674, A25 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Howell, S. B., Mason, E., Boyd, P., Smith, K. L., & Gelino, D. M. 2016, ApJ, 831, 27 [NASA ADS] [CrossRef] [Google Scholar]
- Hümmerich, S., Mikulášek, Z., Paunzen, E., et al. 2018, A&A, 619, A98 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Ivezić, Ž., Smith, J. A., Miknaitis, G., et al. 2007, AJ, 134, 973 [Google Scholar]
- Jayasinghe, T., Kochanek, C. S., Stanek, K. Z., et al. 2018, MNRAS, 477, 3145 [Google Scholar]
- Jayasinghe, T., Stanek, K. Z., Kochanek, C. S., et al. 2019a, MNRAS, 486, 1907 [NASA ADS] [Google Scholar]
- Jayasinghe, T., Stanek, K. Z., Kochanek, C. S., et al. 2019b, MNRAS, 485, 961 [Google Scholar]
- Kahraman Aliçavuş, F., Niemczura, E., De Cat, P., et al. 2016, MNRAS, 458, 2307 [CrossRef] [Google Scholar]
- Kepler, S. O., Fraga, L., Winget, D. E., et al. 2014, MNRAS, 442, 2278 [NASA ADS] [CrossRef] [Google Scholar]
- Kim, D.-W., Protopapas, P., Bailer-Jones, C. A. L., et al. 2014, A&A, 566, A43 [CrossRef] [EDP Sciences] [Google Scholar]
- Kinemuchi, K., Smith, H. A., Woźniak, P. R., McKay, T. A., & ROTSE Collaboration 2006, AJ, 132, 1202 [Google Scholar]
- Kirk, B., Conroy, K., Prša, A., et al. 2016, AJ, 151, 68 [Google Scholar]
- Kochanek, C. S., Shappee, B. J., Stanek, K. Z., et al. 2017, PASP, 129, 104502 [Google Scholar]
- Kurtz, D. W., Shibahashi, H., Dhillon, V. S., et al. 2013, MNRAS, 432, 1632 [Google Scholar]
- Lanzafame, A. C., Distefano, E., Messina, S., et al. 2018, A&A, 616, A16 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Lanzafame, A. C., Brugaletta, E., Frémat, Y., et al. 2023, A&A, 674, A30 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Lebzelter, T., Mowlavi, N., Lecoeur-Taibi, I., et al. 2023, A&A, 674, A15 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Lindegren, L., Bastian, U., Biermann, M., et al. 2021a, A&A, 649, A4 [EDP Sciences] [Google Scholar]
- Lindegren, L., Klioner, S. A., Hernández, J., et al. 2021b, A&A, 649, A2 [EDP Sciences] [Google Scholar]
- Ma, C., Arias, F. E., Bianco, G., et al. 2013, VizieR Online Data Catalog: I/323 [Google Scholar]
- Marquette, J. B., Beaulieu, J. P., Buchler, J. R., et al. 2009, A&A, 495, 249 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Martínez-Arnáiz, R., Maldonado, J., Montes, D., Eiroa, C., & Montesinos, B. 2010, A&A, 520, A79 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Marton, G., Ábrahám, P., Rimoldini, L., et al. 2023, A&A, 674, A21 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Massaro, E., Maselli, A., Leto, C., et al. 2015, Ap&SS, 357, 75 [Google Scholar]
- Mauron, N., Maurin, L. P. A., & Kendall, T. R. 2019, A&A, 626, A112 [EDP Sciences] [Google Scholar]
- Medhi, B. J., Messina, S., Parihar, P. S., et al. 2007, A&A, 469, 713 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Mennickent, R. E., Pietrzyński, G., Gieren, W., & Szewczyk, O. 2002, A&A, 393, 887 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Messina, S., Desidera, S., Turatto, M., Lanzafame, A. C., & Guinan, E. F. 2010, A&A, 520, A15 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Messina, S., Desidera, S., Lanzafame, A. C., Turatto, M., & Guinan, E. F. 2011, A&A, 532, A10 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Mould, J., Saha, A., & Hughes, S. 2004, ApJS, 154, 623 [NASA ADS] [CrossRef] [Google Scholar]
- Mowlavi, N., Holl, B., Lecœur-Taïbi, I., et al. 2023, A&A, 674, A16 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Mróz, P., Udalski, A., Poleski, R., et al. 2015, Acta Astron., 65, 313 [NASA ADS] [Google Scholar]
- Musella, I., Ripepi, V., Clementini, G., et al. 2009, ApJ, 695, L83 [NASA ADS] [CrossRef] [Google Scholar]
- Musella, I., Ripepi, V., Marconi, M., et al. 2012, ApJ, 756, 121 [NASA ADS] [CrossRef] [Google Scholar]
- Niemczura, E. 2003, A&A, 404, 689 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Nitta, A., Kleinman, S. J., Krzesinski, J., et al. 2009, ApJ, 690, 560 [NASA ADS] [CrossRef] [Google Scholar]
- Palaversa, L., Ivezić, Ž., Eyer, L., et al. 2013, AJ, 146, 101 [CrossRef] [Google Scholar]
- Panahi, A., Zucker, S., Clementini, G., et al. 2022, A&A, 663, A101 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Pawlak, M., Graczyk, D., Soszyński, I., et al. 2013, Acta Astron., 63, 323 [NASA ADS] [Google Scholar]
- Pawlak, M., Soszyński, I., Udalski, A., et al. 2016, Acta Astron., 66, 421 [Google Scholar]
- Pellerin, A., & Macri, L. M. 2011, ApJS, 193, 26 [NASA ADS] [CrossRef] [Google Scholar]
- Pietrukowicz, P., Dziembowski, W. A., Latour, M., et al. 2017, Nat. Astron., 1, 0166 [Google Scholar]
- Pigulski, A., Pojmański, G., Pilecki, B., & Szczygieł, D. M. 2009, Acta Astron., 59, 33 [NASA ADS] [Google Scholar]
- Pojmanski, G. 2002, Acta Astron., 52, 397 [NASA ADS] [Google Scholar]
- Poleski, R., Soszyński, I., Udalski, A., et al. 2010, Acta Astron., 60, 1 [EDP Sciences] [Google Scholar]
- Pritzl, B. J., Smith, H. A., Catelan, M., & Sweigart, A. V. 2002, AJ, 124, 949 [Google Scholar]
- Pritzl, B. J., Smith, H. A., Stetson, P. B., et al. 2003, AJ, 126, 1381 [Google Scholar]
- Quirion, P. O., Fontaine, G., & Brassard, P. 2007, ApJS, 171, 219 [Google Scholar]
- R Core Team 2018, R: A Language and Environment for Statistical Computing (Vienna, Austria: R Foundation for Statistical Computing) [Google Scholar]
- Reinhold, T., & Gizon, L. 2015, A&A, 583, A65 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Renson, P., & Manfroid, J. 2009, A&A, 498, 961 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Richards, J. W., Starr, D. L., Butler, N. R., et al. 2011, ApJ, 733, 10 [NASA ADS] [CrossRef] [Google Scholar]
- Richards, J. W., Starr, D. L., Miller, A. A., et al. 2012, ApJS, 203, 32 [NASA ADS] [CrossRef] [Google Scholar]
- Ricker, G. R., Winn, J. N., Vanderspek, R., et al. 2015, J. Astron. Telescopes Instrum. Syst., 1, 014003 [Google Scholar]
- Riello, M., De Angeli, F., Evans, D. W., et al. 2021, A&A, 649, A3 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Rimoldini, L. 2014, Astron. Comput., 5, 1 [NASA ADS] [CrossRef] [Google Scholar]
- Rimoldini, L., Holl, B., Audard, M., et al. 2019, A&A, 625, A97 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Rimoldini, L., Eyer, L., Audard, M., et al. 2022, in Gaia DR3 documentation Chapter 10: Variability, Gaia DR3 documentation, European Space Agency; Gaia Data Processing and Analysis Consortium, Online at https://gea.esac.esa.int/archive/documentation/GDR3/index.html, 10 [Google Scholar]
- Ripepi, V., Molinaro, R., Musella, I., et al. 2019, A&A, 625, A14 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Ripepi, V., Clementini, G., Molinaro, R., et al. 2023, A&A, 674, A17 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Ritter, H., & Kolb, U. 2003, A&A, 404, 301 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Roelens, M., Eyer, L., Mowlavi, N., et al. 2018, A&A, 620, A197 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Sabogal, B. E., Mennickent, R. E., Pietrzyński, G., & Gieren, W. 2005, MNRAS, 361, 1055 [NASA ADS] [CrossRef] [Google Scholar]
- Sabogal, B. E., Mennickent, R. E., Pietrzyński, G., et al. 2008, A&A, 478, 659 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Sarro, L. M., Debosscher, J., López, M., & Aerts, C. 2009, A&A, 494, 739 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Sarro, L. M., Debosscher, J., Neiner, C., et al. 2013, A&A, 550, A120 [Google Scholar]
- Schaffenroth, V., Classen, L., Nagel, K., et al. 2014, A&A, 570, A70 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Sesar, B., Banholzer, S. R., Cohen, J. G., et al. 2014, ApJ, 793, 135 [NASA ADS] [CrossRef] [Google Scholar]
- Sesar, B., Hernitschek, N., Mitrović, S., et al. 2017, AJ, 153, 204 [NASA ADS] [CrossRef] [Google Scholar]
- Shappee, B. J., Prieto, J. L., Grupe, D., et al. 2014, ApJ, 788, 48 [Google Scholar]
- Shibayama, T., Maehara, H., Notsu, S., et al. 2013, ApJS, 209, 5 [Google Scholar]
- Siegel, M. H. 2006, ApJ, 649, L83 [NASA ADS] [CrossRef] [Google Scholar]
- Sikora, J., David-Uraz, A., Chowdhury, S., et al. 2019, MNRAS, 487, 4695 [Google Scholar]
- Simonetti, J. H., Cordes, J. M., & Heeschen, D. S. 1985, ApJ, 296, 46 [NASA ADS] [CrossRef] [Google Scholar]
- Skottfelt, J., Bramich, D. M., Figuera Jaimes, R., et al. 2015, A&A, 573, A103 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Slawson, R. W., Prša, A., Welsh, W. F., et al. 2011, AJ, 142, 160 [Google Scholar]
- Soszyński, I., Poleski, R., Udalski, A., et al. 2008a, Acta Astron., 58, 163 [NASA ADS] [Google Scholar]
- Soszyński, I., Udalski, A., Szymański, M. K., et al. 2008b, Acta Astron., 58, 293 [NASA ADS] [Google Scholar]
- Soszyński, I., Udalski, A., Szymański, M. K., et al. 2009a, Acta Astron., 59, 1 [NASA ADS] [Google Scholar]
- Soszyński, I., Udalski, A., Szymański, M. K., et al. 2009b, Acta Astron., 59, 239 [Google Scholar]
- Soszyński, I., Udalski, A., Szymański, M. K., et al. 2009c, Acta Astron., 59, 335 [NASA ADS] [Google Scholar]
- Soszyński, I., Poleski, R., Udalski, A., et al. 2010a, Acta Astron., 60, 17 [NASA ADS] [Google Scholar]
- Soszyński, I., Udalski, A., Szymański, M. K., et al. 2010b, Acta Astron., 60, 165 [Google Scholar]
- Soszyński, I., Udalski, A., Szymański, M. K., et al. 2010c, Acta Astron., 60, 91 [NASA ADS] [Google Scholar]
- Soszyński, I., Dziembowski, W. A., Udalski, A., et al. 2011a, Acta Astron., 61, 1 [NASA ADS] [Google Scholar]
- Soszyński, I., Udalski, A., Pietrukowicz, P., et al. 2011b, Acta Astron., 61, 285 [NASA ADS] [Google Scholar]
- Soszyński, I., Udalski, A., Szymański, M. K., et al. 2011c, Acta Astron., 61, 217 [NASA ADS] [Google Scholar]
- Soszyński, I., Udalski, A., Poleski, R., et al. 2012, Acta Astron., 62, 219 [NASA ADS] [Google Scholar]
- Soszyński, I., Udalski, A., Szymański, M. K., et al. 2013, Acta Astron., 63, 21 [NASA ADS] [Google Scholar]
- Soszyński, I., Udalski, A., Szymański, M. K., et al. 2014, Acta Astron., 64, 177 [NASA ADS] [Google Scholar]
- Soszyński, I., Udalski, A., Szymański, M. K., et al. 2015, Acta Astron., 65, 297 [NASA ADS] [Google Scholar]
- Soszyński, I., Pawlak, M., Pietrukowicz, P., et al. 2016a, Acta Astron., 66, 405 [NASA ADS] [Google Scholar]
- Soszyński, I., Udalski, A., Szymański, M. K., et al. 2016b, Acta Astron., 66, 131 [NASA ADS] [Google Scholar]
- Soszyński, I., Udalski, A., Szymański, M. K., et al. 2017, Acta Astron., 67, 297 [NASA ADS] [Google Scholar]
- Soszyński, I., Udalski, A., Wrona, M., et al. 2019, Acta Astron., 69, 321 [Google Scholar]
- Soszyński, I., Udalski, A., Szymański, M. K., et al. 2020, Acta Astron.,70, 101 [Google Scholar]
- Southworth, J. 2011, MNRAS, 417, 2166 [Google Scholar]
- Spano, M., Mowlavi, N., Eyer, L., et al. 2011, A&A, 536, A60 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Stankov, A., & Handler, G. 2005, ApJS, 158, 193 [NASA ADS] [CrossRef] [Google Scholar]
- Stetson, P. B. 1996, PASP, 108, 851 [NASA ADS] [CrossRef] [Google Scholar]
- Suh, K.-W., & Hong, J. 2017, J. Kor. Astron. Soc., 50, 131 [NASA ADS] [CrossRef] [Google Scholar]
- Süveges, M., Sesar, B., Váradi, M., et al. 2012, MNRAS, 424, 2528 [CrossRef] [Google Scholar]
- Szkody, P., Anderson, S. F., Brooks, K., et al. 2011, AJ, 142, 181 [Google Scholar]
- Szkody, P., Dicenzo, B., Ho, A. Y. Q., et al. 2020, AJ, 159, 198 [NASA ADS] [CrossRef] [Google Scholar]
- Taylor, M. B. 2005, ASP Conf. Ser., 347, 29 [Google Scholar]
- Teyssier, D., & Gaia QSO Working Group 2022, Gaia DR3 documentation Chapter 12: Integrated extragalactic tables, Gaia DR3 documentation, European Space Agency; Gaia Data Processing and Analysis Consortium, Online at https://gea.esac.esa.int/archive/documentation/GDR3/index.html, 12 [Google Scholar]
- Torrealba, G., Catelan, M., Drake, A. J., et al. 2015, MNRAS, 446, 2251 [NASA ADS] [CrossRef] [Google Scholar]
- Udalski, A., Soszyński, I., Pietrukowicz, P., et al. 2018, Acta Astron., 68, 315 [Google Scholar]
- Uytterhoeven, K., Moya, A., Grigahcène, A., et al. 2011, A&A, 534, A125 [CrossRef] [EDP Sciences] [Google Scholar]
- Van Reeth, T., Tkachenko, A., Aerts, C., et al. 2015, ApJS, 218, 27 [Google Scholar]
- Varga-Verebélyi, E., Kun, M., Szegedi-Elek, E., et al. 2020, in Origins: From the Protosun to the First Steps of Life, eds. B. G. Elmegreen, L. V. Tóth, & M. Güdel, 345, 378 [Google Scholar]
- Vaughan, S., Edelson, R., Warwick, R. S., & Uttley, P. 2003, MNRAS, 345, 1271 [Google Scholar]
- Walkowicz, L. M., Basri, G., Batalha, N., et al. 2011, AJ, 141, 50 [NASA ADS] [CrossRef] [Google Scholar]
- Watkins, L. L., Evans, N. W., Belokurov, V., et al. 2009, MNRAS, 398, 1757 [NASA ADS] [CrossRef] [Google Scholar]
- Watson, C. L., Henden, A. A., & Price, A. 2006, Soc. Astron. Sci. Ann. Symp., 25, 47 [NASA ADS] [Google Scholar]
- Williams, K. A., Montgomery, M. H., Winget, D. E., Falcon, R. E., & Bierwagen, M. 2016, ApJ, 817, 27 [Google Scholar]
- Woźniak, P. R., Williams, S. J., Vestrand, W. T., & Gupta, V. 2004, AJ, 128, 2965 [CrossRef] [Google Scholar]
- Wu, C.-J., Ip, W.-H., & Huang, L.-C. 2015, ApJ, 798, 92 [Google Scholar]
- Wyrzykowski, Ł., Kruszyńska, K., Rybicki, K. A., et al. 2023, A&A, 674, A23 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Zechmeister, M., & Kürster, M. 2009, A&A, 496, 577 [CrossRef] [EDP Sciences] [Google Scholar]
- Žerjal, M., Zwitter, T., Matijevič, G., et al. 2017, ApJ, 835, 61 [CrossRef] [Google Scholar]
Appendix A: Special training selections
In addition to the general selections applied to all training-set sources (Sect. 3.1.2), (sub)class-specific filtering conditions are listed in the following, often as a function of literature catalogue, employing fields defined in the Gaia DR3 archive tables gaia_source and vari_summary. For brevity, the version of the International Variable Star Index (VSX) is 2019-11-12 and it is not repeated at each mention of Watson et al. (2006).
-
BE: conditions reflecting their bright and blue nature.
-
(a)
parallax_over_error > 2, median_mag_bp − median_mag_rp < 0.5 mag, and median_mag_g_fov − 19 < 0 mag, for sources in Mennickent et al. (2002) (Small Magellanic Cloud);
-
(b)
parallax_over_error > 5, median_mag_bp − median_mag_rp < 0.5 mag, and absolute G magnitude < 0 mag, for sources in Richards et al. (2012);
-
(c)
parallax_over_error > 2, for sources in Sabogal et al. (2005) (Large Magellanic Cloud);
-
(d)
parallax_over_error > 5, median_mag_bp − median_mag_rp < 1 mag, and absolute G magnitude < 0 mag, for sources in Watson et al. (2006).
-
(a)
-
Cepheids:
-
(a)
ACEP: period > 1 d, for sources in Drake et al. (2014b);
-
(b)
CEP: std_dev_mag_g_fov above the median standard deviations in G, in 0.05 mag intervals, of 1.6 million reference sources (see Appendix E);
-
(c)
DCEP: std_dev_mag_g_fov > 0.01 mag, for sources in Soszyński et al. (2012, 2015, 2017, 2020) and Udalski et al. (2018);
-
(d)
RV: median_mag_bp − median_mag_rp > 0.5 mag and std_dev_mag_g_fov > 0.1 mag;
-
(e)
T2CEP: std_dev_mag_g_fov > 0.03 mag.
-
(a)
-
CV: std_dev_mag_g_fov > 0.1 mag.
-
DSCT|SXPHE: std_dev_mag_g_fov > 0.02 mag and a lower limit at the median standard deviations in G, in 0.05 mag intervals, of 1.6 million reference sources (see Appendix E).
-
Eclipsing binaries and ellipsoidals: std_dev_mag_g_fov above custom percentiles (as listed below) of the standard deviations in G, in 0.05 mag intervals, of 1.6 million reference sources (see Appendix E), according to the comparison between the literature period and that recovered by the relevant SOS modules.
-
(a)
EA:
-
the 95th percentile, for sources in Drake et al. (2014b); Drake et al. (2017), Palaversa et al. (2013), and Rybizki (catalogue GAIA_ECL_RYBIZKI_2018 in Gavras et al. 2023);
-
the 90th percentile, for sources in Chen et al. (2020), ESA (1997), Pawlak et al. (2013, 2016), Pigulski et al. (2009), and Soszyński et al. (2016a);
-
the 85th percentile, if the period from Gaia data matches that of the literature, for sources in Kirk et al. (2016), Pawlak et al. (2016), and Soszyński et al. (2016a);
-
the 80th percentile, if the period from Gaia data matches that of the literature, otherwise the 95th percentile, for sources in Jayasinghe et al. (2018, 2019a,b) and Watson et al. (2006);
-
skewness_mag_g_fov > 0.9, for sources in Chen et al. (2020), ESA (1997), Kirk et al. (2016), Jayasinghe et al. (2018, 2019a,b), Drake et al. (2014b, 2017), Palaversa et al. (2013), Pigulski et al. (2009), Rybizki (catalogue GAIA_ECL_RYBIZKI_2018 in Gavras et al. 2023), and Watson et al. (2006).
-
-
(b)
EB:
-
the 95th percentile, for sources in Drake et al. (2014b) and Rybizki (catalogue GAIA_ECL_RYBIZKI_2018 in Gavras et al. 2023);
-
the 90th percentile, for sources in Pawlak et al. (2013);
-
the 85th percentile, if the period from Gaia data matches that of the literature, for sources in Kirk et al. (2016);
-
the 85th percentile, if the period from Gaia data matches that of the literature, otherwise the 95th percentile, for sources in Jayasinghe et al. (2018, 2019a,b), Pojmanski (2002), and Watson et al. (2006);
-
the 80th percentile, if the period from Gaia data matches that of the literature, otherwise the 90th percentile, for sources in ESA (1997).
-
-
(c)
ECL:
-
the 75th percentile (same as the general level), but only if std_dev_mag_g_fov > 0.01 mag and the period from Gaia data matches that of the literature, for sources in Watson et al. (2006);
-
the 75th percentile (same as the general level), but only if std_dev_mag_g_fov > 0.01 mag and (the period from Gaia data matches that of the literature or std_dev_mag_g_fov > 0.025 mag), for sources in Soszyński et al. (2012).
-
-
(d)
EW:
-
the 95th percentile, for sources in Drake et al. (2014b) and Pigulski et al. (2009);
-
the 90th percentile, for sources in ESA (1997), Pawlak et al. (2013), and Rybizki (catalogue GAIA_ECL_RYBIZKI_2018 in Gavras et al. 2023);
-
the 90th percentile, if the period from Gaia data matches that of the literature, for sources in Pawlak et al. (2016), Soszyński et al. (2016a), and Watson et al. (2006);
-
the 80th percentile, if the period from Gaia data matches that of the literature, otherwise the 95th percentile, for sources in Kirk et al. (2016), Jayasinghe et al. (2018, 2019a,b), and Pojmanski (2002);
-
the 80th percentile, if the period from Gaia data matches that of the literature, otherwise the 90th percentile, for sources in Pawlak et al. (2016), Chen et al. (2020).
-
-
(e)
ELL:
-
the 75th percentile (same as the general level), but only if std_dev_mag_g_fov > 0.03 mag, for sources in Jayasinghe et al. (2018, 2019a,b);
-
the 75th percentile (same as the general level), but only if std_dev_mag_g_fov > 0.01 mag and (the period from Gaia data matches that of the literature or std_dev_mag_g_fov > 0.025 mag), for all other literature catalogues.
-
-
(a)
-
EP: skewness_mag_g_fov > 0 and signal in the Gaia data confirmed by the planetary transit SOS module.
-
GALAXY: objects at the brightest and reddest ends, overlapping with known stellar distributions, were filtered out by the condition median_mag_g_fov > 3 [( median_mag_bp − median_mag_rp ) − 1.9] + 20 mag. Also, a few extremely blue outliers were excluded by median_mag_bp − median_mag_rp > 0 mag.
-
GCAS: conditions that reflect their bright and blue nature include parallax_over_error > 5, median_mag_bp − median_mag_rp < 1 mag, and absolute G magnitude less than 1.5 and 1.0 mag, for sources in Jayasinghe et al. (2018, 2019a,b) and Watson et al. (2006), respectively.
-
Long-period variables: empirical relations were used to identify bright red giant stars in the observational Hertzsprung–Russell diagram, namely median_mag_bp − median_mag_rp > 1 mag and a threshold for the absolute G magnitude expressed in terms of parallax, colour, and apparent magnitude as follows. Stars in the (reddened) red clump have an absolute G magnitude of about 1.8 (median_mag_bp - median_mag_rp) − 1.8 mag, which can be expressed as a parallax (in mas) of approximately 10^[1.7+0.36 (median_mag_bp − median_mag_rp) − 0.2 ×median_mag_g_fov]. It was found that, generally, stars brighter than the ones in the red clump could be identified by parallax (star) − 0.12 mas < parallax (red clump), where the offset of − 0.12 mas served to include the scatter from parallax noise of stars in the Magellanic Clouds. This expression was adapted to include red giant branch stars fainter than the red clump, as follows: parallax − 0.12 mas < 10^[1.8 + 0.6 (median_mag_bp − median_mag_rp) − 0.2 median_mag_g_fov].
-
(a)
M: std_dev_mag_g_fov > 0.1 mag;
-
(b)
SRA, SRB, SRC: the cross-match with Alfonso-Garzón et al. (2012) and Watson et al. (2006) was limited to angular distances less than 1.5 and 2.5 arcsec, respectively, unless parallax > 2 mas;
-
(c)
SRS: the cross-match with Watson et al. (2006) was limited to angular distances less than 1.5 arcsec, unless parallax > 2 mas.
-
(a)
-
MICROLENSING: skewness_mag_g_fov < 0 and confirmation of the signal presence in the Gaia data by the microlensing SOS module.
-
RCB: visual inspection and selection were applied to this rare class.
-
ROAP: std_dev_mag_g_fov above the median standard deviations in G, in 0.05 mag intervals, of 1.6 million reference sources (see Appendix E).
-
RR Lyrae stars:
-
(a)
RRAB, RRD: std_dev_mag_g_fov > 0.056 mag;
-
(b)
RRC: std_dev_mag_g_fov > 0.056 mag, first overtone period > 0.2 d, and, for median_mag_g_fov < 16.5 mag, | std_dev_mag_bp/std_dev_mag_rp − 1 | > 0.1, where the latter was meant to remove a typical feature of eclipsing binaries, when sources had sufficient signal-to-noise ratio in the GBP and GRP bands;
-
(c)
a lower minimum threshold of 0.03 mag was applied to the std_dev_mag_g_fov for hundreds of RR Lyrae stars that were wished not to be missed.
-
(a)
-
RS:
-
(a)
cross-match angular distance < 0.1 parallax + 0.2 mas, for sources in Chen et al. (2020), Eker et al. (2008), and Watson et al. (2006);
-
(b)
removal of outliers with respect to the GBP − G versus G − GRP relation followed by all other sources of this class.
-
(a)
-
SN: std_dev_mag_g_fov > 0.1 mag.
-
Solar-like stars:
-
(a)
FLARES:
-
cross-match angular distance less than 0.1 parallax + 0.2 mas, for sources in Shibayama et al. (2013), Walkowicz et al. (2011), and Wu et al. (2015);
-
removal of outliers with respect to the GBP − G versus G − GRP relation followed by all other sources of this class.
-
-
(b)
ROT:
-
cross-match angular distance less than 0.1 parallax+0.4 mas and 0.1 parallax+0.6 mas, for sources in Distefano (catalogue GAIA_ROT_GAIA_2017 in Gavras et al. 2023) and Watson et al. (2006), respectively;
-
removal of outliers with respect to the GBP − G versus G − GRP relation followed by all other sources of this class.
-
-
(c)
SOLAR_LIKE:
-
cross-match angular distance less than 0.05 parallax + 0.25 and 0.1 parallax + 0.7 mas, for sources in Medhi et al. (2007) and Žerjal et al. (2017), respectively;
-
removal of outliers with respect to the GBP − G versus G − GRP relation followed by all other sources of this class.
-
-
(a)
-
White dwarfs:
-
(a)
GWVIR: std_dev_mag_g_fov above the 85th percentile of the standard deviations in G, in 0.05 mag intervals, of 1.6 million reference sources (see Appendix E), for sources in Eyer et al. (2020);
-
(b)
ELM_ZZA: absolute G magnitude greater than 5 mag.
-
(a)
-
Young stellar objects:
-
(a)
DIP, WTTS:
-
parallax > 1 mas;
-
removal of outliers with respect to the GBP − G versus G − GRP relation followed by all other sources of this class.
-
-
(b)
HAEBE:
-
median_mag_g_fov < 16 mag;
-
removal of outliers with respect to the GBP − G versus G − GRP relation followed by all other sources of this class.
-
-
(c)
UXOR: removal of outliers with respect to the GBP − G versus G − GRP relation followed by all other sources of this class.
-
(d)
TTS: cross-match angular distance less than 0.1 parallax + 0.7 mas, for sources in Varga-Verebélyi et al. (2020) and Watson et al. (2006).
-
(e)
YSO:
-
parallax > 0.9 mas;
-
| Galactic latitude | < 30°;
-
cross-match angular distance less than 0.05 parallax + 0.65 mas, for sources in Varga-Verebélyi et al. (2020) and Watson et al. (2006);
-
removal of outliers with respect to the GBP − G versus G − GRP relation followed by all other sources of this class.
-
-
(a)
Appendix B: Classification attributes
The classification attributes selected to characterise training set sources for classifier models are listed in terms of parameters in the vari_summary table, unless a different table is mentioned or prepended to field names:
-
the Abbe value (abbe_mag_g_fov) of FoV transit magnitudes in the G band;
-
the astrometry-based luminosity (Arenou & Luri 1999) as gaia_source.parallax 10^(0.2 median_mag_g_fov − 2);
-
the possibly reddened colour index GBP − GRP, estimated by median_mag_bp − median_mag_rp;
-
the possibly reddened colour index G − GRP, estimated by median_mag_g_fov − median_mag_rp;
-
the sample-size unbiased unweighted variance and kurtosis (central moments) of FoV-transit magnitudes in the G band, denoised assuming Gaussian uncertainties (Rimoldini 2014);
-
the duration of the time series (time_duration_g_fov), from the first to the last FoV transit in the G band;
-
the unweighted 95th percentile of magnitude changes per time interval between successive FoV transits in the G band;
-
the qso_variability and non_qso_variability parameters from Butler & Bloom (2011), computed from FoV-transit magnitudes in the G band, after adaptations to the Gaia data (these values are published only in the vari_agn table, see Carnerero et al. 2023);
-
the ratio between the sample-size biased unweighted standard deviation of FoV-transit magnitudes in the G band and the root-mean-square of the corresponding uncertainties (std_dev_over_rms_err_mag_g_fov);
-
the square root of the sample-size unbiased unweighted variance (std_dev_mag_g_fov) of FoV-transit magnitudes in the G band;
-
the source parallax (gaia_source.parallax);
-
the Pearson correlation coefficient for the magnitudes of FoV transits in the GBP and GRP bands;
-
the sample-size unbiased unweighted skewness moment of FoV transit magnitudes in the G band, standardised by the variance of such measurements (skewness_mag_g_fov);
-
the ratio between the third spectral shape coefficients6 in the GBP and GRP bands;
-
the ratio between the standard deviations in magnitude of FoV transit observations in the GBP and GRP bands, that is std_dev_mag_bp / std_dev_mag_rp;
-
the single-band Stetson variability index (Stetson 1996) computed from FoV transit magnitudes in the G band, pairing observations within 0.1 days (stetson_mag_g_fov);
-
a Wesenheit-like magnitude of FoV transits in the G band as median_mag_g_fov−2(median_mag_bp−median_mag_rp);
-
parameters derived from the Least Square periodogram (Heck et al. 1985; Zechmeister & Kürster 2009) configured as described in Sect. 10.2.3 of the Gaia DR3 documentation (Rimoldini et al. 2022):
-
(a)
the top frequencies (corresponding to the highest periodogram amplitudes) in the frequency ranges 0.1–1 and 1–25 d−1;
-
(b)
the signal detection efficiencies (the difference between the maximum and mean periodogram amplitudes, divided by the standard deviation of such amplitudes) in the frequency ranges 0.1–1 and 1–25 d−1;
-
(c)
the false alarm probabilities (Baluev 2009) of the top frequencies in the ranges 0.0007–0.1, 0.1–1, and 1–25 d−1;
-
(d)
the highest periodogram amplitudes in the frequency ranges 0.0007–0.1, 0.1–1, and 1–25 d−1.
-
(a)
Appendix C: Additional class labels
Class labels that were not targeted for publication in Gaia DR3 but that appear among the false positives in Table 3 are defined as follows (consistently with Gavras et al. 2023):
-
CST: non-variable object such as a constant star, including former suspect variable with undetected variability in subsequent observations;
-
HMXB: high-mass X-ray binary system with a massive star and a compact companion;
-
L: slow irregular variable or insufficiently studied object that could belong to other classes (such as SR);
-
PCEB: post-common envelope binary (or pre-cataclysmic variable);
-
RAD_VEL_VAR: object with variable radial velocity;
-
SB: spectroscopic binary.
Appendix D: Sample ADQL queries
Source identifiers and classification scores of AGN and SN candidates can be queried as follows.
SELECT source_id, best_class_name, best_class_score FROM gaiadr3.vari_classifier_result WHERE best_class_name in ('AGN', 'SN')
Source identifiers and classification scores of GALAXY candidates can be queried as follows.
SELECT source_id, vari_best_class_name, vari_best_class_score FROM gaiadr3.galaxy_candidates WHERE vari_best_class_name = 'GALAXY'
All possible candidates of RR Lyrae and Cepheid classes can be retrieved from the union of classification and of the corresponding SOS modules as illustrated in the following query.
SELECT s.source_id, ra, dec, best_classification AS RR_class, type_best_classification AS CEP_class, best_class_name AS CLS_class FROM gaiadr3.gaia_source AS s LEFT OUTER JOIN gaiadr3.vari_rrlyrae AS rr ON s.source_id=rr.source_id LEFT OUTER JOIN gaiadr3.vari_cepheid AS cep ON s.source_id=cep.source_id LEFT OUTER JOIN gaiadr3.vari_classifier_result AS cls ON s.source_id=cls.source_id WHERE best_class_name in ('RR', 'CEP')
Classified source identifiers of objects that are in ‘common’, ‘extra’, ‘other’, or ‘missed’ with respect to SOS modules (see Table 2) can be retrieved as shown in the following examples (assuming the AGN class).
-
Classified sources in ‘common’ with the SOS module:
SELECT c.source_id FROM gaiadr3.vari_classifier_result AS c INNER JOIN gaiadr3.vari_agn AS x ON c.source_id = x.source_id WHERE best_class_name = 'AGN'
-
‘Extra’ classified sources with respect to the SOS module:
SELECT c.source_id FROM gaiadr3.vari_classifier_result AS c LEFT JOIN gaiadr3.vari_agn AS x ON c.source_id = x.source_id WHERE best_class_name = 'AGN' AND x.source_id IS NULL
-
Sources in the SOS module but classified as ‘other’ classes:
SELECT c.source_id, best_class_name FROM gaiadr3.vari_classifier_result AS c RIGHT JOIN gaiadr3.vari_agn AS x ON c.source_id = x.source_id WHERE best_class_name != 'AGN'
-
Sources in the SOS module but ‘missed’ by classification:
SELECT c.source_id FROM gaiadr3.vari_classifier_result AS c RIGHT JOIN gaiadr3.vari_agn AS x ON c.source_id = x.source_id WHERE c.source_id IS NULL
Appendix E: Common diagrams for all classes
Sources from training and classification results are shown for each class in different diagrams, on top of a set of background sources for reference purposes (depicted in grey). Such diagrams are described in terms of vari_summary parameters (unless stated otherwise) and labelled according to the following items:
-
(a)
sky maps in Aitoff projection, in Galactic coordinates (with Galactic longitude of zero at the centre and increasing towards the left);
-
(b)
G versus G − GRP colour–magnitude diagrams as median_mag_g_fov vs median_mag_bp − median_mag_rp;
-
(c)
GBP − G versus G − GRP colour–colour diagrams as median_mag_bp − median_mag_g_fov versus median_mag_g_fov − median_mag_rp;
-
(d)
absolute G magnitude versus the reddened GBP − GRP colour for observational Hertzsprung–Russell diagrams as median_mag_g_fov+5[1 + log10(gaia_source.parallax/1000)] versus median_mag_bp − median_mag_rp (for sources with gaia_source.parallax_over_error > 5);
-
(e)
time series standard deviation as a function of magnitude as std_dev_mag_g_fov versus median_mag_g_fov (with a white curve illustrating the third quartile of the standard deviations in G, in 0.05 mag intervals, of 1.6 million reference sources, defined in subsequent paragraphs of this Appendix);
-
(f)
metrics targeting non-periodic variations, such as skewness_mag_g_fov versus abbe_mag_g_fov;
-
(g)
metrics targeting periodic variations of pulsating stars, such as log10 (std_dev_mag_bp/std_dev_mag_rp) versus median_mag_g_fov.
Training sources are illustrated with red points, with darker shades corresponding to higher number density, while classification results are colour-coded by best_class_score (in the vari_classifier_result table).
Classification results (in the vari_classifier_result table) include additional plots on:
-
(a)
completeness versus contamination, colour-coded by the minimum best_class_score;
-
(b)
F1 score versus the minimum best_class_score;
-
(c)
sample light curves in the G band as a function of time or phase (folded by the most significant period that was published in the corresponding SOS module, in absence of which the literature period was used), after the application of the operators described in Sect. 10.2.3 of the Gaia DR3 documentation (Rimoldini et al. 2022) and Sect. 3.1 of Eyer et al. (2023).
For all but the observational Hertzsprung–Russell diagrams, about 1.6 million reference background sources (depicted in grey) were selected by randomly sampling sources from the full range of magnitude, with an upper limit of 6000 objects per 0.05 mag interval, and then by filtering out sources with less than five FoV transits in the G band and those without any measurement in both GBP and GRP.
For the observational Hertzsprung–Russell diagrams, about 4.25 million background sources were derived from the following concurrent conditions on parameters available in the gaiadr3.gaia_source table:
parallax_over_error > 10 ruwe < 1.2 visibility_periods_used > 11 phot_g_mean_flux > 0 phot_bp_mean_flux_over_error > 10 phot_rp_mean_flux_over_error > 10
and in table gaiadr3.vari_summary (including sources with unpublished values for the following fields):
num_selected_bp > 10 num_selected_rp > 10;
in order to limit the amount of sources, their distribution in log(parallax) was binned in 100 intervals and distributed evenly by random source sampling (for more details, see Gavras et al. 2023).
The diagrams presented in this Appendix represent only a selection of the verification metrics employed during training and classification assessment. Nevertheless, these figures often capture salient characteristics that are peculiar to each class. Figures related to the training set might include sources with unpublished photometric time series, as classification results did not necessarily include all training objects and not all classified sources were included in the Gaia archive (Babusiaux et al. 2023).
Fig. E.1. ACV|CP|MCP|ROAM|ROAP|SXARI: 1572 training sources. |
Fig. E.2. ACV|CP|MCP|ROAM|ROAP|SXARI: 10 779 classified sources. |
Fig. E.3. ACV|CP|MCP|ROAM|ROAP|SXARI: completeness, contamination, F1-score, and sample light curves. The dashed lines indicate the maximum completeness (with minimum best_class_score of zero) in panel (a) and the minimum best_class_score that maximises the F1-score (for an optimal balance between completeness and contamination) in panel (b). For all period-folded light curves, times are colour-coded according to the same legend as shown in panel (c). |
Fig. E.4. ACYG: 59 training sources. |
Fig. E.5. ACYG: 329 classified sources. |
Fig. E.7. AGN: 3089 training sources. |
Fig. E.8. AGN: 1 035 207 classified sources. |
Fig. E.10. BCEP: 173 training sources. |
Fig. E.11. BCEP: 1475 classified sources. |
Fig. E.12. Same as Fig. E.3, but for BCEP. |
Fig. E.13. BE|GCAS|SDOR|WR: 3546 training sources. |
Fig. E.14. BE|GCAS|SDOR|WR: 8560 classified sources. |
Fig. E.15. Same as Fig. E.3, but for BE|GCAS|SDOR|WR. |
Fig. E.16. CEP: 4448 training sources. |
Fig. E.17. CEP: 16 141 classified sources. |
Fig. E.18. Same as Fig. E.3, but for CEP: (c) anomalous Cepheid, (d) δ Cephei, (e) BL Herculis, and (f) W Virginis stars. |
Fig. E.19. CV: 1815 training sources. |
Fig. E.20. CV: 7306 classified sources. |
Fig. E.21. Same as Fig. E.3, but for CV. |
Fig. E.22. DSCT|GDOR|SXPHE: 4259 training sources. |
Fig. E.23. DSCT|GDOR|SXPHE: 748 058 classified sources. |
Fig. E.24. Same as Fig. E.3, but for DSCT|GDOR|SXPHE. |
Fig. E.25. ECL: 6360 training sources. |
Fig. E.26. ECL: 2 184 356 classified sources. |
Fig. E.27. Same as Fig. E.3, but for ECL. |
Fig. E.28. ELL: 2864 training sources. |
Fig. E.29. ELL: 65 300 classified sources. |
Fig. E.30. Same as Fig. E.3, but for ELL. |
Fig. E.31. EP: 66 training sources. |
Fig. E.32. EP: 214 classified sources. |
Fig. E.33. Same as Fig. E.3, but for EP. |
Fig. E.34. LPV: 5353 training sources. |
Fig. E.35. LPV: 2 325 775 classified sources. |
Fig. E.36. Same as Fig. E.3, but for LPV: (c,d) C-rich and (e,f) O-rich long-period variables. |
Fig. E.37. MICROLENSING: 116 training sources. |
Fig. E.38. MICROLENSING: 254 classified sources. |
Fig. E.39. Same as Fig. E.3, but for MICROLENSING. |
Fig. E.40. RCB: 69 training sources. |
Fig. E.41. RCB: 153 classified sources. |
Fig. E.42. Same as Fig. E.3, but for RCB. |
Fig. E.43. RR: 6377 training sources. |
Fig. E.44. RR: 297 778 classified sources. |
Fig. E.45. Same as Fig. E.3, but for RR: (c,d) fundamental-mode, (e) first overtone, and (f) double-mode RR Lyrae stars. |
Fig. E.46. RS: 2548 training sources. |
Fig. E.47. RS: 742 263 classified sources. |
Fig. E.48. Same as Fig. E.3, but for RS. |
Fig. E.49. S: 1965 training sources. |
Fig. E.50. S: 512 005 classified sources. |
Fig. E.51. Same as Fig. E.3, but for S, including an eclipsing binary with halved period in panel (e). |
Fig. E.52. SDB: 62 training sources. |
Fig. E.53. SDB: 893 classified sources. |
Fig. E.54. Same as Fig. E.3, but for SDB. Given the multiple significant periods of SDB stars, only time series are shown in panels (c)–(f). |
Fig. E.55. SN: 86 training sources. |
Fig. E.56. SN: 3029 classified sources. |
Fig. E.57. Same as Fig. E.3, but for SN. |
Fig. E.58. SOLAR_LIKE: 2628 training sources. |
Fig. E.59. SOLAR_LIKE: 1 934 844 classified sources. |
Fig. E.60. Same as Fig. E.3, but for SOLAR_LIKE. |
Fig. E.61. SPB: 149 training sources. |
Fig. E.62. SPB: 1228 classified sources. |
Fig. E.63. Same as Fig. E.3, but for SPB. |
Fig. E.64. SYST: 316 training sources. |
Fig. E.65. SYST: 649 classified sources. |
Fig. E.66. Same as Fig. E.3, but for SYST. |
Fig. E.67. WD: 1075 training sources. |
Fig. E.68. WD: 910 classified sources. |
Fig. E.69. Same as Fig. E.3, but for WD. Given the multiple significant periods of WD variables, only time series are shown in panels (c)–(f). |
Fig. E.70. YSO: 5148 training sources. |
Fig. E.71. YSO: 79 375 classified sources. |
Fig. E.72. Same as Fig. E.3, but for YSO. |
Fig. E.73. GALAXY: 3116 training sources. |
Fig. E.74. GALAXY: 2 451 364 classified sources. |
Fig. E.75. Same as Fig. E.3, but for GALAXY (light curves are limited to the subset with published photometric time series in the Gaia Andromeda Photometric Survey; Evans et al. 2023). |
Fig. E.76. GALAXY (subset of candidates in the Gaia Andromeda Photometric Survey; Evans et al. 2023): 7579 classified sources. |
All Tables
Classification training classes (see Sect. 3.1.1 for class label definitions), with the specification of their components (whose approximate representation is indicated in brackets when greater than 500; see Sects. 3.1.2 and 3.1.3 for details on the creation of class subsets), the number of training sources NTRN, and references.
Statistics of classification results for each class (source counts, classification and F1 scores, completeness and contamination rates) and their distribution with respect to the corresponding SOS modules (sources in ‘common’ between the ones classified as a given class and the corresponding SOS module, ‘extra’ sources in classification, sources classified as ‘other’ classes or ‘missed’ by classification).
Completeness and contamination details of classification results for each class (the class group ACV|CP|MCP|ROAM|ROAP|SXARI was abbreviated as ACV|CP|...|SXARI) (this table is continued on the next page).
All Figures
Fig. 1. Number of sources per class, sorted in decreasing order. The bars shaded in yellow correspond to variability types published in the vari_classifier_result table, while galaxies, identified by their artificial photometric variations in Gaia, are published exclusively in galaxy_candidates. The ACV|CP|MCP|ROAM|ROAP|SXARI class group is abbreviated as ACV|CP|...|SXARI. |
|
In the text |
Fig. 2. Number of the least-sampled sources in 0.2 mag intervals as a function of the median G-band magnitude. The number of variable sources (in the vari_classifier_result table of the Gaia archive) is colour coded by the maximum number of selected FoV observations in the G band (num_selected_g_fov up to 10, 15, and 20) as shown in the legend (with orange, green, and blue colours, respectively). The bars shaded in grey refer to the same conditions but for the galaxies identified by their artificial variability (published in the galaxy_candidates table), including unpublished values of num_selected_g_fov for galaxies outside the Gaia Andromeda Photometric Survey. The white vertical line marks the median G magnitude of 20.7. The fields num_selected_g_fov and median_mag_g_fov are published in the vari_summary table. |
|
In the text |
Fig. 3. Sky map of the least-sampled sources in Galactic coordinates (white grid), colour coded as in Fig. 2 for variable sources with num_selected_g_fov up to 10, 15, and 20. Galactic longitude is zero at the centre and increases towards the left. The thin line in black denotes the Ecliptic. |
|
In the text |
Fig. 4. Distribution of parallax (a) and proper motion components, pmra along the right ascension (b) and pmdec along the declination (c), normalised by the corresponding uncertainties and binned in intervals of 0.02, for the 1 034 925 AGN candidates with at least five-parameter astrometric solutions, for the top-150 000 candidates (denoted as high scores; best_class_score > 0.8995154), and for the bottom-150 000 candidates (denoted as low scores; best_class_score < 0.1615997), colour-coded as indicated in the legend. The bias from the inclusion of parallax among classification attributes is evident in panel a (see Sect. 4.3 for details). |
|
In the text |
Fig. E.1. ACV|CP|MCP|ROAM|ROAP|SXARI: 1572 training sources. |
|
In the text |
Fig. E.2. ACV|CP|MCP|ROAM|ROAP|SXARI: 10 779 classified sources. |
|
In the text |
Fig. E.3. ACV|CP|MCP|ROAM|ROAP|SXARI: completeness, contamination, F1-score, and sample light curves. The dashed lines indicate the maximum completeness (with minimum best_class_score of zero) in panel (a) and the minimum best_class_score that maximises the F1-score (for an optimal balance between completeness and contamination) in panel (b). For all period-folded light curves, times are colour-coded according to the same legend as shown in panel (c). |
|
In the text |
Fig. E.4. ACYG: 59 training sources. |
|
In the text |
Fig. E.5. ACYG: 329 classified sources. |
|
In the text |
Fig. E.6. Same as Fig. E.3, but for ACYG. |
|
In the text |
Fig. E.7. AGN: 3089 training sources. |
|
In the text |
Fig. E.8. AGN: 1 035 207 classified sources. |
|
In the text |
Fig. E.9. Same as Fig. E.3, but for AGN. |
|
In the text |
Fig. E.10. BCEP: 173 training sources. |
|
In the text |
Fig. E.11. BCEP: 1475 classified sources. |
|
In the text |
Fig. E.12. Same as Fig. E.3, but for BCEP. |
|
In the text |
Fig. E.13. BE|GCAS|SDOR|WR: 3546 training sources. |
|
In the text |
Fig. E.14. BE|GCAS|SDOR|WR: 8560 classified sources. |
|
In the text |
Fig. E.15. Same as Fig. E.3, but for BE|GCAS|SDOR|WR. |
|
In the text |
Fig. E.16. CEP: 4448 training sources. |
|
In the text |
Fig. E.17. CEP: 16 141 classified sources. |
|
In the text |
Fig. E.18. Same as Fig. E.3, but for CEP: (c) anomalous Cepheid, (d) δ Cephei, (e) BL Herculis, and (f) W Virginis stars. |
|
In the text |
Fig. E.19. CV: 1815 training sources. |
|
In the text |
Fig. E.20. CV: 7306 classified sources. |
|
In the text |
Fig. E.21. Same as Fig. E.3, but for CV. |
|
In the text |
Fig. E.22. DSCT|GDOR|SXPHE: 4259 training sources. |
|
In the text |
Fig. E.23. DSCT|GDOR|SXPHE: 748 058 classified sources. |
|
In the text |
Fig. E.24. Same as Fig. E.3, but for DSCT|GDOR|SXPHE. |
|
In the text |
Fig. E.25. ECL: 6360 training sources. |
|
In the text |
Fig. E.26. ECL: 2 184 356 classified sources. |
|
In the text |
Fig. E.27. Same as Fig. E.3, but for ECL. |
|
In the text |
Fig. E.28. ELL: 2864 training sources. |
|
In the text |
Fig. E.29. ELL: 65 300 classified sources. |
|
In the text |
Fig. E.30. Same as Fig. E.3, but for ELL. |
|
In the text |
Fig. E.31. EP: 66 training sources. |
|
In the text |
Fig. E.32. EP: 214 classified sources. |
|
In the text |
Fig. E.33. Same as Fig. E.3, but for EP. |
|
In the text |
Fig. E.34. LPV: 5353 training sources. |
|
In the text |
Fig. E.35. LPV: 2 325 775 classified sources. |
|
In the text |
Fig. E.36. Same as Fig. E.3, but for LPV: (c,d) C-rich and (e,f) O-rich long-period variables. |
|
In the text |
Fig. E.37. MICROLENSING: 116 training sources. |
|
In the text |
Fig. E.38. MICROLENSING: 254 classified sources. |
|
In the text |
Fig. E.39. Same as Fig. E.3, but for MICROLENSING. |
|
In the text |
Fig. E.40. RCB: 69 training sources. |
|
In the text |
Fig. E.41. RCB: 153 classified sources. |
|
In the text |
Fig. E.42. Same as Fig. E.3, but for RCB. |
|
In the text |
Fig. E.43. RR: 6377 training sources. |
|
In the text |
Fig. E.44. RR: 297 778 classified sources. |
|
In the text |
Fig. E.45. Same as Fig. E.3, but for RR: (c,d) fundamental-mode, (e) first overtone, and (f) double-mode RR Lyrae stars. |
|
In the text |
Fig. E.46. RS: 2548 training sources. |
|
In the text |
Fig. E.47. RS: 742 263 classified sources. |
|
In the text |
Fig. E.48. Same as Fig. E.3, but for RS. |
|
In the text |
Fig. E.49. S: 1965 training sources. |
|
In the text |
Fig. E.50. S: 512 005 classified sources. |
|
In the text |
Fig. E.51. Same as Fig. E.3, but for S, including an eclipsing binary with halved period in panel (e). |
|
In the text |
Fig. E.52. SDB: 62 training sources. |
|
In the text |
Fig. E.53. SDB: 893 classified sources. |
|
In the text |
Fig. E.54. Same as Fig. E.3, but for SDB. Given the multiple significant periods of SDB stars, only time series are shown in panels (c)–(f). |
|
In the text |
Fig. E.55. SN: 86 training sources. |
|
In the text |
Fig. E.56. SN: 3029 classified sources. |
|
In the text |
Fig. E.57. Same as Fig. E.3, but for SN. |
|
In the text |
Fig. E.58. SOLAR_LIKE: 2628 training sources. |
|
In the text |
Fig. E.59. SOLAR_LIKE: 1 934 844 classified sources. |
|
In the text |
Fig. E.60. Same as Fig. E.3, but for SOLAR_LIKE. |
|
In the text |
Fig. E.61. SPB: 149 training sources. |
|
In the text |
Fig. E.62. SPB: 1228 classified sources. |
|
In the text |
Fig. E.63. Same as Fig. E.3, but for SPB. |
|
In the text |
Fig. E.64. SYST: 316 training sources. |
|
In the text |
Fig. E.65. SYST: 649 classified sources. |
|
In the text |
Fig. E.66. Same as Fig. E.3, but for SYST. |
|
In the text |
Fig. E.67. WD: 1075 training sources. |
|
In the text |
Fig. E.68. WD: 910 classified sources. |
|
In the text |
Fig. E.69. Same as Fig. E.3, but for WD. Given the multiple significant periods of WD variables, only time series are shown in panels (c)–(f). |
|
In the text |
Fig. E.70. YSO: 5148 training sources. |
|
In the text |
Fig. E.71. YSO: 79 375 classified sources. |
|
In the text |
Fig. E.72. Same as Fig. E.3, but for YSO. |
|
In the text |
Fig. E.73. GALAXY: 3116 training sources. |
|
In the text |
Fig. E.74. GALAXY: 2 451 364 classified sources. |
|
In the text |
Fig. E.75. Same as Fig. E.3, but for GALAXY (light curves are limited to the subset with published photometric time series in the Gaia Andromeda Photometric Survey; Evans et al. 2023). |
|
In the text |
Fig. E.76. GALAXY (subset of candidates in the Gaia Andromeda Photometric Survey; Evans et al. 2023): 7579 classified sources. |
|
In the text |
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.