Gaia Data Release 3 The ﬁrst Gaia catalogue of eclipsing-binary candidates

Context. Gaia Data Release 3 (DR3) provides a number of new data products that complement the early DR3 made available two years ago. Among these is the ﬁrst Gaia catalogue of eclipsing-binary candidates containing 2184477 sources with brightnesses from a few magnitudes to 20mag in the Gaia G -band and covering the full sky


Introduction
Most stars are in binary systems and a fraction of them appear to an observer as eclipsing.These eclipsing systems allow us, under certain conditions, to determine fundamental parameters of stars, such as mass and radius, together with the orbital parameters.They are stringent tests for stellar evolution when the two stars are in wide systems, while they are laboratories for many physical processes when the two stars interact with one another.Some eccentric systems can also serve as a test of the theory of general relativity thanks to the determination of their apsidal motion.In addition, when one of the components is oscillating and provides suitable conditions to perform asteroseismology, the system provides an independent determination of stellar parameters and tests for the asteroseismic scaling relations.
Corresponding author: N. Mowlavi (Nami.Mowlavi@unige.ch)Clearly, eclipsing binaries are exceptionally interesting objects for astronomy.Still, the number of well studied cases is relatively small.For example, the catalogue of well studied systems presented by Southworth (2015) contains 170 1 binaries, based on an initial compilation of 45 eclipsing binaries by Andersen (1991).
With the advent of large-scale multi-epoch ground-based photometric surveys, pioneered by the microlensing-search tailored 'Expérience pour la recherche d'objets sombres' (EROS1 Renault et al. 1998), the 'Massive compact halo object' experiment ( MACHO Alcock et al. 1997), and the 'Optical gravitational lensing experiment' (OGLE2 Udalski et al. 1992), the opportunities to find eclipsing binaries increased dramatically.The precursor of Gaia, HIPPARCOS, already provided an all-sky survey of eclipsing binaries (ESA 1997).The number of eclipsing binaries was rather limited, about 900 among 11 597 detected variables (from 118 218 monitored stars), yet it contained ∼30% new candidates.Before this Gaia Data Release 3 (DR3), the largest catalogue specifically dedicated to eclipsing binaries comes from the OGLE4 survey team with the publication of 40 204 sources in the Large Magellanic Cloud (LMC) and 8 401 sources in the Small Magellanic Cloud (SMC, Pawlak et al. 2016), and 450 598 sources towards the Galactic Bulge (Soszyński et al. 2016).In parallel, multiple other large-scale multi-epoch surveys provide additional opportunities with automated classification of their variable stars.Such is the case, for example, for (number of eclipsing binaries given in parenthesis) the Trans-Atlantic Exoplanet Survey (TRES; Devor et al. 2008, 773), the All Sky Automated Survey (ASAS; Pojmanski 2002;Pigulski et al. 2009, 1055 and180, respectively), the Lincoln Near-Earth Asteroid Research survey (LINEAR; Palaversa et al. 2013Palaversa et al. , 2700)), the EROS2 survey (Kim et al. 2014, ∼45 600), the CATALINA survey (Drake et al. 2017, 23 312), the Asteroid Terrestrial-impact Last Alert System survey (ATLAS; Heinze et al. 2018, ∼110 000), or the Zwicky Transient Facility survey (Chen et al. 2020, ∼350 000).The catalogue of variable stars made available by the American Association of Variable Star Observers (AAVSO) through their Variable Star Index (VSX) database also provides a wealth of data for the study of eclipsing binaries (Watson et al. 2006).
Space missions dedicated to exoplanet search provide another source of data for the study of eclipsing binaries.Their strengths come from the continuous, high-cadence observation on long time scale, combined to the high photometric precision that can be obtained from space.Catalogues dedicated to eclipsing binaries from these missions include, for example, Kirk et al. (2016) for Kepler (2878 candidates including ellipsoidal variables) and Prša et al. (2022) from the Transiting Exoplanet Survey Satellite (TESS; 4584 eclipsing binaries).They, however, are limited in terms of sky coverage and/or brightness range.
The Gaia space mission from the European Space Agency (ESA) offers a new opportunity to study eclipsing binaries.Launched at the end of 2013, this all-sky survey started its nominal mission in July 2014 (Gaia Collaboration et al. 2016).Among the strong points of the mission for variability analysis, we can mention, in addition to its well-known astrometric capabilities, the large dynamical range reached in stellar brightness, from a few magnitudes to fainter than 20 mag, the specific scanning law leading to irregularly sampled time series, and the quasi-simultaneity (within tens of seconds) of the observations in G photometry, G BP and G RP spectrophotometry, and RVS (Radial Velocity Spectrometer) spectroscopy.Data products based on 34 months of astrometry and photometry data have been released in the early Data Release 3 (EDR3 in Dec. 3, 2020;Gaia Collaboration et al. 2021;Riello et al. 2021).These have been complemented with numerous additional data products in DR3 (June 13, 2022;Gaia Collaboration et al. 2022b), including variability catalogues for more than ten million variable objects (Eyer et al. 2022).
This paper presents the first Gaia catalogue of eclipsing binaries, published as part of Gaia DR3.It is the largest such catalogue to date, with more than two million candidates.A balance was reached between completeness and purity.The selection of the eclipsing binaries starts with the classification of variable objects performed within the Gaia Processing and Analysis Consortium (DPAC) as described in Rimoldini et al. (2022), followed by a specific eclipsing binary module that automatically selects a geometric two-Gaussian model (see Mowlavi et al. 2017) and orbital period based on the G light curves, after which a final filtering step on various statistics parameters is made.The G BP and G RP time series were not used.The eclipsing binary processing pipeline is described in Sect. 2. In particular, the section describes candidate selection, orbital period search, the two-Gaussian model used to fit the morphology of the G light curves, and the procedure implemented to automate the selection of the best model and orbital period, as well as to derive uncertainties for the determined parameters.Section 2 also details the content of the catalogue.Recommendations for catalogue exploitation using published parameters are given in Sect.3. The quality of the catalogue is then addressed in Sect.4, with an estimate of catalogue completeness and an investigation of the new Gaia candidates.Illustrative samples of candidates with good parallaxes are presented in Sect.5, with a specific application to the period-eccentricity analysis of bright candidates.Section 6 ends the main body of the text with a summary and conclusions.
Additional content is presented in four appendices.Appendix A analyses the various types of two-Gaussian models used to fit the eclipsing binary light curves.Appendix B elaborates on the eccentricity proxy that can be derived from the light curve.Appendix C presents additional figures referenced in the main body of the text.Appendix D completes the acknowledgments.

The catalogue
The 2 184 477 sources published in table gaiadr3.vari_eclipsing_binary(under Variability in the Gaia archive) constitute the Gaia DR3 catalogue of eclipsing binaries.The candidates were selected considering a mixture of various criteria with the goal of reaching a relatively good degree of completeness while limiting the level of contamination.The list of sources in this catalogue is essentially the same as the list of variables identified as eclipsing binaries in the general Gaia DR3 classification table vari_classifier_result (variability type ECL, for details see Rimoldini et al. 2022).Small differences nevertheless exist between the two tables.Nineteen sources are present in the classification table but are not in the catalogue of eclipsing binaries.Periods and light curve characterisation are thus not available for these sources.Conversely, the catalogue of eclipsing binaries contains 140 candidates not listed in the classification table due to a post-processing step of the classification table that modified the label of a small fraction of sources.In this paper, we restrict the analysis to the catalogue of eclipsing binaries.
From the two million eclipsing binary candidates, 86 918 have further been processed within the DPAC to derive orbital solutions.The results are published in table gaiadr3.nss_two_body_orbit(under Non-single stars in the Gaia archive), with nss_solution_type='EclipsingBinary'.
We refer to Siopis et al. (2022) for a presentation of that table.In addition, 155 of them have combined photometric + spectroscopic solutions (identified in the table with nss_solution_type='EclipsingSpectro').We refer to Gaia Collaboration et al. (2022a) for further information.
The distribution on the sky of the eclipsing binary candidates from the catalogue is shown in Fig. 1.The G light curves contain between 16 and 259 cleaned field-of-view measurements, depending on the sky position according to the Gaia scanning law.For each candidate, an orbital period is provided in the catalogue, together with a geometrical characterisation of its G light curve and a global ranking that ranges from 0.4 to 0.84 (Eqs.4  and of the samples with global ranking larger than 0.6 (filled green histogram) and smaller than 0.5 (red spiked histogram).The abscissa scale is truncated at the lower side for better visibility.and 5 in Sect.2.2), where a higher value indicates a better light curve characterisation.Figure 2 gives the G magnitude distribution for the full catalogue (in black) and for the sub-samples with the highest (>0.6, in green) and lowest (<0.5, in red) global rankings.
The eclipsing binary pipeline is presented in Sects.2.1 to 2.3.The input to the pipeline is shortly described in Sect.2.1.The geometrical characterisation of the light curves is detailed in Sect.2.2, and our post-pipeline selection criteria is presented in Sect.2.3.The content of the catalogue is summarised in Sect.2.4.

Eclipsing binary pipeline input
The eclipsing binary module that generated the candidates published here are part of the variability pipeline consisting of several stages described in Eyer et al. (2017Eyer et al. ( , 2022)).After a general variability detection performed on all Gaia sources, variable source candidates go through a classification stage (Rimoldini et al. 2022).Sources classified as eclipsing binaries are then fed to our eclipsing binary module.
Not all sources initially classified as eclipsing binaries are published in DR3.An initial selection keeps only sources that are brighter than 20 mag in G, that have at least sixteen cleaned fieldof-view measurements in their G light curves, and for which the skewness in the G time series is larger than −0.2.This constitutes ∼20 million sources.The eclipsing binary pipeline then processes the G light curves (as described in Sect.2.2), and a final selection further filters out sources according to period and folded light curve properties (see Sect. 2.3).

Light curve characterisation
For each eclipsing binary candidate, a geometric model of its Gband light curve is constructed by fitting to the cleaned G-band time series up to two Gaussians and one cosine.The Gaussian components aim at modelling the geometrical light curve shape of the eclipses and the cosine component that of an ellipsoidallike variability.The model serves the purposes of characterising the geometry of the light curve, of selecting the most probable orbital period based purely on the photometry, and of providing a ranking among all sources.
The 'two-Gaussian' model is introduced in Sect.2.2.1, and its derived parameters is described in Sect.2.2.2.The period search method is then presented in Sect.2.2.3, followed in Sect.2.2.4 by the procedure to estimate the uncertainty on these parameters.Our final per-source light curve model selection strategy is given in Sect.2.2.5.

Two-Gaussian model parameters
The geometrical model fitted to the G light curve consists of up to two Gaussians and a cosine.The model can contain any combination of these three components, not all necessarily present.It is called a 'two-Gaussian' model irrespective of the number of components it eventually contains.A full description of the Notes. (a) Cosine function with half the orbital period 0.0 0.2 0.4 0.6 0.8 1.0 phase Fig. 3. Schematic representation of the two-Gaussian model parameters used in Eq. (3) to fit folded light curves of eclipsing binaries.The ordinate represents magnitude in reverse order.Three cases are shown with their primary eclipses (arbitrarily) located at phase 0.2.Case (a) illustrates the modelling of a well-detached eccentric system using two non-overlapping Gaussians.Case (b) shows a very tight circular system modelled with two overlapping Gaussians.Case (c) represents a tight circular system with an out-of-eclipse ellipsoidal variation modelled with a cosine component.The red dashed horizontal line in each panel indicates the value of the constant C in Eq. 3. The green areas delimit the eclipse durations.The thin black dotted lines in the middle and bottom panels show the individual Gaussian and/or cosine components of the two-Gaussian models.The thick black solid lines show the resulting two-Gaussian models.model is given in Mowlavi et al. (2017), to which we refer for more details.We here summarise the model components and associated parameters.
A Gaussian component k is defined as where ϕ is the orbital phase, i.e. (observation time − reference time T 0 ) modulo (orbital period), and µ k , d k , σ k are the Gaussian parameters (phase location of the centre, depth in magnitude, and width in phase, respectively) of the first (k = 1) and second (k = 2) Gaussian, when present.A schematic representation of a model with two Gaussians mimicking a well detached binary system is shown in the top panel of Fig. 3, while the middle panel illustrates the case of a tighter system modelled with two overlapping Gaussians.We note that there are not always two Gaussians in the models and, when there are two, the first Gaussian is not necessarily the deepest of the two.When a Gaussian component is included, its mirror functions at phases below zero and above one are automatically added to take into account the contribution of the tails of the Gaussian function from adjacent phases due to the periodicity of the eclipses (see Eq. (2) of Mowlavi et al. 2017).This is necessary for a correct inclusion of wide Gaussians.
The cosine component, when present, has a period equal to half of the orbital period.It is given by where A ell is the amplitude of the cosine function.If there are any Gaussian components, µ ell is either equal to µ 1 or µ 2 , depending on whether the cosine is centred on the first or second Gaussian component, respectively.If the model contains only a cosine, µ ell is fitted to the data as an independent parameter.When all the components are present, the model writes where C is the reference level.The list of model types according to the number of components, and the number of parameters for each model type are summarised in Table 1.We note that this model is adequate to represent eccentric systems only in the absence of a cosine component, and that reflection, which would be described with a cosine component with a period equal to the orbital period, is not included in this first Gaia catalogue of eclipsing binaries.
All parameters necessary to reconstruct the geometric model are published in the catalogue.They are summarised in Table 2.The orbital period is given as a frequency (to which a frequency uncertainty can be associated, see Sect.2.2.4).The model component parameters are given in field names prepended with "geom_model_".The reference time used for phase folding is also published.
Article number, page 4 of 40 Notes. (a) Referenced time given in barycentric JD in TCB -2455197.5day. (b) Null if no Gaussian component in the model. (c) Null if only one Gaussian component in the model. (d) Null if no cosine component with half the period of the geometric model. (e) Null if no cosine component with half the period of the geometric model.Equal to one of the geom_model_gaussian*_phase and associated error if model type contains "_WITH_ELLIPSOIDAL*".

Derived geometric model parameters
In addition to the 'two-Gaussian' model parameters, several parameters are derived from the geometric model and published in the catalogue in field names prepended with "derived_" (see Table 2).These derived parameters give eclipse characteristics (phase location, phase duration, depth) based on the geometric model as given by Eq. ( 3).The deepest and second deepest eclipse information are stored in the 'primary' and 'secondary' eclipse fields, respectively.We remind that the underlying Gaussian model components 1 and 2 have no specific order.
Derived eclipse parameters are only provided in association with a Gaussian component.A dip in the folded light curve that results from a cosine component and that has no associated Gaussian does not have derived eclipse parameters.Therefore, models containing a cosine and a Gaussian, for example, only have one set of derived eclipse parameters.Only the "derived_primary_*" fields are then filled in the catalogue.Likewise, purely cosine models have no derived parameters.
The derived eclipse phase locations are obtained by starting at the centre of the Gaussian (1 or 2) and identifying the closest zero-derivative (flat) point in the light-curve, which is not nec-Article number, page 5 of 40 A&A proofs: manuscript no.GaiaDR3_EB_v3.0 essarily located at the same positions as the centres of the Gaussians if they are not offset by 0.5 in phase or when there is an ellipsoidal component.The derived eclipse depth is defined as the distance between the model value at the derived primary or secondary eclipse phase and the brightest model value, and the derived eclipse duration in phase is defined as 5.6 σ k , σ k being defined in Eq. ( 1), with a maximum of 0.4 (see Mowlavi et al. 2017).These last two definitions equally apply for models with and without an ellipsoidal component.

Period search
The orbital period is obtained in two steps.First, a list of up to twenty candidate periods is established from the G light curve as described in this section.Two-Gaussian models are then fitted to the light curve for each of these periods, and the best model is selected as described in Sect.2.2.5.The period of this best model is the orbital period published in the catalogue together with the best model parameters.
Due to the variety of eclipsing binary light curve geometries, we combined the results of three different period search methods to identify the list of candidate periods.The three methods are the Generalised Least-Squares (Heck et al. 1985;Cumming et al. 1999;Zechmeister & Kürster 2009), the Phase Dispersion Minimisation (PDM; Jurkevich 1971;Stellingwerf 1978;Schwarzenberg-Czerny 1997) and the String Length (Lafler & Kinman 1965;Burke et al. 1970) methods.The choice for these three different methods is based on earlier internal tests on HIP-PARCOS (ESA 1997) eclipsing binaries showing the largest correct period recovery to be found in the union of this ensemble.The unweighted procedure has been used in all cases because the observations in the eclipses are fainter, and their uncertainties consequently larger, than their corresponding out-ofeclipses values, and they would therefore be down-weighted in a weighted procedure.Periodograms are computed using these three methods in the frequency range between 0.005 and 15 d −1 (spanning 1.6 h to 200 d) using a fixed frequency step of 10 −5 d −1 .
The two most significant peaks in each of the three periodograms are then gathered in a list of candidate frequencies, to which half and twice their values are added for all three methods, as well as one third and four times their values for Generalised Least-Squares.In this way, a set of twenty candidate periods are constructed, some of which might of course overlap between the different methods.

Model parameter uncertainties
Due to the often low duty-cycle of eclipsing signals (e.g. down to an adopted minimum of three observations in eclipse), estimation of the uncertainties in our models can be inherently imprecise.As formal errors from the least-squares fit do not capture any modelling errors, we opted the jackknife method to get a sense of the uncertainties around our best-fit solution parameters.
For this data release, we implemented a Jackknife method with non-robust mean and variance estimates (Wall & Jenkins 2003).Essentially this means that, in order to estimate the uncertainties of the best fit model parameters p (including frequency, reference level, and derived parameters) of a source with N observations X i=1→N , we re-fit the model N-times, where each time one of the observations X i is left out.Generally, for each re-fit, this recovers a similar, but not identical, parameter solution p i of which the variance Eq. 6.20 in Wall & Jenkins 2003) is used to populate the uncertainty estimate.Because instances of the N jackknife re-fits can cause non-convergence, a minimum of 30% converged solutions was required to estimate the uncertainties.If more than 70% of the re-fits failed, the model is rejected from the list of model candidates for the given source (see Sect. 2.2.5).Even though most Jackknife solutions converged, some included some wildly large values, which is reflected in some of the published uncertainties.Alternatively, the Jackknife samples showed in some cases too little variation for a good uncertainty estimate, resulting in some near-zero uncertainty estimates.We intend to improve upon that in DR4 by implementing a more robust estimate of the variance.
The Jackknife method described above allows to estimate the uncertainties of not only the geometric model parameters, but also of the frequency, reference level, and derived parameters.These uncertainties are generally more informative (and larger) than the formal errors obtained from a simple linear covariance estimation at the best-fit parameter set, because the latter does not include any modelling errors and assumes that observation uncertainties are correctly estimated.
As the frequency is among the most important parameters, we applied more stringent checks and bounds on its estimated Jackknife uncertainty.We set frequency_error = MAX( frequency_error, 0.001/time_duration_g_fov) where time_duration_g_fov is the duration between the first and last observations, as published in the gaiadr3.vari_summarytable.Additionally we identified that for frequency_error × time_duration_g_fov > 0.6, no correct period is recovered in our literature cross-match.Therefore, all models with a value above this limit have been rejected.These lower and upper bounds on the frequency uncertainty f orb,err correspond to, respectively, 0.1% and 60% phase deviations2 at the last cycle of the observations with the given period P orb = 1/ f orb .
The uncertainties on all model parameters are published in the catalogue in field names appended with "_error".They include the geometrical model parameters as well as the orbital frequency and derived parameters.

Model selection strategy
For each of the up-to-twenty candidate periods identified in Sect.2.2.3, seven two-Gaussian models are fitted to the G light curve by considering all possible combinations of the two-Gaussian components, including a simple constant model in order to do a proper model comparison against a non-variable model.This results in a list of up to 140 model candidates per source, considering the six model types listed in Table 1 and the additional constant model.The models are then cleaned and sorted according to their Bayesian Information Criterion (BIC) score (Feigelson & Babu 2012, Eq. 3.54), which allows to compare model fits for all combinations of the candidate periods and geometric models, and the best model is selected.These steps are each briefly described in the next paragraphs.
In the first, cleaning step, models having component parameters that we deem non-physical are removed from the list of model candidates.Visual inspection of earlier iterations of our pipeline on Gaia data revealed that the geometric model parameters may model features that we deem non-physical.This is the case when two Gaussian components are too close to each other.We therefore remove a model from the list of model candidates if the derived primary and secondary eclipse locations are distant by less than 0.08 in phase to avoid stacking Gaussians on the same eclipse.We also remove models with one Gaussian if its width is larger than 0.4 in phase, as well as models with one Gaussian and a cosine component if the Gaussian width is larger than 0.4 in phase (as such wide Gaussian is partially degenerate with the ellipsoidal component).The pipeline also checks the uncertainties of the geometric model parameters, and rejects models that have uncertainties larger than 10 mag for the reference level (C in Eq. 3) or for the cosine amplitude (A ell ), or larger than one for one for the phase locations (µ 1 , µ 2 ) or widths (σ 1 , σ 2 ) of the Gaussians.No condition is given on the uncertainties of the Gaussian depths as this quantity can be unconstrained for welldetached systems with narrow eclipses.
After this first pruning of models, we order the list of remaining model candidates by their BIC score.In the adopted BIC convention, a higher BIC score identifies a better model fit to the data taking into account the number of free parameters in each model and giving a higher weight to models that have a smaller number of parameters.We then retain all models that have a BIC score within 30 of the highest BIC score.All these model candidates are considered to be equivalently good at this point.This list is then filtered according to several exclusion criteria.We remove the constant model that was added to the list of models, if it remains in the list of model candidates, models that have a phase coverage less than 0.6 (the phase coverage is computed by binning the phase-folded data in ten bins and counting the fraction of filled bins), and models that have less than three observations in an eclipse.If multiple models survive at this point, a pre-defined model ranking is used to select the model with the highest rank according to the model ranking indicated in Table 1.It must be noted that this model ranking inevitably introduces priors in the model selection.For example, circular systems with two equal-depth eclipses will be favored over eccentric systems displaying only one eclipse (these two cases differ by a factor of two in their orbital periods).If no candidate model remains in the final list, the source is removed from the catalogue.

Post-pipeline source filtering and model ranking
In this first Gaia catalogue of eclipsing binaries, the output of the pipeline underwent a large variety of verification and validation checks that led to the application of additional filters outlined here.The first concerns the periods found in the time series, requiring that the internal second best model (see Sect. 2.2.5) must have a period compatible with the one found in the best model (i.e. with period ratios equal to 0.5, 1, 1.5 or 2).Additional criteria further consider the Abbe value on the folded light curves in combination with various frequency limits and global ranking criteria.Finally, sources with periods smaller than 0.2 d were removed because of the larger occurrence in DR3 of aliases at these small periods.
In order to compare the models of all sources in the catalogue, a global ranking is computed based on the Fraction of Variance Unexplained (FVU).This quantity is defined as the ratio of the variance of the residuals to the variance of the signal, and is given by (4) In this equation, G obs,i is the ith measurement of the N G observations in the G time series, G model,i is the value of the model at that time, and G the mean G magnitude.A global ranking that ranges between zero and one is then derived using a linear transformation of the base ten logarithm of the FUV, given by The constants in this equation are empirically derived to map the log(FVU) values in the range from zero to one.
Our last source filter uses this global ranking.Only sources with a global ranking larger than 0.4 are published in the catalogue.

Catalogue content
The data fields published in the catalogue are listed in Table 2.They include the orbital frequency, the geometrical model parameters of the G-band light curve, the parameters derived from the model, the uncertainties on these parameters, and the global ranking.Orbital frequencies are published rather than orbital periods for consistency with the internal model parameterisation and subsequent uncertainty estimates by the Jackknife method.
The model type is one of the six possible combinations of two Gaussian and a cosine functions.They are listed in Table 1, together with the number of sources present in the catalogue for each type.All model parameters are named with a prefix 'geom_'.The numbering of the first and second Gaussians follows the order of dip detection in the pipeline, and does not necessarily correspond to an order where the deepest Gaussian would be Gaussian one and the shallowest Gaussian would be Gaussian two.
The two-Gaussian model represents a purely geometrical description of the light curve morphology and is not intended to model the physical properties of the binary system.From the two-Gaussian model, however, an estimate of the phase locations, durations and depths of the primary and secondary eclipses are derived by identifying the deepest and second deepest dips, respectively, in the model light curve (see Sect. 2.2.2).These quantities are published in the catalogue with data field names prefixed with 'derived_'.
As mentioned in Sect.2.2.4, the current uncertainty estimation is not robust against outlying samples in the Jackknife method, and thus can lead to arbitrarily high uncertainties in some cases.This explains the presence in the table of unrealistically large estimates of the errors on some parameters.Besides, values above 3.4E38 have been converted to NULL values, as they cannot fit in a numeric float type in the database.As a result, there are 1131 sources which have NULL values for geom_model_gaussian1_depth_error, 824 sources for geom_model_gaussian2_depth_error, 776 sources for derived_primary_ecl_depth_error, and 1145 sources for derived_secondary_ecl_depth_error, despite the presence for these same sources of non-NULL values for the quantities to which the errors are associated.
Article number, page 7 of 40

Light curve models
The automated procedure that processes the data of the two million Gaia eclipsing binary candidates finds the best two-Gaussian model fit to the G light curves.As stressed in Sect.2.4, the model represents a purely geometrical description of the light curve morphology.The model parameters are not necessarily linked to physical properties of the binary system despite a good description of light curve geometry, due, for example, to a lack of phase coverage, spurious feature identifications in the light curves, or potential wrong period determination The model parameters can, however, in a large number of cases, inform on the physical properties of the eclipses (depth, duration, eccentricity) and the ellipsoidal variability (amplitude).
A detailed analysis of the light curve models is presented in Appendix A. In that appendix, the light curves are classified in samples that have two, one, or no Gaussian components, with a naming convention starting with 2G, 1G, or 0G, respectively, and with a letter E added when an ellipsoidal component is present.In addition, groups 2G and 2GE are further sub-classified depending on model parameters by post-fixing the group name with -A, -B, -C, -D, -X, -Y and -Z.The definition of the groups are given in Table A.1 of the appendix.This basic classification is only meant to guide the user on the catalogue content and various types of light curve morphologies.In this section, we present examples of known eclipsing binaries in each of these groups, all available in the catalogue of Avvakumova et al. (2013).
The overwhelming majority of G light curves are modelled with two Gaussians (94% of sources in the catalogue, see Table 1).Among these, two-third have strictly two Gaussian components.The most obvious eclipsing binary configuration whose light curve can be modelled in this way is that of well-detached systems, with constant out-of-eclipse light.The two Gaussians have similar widths, but not necessarily similar depths.In Appendix A, they define Sample 2G-A (285 320 candidates).The G folded light curve of V614 Ven in this sample is displayed in the top panel of Fig. 4. We remind that only the G data have Article number, page 8 of 40  Tighter systems in which one or both stars fill their Roche lobes can also display light curves reminiscent of detached systems (e.g., Pojmanski 2002;Paczyński et al. 2006), and hence be found in Sample 2G-A.This can, for example, happen when the star that fills its Roche lobe is much fainter than its companion such that the induced ellipsoidal variability is below detection limit (depending on instrument photometric precision).The secondary eclipse would then also be much shallower than the primary eclipse.They typically characterise Algo-type binaries, which are understood to result from a past mass-transfer episode.Algol itself is not available in Gaia DR3 due to its brightness (2.1 mag in V), but the example of SW Cyg, a A2Ve+KI system (Malkov 2020), is given in the top panel of Fig. 5. 3 The absence of detected out-of-eclipse variability does thus not necessarily  imply a well-detached system.The variety of binary configuration in Sample 2G-A is also attested by the depth ratio distribution shown in blue in the top panel of Fig. 6.The histogram covers all values from close-to-zero to one, with two main peaks, one at small ratios below 0.2, and another at depth ratios close to one.
Some light curves are modelled with a very narrow primary Gaussian and a wide secondary.In the sub-classification presented in Appendix A, they are gathered in Sample 2G-D.The secondary Gaussians of these cases are, on the mean, much shallower than their primary Gaussian, as shown in Fig. 6 (second panel, cyan histogram).When the primary eclipse is very narrow, the detection of the secondary eclipse may be challenging, due for example to insufficient measurements in the eclipse and/or too shallow secondary eclipse.The probability that the pipeline fails to correctly detect the secondary, or that the orbital period is incorrect, is thus much greater than for Sample 2G-A candidates.The second example in Fig. 5 displays a case in Sample 2G-D, V745 Cep classified as a semi-detached system in Avvakumova et al. (2013), where both Gaussians correctly identify the eclipses.
Tight systems are generally modelled with two Gaussians and a cosine to account for the ellipsoidal out-of-eclipse variability.These light curves belong to either Sample 2GE-A or 2GE-B in Appendix A, depending on the amplitude of the ellipsoidal variability.Sample 2GE-A (162 630 sources) contains candidates with small to medium amplitudes of 2 A ell < 0.11 mag, while Sample 2GE-B (265 276 sources) has 2 A ell > 0.11 mag.Sample 2GE-A is similar to Sample 2G-A except for the additional cosine component.Four such examples are shown in Fig. 7, with increasing ellipsoidal amplitude (relative to primary eclipse depth) from top to third case, and with a total eclipse in the fourth case.The two famous eclipsing binaries β-Lyr and W UMa, the prototypes of the classical EB-and EW-type eclipsing binaries, respectively, belong to Sample 2GE-B.Their light curves are shown in Fig. 8.
Very tight systems, including semi-detached systems with large ellipsoidal variability, in-contact systems, or systems with a common envelope, have their light curves modelled in several ways using a two-Gaussian model.The most common way consists of two wide overlapping Gaussians of similar width.They form Sample 2G-B containing 834 093 sources.The Gaussians are located at a phase separation of about 0.5 from each other.The majority of them have similar eclipse depths, as seen by the green histogram in Fig. 6 (top panel).1687 Aql is such an example, shown in the third panel of Fig. 5.An example in Sample 2G-B with significantly unequal eclipse depths, NS Cam, is shown in the fourth panel.
A small fraction of candidates modelled with two wide overlapping Gaussians have non-equal Gaussian widths.This feature can model asymmetries in the light curves of tight systems.They form Sample 2G-C (24 081 sources).Their eclipse depth ratio distribution is very similar to that of Sample 2G-B (red dotted histogram in Fig. 6, top panel).An example is given in the bottom panel of Fig. 5 with KS Eri, a binary system displaying the O'Connell effect.
In less than 4% of the DR3 eclipsing binary candidates, the light curve is modelled without a second Gaussian.They belong to samples 1G and 1GE depending on whether the model contains or not a cosine component.The lack of a secondary Gaussian can be due to several reasons.One of them is the lack of eclipse phase coverage.Such is the case for KN And (Fig. 9, top panel) and V379 Per (second panel).The absence of a second Gaussian can also be due to the presence of a cosine component that models by itself the secondary eclipse.This is the case for RZ Col shown in the third panel of Fig. 9.
Finally, a single cosine my be sufficient to model a light curve.They form Sample 0GE (36 227 sources).DU Car illustrates an example in Fig. 9 (bottom panel).
About one fifth of the ∼2 million sources that contain two Gaussians in their light curve model do not fall in one of the above categories 2G-A, 2G-B, 2G-C, 2G-D, 2GE-A, 2GE-B, 1G, 1GE or 0GE.They form three additional categories, 2G-X, 2G-Y and 2GE-Z, depending on their model parameters.We refer to Appendix A for more details.The probability that their model components reflect physical configurations of the eclipsing binaries is much lower than for the other groups, and they are  to be investigated on a case-by-case basis.Example of nevertheless correct cases in each of these three samples, and where the Gaia period agrees with literature period, are shown in Fig. 10.
In conclusion, the two-Gaussian model provides a powerful tool to study the two million eclipsing binary candidates published in Gaia DR3.The classification provided in Appendix A gives some insight into the type of binary system, keeping in mind that each group defined in that appendix contains a variety of different light curve morphologies.In addition, there is an inherent degeneracy in light curve morphology between different types of binary systems that makes it impossible to discriminate between them solely based on G photometry.The case of detached and semi-detached systems was mentioned above.From the 119 semi-detached systems listed in Malkov (2020), 95 are present in the DR3 catalogue and 74 have Gaia periods compatible within 5% with the values gathered by that author.Among these 74 sources, 60 have an ellipsoidal component in their two-Gaussian model (36 in Sample 2GE-A, 19 in 2GE-B and five in 1GE), and 14 do not (five in 2G-A, six in 2G-B, one in 1G, and two in 2G-X).

Global ranking
The global ranking is directly linked to the fraction of the variance unexplained by the two-Gaussian model through Eq. ( 5).As such, it informs on the reliability of a candidate to be an eclipsing binary, a larger global ranking corresponding to a bet-Article number, page 10 of 40 ter fit to the light curve, and hence to a more reliable eclipsing binary candidate.A poor global ranking, however, does not necessarily imply a false detection, as it relies on the assumption that the functions included in the model can adequately describe the light curve of an eclipsing binary.The two-Gaussian model will fail to recognise an eclipsing binary if some physics dominating the shape of the light curve is not modelled by these functions.Such would be the case, for example, for ellipsoidal variables on an eccentric orbit including heartbeat stars, or for close binaries featuring a reflection effect (which translates in a cosine function with a period equal to the orbital period).Sources in the catalogue with a low global ranking will therefore need additional investigation to confirm and characterise their binary nature.Sources with a high global ranking, on the other hand, have a high probability to be eclipsing binaries.
The distribution of the global ranking is shown in Fig. 11 for the full catalogue (black histogram).It ranges from 0.40 to 0.84, with a maximum of the distribution around 0.51.Candidates with low global rankings are, on average, fainter than the ones with high global rankings.This is illustrated by the green and red histograms in Fig. 2, where the sample with rankings larger than 0.6 (filled green histogram) peaks around 17.3 mag, while sources with rankings less than 0.5 (red hatched histogram) are located at much fainter magnitudes around 19 mag.It mainly results from the fact that faint sources have larger epoch G uncertainties than bright sources, which in turn generally leads to poorer eclipsing binary light curve characterisation, and hence lower rankings.Figure 12, which plots the signal-to-noise ratio in G (std_dev_over_rms_err_mag_g_fov in the Gaia archive) versus global ranking, overall supports this explanation.
The histograms of the global ranking for the various samples discussed in the previous section are shown in Fig. 13.The largest global rankings, on the mean, are found in samples whose model components have a higher probability to represent physical features (eclipses and ellipsoidal variability).These are samples 2G-A and 2G-B without an ellipsoidal component (respectively blue and green distributions in the top panel of Fig. 13), and samples 2GE-A and 2GE-B with an ellipsoidal component (respectively blue and green distributions in the bottom panel).We note that the presence of an ellipsoidal component leads to

Orbital periods
The distribution of the orbital periods of the full sample is shown in Fig. 14     They have narrower distributions than the ones of well-detached systems, peaking at ∼0.35 days (second and third panels in Fig. 16).The tighter systems include tighter detached, contact, and ellipsoidal systems.No excess is observed at ∼0.25 d in the period distributions of these systems.
The last category shown in the bottom panel gathers samples 2G-X, 2G-Y and 2GE-Z whose light curve model components are not necessary linked to physical features of the binary system.It contains about one fifth of the full catalogue The period distributions reveal much more complex structures.Many peaks are observed over the full range of periods, with a predominance at 0.25 days and above twenty days.These distributions support the conclusions drawn in Sect.3.1 that ask for a confirmation of their periods and the nature of the eclipsing binaries in these samples.
Additional insight in the period distributions is provided in Sect.A.6 of Appendix A.

Catalogue quality
We assess the quality of our catalogue by comparison of our results with literature data, based on the Gaia DR3 cross-matches presented in Gavras et al. (2022).For the Gaia DR3 catalogue of eclipsing binaries, there are 606 393 cross-matches.The main surveys and number of cross-matched sources are listed in Table 3.The largest number of cross-matches relates to the ZTF survey (42%), then OGLE4 (17%), ASAS-SN (14%), ATLAS Article number, page 12 of 40   (10%), CATALINA (8%), and PS1 (5%).The remaining 4% cross-matches come from a variety sources not detailed here.
The statistics of the Gaia DR3 cross matches with the literature are reported in Table 4.The first two-row set of rows (labeled 'All' in the XMs column) gives the statistics for the sample of all cross-matches, irrespective of whether the source is classified as an eclipsing binary in the literature or not.The table lists the number of sources, the number of sources that have a period reported in the literature, and the number of sources for which the literature period is compatible with the Gaia period, either  Notes. (a) Within the rectangular sky region shown in Fig. 19. (b) Within the polygon sky region shown in Fig. 20.
directly (1:1 ratio, see Sect.4.1) or within a factor of one or two (1:1, 1:2 or 2:1 ratios).The second two-row set (labeled 'EB') then provides the same statistics, but only for the subsample of cross-matches that are also classified as eclipsing binaries in the literature.The last two-row set (labeled 'non-EB') finally gives the statistics for the complementary subsample of cross-matches that are classified in the literature in a variability type other than eclipsing binary.Table 4 shows that the great majority (87%) of the Gaia DR3 eclipsing binaries cross-matched with literature data are also identified in the literature as eclipsing binaries.This is a good score given the fact that classification of large catalogues is performed through automated procedures, a process that necessarily Article number, page 13 of 40 introduces a fraction of wrong classifications that will impact the comparison between two independent catalogues.Among the non-EB crossmatches, we note that the Gaia eclipsing binary candidates cross-matched with non-eclipsing binaries in the literature include 1205 candidates classified as ellipsoidal variables in OGLE4.
We first compare in Sect.4.1 our periods with the ones found in the literature.The questions of completeness and purity are then addressed in Sects.4.2 and 4.3, respectively.

Orbital periods
Almost all Gaia DR3 eclipsing binary candidates that have a cross-match in the literature also have a period published in the literature (99% of them, see Table 4), allowing a direct comparison with our periods.To do so, for any given source, we evaluate the phase deviation r P,lit at the end of the observation obtained when adopting the literature period P lit instead of the Gaia period P Gaia .This is computed by multiplying the relative difference between the literature and Gaia periods with the number of cycles during the observation, and is given by where ∆T is the duration of the G light curve.Its cumulative distribution is shown in grey filled histogram in Fig. 17 for crossmatches that are classified as eclipsing binaries in the literature.More than 85% of the sources have a phase deviation of less than 0.5 at the last cycle of their observation.The histogram also Article number, page 14 of 40  shows that when this is not the case, r P,lit is much larger than one, indicating a significant difference between the Gaia and literature periods.We therefore consider the Gaia and literature periods to be equal when r P,lit < 1.The number of such sources is reported in the fourth column in Table 4.If we also include sources with Gaia periods that are half or twice the literature periods (replacing P lit by 0.5 P lit or 2 P lit in Eq. ( 6)), the percentage of sources having compatible Gaia and literature data increases to 93% (fifth column in the table).
In contrast, less than 6% of sources not classified as eclipsing binaries in the literature have equal Gaia and literature periods (red histogram in Fig. 17 and fourth column in Table 4).Interestingly, this number increases to 32% when considering compatible periods (P lit /P Gaia (0.5, 1, 2) in the table).This can easily be understood if the sources have sinusoidal-like light curves.The detected period can then easily be a factor of two the orbital period if it is an eclipsing binary or ellipsoidal variable, and a survey may pick either one of these periods.
The comparison between Gaia and literature periods is shown in Fig. 18.The upper panel displays the periods of the crossmatches classified as eclipsing binaries in both Gaia DR3 and the literature.They are distributed as expected, with the presence of (mainly) P Gaia :P lit = 1:2, 2:1 and 2:3 ratios in addition to the overwhelming 1:1 cases.Alias features are also seen.The distribution of the Gaia eclipsing binary candidates crossmatched with sources classified as non eclipsing binaries in the literature, on the other hand, reveals the imprints of the underlying literature catalogues in the distributions of P lit (bottom panel in the figure).At literature periods below one day, we see the imprint of ∼31 600 sources from PS1_RRL_SESAR_2017 (Sesar et al. 2017), while the main contribution at literature periods above twenty days comes from ∼17 200 sources from ATLAS_VAR_HEINZE_2018 (Heinze et al. 2018).The former catalogue targets RR Lyrae variables, while the cross-matches in the latter catalogue were assigned the tailored 'OMIT' classification type in Gavras et al. (2022) to gather sources whose classification in the literature were considered to be 'too generic, uncertain, or with insufficient variability characterisation' (see Gavras et al. 2022).In Heinze et al. (2018), they are mainly assigned, in decreasing order of number of crossmatches with our eclipsing binaries, the types NSINE (pure sine wave fit, but noisy data), SINE, or MSINE (modulated sine wave).These crossmatches not classified as eclipsing binaries in the literature or considered uncertain by Gavras et al. ( 2022) are, however, a minority in the full sample of crossmatches.
In summary, the Gaia periods are compatible with literature periods in about 85% of cases.This includes cases where the literature period is twice or half the Gaia period.

Completeness of the Gaia catalogue
To estimate the completeness of our catalogue, we compare it with the OGLE4 catalogues of eclipsing binaries available, which are available for the Large (LMC) and small (SMC) Magellanic Clouds (Pawlak et al. 2016) and for the Galactic Bulge (Soszyński et al. 2016).The sky distribution of the Gaia DR3 eclipsing binaries towards the LMC and Galactic Bulge are displayed in the top panels of Figs.19 and 20, respectively, and the distributions of the OGLE4 eclipsing binaries in the second panels.Sources in common in Gaia DR3 and OGLE4 catalogues are shown in the third panels.
Two steps are achieved in order to estimate the completeness of the Gaia catalogue relative to the OGLE4 catalogues.We first restrict the OGLE4 catalogues to sources present in the full Gaia DR3 archive, with a cross-match search radius of one arcsecond.The statistics are given in Table 5. Gaia cross-matches are found for practically all OGLE4 eclipsing binaries in the LMC and SMC, but for only 87% of the OGLE4 sources in the Galactic Bulge (420 321/473 798 in Table 5).The 13% OGLE4 sources from the Bulge that are not in the Gaia archive are all very red faint sources with OGLE I magnitudes mainly between 18 and 20.5 mag.Their sky distribution is shown in Fig. 21.
We then limit the OGLE4 samples to sources brighter than 20 mag in G to comply with the input magnitude selection of the Gaia eclipsing binaries (see Sect. 2.1).The final OGLE4 samples contain 35 392, 7843 and 315 523 sources in the LMC, SMC and Galactic Bulge4 , respectively (see Table 5).From these OGLE4 samples, 28% are present in the Gaia catalogue of eclipsing binaries (36% had we considered 19 mag as the faintest limit for both OGLE4 and Gaia catalogues) .The recovery rates are larger in the Magellanic Clouds than in the Bulge, as detailed in Table 5, reaching 48% in the SMC while being 26% in the Bulge.An investigation of the 72% missing OGLE4 sources reveals that ∼45% were excluded from the initial selection (see Sect. 2.1, with ∼40% not being classified as eclipsing binaries and another ∼5% having less than sixteen measurements in their G light curves, mainly in the Bulge).The remaining ∼27% of missing sources were further filtered out from the final selection procedure (Sect.2.3).
A small fraction of the missing OGLE4 eclipsing binaries that were not classified as eclipsing binaries in Gaia DR3 are present in other variability tables published in DR3 (tables gaiadr3.vari_* in the Gaia archive).They consist of 2195 short time-scale variables, 426 binary candidates with a compact companion, 384 long-period variables, 89 main-sequence oscillators, 32 rotation modulation variables, 31 Cepheids, and one Active Galactic Nucleus.
In summary, the completeness of the Gaia catalogue of eclipsing binaries amounts to between 25% and 50% depending on the sky region, when compared to the OGLE4 catalogues of eclipsing binaries.The missing OGLE4 sources were excluded from the Gaia catalogue at candidate selection steps in our processing pipeline.A significant increase in the number of eclipsing binary candidates is thus expected for the next Gaia release, DR4.This is reflected in their I, V and G magnitudes shown in

New Gaia candidates
We investigate in the section the Gaia eclipsing binary candidates that are not present in the OGLE4 catalogues of eclipsing binaries, using the LMC and Galactic Bulge regions as test cases.For this purpose, two sky areas well covered by the OGLE4 surveys are defined towards these regions: the rectangle area  6.There are 26 020 Gaia eclipsing binary candidates towards the LMC and 96 199 sources towards the Galactic Bulge.More than half of them are new relative to the OGLE4 catalogues (53% in the LMC and 65% in the Galactic Bulge, see Table 6).
The sky distribution of the new Gaia candidates towards the LMC is shown in the bottom panel of Fig. 19.They are mostly concentrated in the bar, where the sky density of stars is highest.Their magnitude distribution peaks around 19 mag in G (red histogram in the top panel of Fig. 22), similarly to the magnitude distribution of the full Gaia sample in the defined sky area (black histogram).In contrast, the magnitude distribution of the Gaia-OGLE4 crossmatch sample reveals a plateau between ∼18.5 and ∼19.5 mag (green histogram).The origin of this plateau is unclear, as the full OGLE4 sample in the defined sky area shows a continuously increasing distribution of G up to 20 mag (not shown here).We checked that the new faint sources are not contaminated by the potential presence of a nearby brighter eclipsing binaries.
The distribution of the new Gaia candidates towards the LMC in the colour-magnitude diagram is shown in the upperleft panel of Fig. 23, together, in the upper-middle panel, with Article number, page 16 of 40 the distribution of the Gaia candidates that have an OGLE4 crossmatch.The new Gaia candidates are seen to lie not only on the main sequence, as is the case for the OGLE4 crossmatches, but also, for one third of them, on the red side of the diagram at G BP − G RP > 0.6 mag.Such red candidates are much less abundant in the cross-matched sample.We also note in this regard the larger colour dispersion observed at the faint end of the main sequence for the new Gaia candidates (upper-left panel in Fig. 23) compared to the narrower colour dispersion for the crossmatched sample (upper-middle panel).The correctness of the red colours must, therefore, be checked as the G BP and G RP values of these faint sources may be affected by residual background estimates and/or multiple source blending in the BP and RP spectra.The BP+RP flux excess factor C available in the Gaia archive (phot_bp_rp_excess_factor in the gaia_source table) provides a handy tool for this purpose (assuming G is correct).This quantity, notated C by these authors, evaluates the excess of the integrated BP and RP fluxes in comparison with the G flux (Riello et al. 2021).It is shown versus G BP − G RP in the bottom-left panel of Fig. 23.While many sources are seen to have an excess factor between 1.1 and 1.2, as expected, a non-negligible fraction of them have excess factors significantly above 1.2.This is particularly true for the red sources.These large excess factors lead to unreliable G BP and/or G RP magnitudes, and can thus be at the origin of both the large colour dispersion observed at the faint end of the main sequence and the presence of an excess of red sources at G 19 mag.We note that the value of C of the crossmatched sources have much cleaner distribution around the expected values (bottom-middle panel of Fig. 23).
The G values, on the other hand, should be reliable, within the uncertainties expected at the faint magnitudes.A visual inspection of the G light curves of the new Gaia candidates provides confidence that at least one third of them are genuine eclipsing binaries.In fact, about 17% of these new Gaia candidates towards the LMC with no OGLE4 crossmatch are classified as eclipsing binaries in other surveys, using crossmatches from Gavras et al. ( 2022).They were mainly identified from the EROS2 survey by Kim et al. (2014).
Eleven examples of good light curves from the Gaia new candidates not in the OGLE4 catalogue are shown in Figs.24 and 25. Figure 24 shows sources located outside the bar of the LMC.Two of them, sources Gaia DR3 4651581458546544000 and Gaia DR3 4655269942119876864, are also referenced as eclipsing binaries in EROS2.The light curves in Fig. 25   the crossmatch sample.Since the majority of these new Gaia candidates are faint and lie in crowded regions of the LMC, the eclipsing binary signature in the G light curves can be mingled with variability of non-astrophysical origin.This would explain their low global rankings.The fraction of new Gaia candidates is much smaller if we limit the samples to sources with larger global rankings.In a sample limited to global rankings larger than 0.5, for example, the fraction of new Gaia candidates towards the LMC is three times less than in the full sample (17% new candidates with respect to OGLE4, compared to 53% considering all global rankings, see Table 6).It is interesting to note that the global ranking distribution of the Gaia candidates not in OGLE4 but identified as eclipsing binaries in other surveys (blue histogram in the Fig. 26) peaks at values between those of the new candidates and those of the Gaia-OGLE4 crossmatch  The situation of the new Gaia candidates in the Galactic Bulge with respect to OGLE4 is very similar to the situation described above for the LMC, with the additional observation that the sky density is much higher in the Galactic Bulge than in the LMC, and the sources are much redder due to heavy extinction.As a result, C reaches very large values, above three, as shown in Fig. 27.This is particularly true for the new Gaia candidates having no OGLE4 crossmatch (bottom panel in the figure) as compared to the sample with OGLE4 crossmatch (top panel).Among these new Gaia candidates compared to OGLE4, only very few have a detection as an eclipsing binary in other surveys.Their magnitude and global ranking distributions are shown in blue in the bottom panels of Figs.22 and 26, respectively (note the one hundred times amplification factor compared to the distributions of the other samples).The larger sky crowdedness towards the Galactic Bulge may explain the larger fraction of new Gaia candidates in that region of the sky (65% compared to 53% towards the LMC, see Table 6).This fraction is still 35% in a sample limited to global ranking > 0.5 towards the Galactic Bulge.
In summary, more than half of the Gaia eclipsing binaries are new discoveries, the percentage being larger in crowded regions than in less dense regions of the sky.They generally have low global rankings and large BP+RP flux excess factors, requiring to be cautious when using their G BP and G RP magnitudes.They, however, show genuine G light curves of eclipsing binaries in many cases.
Article number, page 18 of 40

Illustrative samples with good parallaxes
Samples of the Gaia DR3 eclipsing binary candidates towards the LMC and the Galactic Bulge have been briefly discussed in Sects.4.2 and 4.3.In this section, we illustrate the catalogue with samples of candidates with good parallaxes.We consider positive relative parallax uncertainties better than 15% (409 437 sources), and restrict to sources with good BP+RP flux excess factors to exclude obviously wrong G BP − G RP colours in colourmagnitude diagrams.We use for this the corrected BP+RP flux excess factor C * proposed by Riello et al. (2021), using the G BP and G RP median values in their Eq.( 6).The distribution of this quantity for the sample with good parallaxes is shown in Fig. 28.We limit our sample to C * < 0.5.This removes 2% of the initial sample, leading to 400 996 sources.The resulting parallax distribution versus G magnitude is shown in Fig. 29.Most sources are brighter than ∼18 mag.Sources fainter than this value lie within 2 kpc of the Sun.
The absolute M G magnitude versus G BP − G RP colour diagram, hereafter called the observational Hertzsprung-Russell Almost half of the eclipsing binary candidates in this good parallax sample have an ellipsoidal component in their light curve model, with 18% of the sample belonging to group 2GE-A and 22% to 2GE-B).An additional 24% are tight systems with the light curves described by two wide Gaussians (group 2G-B), and 10% belong to group 2G-A.
The orbital period distribution across the observational HR diagram is shown in the third panel from top of Fig. 30, with the per-bin median period colour-coded.The period is seen to be well correlated, on the mean, with the stellar radius, as expected for tight systems.This, however, is not observed for intrinsically faint main-sequence candidates with absolute M G 7 mag, neither for candidates below the main sequence where cataclysmic variables are found.The global rankings of these faint candidates are generally also very low, as can be seen from the bottom panel of Fig. 30.The observational HR diagram is much cleaner if we restrict the sample to candidates with high global rankings.This is illustrated in Fig. 31 with the sample of good parallaxes restricted to sources with a global ranking larger than 0.6, to which all 24'081 tight binaries from Sample 2G-C are also added5 .More than half of this new sample of 144 994 sources have an ellipsoidal component (among which 21% in Sample 2GE-A and 38% in Sample 2GE-B), and 15% are in Sample 2G-B.The period distributions in the two samples, the full sample with good parallaxes and the one restricted to high global rankings, are very similar, but systems with a strong ellipsoidal component (2GE-B) become predominant at short periods in the sample restricted to high global rankings, as shown in Fig. 32.
Bright candidates without detected ellipsoidal variability -About ten percent of the sample with good parallaxes have two Gaussians in their light curve and no detected ellipsoidal variability (group 2G-A), pointing to detached systems (and some Fig. 34.Same as Fig. 33, but for non-Gaia literature period versus Gaia period for those sources that have a period published in the literature.The dashed line is P Lit = P Gaia , and the dotted lines are P Lit = 2 P Gaia and P Lit = 0.5 P Gaia relations.Top panel: sources that have eccentricity proxies smaller than the eccentricity limit shown in Fig. 33.Bottom panel: sources with eccentricity proxies larger than this limit.semi-detached ones, see Sect.3.1).As expected, they have, on the mean, longer periods than the tighter systems (see Fig. 32).
A key question concerns the circularization of these systems at short periods.While we do not model the binary systems, and hence do not know their eccentricity, an eccentricity proxy e proxy can be derived from the relative eclipse locations and durations provided by the two-Gaussian model.The relevant equations are recalled in Appendix B. It is shown in that appendix that most candidates in groups with potential tidal interactions (groups 2G-B, 2G-C, 2GE-A and 2GE-B) have eccentricity proxies compatible with circular systems, while large eccentricity proxies are found in Sample 2G-A of systems without detected tidal interaction.However, the analysis also reveals the unexpected presence of short-period systems with large eccentricity proxies (top panel of Fig. B.1 in the appendix), while these systems are expected to have been circularised.
In order to check the status of these short-period systems with large eccentricity proxies, we focus on the subset of bright candidates with G < 12 mag.There are 401 such candidates in the good parallax sample with large global rankings defined above.Their P-e proxy diagram is shown in Fig. 33.The eccentricity limit above which all systems at a given period are expected to be circular is shown by the dashed blue line, following Eq.( 4.4) of Mazeh ( 2008) with E = 0.98, A = 3.25, B = 6.3 and C = 0.23.The figure shows 48 systems that are unexpectedly above this limit, that we call outlier sources.The uncertainty on their eccentricity proxy, a quantity that can be computed from the two-Gaussian model parameter uncertainties (see Eq. B.4 in Appendix B), does not resolve the issue.If we take the 1σ lower values of the eccentricity proxies considering their ε(e proxy ) uncertainties, 27 candidates would remain above the limit (the 1σ downward corrections are shown as vertical grey line segments in Fig. 33), and still 14 with a 2σ downward correction.The depth of the secondary eclipse is colour-coded in Fig. 33, red corresponding to shallow dips and blue to deep dips.Most of the outliers are seen to have very shallow secondary eclipses, less than 10 mmag deep (red markers in the figure).The figure also encodes the secondary over primary depths ratios, with the size of the markers proportional to this ratio such that smaller circles correspond to smaller depth ratios.The large majority of the outliers are seen to have their secondary eclipse much shallower than their primary eclipse (small marker size).These features invite for a careful check of the reliability of the secondary eclipse.
Article number, page 21 of 40 A comparison of the Gaia periods with literature periods, when available, provides additional insight.The top panel of Fig. 34 shows this comparison for candidates that have acceptable eccentricity proxies, i.e. below the expected upper limit.A very good match between P Gaia and P lit is seen for a large majority of the sources, with most of them having a 1:1 ratio and some a 1:2 ratio (P Gaia 2 P lit ).In contrast, only very few outlier sources have P Gaia P lit , as shown in the bottom panel of Fig. 34.But the majority of them still have a Gaia period 'compatible' with the literature period within a factor of two, with P Gaia 0.5 P lit .This suggests that the Gaia pipeline the two distinct eclipses as one unique eclipse, and detected a (usually very shallow) artefact in the folded light curve to represent an imaginary secondary eclipse.A typical example of such a case is shown in the top panel of Fig. 35, where the secondary has a very shallow secondary dip identified at phase around 0.1, barely visible in the figure.If we increase the Gaia period of the outlier sources by a factor of two, the majority of them become compatible with the expected maximum eccentricity given their period in Fig. 33.Note that in that case, the eccentricity proxy would also take another value as the two-Gaussian model will be different.
The above analyses suggest a quick way to clean the Pe proxy diagram in Fig. 33 by removing all small red points.If we do this, a few outlier sources would still remain, that have significant secondary eclipse depths both in terms of absolute depth (non-red colour) and of secondary over primary depth ratio (large marker sizes).We investigate here the four systems with the shortest periods.They are, in increasing order of orbital period, Gaia DR3 4524651705941314432 (0.2411507 d), 5712304991851559040 (0.2450746 d), 2589992273084612480 (0.2818606 d), and 4455992049496820992 (0.3682685 d).They are identified with open diamonds in Fig. 33, and their G, G BP and G RP light curves are shown in Fig. 35.All two-Gaussian models to the G light curves (dashed lines in the figure) look acceptable.A closer look, however, reveals some features that put in question the periods extracted from the G light curves, which have only a small number of measurements in the eclipses.Investigation of the G BP and G RP light curves, which where not used in the DR3 processing of eclipsing binaries, suggest that the periods derived from the G light curves may be incorrect, at least for the second and third sources.Therefore, the periods of these outlier sources should be double checked.
Two of these four outliers are identified in the ASAS-SN survey as eclipsing binaries (Jayasinghe et al. 2019), but with different periods than the Gaia period.Gaia DR3 4524651705941314432 is mentioned with a period of 3.3972697 d (ASASSN-V J075432.26-211826.4), and Gaia DR3 5712304991851559040 with a period of 5.9965616 d (ASASSN-V J184156.16+192755.8).Their P lit versus P Gaia values are highlighted by the diamonds in the bottom panel of Fig. 34.The Gaia DR3 4524651705941314432 light curve is compatible with the ASAS-SN period (see Fig. C.8 in Appendix C which displays the Gaia light curve folded with the ASAS-SN period).This period, however, was not selected by the automated Gaia pipeline due to the scarcity of points that would result in the secondary eclipse.Regarding Gaia DR3 5712304991851559040, the period proposed by ASAS-SN is not compatible with the Gaia light curve.A check in the ASAS-SN database of light curves 6 actually reveals that the ASAS-SN period is not very good for the ASAS-SN light curve neither.However, twice this period (11.9931232 d) would be compatible with both ASAS-SN and 6 https://asas-sn.osu.edu/variablesGaia light curves.But here too, the scarcity of Gaia measurements that would result in the secondary eclipse prevented the Gaia pipeline to chose this period.
In summary, Gaia results of eclipsing binaries are generally very good given the available Gaia time series.Further investigations must, however, be performed for specific cases.This is especially true for well-detached eclipsing binaries with long periods, for which an incorrect (very) short period may instead be chosen as the best solution by the DR3 pipeline.

Summary and conclusions
This paper presents the first Gaia catalogue of eclipsing binaries made available in June 2022 within Gaia DR3.It contains more than two million candidates, filtered from a larger set of eclipsing binary candidates identified by the general classification pipeline of the variability processing modules within the Gaia DPAC (Rimoldini et al. 2022).The orbital periods are determined based on the cleaned G light curves.A two-Gaussian model is used to characterise the morphology of their light curves, containing up to two Gaussian and one sine functions, and a global ranking is provided that quantifies the quality of the model fit to the G light curves.The model adequately identifies the eclipses and ellipsoidal variability when they are clearly detectable in the light curves, and several groups of eclipsing binaries, from wide to tight systems, are identified in Sect.3.1 based on the model parameters.The two-Gaussian model, however, can contain components that are not relevant to describe real eclipses or ellipsoidal variability, though they reliably describe the geometry of the light curves.The G BP and G RP light curves were not considered in the processing of the eclipsing binaries in DR3.
About 600 000 of the Gaia candidates have a crossmatch in the literature, of which 88% are also identified as eclipsing binaries in the literature.The Gaia and literature periods are similar for 86% of the sources identified as eclipsing binaries in both Gaia and the literature.This number increases to 93% when also considering period ratios of two.The overall completeness of the catalogue is estimated to lie between 25% and 50% depending on the sky region, based on a comparison with OGLE4 catalogues of eclipsing binaries towards the Magellanic Clouds and the Galactic Bulge.More than half of the catalogue consist of new candidates, with larger percentages of new candidates in dense regions of the sky.
Illustrative samples with good parallaxes, containing in majority tight systems, confirm that their properties such as distribution in the observational HR diagram and periods, are overall as expected.The analysis of the period-eccentricity diagram for a subset of detached systems, on the other hand, highlights the challenges in dealing with these systems, and illustrates the usage of the catalogue.
This release represents the largest catalogue of eclipsing binary candidates available so far in the literature.Looking to the future, the next Gaia Data Release 4 (DR4) will provide an even larger catalogue of candidates, with improved characterisation due not only to the larger time baseline and increased number of measurements (DR4 will be based on 66 months of data instead of the 34 months in DR3), but also due to further improvements in our processing pipeline by, for example, taking into account G BP and G RP time series.tween the two fields of view.For eclipse durations in this time interval, observations of a particular source may be lacking in the core of its eclipse depending on the observation time distribution over the 34 months covered in DR3.Let us consider such a source where only very few observations fall in its eclipse time window during these 34 months.For durations larger than 4.2 h, the probability to have measurements inside the eclipse is large as the source will be observed during a minimum of two successive FOV passages.For durations smaller than 4.2 h (but larger (bottom panel).The primary eclipse duration is taken equal to w ecl,1 P orb .The larger occurrence of eclipse depths larger than about one magnitude for eclipse durations between ∼0.07 and ∼0.17 days is linked to the equivalent time intervals between successive observations in the two Gaia fields of view (see text, Sect.A.1).The axes ranges have been limited for better visibility.than 1.8 h), the probability to have enough observations in the middle of the eclipse decreases with decreasing eclipse duration.If measurements are still available at the edge of the eclipse, the eclipse will be caught by the pipeline, but with rather unconstrained eclipse depth.For durations below 1.8 h, the probability to have observations only at the edges of the eclipse decreases considerably as this duration is shorter than the shortest time interval between two successive FOVs.These aspects explain the excess of sources with (too) large primary eclipse depths in     The phase separation between the locations of the two Gaussians is expected to be ∼0.5 for these tight systems.This is indeed confirmed for the majority of the sources in the sample (see second panel from top in Finally, one must mention a note on the value of derived_primary_ecl_depth reported in the catalogue for these 2G-C binaries (a similar note applies to derived_secondary_ecl_depth).This value represents the depth of the faintest point of the primary dip in the modelled light curve.For most of the candidates in the catalogue, the value of derived_primary_ecl_depth is similar to the depth of the deepest Gaussian.When there is a significant overlap between the two Gaussian components, as is the case for candidates in Sample 2G-C, a significant difference exists between the depths of the Gaussians and the derived depths in the modelled light curve.This is illustrated in the top panel of Fig. A.10, which compares these two values for the primary Gaussian of models containing only two Gaussians.Sample 2G-C is identified as the distinctive subsample below (and somehow parallel to) the diagonal line, with primary Gaussian depths (on the abscissa) that are two to ten times larger than the actual depth in their light curves (on the ordinate).
Sample 2G-D.The remaining areas in Fig. A.1 (top panel) other than the ones defined by samples 2G-A, 2G-B and 2G-C contain a variety of light curve geometries.Among them, the ones having narrow primary and much wider secondary Gaussians stand out, with secondary Gaussian depths much smaller, in general, than the primary Gaussian depths, as seen in the third panel of While their primary Gaussians correctly identify the presence of a detached eclipse in most cases, caution must be taken on the reality of the second Gaussian identification.The automated algorithm can indeed fail to detect a narrow secondary eclipse in the case of inadequate phase coverage and/or too shallow secondary, and pick up, instead, a wide and shallow feature in the light curve geometry for the secondary eclipse.This can      Sample 2GE-B.In this sample, the large amplitudes of the ellipsoidal component in the two-Gaussian models dictate the overall morphology of the light curves.The peak-to-peak amplitudes range from 0.11 mag (by definition) to above 0.35 mag (see Fig. A.16,top panel).The Gaussian components, on the other hand, determine the sharpness of the eclipses in the light curve.
The depth of the Gaussian component is typically between half and twice the peak-to-peak amplitude of the ellipsoidal amplitude.Three example light curves are shown in Fig . A.18.In the top example, the Gaussian (depth of 0.15 mag) is less prominent than the ellipsoidal component (peak-to-peak amplitude of 0.25 mag).The middle example shows a case with a stronger Gaussian component (depth of 0.51 mag) than the ellipsoidal amplitude (peak-to-peak amplitude of 0.25 mag).The impact of the Gaussian component on the otherwize sine-like shape of the light curve is clearly visible.The bottom example illustrates a case with a secondary Gaussian much shallower than the primary Gaussian.This last case characterises a small fraction of the candidates in Sample 2GE-B that have Gaussian depth ratios smaller than about 0.3.In Fig. A.16 (bottom panel), they are seen to be an extension of the distribution of Sample 2GE-A towards larger ellipsoidal amplitudes.
Article number, page 30 of 40 Another indication that the Gaussian models of Sample 2GE-Z should be taken with caution comes from their global rankings.The distribution of the rankings of all sources with two Gaussians and an cosine is shown in Fig. A.19 against ∆ 0.5 (ϕ ecl ).Sample 2GE-Z, defined by ∆ 0.5 (ϕ ecl ) > 0.07, has a distribution peaked towards low global rankings, while samples 2GE-A and 2GE-B, located at ∆ 0.5 (ϕ ecl ) < 0.07, are predominantly found at larger rankings.Moreover, the candidates in Sample 2GE-Z that do have large global rankings, have, on average,   5524068296237910272) shows a case where the period is probably a factor of two too small, the second set (Gaia DR3 3033226683222463872) shows a case where observation may be missing at the phase window of the second eclipse, and the third case (Gaia DR3 5541461814286341632) illustrates a case where the lack of enough phase coverage within an eclipse leads to an overestimate of the eclipse depth.

Samples
Main binary types % in catalogue 2G-A, 2GE-A, 2G-D, 1G Wide systems (detached or, under some conditions, semi-detached) 27% 2G-B, 2G-C, 2GE-B, 1GE, 0GE Tight systems 56% 2G-X, 2G-Y, 2GE-Z To be investigated 17%    The categorization presented in Table A.2 is not intended to provide a thorough classification of the two million eclipsing binary candidates, a task that would require additional analysis.Rather, it offers a convenient quick analysis and overview of the catalogue content.It must also be stressed that the definition of the samples as given in Table A.1 is based on well-defined cuts on σ p , σ s , A ell and ∆ 0.5 (ϕ ecl ), which introduces an additional source of uncertainty in the classification.

Appendix B: Eccentricity proxy
A proxy for the eccentricity can be derived using the two-Gaussian model results based on the derived relative locations and durations of the eclipses.At small eccentricities, the projected eccentricity e proxy cos ω can be approximated from the phase separation of the eclipses with where e proxy is the eccentricity proxy and ω is the periastron argument.Equation B.1 is readily computable from the derived model parameters.In addition to eclipse locations, the models also provide the durations w ecl,1 and w ecl,2 of the primary and secondary eclipses, respectively.From these parameters, e proxy sin ω can be computed using The uncertainty ε(e proxy ) on the eccentricity proxy can be computed from Eq. B.3 by propagation of the uncertainties ε(ϕ ecl,1 ), ε(ϕ ecl,2 ), ε(w ecl,1 ) and ε(w ecl,2 ) of ϕ ecl,1 , ϕ ecl,2 , w ecl,1 and w ecl,2 ,  , can be as large as 0.15 even at these small eccentricity proxies.These small eccentricities are therefore compatible with circularised systems.
In contrast, large eccentricity proxies (e proxy 0.3) are found in group 2G-A (top panel in Fig. B.1), which contains well detached systems with no tidal effect.In particular, many shortperiod systems are seen to have large eccentricity proxies, contrary to the expectation of them being circularised.This can be seen in the figure, where the eccentricity limit, at any given orbital period, above which systems are expected to be circularised is shown by the dashed blue line based on Eq. (4.4) of Mazeh (2008) (with E = 0.98, A = 3.25, B = 6.3 and C = 0.23).However, a careful analysis of the systems with eccentricities larger than this limit, performed in Sect. 5 of the main body of this article, concludes that the eccentricity and/or orbital period of these systems are incorrect.Caution must thus be taken in the interpretation of the two-gaussian model results when analyzing specific cases.
Large eccentricity proxies are also derived for candidates in groups 2G-D, 2G-X, 2G-Y and 2GE-Z (Fig. B.1).These groups, however, have been shown in Sect.3.1 to have unreliable light curve models.Their large eccentricities are thus mostly spurious, and need confirmation on a case-by-case study.
Article number, page 36 of 40

Fig. 1 .
Fig. 1.Sky density map of the Gaia catalogue of eclipsing binaries, in Galactic coordinates, colour-coded according to the colour scale shown on the right of the figure.

Fig. 2 .
Fig. 2. Distribution of G magnitude of the full sample (black histogram) and of the samples with global ranking larger than 0.6 (filled green histogram) and smaller than 0.5 (red spiked histogram).The abscissa scale is truncated at the lower side for better visibility.

Fig. 4 .
Fig. 4. Folded light curves of V614 Ven.Top panel: G light curve.The two-Gaussian model is superposed in dotted line.The green areas indicate the derived eclipse durations.Bottom panel: G BP and G RP light curves, shifted by a value equal to their respective median magnitudes as written in the top of the panel.The dotted model and green areas shown in the bottom panel are the ones from the G light curve, shifted to match zero median magnitude.The Gaia period, global ranking and the light curve classification (in brackets; see text) are given in title of the figure after the Gaia DR3 ID and GCVS name.

Fig. 5 .
Fig. 5. Same as top panel of Fig. 4, but for additional eclipsing binaries for which the G light curves are modelled with only two Gaussians.From top to bottom: SW Cyg, V745 Cep, 1687 Aql, NS Cam, and KS Eri.

Fig. 6 .
Fig. 6.Distribution of the derived eclipse depths ratio (secondary over primary) for the various samples having two Gaussians in their light curve models without (top and middle panels) or with (bottom panel) an ellipsoidal component, as labeled in the panels (see text).The histograms are area-normalized.

Fig. 7 .
Fig. 7. Same as top panel of Fig. 4, but for sources with light curves modelled with two Gaussians and a cosine of small to medium amplitude.From top to bottom: AO Ser, DN Cas, MU Cyg and HN Cas.

Fig. 8 .
Fig. 8. Same as top panel of Fig. 4, but for sources with light curves modelled with two Gaussians and a cosine of large amplitude.From top to bottom: β Lyr and W UMa.

Fig. 9 .
Fig. 9. Same as top panel of Fig. 4, but for sources with light curves modelled with only one Gaussian (KN And, top panel), one Gaussian and a cosine (V379 Per in second panel and RZ Col in third panel) or only a cosine (DU Car, bottom panel).No green area is displayed for eclipses that are not modelled with a Gaussian component in the light curve model.

Fig. 10 .
Fig. 10.Same as top panel of Fig. 4, but for additional sources with light curves modelled with at least two Gaussians.From top to bottom: AE Mon, V444 Cyg and RU Gem.

Fig. 11 .
Fig. 11.Distribution of the global ranking of DR3 EB candidates.The full sample is displayed in thick grey, with the candidates therein that have a cross-match with known EBs in the literature shown in thin black.The sample with positive parallax uncertainties better than 15% is displayed in thick green, with the candidates therein with a literature cross-match shown in thin cyan.

Fig. 12 .
Fig. 12. Density map of the signal-to-noise of the G time series (standard deviation of the measurements over root mean square of their uncertainties) of all eclipsing binary candidates versus their global ranking.The density in the map is colour coded according to the colour scale shown on the right of the figure.

Fig. 13 .
Fig.13.Same as Fig.6, but for the global ranking.The histograms of the samples are displayed in two panels as labeled in the panels.

Fig. 14 .
Fig. 14.Distribution of the orbital periods of the DR3 eclipsing binary candidates.The full sample is displayed in black line, and the sample with global ranking larger than 0.6 in filled green.The top panel shows the number of counts per bin on a linear scale, while the bottom panel shows them on a logarithmic scale.

Fig. 15 .
Fig. 15.Density map of the orbital period versus global ranking of the DR3 eclipsing binary candidates.
Figure A.27, in particular, shows the distribution of the global ranking versus G magnitude for each sample.
(black histogram), on a linear scale for the number of Article number, page 11 of 40

Fig. 16 .
Fig. 16.Period distributions of various samples according to their twomodel parameters as labeled in the panels (see text).The filled grey histograms represent the combined samples in each panel.
shown in the top panel (samples 2G-A, 2GE-A, 2G-D, 1G).The periods span all values, with a peak at around one day and an extended tail above twenty days.The observed period distributions reflect the real distribution of these mainly detached eclipsing binaries convolved with the (complex) selection function resulting from the Gaia eclipsing binary identification, period determination, and light curve modelling procedures.We also note the presence of the alias peak at the six-hour rotation period.The excess is predominant in Sample 2G-D containing very narrow primary Gaussians (red histogram), while it is absent in Sample 2GE-A (green histogram) where the presence of a small-to medium-amplitude ellipsoidal component (2 A ell < 0.11 mag) in the light curve model better constrains the period.The periods of tighter systems are shown in the second (samples 2G-B, 2GE-B, 1GE) and third (samples 0GE, 2G-C) panels.
Figure A.26, in particular, shows the distribution of the periods versus G magnitude for each sample.

Fig. 17 .
Fig. 17.Cumulative distribution of the phase deviation r P,lit at the end of the observation obtained when adopting the literature period P lit instead of the Gaia period P Gaia (see Eq. 6).The sample of Gaia eclipsing binary candidates that are also classified as eclipsing binaries in the literatures is shown by the filled grey histogram, and the sample which has literature cross-matches but with a classification other than eclipsing binary in the literature is shown by the red histogram.

Fig. 18 .
Fig. 18.Gaia period versus literature period for all Gaia DR3 eclipsing binary candidates that have a cross-match in the literature.Top panel: Candidates also identified as eclipsing binaries in the literature.Bottom panel: Candidates not classified as eclipsing binaries in the literature.The imprints of literature catalogues are visible in the distributions of P lit (see text).Table6.Number of Gaia DR3 eclipsing binaries candidates in selected sky areas of the LMC and Galactic Bulge.The first dataset (columns 2-3) includes all global rankings, while the second set (columns 4-5) consists of sources with global_ranking > 0.50.

Fig. 19 .
Fig. 19.Sky distributions (density maps) in equatorial coordinates of eclipsing binary candidates around the LMC.The panels show, from top to bottom, Gaia DR3 candidates, OGLE4 candidates, the Gaia-OGLE4 crossmatches (two arc seconds radius), and the new Gaia candidates with respect to OGLE4.Sources highlighted by filled circles in the bottom panel have their G light curves displayed in Figs.24 and 25.They are located in the bar of the LMC for the cyan markers and outside the bar for the blue markers.The orange area delineates the sub-region of the sky used in the text to compute the fraction of Gaia new candidates towards the LMC.

Fig. 21 .
Fig. 21.Same as Fig. 19, but for OGLE4 sources in the Galactic Bulge that have no source counterpart in the Gaia DR3 archive.

Fig. 22 .
Fig. 22. Median G magnitude distributions of Gaia eclipsing binaries in specific regions of the sky.Top panel: Sky region towards the LMC shown in Fig. 19.Bottom panel: Sky region towards the Galactic Bulge shown in Fig. 20.The black histograms show all Gaia DR3 eclipsing binary candidates in the given sky area.The sources among them that have or do not have a cross-match (two arc second radius) with the OGLE4 catalogue of eclipsing binaries are shown by the green and red histograms, respectively.The blue histograms are the distributions of sources with no OGLE4 cross-match, but which have a cross match with eclipsing binary candidates identified in other surveys.The abscissa range is truncated on the bright side for better visibility.

Fig. 23 .
Fig. 23.Colour-magnitude (top row) and BP+RP flux excess factor versus colour (bottom row) diagrams of the Gaia eclipsing binaries in the sky area towards the LMC shown in Fig. 19.Left panels: New Gaia eclipsing binary candidates with respect to the OGLE4 catalogue of LMC eclipsing binaries.Middle panels: Gaia sources having a crossmatch (two arc-second radius) with the OGLE4 catalogue.Right panels: Gaia candidates having no crossmatch with the OGLE4 candidates but having a crossmatch with EB candidates in other surveys.The colour of each bin is proportional to the number of counts per bin according to the colour scales shown on the right for each row.Sources highlighted by filled circles have their G light curves displayed in Figs.24 and 25.They are located in the bar of the LMC for blue circle markers and outside the bar for cyan diamond markers.The axes ranges have been truncated for better visibility.
Fig. C.6 in Appendix C.

Fig. 24 .
Fig. 24.Same as top panel of Fig. 4, but for five Gaia candidates outside the LMC bar that are not present in the OGLE4 catalogues of eclipsing binaries, sorted with increasing G BP − G RP colour from top to bottom.The sources are Gaia DR3 4655269942119876864, 4651511051157939968, 4651581458546544000, 4651853274236480256, 4651018852223135104.Their sky positions are identified in Fig. 19.
Fig. 26.Same as Fig. 22, but for the global ranking.

Fig. 27 .
Fig. 27.BP+RP flux excess factor versus colour of the Gaia eclipsing binaries in the sky area towards the Galactic Bulge shown in Fig. 20.Top panel: Gaia candidates having OGLE4 crossmatches.Bottom panel: Gaia candidates having no OGLE4 crossmatch.The axes ranges have been truncated for better visibility.

Fig. 28 .
Fig. 28.Corrected BP+RP flux excess factor versus colour of Gaia DR3 eclipsing binaries having positive parallax uncertainties better than 15%.Median values of G BP , G RP and G are used in all quantities.The axes ranges have been truncated for better visibility.

Fig. 30 .
Fig. 30.Observational HR diagrams of Gaia DR3 stars with good parallaxes.Top panel: Density map of a random sample of ten million stars with parallax uncertainties better than 5% and additional conditions on the number of measurements and source image quality (see text).The yellow lines are evolution tracks of, from bottom to top, 0.8, 1, 1.5 and 2 M solar-metacclicity stellar models from Ekström et al. (2012).Second panel: Density map of Gaia eclipsing binary candidates with parallax uncertainties better than 15% and corrected BP+RP flux excess factors less than 0.5.Contour lines (logarithmic scale) of the sample shown in the top panel are drawn in grey.Third panel: Same as second panel, but colour-coded with the median value of the orbital period in each bin.Bottom panel: Same as second panel, but colour-coded with the median value of the global ranking in each bin.Median values of the G BP , G RP and G cleaned time series are used in all panels except in the top one, where mean values are used due to the unavailability of median values for all sources in Gaia DR3.The colours in the figures are coded according to the colour scales on the right of each panel.The ranges of the axes and colour scales are truncated for better visibility.

Fig. 31 .
Fig. 31.Same as third panel of Fig. 30, but for the subset with global rankings larger then 0.6 or belonging to the group 2G-C.

Fig. 32 .
Fig. 32.Period distributions of the good parallax sample.The filled grey histogram represents the full distribution, while the orange, blue, magenta, red and green histograms represent the 2G-A, 2G-B, 2G-C, 2GE-A and 2GE-B subsamples, respectively.Top panel: All candidates in the good parallax sample with C * < 0.5.Bottom panel: Same as top panel, but restricted to candidates with global rankings larger than 0.6 except for group 2G-C.

Fig. 33 .
Fig. 33.Eccentricity proxy versus orbital period of Sample 2G-A eclipsing binaries (well-detached and without ellipsoidal component) that are brighter than 12 mag in G and that have parallax uncertainties better than 15%, corrected BP+RP flux excess factor smaller than 0.5, and global rankings larger than 0.6.The colour of each marker is related to the depth of the secondary eclipse (in G magnitude) according to the colour-scale drawn on the right of the figure, with ratios larger than 0.45 rendered in black.The size of each marker is proportional to the secondary over primary depth ratio.The vertical line segments indicate the 1-σ uncertainty of the eccentricity proxy.For clarity, it has only been drawn on the small eccentricity side.The blue dashed line is Eq.(4.4) from Mazeh (2008) with E = 0.98, A = 3.25, B = 6.3 and C = The five sources above this line that are highlighted with open diamonds have their light curves displayed in Fig. 35.

Fig. A. 5 .
Fig. A.5. Same as Fig. A.3, but for a case (Gaia DR3 5684917462874690560) with an insufficient coverage of the primary eclipse that leads to a poor constrain of its depth.

Fig. A. 6 .
Fig. A.6.Density map of the primary eclipse depth versus duration (in days) for the samples 2G-A (top panel), 2G-D (middle panel) and 1G(bottom panel).The primary eclipse duration is taken equal to w ecl,1 P orb .The larger occurrence of eclipse depths larger than about one magnitude for eclipse durations between ∼0.07 and ∼0.17 days is linked to the equivalent time intervals between successive observations in the two Gaia fields of view (see text, Sect.A.1).The axes ranges have been limited for better visibility.
Fig. A.6 for eclipse durations between ∼0.07 and ∼0.17 days.Sample 2G-B.The second sample identified along the line of equal-Gaussian widths in the top panel of Fig. A.1 lies at phase widths between 0.06 and 0.15.It is the most populated region in the diagram.Their larger Gaussian widths lead to the absence of flat inter-eclipse phases.These are tighter binaries than candidates in Sample 2G-A.The phase separation between the two eclipses is close to 0.5 (see second panel from top of Fig. A.1), as expected for these types of eclipsing binaires.The distribution of the eclipse depth ratio (green thick histogram in the top panel of Fig. 6) peaks at one with a tail extending down to below 0.4.Example light curves are shown in Fig. A.7.In this sample 2G-B, spurious cases can happen when a potentially wrong period is obtained.An example of such a case is shown in the second source from top in Fig. A.4. Visual inspec-Article number, page 26 of 40

Fig
Fig. A.7. Same as Fig. A.3, but for two sources in Sample 2G-B of light curves modelled with only two Gaussians.The top set is for EW-type eclipsing binary candidate Gaia DR3 5256648720295981184 and the bottom set is for EB-type eclipsing binary candidate Gaia DR3 1807942504467443456.

Fig. A. 8 .
Fig. A.8. Same as Fig. A.3, but for two ellipsoidal variable candidates from Sample 2G-C of light curves modelled with only two Gaussians.The top set is for a typical ellipsoidal variable (Gaia DR3 1872983530689181312) and the bottom panel for an ellipsoidal variable with light amplitude modulation (Gaia DR3 1980590328522237824).

Fig. A. 9 .
Fig. A.9. Same as Fig. 6, but for the global ranking.The histograms are not area-normalized.

Fig
Fig. A.10. Density map of the depth of the primary eclipse versus the depth of the deepest Gaussian.The sample of sources whose light curves are modelled with two Gaussians and without an ellipsoidal component is shown in the top panel, while those with two Gaussians and a cosine function are shown in the bottom panel.The density in the maps is colour coded according to the colour scales shown on the right of each panel.The axes ranges have been restricted for better visibility.
Fig. A.1), with 90% of them having a deviation from a 0.5 separation of less than 0.015 in phase.The ones with larger deviations are spurious cases.An example light curve of such an apparently spurious case is shown by the third case in Fig. A.4.The global ranking of the candidates in Sample 2G-C is generally lower than the ones of samples 2G-A and 2G-B, with only very few cases above 0.6.This is shown in the top panel of Fig. A.9. Nevertheless, the light curves are very good in the majority of cases.
Fig. A.1.We therefore define Sample 2G-D with σ p 0.02 and σ s 3 σ p .The histogram of their depth ratio is shown by the cyan dotted histogram in Fig. 6 (bottom panel).Most of them have depth ratios smaller than 0.2.It contains about 5% of the full catalogue.

Fig
Fig. A.11. Same as Fig. A.3, but for two candidates in Sample 2G-D of light curves modelled with only two Gaussians.The top set shows a case with a convincing primary eclipse detection while the secondary eclipse is spurious (Gaia DR3 2056622657089571968).The bottom set shows a seemingly good case from the G light curve (Gaia DR3 4077271415493214080).

Fig
Fig. A.13. Same as the top panel of Fig. A.1, but for the sample of sources whose light curves are modelled with two Gaussians and an ellipsoidal component (samples 2GE-A, 2GE-B and 2GE-Z).The dashed line delineates the region defined for Sample 2G-A eclipsing binaries of the models containing two Gaussians but no ellipsoidal component (Table A.1 and Fig. A.2).

Fig
Fig. A.14. Peak-to-peak amplitude distributions of the ellipsoidal component (cosine) of the two-Gaussian models in the various samples containing the cosine component as labeled in the figure.The abscissa scale has been limited for better visibility.

Fig. A. 15 .
Fig. A.15. Density map of the ellipsoidal varation amplitude (peak-topeak) versus deviation from 0.5 of the phase separation between primary and secondary eclipse locations for the sample of sources having two Gaussians and an ellipsoidal component.The expression of ∆ 0.5 (ϕ ecl ) is given by Eq. (A.1).The dashed lines delineate the three samples 2GE-A (lower-left region), 2GE-B (upper-left region) and 2GE-Z (right region) defined in the text.The axes scales have been limited for better visibility.

Fig
Fig. A.16. Density maps of two-Gaussian related quantities versus ellipsoidal amplitude (peak-to-peak) of the samples 2GE-A (2 A ell < 0.11) and 2GE-B (2 A ell ≥ 0.11) of sources having two Gaussians and an ellipsoidal component.Top panel: Primary Gaussian depth, with 1:1 (solid), 2:1 (upper dashed) and 1:2 (lower dashed) lines to guide the eyes.Bottom panel: Secondary to primary Gaussian depths ratio.The axes scales are truncated for better visibility

Fig
Fig. A.17. Same as Fig. A.3, but for three candidates in Sample 2GE-A of light curves modelled with two Gaussians and an ellipsoidal component.The top set shows a case with a weak ellipsoidal variability (Gaia DR3 1920242361505734784), the middle set with a mild ellipsoidal component (Gaia DR3 509661748731132800), and the bottom set with very different eclipse depths (Gaia DR3 520547841550454784).

Fig
Fig. A.18. Same as Fig. A.3, but for three candidates in Sample 2GE-B of light curves modelled with two Gaussians and an ellipsoidal component.The top case shows an example of light curve where the ellipsoidal component is the major contributor to the light shape (Gaia DR3 3444083186030598272), while the second case shows an example with a larger Gaussian depth than the ellipsoidal variability amplitude (Gaia DR3 249126318130979840).The bottom case exemplifies a Sample 2GE-B source with a small depth ratio (of 0.14) between the secondary and primary Gaussian depths.

Fig
Fig. A.19. Same as Fig. A.15, but for the global ranking versus eclipse phase separation relative to 0.5.Top panel: Density map.Bottom panel: Primary to secondary Gaussian depth ratio colour-coded according to the colour scale shown on the right of the panel.

Fig
Fig. A.20. Same as Fig. A.3, but for three candidates in Sample 1G of light curves modelled with only one Gaussian.The top set (Gaia DR35524068296237910272)  shows a case where the period is probably a factor of two too small, the second set (Gaia DR3 3033226683222463872) shows a case where observation may be missing at the phase window of the second eclipse, and the third case (Gaia DR3 5541461814286341632) illustrates a case where the lack of enough phase coverage within an eclipse leads to an overestimate of the eclipse depth.

Fig
Fig. A.21. Same as the top panel of Fig. A.16, but for the sample 1GE with one Gaussian and an ellipsoidal component.The axes scales are kept identical to those in Fig. A.16 for easy comparison.

Fig
Fig. A.22.Same as Fig. A.3, but for three candidates in Sample 1GE of light curves modelled with one Gaussian and an ellipsoidal component.The top, middle and bottom sets show cases with large (Gaia DR3 5310453356151345408), mild (Gaia DR3 4513043989925301760), and small (Gaia DR3 4052342222634008320) ellipsoidal component relative to the Gaussian depth.The middle case may have its secondary eclipse gone undetected due to a lack of Gaia measurement at a phase of 0.5 apart from the primary eclipse.

Fig
Fig. A.23. Density map of the signal-to-noise (computed as the ratio of the standard deviation over the root-mean-square of the G magnitude uncertainties) versus G magnitude for the sample of candidates modelled with only a cosine function (Sample 0GE).The contours delineate the density of sources in the sample 2GE-B (modelled with two Gaussians and an ellipsoidal component with large amplitude (see text).Six contours are shown on a linear scale of the density of sources on the map.

Fig
Fig. A.25. Same as Fig. A.3, but for three candidates in Sample 0GE (light curves modelled with only a cosine).The first two examples show the most common cases of faint candidates, with (top case, Gaia DR3 5923298253884922624 and without (middle case, Gaia DR3 4037337565429141376) a clear confirmation of the variability in the G BP and G RP light curves.The bottom example shows a relatively rarer case in this sample of a bright candidate (Gaia DR3 5236421790821776256).

Fig
Fig. A.26. Density maps of period versus G magnitude for the various samples defined in Table A.1.

Fig
Fig. A.27. Same as Fig. A.26, but for global ranking versus G magnitude.
Figures A.26 and A.27  summarise the periods and global rankings versus G magnitude for each sample.

Fig. B. 1 .
Fig. B.1.Density maps of the eccentricity proxy versus orbital period for samples with good parallaxes and global rankings as defined in the text.The dashed lines are Eq.(4.4) from Mazeh (2008) with E = 0.98, A = 3.25, B = 6.3 and C = 0.23.The axes ranges are truncated for better visibility.Article number, page 35 of 40

Fig. B. 3 .
Fig. B.3.Histograms of the relative uncertainty on the eccentricity proxy for various samples with conditions on the parallax, corrected BP+RP flux excess factor and global ranking as written in the figure.

Fig
Fig. C.1.G BP and G RP folded light curves of the eclipsing binaries whose G are shown Fig. 5 in the main body of the article.The G BP and G RP magnitudes are shifted by a value equal to the median magnitudes of their respective light curves, given in the top of each panel.The dotted line represents the two-Gaussian models determined from the G light curves, with the green areas indicating the derived eclipse durations.

Figures
Figures C.1 to C.5 show the G BP and G RP folded light curves of the eclipsing binaries displayed in Sect.3.1, except for V614 Ven which has its G BP and G RP light curves already shown in Fig. 4. Figure C.6 shows the magnitude distributions of the OGLE4 samples of eclipsing binaries used in Sect.4.2 to estimate the completeness of the Gaia catalogue.The figure plots the distributions of the OGLE4 sources in the I, V and G bands, separately for the OGLE4 samples towards the LMC (top panel), SMC (middle panel) and Galactic Bulge (bottom panel).The G distribution of the Gaia-OGLE4 crossmatches is also shown in each panel by the filled blue histograms.Figure C.7 shows the observational diagram of the sample of eclipsing binary candidates with parallax uncertainties better than 15%, with the absolute magnitudes M G shifted by 0.75 mag to compare with the distribution of a random sample of Gaia DR3 sources.For this purpose, the contour lines of the random ten million sources shown in the top panel of Fig. 30 (see Sect. 5 in the main text of the article) have not been shifted by 0.75 mag. Figure C.8 shows the light curves of two Gaia candidates discussed in Sect.5, folded with the ASAS-SN periods of the respective cross-matched ASAS-SN sources.
Figures C.1 to C.5 show the G BP and G RP folded light curves of the eclipsing binaries displayed in Sect.3.1, except for V614 Ven which has its G BP and G RP light curves already shown in Fig. 4. Figure C.6 shows the magnitude distributions of the OGLE4 samples of eclipsing binaries used in Sect.4.2 to estimate the completeness of the Gaia catalogue.The figure plots the distributions of the OGLE4 sources in the I, V and G bands, separately for the OGLE4 samples towards the LMC (top panel), SMC (middle panel) and Galactic Bulge (bottom panel).The G distribution of the Gaia-OGLE4 crossmatches is also shown in each panel by the filled blue histograms.Figure C.7 shows the observational diagram of the sample of eclipsing binary candidates with parallax uncertainties better than 15%, with the absolute magnitudes M G shifted by 0.75 mag to compare with the distribution of a random sample of Gaia DR3 sources.For this purpose, the contour lines of the random ten million sources shown in the top panel of Fig. 30 (see Sect. 5 in the main text of the article) have not been shifted by 0.75 mag. Figure C.8 shows the light curves of two Gaia candidates discussed in Sect.5, folded with the ASAS-SN periods of the respective cross-matched ASAS-SN sources.

Fig
Fig. C.2. Same as Fig. C.1, but for the eclipsing binaries whose G light curves are shown in Fig. 7 in the main body of the article.

Fig
Fig. C.3.Same as Fig. C.1, but for the eclipsing binaries whose G light curves are shown in Fig. 8 in the main body of the article.

Fig. C. 8 .
Fig. C.8.Light curves of Gaia DR3 4524651705941314432 and Gaia DR3 5712304991851559040 shown in Fig. 35 in the main body of the text, folded with the periods published by the ASAS-SN survey for the respective cross-matches (ASASSN-V J075432.26-211826.4 with P=3.3972697 d, and ASASSN-V J184156.16+192755.8 with 5.9965616 d, respectively).

Table 1 .
Types of geometric models fitting the G light curves.The columns give the model type, the number of model parameters, the ranking used in model prioritization, a description of the model type, and the number of sources of the given type in the Gaia DR3 table vari_eclipsing_binary.

Table 2 .
Data fields in the Gaia DR3 table of eclipsing binaries (Gaia DR3 table vari_eclipsing_binary), with their units (col.2), the mathematical symbol used in this paper (if used, col.3), and a short description (col.4).
source_id -Unique source identifier of the EB candidate model_type -Geometric model type fitting the G light curve (Table 1) num_model_parameters -Number of free parameters of the geometric model

Table 3 .
Gavras et al. (2022)d with the Gaia DR3 catalogue of eclipsing binaries.The first column gives the survey.The second column gives the percentage of cross-matched sources belonging to that survey.The third column gives the catalogue label(s) used byGavras et al. (2022), on which we based our cross-matches.The literature references corresponding to the catalogue labels are given in their Table1.

Table 4 .
Some statistics on period comparison between the Gaia DR3 catalogue of eclipsing binaries and literature data.The number of sources with parallaxes better than 10% are indicated in italics below the parent sample.See text for a description of the table.

Table A .
2. Main binary system types expected in the samples defined in TableA.2.