Issue |
A&A
Volume 682, February 2024
|
|
---|---|---|
Article Number | A5 | |
Number of page(s) | 21 | |
Section | Stellar structure and evolution | |
DOI | https://doi.org/10.1051/0004-6361/202347694 | |
Published online | 26 January 2024 |
Classification and parameterization of a large Gaia sample of white dwarfs using XP spectra⋆
1
Département de Physique, Université de Montréal, Montréal, Québec, Canada
e-mail: o.vincent@umontreal.ca
2
School of Physics & Astronomy, University of Leicester, Leicester, UK
e-mail: mab@leicester.ac.uk
3
Astronomisches Rechen–Institut, Zentrum für Astronomie der Universität Heidelberg, Mönchhofstr. 12–14, 69120 Heidelberg, Germany
e-mail: jordan@ari.uni-heidelberg.de
Received:
10
August
2023
Accepted:
22
October
2023
Context. The latest Gaia data release in July 2022, DR3, in addition to the refinement of the astrometric and photometric parameters from DR2, added a number of important data products to those available in earlier releases, including radial velocity data, information on stellar multiplicity, and XP spectra of a selected sample of stars. Gaia has proved to be an important search tool for white dwarf stars, which are readily identifiable from their absolute G magnitudes as low luminosity objects in the Hertzsprung–Russell (H–R) diagram. Each data release has yielded large catalogs of white dwarfs, containing several hundred thousand objects, far in excess of the numbers known from all previous surveys (∼40 000). While the normal Gaia photometry (G, GBP, and GRP bands) and astrometry can be used to identify white dwarfs with high confidence, it is much more difficult to parameterize the stars and determine the white dwarf spectral type from this information alone. Observing all stars in these catalogs with follow-up spectroscopy and photometry is also a huge logistical challenge with current facilities.
Aims. The availability of the XP spectra and synthetic photometry presents an opportunity for a more detailed spectral classification and measurement of the effective temperature and surface gravity of Gaia white dwarfs.
Methods. A magnitude limit of G < 17.6 was applied to the routine production of XP spectra for Gaia sources, which would have excluded most white dwarfs. Therefore, we created a catalog of 100 000 high-quality white dwarf identifications for which XP spectra were processed, with a magnitude limit of G < 20.5. Synthetic photometry was computed for all these stars, from the XP spectra, in Johnson, SDSS, and J-PAS, published as the Gaia Synthetic Photometry Catalog – White Dwarfs (GSPC-WD). We took this catalog and applied machine learning techniques to provide a classification of all the stars from the XP spectra. We have then applied an automated spectral fitting program, with χ-squared minimization, to measure their physical parameters (effective temperature and log g) from which we could estimate the white dwarf masses and radii.
Results. We present the results of this work, demonstrating the power of being able to classify and parameterize such a large sample of ≈100 000 stars. We describe what we can learn about the white dwarf population from this dataset. We also explored the uncertainties in the process and the limitations of the dataset.
Key words: techniques: spectroscopic / stars: fundamental parameters / white dwarfs
The data that support the findings of this study are openly available. The spectroscopic and astrometric data can be accessed from the official Gaia archive servers https://gea.esac.esa.int/archive. A full description of the catalogue produced by this paper can be found at Table 3 and can be downloaded online from http://www.astro.umontreal.ca/~ovincent/catalogues, along with the u band correction bins shown in Fig. 6. The results of this paper are also available both on the MWDD website http://montrealwhitedwarfdatabase.org and at the CDS via anonymous ftp to cdsarc.cds.unistra.fr (http://cdsarc.u-strasbg.fr) or via https://cdsarc.cds.unistra.fr/viz-bin/cat/J/A+A/682/A5
© The Authors 2024
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
1. Introduction
The publication of the data from the ESA Gaia space mission (Gaia Collaboration 2016, 2018b, 2021, 2023b) has revolutionized our understanding of white dwarfs by providing precise astrometric and photometric measurements, allowing for detailed studies of their properties, distributions, and evolutionary pathways. Gaia Collaboration (2018a) has revealed a previously unseen division in the white dwarf sequence on the color-magnitude diagram, providing unprecedented detail. Specifically, the Q branch and the A-B bifurcation have been identified. The Q branch is now believed to be caused by energy released during the crystallization of the white dwarf core (Tremblay et al. 2019b), while the origin of the A-B bifurcation remained unclear until recently. The A branch primarily consists of white dwarfs with hydrogen-rich atmospheres, while the presence of the B branch has long been unexplained. El-Badry et al. (2018) attributed the existence of the B branch to a flattening in the initial-to-final–mass relation (IFMR), resulting in a secondary peak in the white dwarf mass distribution at approximately 0.8 M⊙. Soon after, Kilic et al. (2018) also suggested the presence of this secondary peak but attributed it to the occurrence of stellar mergers, while Bergeron et al. (2019) have shown that adding an invisible trace of hydrogen to the B-branch stars caused their mass to move closer to the fiducial 0.6 M⊙. Only recently have Camisassa et al. (2023) and Blouin et al. (2023) been able to explain the B branch using white dwarf models that include helium and a small amount of carbon contamination, which cannot be detected optically, showing that the mass distribution of these white dwarfs is consistent with the mass distribution observed in hydrogen-rich white dwarfs and their standard evolutionary pathways. Furthermore, Blouin et al. (2023) have shown that neither the convective mixing of residual hydrogen nor the accretion of hydrogen or metals can be the dominant drivers of the bifurcation.
Gaia has uncovered a significant number of new white dwarfs, resulting in the identification of 73 000 white dwarfs within the 100-parsec Solar neighborhood in Gaia Data Release 2 (Gaia Collaboration 2021) by Jiménez-Esteban et al. (2019). A total of 260 000 high-probability white dwarfs were discovered by Gentile Fusillo et al. (2019), not restricting the search to the close Solar neighborhood. The utilization of Gaia Early Data Release 3 (Gaia Collaboration 2021) by Gentile Fusillo et al. (2021) further increased the number of high-confidence white dwarf candidates to 359 000. This represents more than an order of magnitude increase compared to the previous number of known white dwarfs. As a result, several comprehensive analyses of Gaia white dwarfs belonging to various spectral classes were conducted (e.g., Tremblay et al. 2019b; Bergeron et al. 2019; Coutu et al. 2019; Caron et al. 2023).
The photometry provided by Gaia, from Data Release 1 (Gaia Collaboration 2016) to Early Data Release 3 (Gaia Collaboration 2021), utilized wide passbands (G, GBP, and GRP), which are not ideal for precise model atmosphere modeling. Consequently, additional surveys employing narrower bands have been incorporated alongside Gaia data to enable more comprehensive analyses.
In Gaia Data Release 3 (Gaia Collaboration 2023b), a significant advance was made by publishing more than 200 million low-resolution spectra from the Gaia blue and red photometers (BP and RP), covering a wavelength range of 330 nm ≤ λ ≤ 1050 nm. These spectra, referred to as XP spectra, have provided invaluable data (Gaia Collaboration 2023a). Among these XP spectra, nearly 100 000 are attributed to white dwarfs, offering a substantial resource for further analysis.
The spectra obtained from Gaia Data Release 3 can be converted to synthetic photometry in physical units for any passband within the wavelength range they cover. This allows for direct comparisons with photometric data obtained by other surveys, such as the SDSS, and facilitates the interpretation of synthetic photometry derived from Gaia through model calculations specific to these passbands. Tables with synthetic photometry were published by Gaia Collaboration (Gaia Collaboration 2023a), which includes standardized photometry for over 200 million sources across multiple widely used photometric systems.
In order to leverage the full potential of the large data sets provided by the latest surveys, statistical algorithms have become essential in the field of astronomy. Machine learning tools for the analysis of white dwarfs have recently begun emerging, such as the WDTools package for spectroscopic parameter inference of DAs (Chandra et al. 2020) and the classification pipeline of Vincent et al. (2023), which contains several neural network classifiers for WD primary spectral type classification, as well as WD candidate and WD+MS binary system identification. The identification of WD+MS systems within Gaia has been studied in greater detail by Echeverry et al. (2022), who tested the Random Forest classifier algorithm using a realistic population of white dwarfs with XP-quality synthetic spectra.
In this paper, we spectroscopically classify the white dwarfs of the GSPC sample using machine learning techniques and measure their stellar parameters. We describe the GSPC data in Sect. 2 and our classification methodology and results in Sect. 3. We outline the parameterization procedure in Sect. 4 and verify the consistency between the measured physical parameters and spectral classifications. We discuss selected results in Sect. 5 and give our concluding remarks in Sect. 6.
2. Gaia DR3 and the catalogue of synthetic photometry
One of the new outputs from the Gaia DR3 catalogue, compared to DR2 and EDR3, was the publication of flux-calibrated low-resolution spectrophotometry for ≈220 million sources (Gaia Collaboration 2023b). Gaia produces low resolution (R ≈ 50) spectra in each of its blue and red photometric channels, labeled BP and RP respectively. Dispersion is provided by a prism. The raw spectra are calibrated and merged into a single XP spectrum, as described by De Angeli et al. (2023) and Montegriffo et al. (2023). The spectra are defined as an array of coefficients to be applied to a set of basis functions. Sampled spectra and synthetic photometry can be recovered from these coefficients using the GaiaXPy Python library1. An extensive survey of the possible uses of synthetic photometry from the Gaia XP spectra is presented in a paper accompanying the Gaia DR3 data release (Gaia Collaboration 2023a).
In DR3 only mean spectra are available, the average from all valid observations included in the data processing pipeline. Furthermore, these are selected to have a reasonable number of observations (more than 15 transits) and to be sufficiently bright to ensure a good signal-to-noise ratio (S/N), generally limiting the G magnitude to G < 17.65. However, if applied across the whole catalogue, this magnitude limit would exclude large numbers of interesting classes of fainter objects. Therefore, a few samples of specific objects that could be as faint as G ≈ 21.43 were added: including about 500 sources used for the calibration of the BP/RP data, a catalogue of about 100 000 white dwarf candidates, 17 000 galaxies, about 100 000 QSOs, about 19 000 ultra-cool dwarfs, 900 objects that were considered to be representative for each of the 900 neurons of the self-organizing maps (SOMs) used by the outlier analysis (OA) module (see Sect. 9) and 19 solar analogs (De Angeli et al. 2023).
The input catalogue of white dwarfs used to select suitable sources for generating XP spectra is described in detail in Gaia Collaboration (2023a). It derives from earlier work carried out to identify white dwarfs in the Gaia DR2 and eDR3 catalogues carried out by Gaia Collaboration (2018a), Gentile Fusillo et al. (2019, 2021) and the Gaia Catalogue of Nearby Stars (Smart et al. 2021). The input sample was drawn from the eDR3 release, designed to span the complete range and colours of white dwarfs, defined as high-probability white dwarf candidates by their location in the H–R diagram. The following criteria were applied:
-
Eqs. (1)–(9) detailed in Gentile Fusillo et al. (2019)
-
astrometric_excess_noise < 5
-
phot_bp_mean_flux_over_error < 20
-
phot_rp_mean_flux_over_error < 20
-
parallax/parallax_error < 10
-
phot_g_mean_flux_over_error < 20
-
log(parallax_over_error) < 1.56(log(103/parallax)−3.17)+0.96.
This yields a sample of 102 000 white dwarfs, five times the number of known spectroscopically identified objects. Most of the objects have G < 19.5, but about 30% are fainter. The effective G magnitude cut-off is ≈20. This catalog corresponds to the list of objects included in the Gaia Synthetic Photometry Catalogue for White Dwarfs (GSPC-WD). The completeness of the catalog essentially follows the same trends as those in Gentile Fusillo et al. (2021) up to ∼50 pc, after which the number of XP objects starts diminishing. To illustrate this, we plot in Fig. 1 the number of white dwarf candidates with PWD > 0.65 from the Gentile Fusillo et al. (2021) along with the number of objects in the GSPC-WD catalogue as a function of distance. For reference, the 50 pc and 100 pc distances are highlighted with gray dashed lines. We also looked for a color-dependent drop in completeness noticed by Jiménez-Esteban et al. (2023), who analyzed the 100 pc sample of the GSPC-WD catalogue and found a gradual decrease of completeness with redder GBP − GRP colors. We plotted the ratio between the number of objects in the GSPC-WD catalogue and objects in the Gentile Fusillo et al. (2021) white dwarf candidate catalogue with PWD > 0.75 against GBP − GRP color bins (not shown here), and found a small decrease of GSPC-WD objects around GBP − GRP = 0.2 that remained nearly constant for redder colors.
Fig. 1. Number of objects in the Gentile Fusillo et al. (2021) white dwarf candidate catalogue with PWD > 0.65 (red line) and number of objects in the GSPC-WD catalogue (black line) as a function of distance. The gray dashed lines indicate distances of 50 and 100 pc. |
Figure 2 shows examples of typical XP spectra, generated from the coefficients, for the six primary white dwarf spectral types: DA, DB, DC, DO, DQ, and DZ, spanning a range of magnitudes. These objects are previously known white dwarfs with spectral types confirmed by the Sloan Digital Sky Suvery (SDSS). The strongest H Balmer lines (λ6562, λ4861, λ4340) are visible in the DA spectra, while He I lines (λ5876, λ4471, λ3889) and the calcium lines (λ3934, λ3969) are identifiable in the DBs and DZs, respectively. However, the He II λ4686 line typically used for classification of DOs appears invisible at this resolution. The lowest S/N regions correspond to the extremes of the wavelength range covered by the spectra, where the XP coefficients generate a pattern of oscillations due to the noise.
Fig. 2. Gallery of XP spectra for Gaia objects with SDSS-confirmed spectroscopic type (indicated in the top right corner of each panel). Three spectra of varying G magnitudes are displayed to illustrate the difference between brighter and fainter stars. Spectral lines typically used for classification are displayed for reference (see text). The Gaia DR3 source identification and the spectroscopic type probability predicted by our classifiers (Pclass) are also shown. |
The following sections describe how we have applied machine learning to classify the GSPC-WD stars and determine their physical parameters using the synthetic colours derived from XP spectra. We then present the results we have obtained by studying this large sample of objects, a more than factor 5 increase in the number of white dwarfs of known spectral type.
3. A machine learning approach to classification of the white dwarf XP spectra
3.1. Methodology
Spectral classification of white dwarf XP spectra in the Gaia XP sample is conducted using a “one-versus-all” approach. This method employs binary classifiers that are trained to distinguish a specific class from all other white dwarfs. To perform the classification, we utilize the Scikit-learn GradientBoostingClassifier (Pedregosa et al. 2011), which is an ensemble method that trains multiple regression trees by minimizing a differentiable loss function and then combines them into a powerful model. Further details regarding the hyperparameters employed and data preprocessing can be found in Appendix A.
Our classifiers take the 110 XP coefficients as input and provide a probability ranging from 0 to 1, indicating the likelihood of a spectrum belonging to a particular class. The spectral type with the highest probability is assigned to the object if it surpasses a minimum confidence threshold, as discussed in the subsequent section. In our study, the classifiers are trained to distinguish six primary spectral types: DA, DB, DC, DO, DQ, and DZ. It is important to note that our classification scheme solely accounts for the primary spectral type and does not attempt to identify secondary signatures. For instance, in our approach, a white dwarf spectrum exhibiting both hydrogen and neutral helium lines would be classified as DA or DB based on the relative strengths of these lines. Conversely, in the traditional system proposed by Sion et al. (1983), the spectrum would be classified as either DAB or DBA, where the spectral type order also depends on the relative strengths of the lines.
The training labels utilized in our study were obtained from the Gaia-SDSS catalogue described in Vincent et al. (2023). This catalogue provides a robust data-driven classification of primary spectral types for 27 866 unique Gaia white dwarfs, employing spectra from the SDSS Data Release 17 (Abdurro’uf et al. 2022). The catalogue assigns a probability Pclass to indicate the likelihood of a spectrum belonging to one of the following 13 classes: DA, DB, DC, DO, DQ, hotDQ, DZ, DAH, PG1159, cataclysmic variable, sdB, sdO, or sdBO. As recommended in Vincent et al. (2023), we restricted our selection to objects with a classification confidence Pclass > 0.6 to ensure reliable spectroscopic classifications for our training data. Additionally, we required objects to possess at least one SDSS spectrum with a signal-to-noise ratio above 9 to further enhance the reliability of the spectroscopic classifications. These criteria resulted in a dataset consisting of 13 743 unique Gaia objects with spectroscopic classification. Table 1 displays the number of objects for each of the six primary spectral types used to train and validate the XP classifiers.
Training examples and average precision and recall test scores of the cross-validation and top-5 classifier ensembles at a 0.6 threshold.
We note that the usage of spectroscopic classifications from the SDSS as our training labels may induce undesirable effects. The division of white dwarfs in different spectroscopic classes is strictly based on their spectral appearance and, depending on the resolution of the available spectrum, the classification can change. In this work, there is an underlying assumption that the SDSS spectra used for training (R ∼ 1800) classified in the different subclasses (DA, DB, DC, DQ, DZ) would have the same classification in the XP spectra (R ∼ 50). This obviously cannot be true for all objects. This may lead to the classifier assigning meaning to nonphysical features in the low-res spectrum, such as the shape of the continuum. We thus advise the reader to interpret the classifier outputs not as traditional spectral classifications, but as predictions of what the spectral type is likely to be, based on the XP spectra. This caveat will likely persist until methods to remove the spectral continuum from the XP coefficients and/or to transform synthetic spectra into XP coefficients, allowing creation of a sufficiently large training data set for each class, are developed.
To validate our choice of classification algorithm, we employed cross-validation. For each class, we trained 20 gradient boosting classifier models using distinct data splits. Each split comprised a random selection of 10% of the data for the test set, 10% for the validation set, and 80% for training, where the proportions are enforced for both the positive class (e.g., DA) and negative class (all other WD). We assessed the mean precision and recall scores at a classification confidence threshold of 0.6 on the test set of the 20 models, which are presented in Table 1 as Pc − v and Rc − v, respectively. The overall performance is acceptable, with every class achieving precision and recall scores above ≳60%. Although more complex classifier models might yield slightly higher scores, the limited number of objects (less than 1000) for most classes increases the risk of overfitting. Furthermore, there is no guarantee that the spectral features used to classify the objects when employing SDSS spectra remain discernible in the lower-resolution XP spectra. This situation could lead to complex models focusing excessively on learning features that do not exist. Consequently, a certain amount of error is expected in the training labels due to the invisibility of spectral features at low resolution for certain objects.
Examining individual spectral types, the DA and DB classifiers exhibit excellent performance, with both precision and recall values exceeding 95%. This outcome can be attributed to the larger number of objects in these classes and their easily distinguishable spectral features. The DC classifiers display the poorest performance, although still acceptable, with approximately 66% precision and recall. Spectral features used to classify SDSS spectra may become invisible, causing many objects labeled as non-DC to be classified as DC at XP resolution. Conversely, objects labeled as DC are highly likely to possess correct labels, as no new features become visible when the resolution is downgraded. This assurance allows us to consider high-confidence DC classifications as genuine. The DQ and DZ classifiers demonstrated good recall performance of around 85%, but relatively low precision of approximately 60%. The lower performance for these spectral types is anticipated, as the spectral features typically used to classify them, such as Swan bands or other carbon absorption lines for DQ white dwarfs and Ca II absorption features for DZ (Kleinman et al. 2013; Coutu et al. 2019; Kepler et al. 2021), become increasingly challenging to distinguish in low-resolution and/or low signal-to-noise spectra. This difficulty is evident in the XP spectra gallery displayed in Fig. 2. DO white dwarfs also encounter this issue, with the He II λ4686 line being virtually invisible in most spectra. The DO cross-validation scores for precision and recall were both 76%, which may seem acceptable at first glance. However, these scores should be interpreted with caution due to the limited number of objects. The DO classifiers are extremely sensitive to the choice of training and testing data, resulting in individual model performance scores fluctuating between 60% and 100%.
To mitigate the biases learned by individual models, we performed ensemble learning by combining the top five performing models of each class based on their test F-scores, that is, the average of precision and recall on the test set. We obtained the final class probability by averaging the predictions of the ensemble models. Table 1 presents the average precision and recall scores of the top-5 ensembles at a threshold of 0.6 for comparison with the cross-validation scores. Ensembling the top-5 models yielded a minor improvement of a few percentage points over the cross-validation results, except for the DO classifiers, which improved by approximately 20%. This improvement highlights the performance variance resulting from the limited number of objects. The top-5 ensembles were employed to classify the GSPC-WD sample, as discussed in the subsequent section.
3.2. Classification of the GSPC-WD sample
We started by refining the selection of our initial 102 000 WD candidates following Vincent et al. (2023), which employed neural networks and 13 Gaia parameters to estimate the probability of an object being a white dwarf (PWD) for approximately 1.3 million Gaia objects described in Gentile Fusillo et al. (2021). Their machine learning approach produced superior results compared to density estimation on the Gaia H–R diagram, particularly in regions where the white dwarf locus and main sequence stars overlap. We select high-confidence white dwarf candidates with XP spectra by applying a PWD > 0.9 cut and a standard deviation limit of 0.02 on PWD, retaining a total of 100 886 objects.
To spectroscopically classify the GSPC-WD sample, we employed the top-5 classifier models for each spectral type, as described in the previous section. The mean of their predictions was used as the final classification probability. We determined the optimal confidence threshold for each ensemble by plotting the mean F-scores against threshold values as shown in Fig. 3. A threshold of approximately 0.65 maximizes the F-score for all classes. Objects with probabilities below 0.65 for any spectral type were still classified according to the most probable class but labeled as uncertain by adding a colon annotation (e.g., “DA:”). Feeding the 101 783 objects to the classifier ensembles, we obtain 89 188 high-confidence classifications, while 11 698 objects remained with uncertain classifications. The number of high-confidence classified objects per spectral type is presented in Table 2, along with the number of objects not found in the Gaia-SDSS catalogue (Nnew) for each class. The uncertain objects primarily constitute the fainter end of our sample, with an average G magnitude of 18.9. Lowering the confidence threshold to 0.5 provided classifications for an additional 4745 objects, but caution should be exercised as this may introduce a larger number of false positives. A more pure but incomplete sample may be selected by increasing the confidence threshold or, vice versa, a more complete but less pure sample by decreasing the threshold. The probability Pclass for an object belonging to any of the six possible classes is available in our online catalog described in Sect. 4.3.
Fig. 3. F-score curves of the top five performing classifier ensembles. A threshold value of 0.65 maximizes the mean F-scores for all classes. |
High-confidence white dwarfs per class in the GSPC-WD sample and in the Gaia-SDSS catalogue.
From Table 2, we see that the relative number of objects for each class is similar to what is found within the Gaia-SDSS catalogue of Vincent et al. (2023), which provides some assurance as to the global performance of the classifiers, but is not unexpected. As pointed out in Bailer-Jones et al. (2019), most classification algorithms implicitly learn a prior probability for each class based on their relative proportions in the training data. We verified this by calculating the ratio of each class (number of objects per class, including low-confidence classifications, over all XP objects) and found nearly exactly the same ratios as those obtained for each class in the training dataset (see Table 1). Although the SDSS currently provides the largest spectroscopic sample of white dwarfs, and thus the best available prior, the completeness of the XP-SDSS white dwarf sample as well as the SDSS target selection biases should be taken into consideration when doing population analyses requiring high degrees of statistical precision. Further information about the impact of the SDSS biases can be found in Appendix B, where a short analysis of the effective temperature coverage of the training data is presented.
We perform a brief sanity check to verify whether the classifications follow established trends when visualized with data other than the XP spectra themselves. To this end, we plot the high-confidence classification results on the Gaia H–R diagram in Fig. 4. The locations of classified objects are consistent with expectations: DO stars are found at the hot end of the white dwarf locus (Bédard et al. 2020), DA stars mainly populate the A branch (Bergeron et al. 2019), DB stars are located on or near the warmer section of the B branch, while DQ and DZ stars are situated on the cooler section of the B branch (Coutu et al. 2019) along with DC stars, the latter which are also found at the faint end of the white dwarf locus. We note a small gap in the DC and DZ sequences, which we attribute to the SDSS biases in our training set described above. An alternative view of the classifications is shown on the bottom panel of Fig. 5, where we plot the synthetic SDSS (g − r) vs. (u − g) colour–colour diagram for objects with both XP spectra and SDSS photometry, overlaid with the theoretical colour tracks for pure hydrogen and pure helium at a constant mass of 0.6 M⊙. While the XP spectra and their synthetic photometry are not independent, the DA and non-DA classifications fall nicely on the appropriate tracks, providing further assurance that the appropriate atmosphere models are used to measure their stellar parameters.
Fig. 4. H–R diagram of GSPC-WD objects with high classification confidence and converged fits. The location of objects for every class is consistent with previous spectroscopic studies. For clarity, background objects (grey points) are restricted to those with G < 18 and a parallax measurement error less than 1%, and only random selection of 25% of all DA is shown. |
Fig. 5. Color–color diagrams of objects with both real and synthetic SDSS photometry. For clarity, only objects with G < 17 are shown. The top panel contains the two color–color distributions before the u band correction (see text), displaying a significant color-dependent shift between the two. The middle panel shows the distributions after the u band corrections have been applied to the synthetic photometry. The bottom panel shows the corrected synthetic photometry color–color diagram with points colored by their XP spectroscopic classification. On all three panels, the cooling tracks for pure hydrogen (full black line) and pure helium (dashed black line) are shown at a constant mass of 0.6 M⊙. |
Overall, the reliability of our spectroscopic classification for the 100 886 GSPC-WD objects is supported by reasonable relative class numbers, as well as their location on the H–R and color–color diagrams. In the following section, we describe the atmosphere models and procedures used to measure the stellar parameters of this large sample.
4. Parameterization of the GSPC-WD sample
4.1. The photometric technique
We measured the physical parameters in our sample using the so-called photometric technique described in Bergeron et al. (1997). Briefly, the standardized synthetic SDSS magnitudes are converted into average fluxes using appropriate zero-points (Montegriffo et al. 2023) and conversion equations (Holberg & Bergeron 2006). Model photometry is then calculated for class-specific model grids by integrating the monochromatic Eddington fluxes over each bandpass. These model fluxes depend on the effective temperature Teff, the surface gravity log g and chemical composition. The observed and model fluxes are then related to each other via the solid angle π(R/D)2, where R is the stellar radius and D is the distance from Earth. Since the distance is known from Gaia parallax measurements, the radius can be measured directly and converted into stellar mass M using evolutionary models, which provide a temperature-dependent mass–radius relation. We rely on the evolutionary models described in Bédard et al. (2020) with C/O cores, q(He)≡log MHe/M⋆ = 10−2 and q(H) = 10−4, which are representative of H-atmosphere white dwarfs, and q(He) = 10−2 and q(H) = 10−10, which are representative of He-atmosphere white dwarfs2. A χ-squared minimization is performed between the observed and model average fluxes using the method of Levenberg–Marquardt (Press et al. 1986). In our fitting procedure, the fitted parameters are the effective temperature, Teff, and the solid angle, π(R/D)2, where R is the radius of the star, and D its distance from Earth obtained directly from the Gaia parallax. The uncertainties of both parameters are obtained directly from the covariance matrix of the fit.
The synthetic SDSS magnitudes are dereddened using the 3D extinction maps and parameterisation in Gentile Fusillo et al. (2021) and we apply the parallax zero-point correction described in Lindegren et al. (2021). Moreover, it is well established that the SDSS magnitude system is not exactly on the AB magnitude system (see Bergeron et al. 2019, and references therein), requiring uiz magnitude corrections proposed by Eisenstein et al. (2006), which we apply. Since the given errors on the synthetic SDSS fluxes tend to be smaller than the AB system corrections, we follow Bergeron et al. (2019) and adopt a lower limit of 0.03 mag uncertainty in all bandpasses.
Ultraviolet fluxes of XP spectra, and, therefore, synthetic ultraviolet bandpasses, are known to have strong color-dependent systematic errors. We refer the readers to Montegriffo et al. (2023) and Gaia Collaboration (2023a) for a lengthy analysis of these issues. While the standardized synthetic photometry (see Sect. 4.3) tries to provisionally address this issue, it remains a blanket fix for the entire GSPC sample and is not perfectly adapted to white dwarfs. We find a significant color-dependent shift persists in the u band and attempt to further reduce it. We crossmatch our GSPC-WD catalogue with SDSS DR18 photometry and find 21 254 objects in common. The topmost plot in Fig. 5 illustrates this shift in the u − g vs. g − r color–color diagrams between real and synthetic SDSS photometry. As can be seen from the plot, synthetic u magnitudes tend to be overestimated for colors g − r ≲ −0.25 and overestimated at g − r ≳ −0.25. We found that this split roughly corresponds to the Gaia color GBP − GRP = −0.15 and use this as a separation point to calculate a u magnitude correction for blue (GBP − GRP < −0.15) and red (GBP − GRP ≥ −0.15) white dwarfs. We calculate the correction term for each color group by binning white dwarfs into synthetic u bins of 0.02 mag and computing the median difference between real and synthetic magnitudes. The median difference, along with the 67.5th percentile as error bars, are shown as a function of synthetic u magnitude in Fig. 6. We then subtract this correction term from the synthetic u magnitudes and use the 67.5th percentile as the uncertainty on the magnitude, as it is typically larger than the error provided by the XP pipeline. The effects of this correction on the u − g vs. g − r diagram is shown on the middle plot of Fig. 5, where both the real and synthetic colors are now in better agreement. The correction term for each object in our sample can be found under the u_corr and u_corr_675 columns of our catalogue. Furthermore, the correction bins shown in Fig. 6 are made available as Supplementary Material.
Fig. 6. Median difference between real and synthetic SDSS photometry for bins of 0.2 synthetic u magnitudes. Error bars correspond to the 67.5th percentile of each bins. |
In spite of the ultraviolet calibration issues, we find the SDSS synthetic photometry to be the optimal choice among the other systems offered by the Gaia team. First, the SDSS synthetic photometry was standardized (see Montegriffo et al. 2023), meaning additional processing steps were done by the Gaia team to ensure its quality and reliability. The standardization has not been applied to other photometric systems commonly used in white dwarf studies. Among the PanSTARRS filters, only the y band has been standardized. We did not include it, however, because it suffers from the same wavelength cut-off issue as the u band (although for redder wavelengths) and is not as critical as the latter for physical parameter inference. These choices are supported by Bergeron et al. (2019), who have clearly shown that excluding the band induces systematic offsets in measured parameter values when using different photometric systems (i.e., SDSS and PanSTARRS), whereas including the u band not only appears to eliminate these offsets, but also provides values that are the most consistent with spectroscopic studies. Furthermore, Bergeron et al. (2019) have also shown that the y-band generally has the worse agreement with atmosphere models (see their Fig. 6), indicating potential issues with either the calibration or physics in that region.
To explore the impact of the u band on our results, we measure the physical parameters of our DA stars with and without the u band using the model atmospheres described in Sect. 4.2.1. We then compare Teff and log g to the values in Tremblay et al. (2019a) for objects also classified as DA in their study (not shown here). We find that the exclusion of the u band has marginal effect for Teff < 30 000 K, at which point the UV calibration issues become noticeable, but has a significant negative impact on hotter white dwarfs. Without the u band, hotter white dwarfs have increasingly underestimated temperatures and overestimated masses when compared to the spectroscopic values obtained by Tremblay et al. (2019a). We conclude that the inclusion of the band results in physical parameters much closer to the values obtained in more specialized spectroscopic studies and is thus justified.
4.2. Model Atmospheres
In this section we outline the model atmospheres used to measure the stellar parameters of the GPSC-WD sample and briefly discuss how unusual parameters may inform us about issues with the data or erroneous classification. Objects that have an uncertain classification were also fitted using the approaches described here based on their most probable class, although they are not discussed.
4.2.1. DA white dwarfs
Our model atmospheres for DA white dwarfs are described at length in Blouin et al. (2018) for Teff ≤ 5000 K, Tremblay et al. (2011) for 5000 K < Teff < 35 000 K, and Bédard et al. (2020) for Teff ≥ 35 000 K. We assume a pure hydrogen atmospheric composition with model atmosphere grid parameters ranging from 3000 K ≤ Teff ≤ 150 000 K and 6.5 ≤ log g ≤ 9.5 for the 77 330 DA stars, a sound assumption for the vast majority of DA white dwarfs (Bergeron et al. 1997; Blouin et al. 2019). The fitting procedure did not converge for 969 objects, nearly all located in the hot tail of the WD locus or the WD+MS region on the Gaia H–R diagram. 212 of these objects have available spectroscopic classification and stellar parameters on the Montreal White Dwarf Database (MWDD; Dufour et al. 2017), including 69 DA+MS binary systems, 1 CV, 1 DC, and 1 DAZ. The remaining 140 are all confirmed DA, out of which 98 are hot (≳30 000 K).
The stellar parameters of the hot DA stars in our sample require a note of caution. We find 88 DA with extremely high temperatures (Teff > 150 000 K) and masses, 28 of which are confirmed DAO or hot DA according to the MWDD. These temperatures are obviously implausible and are likely due to a combination of the u band calibration issues discussed in Sect. 4.1 and the insensitivity of optical photometry to high effective temperatures. At Teff > 40 000 K, the spectral energy distribution of optical photometry is in the Rayleigh-Jeans regime and becomes a poor indicator of temperature. Figure 3 of Bédard et al. (2020) illustrates this well, showing that the difference in u − g colour between Teff = 50 000 K and Teff = 100 000 K for a 0.6 M⊙ white dwarf is a mere 0.06 mag. One can also see from Fig. 5 that the offset between real and synthetic SDSS u photometry is larger than 0.06 mag for objects located at the hot tail of the color–color diagram, even when corrected, and keeps increasing for hotter temperatures. Another consequence of these issues is the trend of increasing mass with temperature, starting around 40 000 K, on the DA mass–Teff diagram displayed in Fig. 7. A slight error in the photometry can cause overestimation of the temperature, which in turn is compensated by underestimation of the stellar radius, leading to overestimation of the mass and the observed diagonal pattern. Precise measurement of the stellar parameters of hot white dwarfs would require UV or spectroscopic observations and are beyond the scope of this paper.
Fig. 7. Mass–effective temperature diagrams for the six main spectroscopic types in the GSPC-WD sample. For clarity, the background objects (gray dots) are a fixed random selection of 50% of all GSPC-WD objects, from which objects from the class being presented are excluded, and only 20% of all DA are shown. |
Fig. 7. continued. |
We also note 60 DA with masses above 1.44 M⊙. These include the 34 objects with extremely high temperatures mentioned above, one confirmed cool DC analyzed in Caron et al. (2023) and 30 objects scattered around the H–R diagram with no other spectroscopic observations available. We suspect the latter are either erroneous classifications or mixed-type white dwarfs with visible hydrogen in their atmosphere.
4.2.2. DB white dwarfs
For the 5688 DB white dwarfs, we use the model atmospheres described in Blouin et al. (2019) for Teff ≤ 8000 K, Genest-Beaulieu & Bergeron (2019) for 8000 K < Teff < 40 000 K, and Bédard et al. (2020) for Teff ≥ 40 000 K. We assume a pure helium atmosphere with grids covering 5000 K ≤ Teff ≤ 60 000 K and 7.0 ≤ log g ≤ 9.0. The fitting procedure did not converge for 969 objects, one located in the hot tail of the WD locus and the rest in the WD+MS region of the Gaia H–R diagram. The MWDD has spectroscopic types available for two objects, indicating PG1159 for the hot object and DA+MS for the second one.
A surprising outcome of these results are 166 DB with masses above 1 M⊙, strongly at odds with previous in-depth analysis of large DB samples suggesting that massive DB are essentially nonexistent (Genest-Beaulieu & Bergeron 2019; Bergeron et al. 2011). Our massive DB are all located under the main white dwarf sequence, where the Q-branch and magnetic white dwarfs are found. Indeed, 13 of these massive DB have spectroscopic types available in the MWDD, 6 of which are classified as magnetic white dwarfs, 5 as DQ, 1 as DBA and 1 as DB:+MS. The high masses are likely to be genuine since high masses are common among magnetic (Hardy et al. 2023a,b) and DQ white dwarfs found within the Q-branch (Cheng et al. 2019; Coutu et al. 2019). Visual inspection of the sampled XP spectra reveals that most of these objects have spectral features at the position of the He I lines at the bluest part of the spectrum, but also near the ionized carbon line λ4267. At such low resolution and signal-to-noise ratio, however, it is impossible to determine whether these features are truly single lines or magnetically distorted lines.
4.2.3. DC white dwarfs
The determination of the atmospheric composition of DC white dwarfs is difficult because, by definition, they do not have any discernible features in their spectra. Furthermore, invisible traces of hydrogen and other heavy elements in helium-atmosphere DC can significantly affect the effective temperature and mass measurements when using the photometric technique (Bergeron et al. 2019; Blouin et al. 2019, 2023). The exact amount of hydrogen is impossible to determine accurately, except for cool DC for which molecular hydrogen starts to form and causes strong infrared absorption via collision-induced absorption (Bergeron et al. 2022; Caron et al. 2023). It is, however, possible to distinguish helium and hydrogen atmospheres due to the different behavior of continuum opacities, but this requires precise photometric measurements and accurate magnitude-to-flux conversion. Considering the SDSS photometry is known to have calibration issues (Holberg & Bergeron 2006; Bergeron et al. 2019), combined with the statistical error on the synthetic magnitudes (a few mmag for griz bands, see Sect. 3.1 of Montegriffo et al. 2023) and u band calibration issues (see previous subsection), it is dangerous to let the H/He abundance ratio vary as a free parameter. We instead keep it fixed and follow the heuristic approach described below.
To measure the physical parameters of our 4082 DC stars, we use the same model atmospheres as those listed in Sect. 4.2.1 and use different atmosphere compositions depending on the temperature based on the in-depth analysis of DC stars by Caron et al. (2023). We initially assume a pure helium atmosphere to have a rough estimate of the temperature, then fit the objects again according to this estimate. We assume a pure hydrogen atmosphere for stars under 5500 K, a mixed atmosphere with a fixed H/He = 10−5 abundance ratio for stars between 5500 and 12 000 K, and retain the pure helium atmosphere for the remaining hot objects.
The fitting procedure did not converge for 50 objects, which are mostly located at the end of the DC sequence on the B-branch and in the WD+MS region. 19 of these objects have spectroscopic classifications available in the MWDD, including 18 DC and one DQpec. We also note one DC, Gaia DR3 1505825635741455872, located in the so-called ultracool sequence (Kilic et al. 2020; Bergeron et al. 2022) with a mass above 1.44 M⊙. This cool DC has been previously analyzed by Caron et al. (2023) who found a high, but more reasonable, mass of 1.18 M⊙.
4.2.4. DO white dwarfs
To fit our 215 DO stars, we use the same model atmospheres described in Bédard et al. (2020). We assume a pure helium atmosphere, a suitable composition for most DO white dwarfs (Bédard et al. 2020), and use model grids covering 30 000 K ≤ Teff ≤ 150 000 K and 6.5 ≤ log g ≤ 9.5. The fitting procedure did not converge for 6 objects, 3 of which are confirmed DO(Z) according to the MWDD.
Just as for hot DA, the precise measurement of physical parameters of DO stars is impossible. To briefly summarize the explanations in Sect. 4.2.1, optical photometry is extremely sensitive to effective temperature above 40 000 K and the u band is known to have large systematic errors, causing the measured physical parameters to be unreliable for hot white dwarfs. As a matter of fact, the diagonal trend in the mass–Teff caused by errors in the u band is glaringly obvious for the DO mass–Teff diagram in Fig. 7. Additionally, a vertical pattern on the mass–Teff diagram can be seen around 150 000 K, indicating that numerous objects have hit the upper temperature limit of the model grids during the fitting procedure. Given the situation, we advise against using the physical parameters we measured for DO stars as well as hot DAs for any analysis of these stars. These results should instead be interpreted as an illustration of the current limitations of the GSPC-WD sample.
Apart from the issues described above, we note three DOs with particularly dubious physical parameters. The three objects are located within the Q-branch, have very high masses (> 1 M⊙) and low temperatures (∼30 000 K–50 000 K). They can be easily discerned from the H–R diagram in Fig. 4 and DO mass–Teff diagram in Fig. 7. Only one of these objects has a spectroscopic classification in the MWDD and is a DC. Ionized helium absorption lines only appear above ≳50 000 K and are unlikely to be distinguishable at the resolution of XP spectra. Visual inspection of the spectra shows features near the positions of He II absorption lines, though identifying the spectral type remains difficult. Just like for the massive DB (see Sect. 4.2.2), we suspect magnetic white dwarfs whose distorted features happen to be confused with DO stars at low resolution.
4.2.5. DQ white dwarfs
Recent spectroscopic analyses have clearly confirmed the existence of two distinct DQ evolutionary sequences: one with normal-mass white dwarfs and one with heavily carbon-polluted and generally more massive objects (Dufour et al. 2005; Coutu et al. 2019; Blouin et al. 2019). Since the DQ designation in our sample includes all carbon-contaminated atmosphere white dwarfs and does not differentiate between the two sequences, and since we do not have higher resolution spectroscopy to constrain the atmosphere carbon abundance, we employ a two-step fitting strategy along with empirical relations between temperature and carbon abundance in order to obtain the best possible stellar parameters.
The empirical relations between temperature and carbon abundance are made using the data from Fig. 12 of Coutu et al. (2019). We split the objects on their figure into two groups: one including high-mass and high-temperature objects (M ≥ 0.7 M⊙, Teff ≥ 9000 K) and one including the normal-mass and lower temperature objects (M < 0.7 M⊙, Teff < 9000 K). We then fit a linear curve to each group to predict the carbon abundance as a function of temperature.
Using the models of Coutu et al. (2019), see also Blouin et al. (2018, 2019) and Blouin & Dufour (2019), we create two grids, one for warmer and one for cooler temperature ranges. More precisely, the warm grid covers 7 ≤ log g ≤ 9, −5 ≤ logC/He ≤ −1, 8000 K ≤ Teff ≤ 16 000, and the cool grid covers 7.5 ≤ log g ≤ 9, −7.5 ≥ logC/He ≥ −5, 5000 K ≤ Teff ≤ 9000 K. We assume there is no hydrogen in the atmosphere. We first fit all DQs assuming a fixed carbon abundance logC/He = −5 and leaving Teff and log g as free parameters using both grids. We then select the best fit between the two grids based on the χ2 values and estimate a new carbon abundance using the appropriate empirical Teff − logC/He relation. We then fit the photometry again using the best-fitting grid of the previous iteration along with the new carbon abundance as a fixed parameter.
Among our 601 DQ white dwarfs, the fitting procedure did not converge for 21 stars, all located in the WD+MS region of the Gaia H–R diagram. Spectroscopic types are available on the MWDD for 4 of them, indicating 4 CVs. We note that some Hot DQ and DAH white dwarfs can be visually difficult to distinguish from one another, even for machine learning algorithms classifying SDSS spectra (Vincent et al. 2023). Extra steps should thus be taken in order to identify possible DAH misidentified as DQ at XP resolution for any detailed analysis of our DQ sample. As for the 4 CVs, we visually inspect their spectra and find that one of them looks similar to a DC, while the other 3 show a strong absorption feature near the ionized carbon line λ4267. We inspect the SDSS spectra of these three objects and find no clear carbon absorption line at that wavelength. We suspect the “absorption feature” seen on the XP spectra is not an actual line, but appears as such at low resolution due to the contrast of being surrounded by two large emission lines.
4.2.6. DZ white dwarfs
Since the training data of the DZ spectral type classifier is constructed in such a way to exclude most spectra with secondary signatures (e.g., DZA/DAZ and DZB/DBZ), we expect the majority of XP spectra classified as DZ to be white dwarfs with cool helium-rich atmospheres. This is supported by the fact that most objects classified as DZ lie on the B-branch of the Gaia H–R diagram as shown in Fig. 4. We use the model atmospheres described in Blouin et al. (2018) and Coutu et al. (2019), and assume a helium-dominated atmosphere with a fixed amount of metal pollution. Our solutions are provided in terms of the Ca abundance (logCa/He), and we assume chondritic abundance ratios with respect to Ca for other metals. The model grids cover 7 ≤ log g ≤ 9 and 4000 K ≤ Teff ≤ 16 000 K, while the calcium abundance ratio remains fixed at logCa/He = −9.5. Although calcium abundance has been shown to vary with temperature (Hollands et al. 2017; Coutu et al. 2019; Blouin & Xu 2022), the scatter of the Ca/He–Teff relationship is very large and properly constraining the abundance requires higher resolution spectroscopy. A fixed assumption of logCa/He = −9.5 is close to the mean abundance found by the previous studies and should be appropriate for the bulk of our DZ sample. We apply this procedure to our 1272 DZ stars and find 7 objects that do not converge, all located above the WD locus on the Gaia H–R diagram and one of which has a DBZ spectroscopic classification in the MWDD.
As noted by Coutu et al. (2019), the hydrogen-free atmosphere assumption results in slightly higher masses when applied to DZ stars (see their Fig. 7). In order to shift the measured masses closer to the fiducial 0.6 M⊙, they included an invisible trace of hydrogen in the atmosphere based on the visibility limit of hydrogen at a given effective temperature. Since the hydrogen abundance cannot be measured directly in most metal-polluted white dwarfs, there exists no empirical relationship from which it can be estimated. We add a fixed hydrogen abundance of logH/He = −3, thus bringing the measured DZ masses closer to 0.6 M⊙, but also inducing a temperature-dependant offset (see the mass–temperature diagram in Fig. 7).
4.3. Adopted parameters
The effective temperature, mass, surface gravity, and chemical composition (see below) for all white dwarfs in our sample are all available via the online catalogue accompanying this paper. See Table 3 for the full list of columns and their description. For objects that did not converge during the fitting procedure, the physical parameters are set to −999. Also included are the probabilities of belonging to each class (Pclass), the synthetic SDSS magnitudes and their flux errors as well as the u band correction (Δu, see Sect. 4.1). The u band correction bins shown in Fig. 6 are also available as Supplementary Material.
Columns and descriptions of our online catalogue for the GSPC-WD sample.
We briefly summarize the atmosphere assumptions for each spectral type as well as important points about the fitting procedure, if necessary. For DA stars, we fit the photometry assuming a pure hydrogen composition (He/H = 0), whereas for DB and DO stars, we assume a pure helium composition (H/He = 0). For DC white dwarfs, we initially fit all objects with a pure helium atmosphere, and fit them a second time based on the effective temperature, assuming a pure helium atmosphere above 11 000 K, a mixed atmosphere (logH/He = −5) between 11 000 K and 5500 K, and a pure hydrogen atmosphere below 5500 K. For DQ stars, we initially fit all objects with a helium-dominated atmosphere and a fixed carbon abundance (logC/He = −5), estimate a new carbon abundance based on empirical Teff − logC/He relations and fit the photometry a second time using this new abundance. Finally, for DZ stars, we assume a helium-dominated atmosphere with a fixed abundance of calcium (logCa/He = −9.5; all other metals are scaled in chondritic proportion according to this abundance).
Most objects without physical parameters appear to be WD+MS binary systems for which the fitting procedure did not converge. Many hot white dwarfs, including DA and DO, also do not have physical parameters due to large errors in the u band. More generally, the measured properties of hot white dwarfs (≳40 000 K) should be used with extreme caution, if at all.
5. Selected results
In this section, we look at the global properties of our GSPC-WD sample, beginning with the various mass distributions.
5.1. Mass distributions
We show in Fig. 8 the cumulative mass distributions for our entire sample, N versus M, for all spectral types (DA, DB, DC, DO, DQ, DZ), regardless of their effective temperature. Stars with implausible parameters (M > 1.44 or Teff > 150 000 K) are omitted. In each panel we provide the number of stars as well as the mean mass μ of each subsample. Remarkably, all mass distributions are relatively narrow and peak near ∼0.6 M⊙, the fiducial mean mass for white dwarfs, indicated by a dashed line in each panel. Hence our overall classification scheme and fitting procedure seem to have properly captured the global properties of all spectral types.
Fig. 8. Cumulative mass distributions for the six spectral types in the GSPC-WD sample. The total number of stars in each histogram and the mean mass are shown at the top right corner. For DZ stars, the red and black histograms represent atmospheres with no hydrogen and a hydrogen abundance of logH/He = −3, respectively. The dashed line indicates the fiducial mean mass of 0.6 M⊙. |
For the DA stars, there is a significant excess of low-mass and high-mass objects compared to other spectral types, consistent with the fact that low-mass white dwarfs, which are most likely unresolved double degenerate binaries, as well as high-mass white dwarfs, are mostly of the DA spectral type (see, e.g., Fig. 15 of Caron et al. 2023 and Fig. 9 below). The mass distributions of DA, DB, and DC stars all have a median mass close to 0.6 M⊙. Those of the DO, DQ, and DZ stars show a more complex behavior. The DO stars, although in relatively small number in our sample, show an extended tail at low masses, and the median mass is shifted above 0.6 M⊙. This is a simple consequence of our difficulty with obtaining reliable physical parameters for these hot DO stars, as discussed above, a problem also apparent in the strong M versus Teff correlation observed in Fig. 7.
Fig. 9. Mass–temperature diagrams of the GSPC cool white dwarfs within 100 pc of the Sun. The top panel shows the diagram for DA stars, and the bottom panel shows the non-DA stars. Also shown as solid black curves are theoretical isochrones, labeled in units of Gyr, obtained from cooling sequences with C/O-core compositions, q(He) = 10−2, and q(H) = 10−4. The lower blue solid curve indicates the onset of crystallization at the centre of evolving models, and the upper one indicates the locations where 80% of the total mass has solidified. The dashed line indicates the fiducial mean mass of 0.6 M⊙. |
As expected, the median of the mass distribution for DQ stars is lower by about ∼0.05 M⊙ compared to the median for other spectral types, a result also obtained earlier by Coutu et al. (2019, see their Fig. 13) and Caron et al. (2023, see their Fig. 19). As discussed in Caron et al., there are at least two possible explanations for this lower mean mass, one involving problems with the physics of DQ model atmospheres (Coutu et al. 2019), and another one recently proposed by Bédard et al. (2022), who suggested that carbon – and hence the DQ phenomenon – is preferentially detected in lower mass white dwarfs.
The mass distribution for DZ white dwarfs in Fig. 8 is shown for both an atmospheric composition assuming no hydrogen, as well as a trace of hydrogen of logH/He = −3. The effect of adding a trace of hydrogen in the analysis of DZ stars using the photometric technique is to decrease the stellar masses significantly, as first noted by Dufour et al. (2007). In this case, the addition of free electrons from hydrogen changes the helium free-free opacity, resulting in lower photometric temperatures, and thus larger stellar radii and smaller masses. We thus assume a trace of hydrogen for DZ stars in the remainder of our analysis (and in our catalogue as well). We note that our photometric analysis of DC stars also include a trace of hydrogen, otherwise the peak of the mass distribution would be shifted towards higher masses as well.
Of more significant interest is the mass distribution of the various spectral types as a function of effective temperature. Here we focus our attention to the cool end (Teff < 10 000 K) of the white dwarf sequence, and we also restrict our sample to a distance of 100 pc in order to reduce the number of objects in our plots, but also to compare our results directly with those of Caron et al. (2023), who restricted their analysis to the same distance. The M versus Teff distribution for this subsample is displayed in Fig. 9 where we split the DA (top panel) and non-DA (bottom panel) stars for clarity. This figure can be compared directly with Fig. 15 of Caron et al. (2023), who analyzed a significantly smaller sample of 2880 spectroscopically confirmed white dwarfs drawn from the MWDD, compared to the 12 569 objects in our 100 pc sample below 10 000 K. A note of caution here, however. Given that our classification scheme is strictly based not on spectral lines but on photometric information as well, the DA stars in the upper panel of Fig. 9 also include at the end of the cooling sequence non-DA stars that are better fitted with pure H atmospheres (see also Caron et al. 2023).
The most striking feature in the mass distribution for DA stars is the crystallization sequence, which is contained between the two blue solid curves in Fig. 9, where the lower blue solid curve indicates the onset of crystallization at the center of evolving models, while the upper curve indicates the locations where 80% of the total mass has solidified. This crystallization sequence evolves towards lower masses at lower Teff values, and eventually merges with the other evolving DA stars with normal masses. Note that non-DA stars can also be found within this crystallization sequence, but in significantly smaller number.
As previously discussed above, low-mass (M < 0.5 M⊙) white dwarfs are mostly DA stars, although low-mass non-DA stars exist as well. Caron et al. (2023) argued, based on their more refined spectro-photometric analysis, that these low-mass non-DA white dwarfs probably have hydrogen atmospheres, which would suggest that common-envelope evolution most likely produces white dwarf remnants that retain thick H layers.
For Teff < 5500 K, it was assumed that most DC white dwarfs have pure hydrogen atmospheres, following the conclusions of Caron et al. (2023); we remind the reader that a fraction of these cool stars are classified as “DA” in Fig. 9. The masses for these objects obtained under the assumption of helium atmospheres are way too low from an astrophysical point of view (see Caron et al. for a more detailed discussion). We can see that at the very cool end of the DA sequence, the masses decrease slightly, a problem that has been attributed to inaccuracies in the calculations of opacity sources, either the red wing of Lα, the H− bound-free opacity, the collision-induced opacity from molecular hydrogen, or several of the above.
The mass distributions for the DQ and DZ stars in the bottom panel of Fig. 9 need to be interpreted with caution since for these objects, we adopted an empirical relation to fix the carbon abundance in DQ stars, and a fixed abundance of hydrogen in all DZ stars, while the exact abundance of each of these trace elements should in principle be adjusted individually for each object. For instance, the trend observed here, where the masses of both spectral types are larger at lower temperatures, is most likely the result of our assumptions.
Finally, we note that one spectral type that has been omitted from our analysis are the so-called ultracool white dwarfs, or more accurately described as IR-faint white dwarfs, which are characterized by a strong infrared flux deficiency resulting from collision-induced absorption by molecular hydrogen (see the detailed analysis by Bergeron et al. 2022), but these are difficult to identify in our sample due to the lack of infrared photometry in our analysis.
While the analysis of Caron et al. (2023) relied on a more detailed and tailored analyses of individual objects of the 100 pc sample from the MWDD, the authors concluded that they had reached the limit of human capacity to analyze individually each object in their large sample of nearly 3000 white dwarfs. They also concluded that better techniques for handling bigger data sets involving machine-learning algorithms would eventually become necessary. Given that the results presented in this section compare favorably well with those of Caron et al. in terms of the mass distributions, we believe that such a goal has now nearly been achieved.
5.2. Spectral Evolution
The atmospheric composition of a white dwarf star can change as it evolves along the cooling sequence, a phenomenon referred to as the spectral evolution. Changes in atmospheric composition provide evidence for transport mechanisms competing with gravitational settling in determining the chemical appearance of the stars as they cool down. Here, we study the spectral evolution of the GSPC-WD sample by looking at the ratio of non-DA to the total number of stars as a function of effective temperature, as well as the 100 pc volume-limited sample. We obtain a ratio for bins of 1000 K by summing the number of objects weighted by 1 − PDA and by dividing by the unweighted total. The error bars are estimated using the Clopper–Pearson interval method (Clopper & Pearson 1934). We restrict our analysis to stars under 30 000 K due to the u band calibration issues, described in the previous sections, affecting temperature measurements. We also exclude stars with masses below 0.45 M⊙ as they mostly include unresolved binary systems (Bergeron et al. 2019). For the 100 pc subsample, we assume the distance of an object is equal to D = 1/π, where π is the parallax in arcseconds, and select those with D < 100 pc, leaving a total of 14 679 objects. A completeness estimation of the Gaia white dwarfs with XP spectra within this volume has already been performed by Jiménez-Esteban et al. (2023), who estimated that about 18 800 white dwarfs should be found within this distance based on the space-density obtained by the 20-pc sample of Hollands et al. (2018). Under this assumption, our sample is ∼78% complete. We can also make a rough estimate of our selection completeness by comparing the number of objects to the white dwarf candidate catalogue of Gentile Fusillo et al. (2021), who estimated their overall completeness to be between 67% and 93%. Assuming that the 16 675 objects with PWD > 0.75 within 100 pc of the Sun in the Gentile Fusillo et al. (2021) catalogue represent a 93% volume-complete sample, which is likely appropriate at the said distance, our own selection would be ∼82% complete. Based on the two different completeness determinations above, we estimate our 14 679 stars to represent a ∼80% volume-complete sample.
The spectral evolution of the GSPC-WD sample is shown in Fig. 10 for objects within 100 pc (black curve) and the full sample (red curve). The latter is only shown as a reference, since it suffers from important selection biases that are not corrected for here. The discussion below focuses on the 100 pc sample. We first notice that both curves appear to systematically overestimate the ratio of non-DA by ∼5% when compared to previous studies (see Fig. 7 of Torres et al. 2023). This offset can easily be eliminated by summing the number of objects above or below certain classification thresholds rather than weighting them by their classification probability. However, the former approach would result in the loss of crucial statistical nuance at lower temperatures, where the classification boundaries are not well defined. For consistency, we stick to the weighted sum approach and keep in mind that the ratio values might be slightly overestimated.
Fig. 10. Spectral evolution curves of the GSPC-WD sample for objects within 100 pc of the Sun (black line) and the entire sample (red line). The non-DA ratio is calculated by summing the number of non-DA white dwarfs in each temperature bin, weighted according to their probability of being a DA, divided by the total number of stars within the bin. |
At high temperatures, we find large fluctuations of the ratio of non-DA. This is likely a small number statistics effect, as very few helium-atmosphere objects are found above 25 000 K (Genest-Beaulieu & Bergeron 2019). This can also be seen on the mass-temperature diagrams of Fig. 7, where only a few DBs reside in the 30 000–25 000 K temperature range. Increasing the volume limit to include more objects stabilizes the number to 15%, which connects nicely with the high-temperature spectral evolution sequence in Bédard et al. (2020) when accounting for the systematic offset mentioned above. Bédard et al. (2020) proposed that this gradual decrease can be explained using the float-up model, where a broad range of residual hydrogen diffuses to the surface and turns most helium-rich stars into DA before they reach Teff ∼ 30 000 K.
A bump of non-DA white dwarfs between 30 000 K and 25 000 K followed by a deficit between 25 000 K and 22 000 K was reported by Jiménez-Esteban et al. (2023) and Torres et al. (2023) who analyzed the 100 pc and 400 pc volume-limited samples, respectively, of the GSPC-WD catalogue. According to their analysis, the ratio of non-DA starts increasing around 30 000 K, reaching its peak at 27 500 K with an increase of ∼5% in non-DA stars, and goes back down to its previous ratio at 25 000 K. We do not find any evidence for such a bump nor a deficit in our 100 pc sample. Instead, we find the non-DA ratio to be statistically constant between 29 000 K and 18 000 K with a ratio somewhere between 10% and 20%. While the spectral evolution of the full GSPC-WD in Fig. 10 may be suggestive of the aforementioned bump and deficit, they do not exhibit precisely the same characteristics as the features identified by Jiménez-Esteban et al. (2023) and Torres et al. (2023). Furthermore, it is essential to consider that the full GSPC-WD sample is not volume-complete, as demonstrated in Fig. 1, and we have yet to account for potential selection biases. Notably, the color-dependent calibration issues discussed in Sect. 4.2, which impact the physical parameters of white dwarfs beginning around 30 000 K (as evident in the DA and DO mass–temperature diagrams in Fig. 7), were not addressed by the authors. The implications of these calibration issues on the spectral evolution beyond 30 000 K remain uncertain; however, it is plausible that they could sufficiently shift the temperature values, potentially leading to a subtle perturbation resembling a small bump. Adding to this, it is worth noting that the classification between DA and non-DA in Jiménez-Esteban et al. (2023) and Torres et al. (2023) relies primarily on the optimal photometric fit between DA and non-DA spectra. This classification method has been shown in the past to have diminishing accuracy as effective temperature exceeds approximately 25 000 K (see Fig. 2 in Bergeron et al. 2019). Consequently, even a minor calibration offset has the potential to introduce significant classification inaccuracies.
Moving on to cooler effective temperatures, the ratio of non-DA steadily increases between 18 000 K and 8000 K, consistent with previous studies (Ourique et al. 2020; Cunningham et al. 2020; Jiménez-Esteban et al. 2023, and Torres et al. 2023). This is indeed what is expected from the convective dilution and convective mixing processes taking place within that temperature range (Rolland et al. 2018; Genest-Beaulieu & Bergeron 2019; Cunningham et al. 2020), causing a gradual transformation of DA into non-DA stars. We note a peculiar dent at the 7000 K bin. While small, it appears statistically significant and goes against the upward trend observed in low-temperature spectral evolution studies (Blouin et al. 2019; McCleery et al. 2020). This feature is greatly accentuated when we calculate the non-DA ratio using simple sums rather than weighted sums, causing the non-DA ratio to rapidly drop between 10 000 K and 6000 K before sharply going back up at 5000 K. Due to its sensitivity on how the non-DA ratio is calculated, we are cautious to affirm whether this dent is physically meaningful. It does, however, persist in the full GSCP-WD sample spectral evolution in Fig. 10. A possible explanation for this dent could simply be that XP spectra become increasingly difficult to classify at low temperatures. We looked at the prediction confidence of each class as a function of temperature for all objects, and found that the classifiers indeed become increasingly confused at around 9000 K. Another factor that could contribute to this dent are that non-DA may be turning into hydrogen-rich DC stars at exactly this temperature (Kowalski & Saumon 2006; Caron et al. 2023). Of course, one should also consider the fact that low-temperature objects are also fainter, and that the measured temperature is more uncertain. Furthermore, it is well known that the physics of very cool white dwarfs is still missing important pieces below 6000 K (Saumon et al. 2022), and any physical parameter measured for the coolest white dwarfs should be interpreted with utmost care.
Finally, at the lowest temperatures, we find the ratio of non-DA to increase sharply to ∼50% at 5000 K, then dropping back down to ∼35% at 4000 K. This behavior is consistent with the expectation that white dwarfs tend to become increasingly helium-rich as they cool down (Blouin et al. 2019; McCleery et al. 2020) and then have their atmospheres turn into hydrogen-rich, albeit we find the latter transition to happen at a lower temperature than what was found by Caron et al. (2023). We hypothesize that the DA classifier gradually starts to rely on the slope of the stellar continuum rather than absorption lines as it tries to classify fainter objects, but only realizes to do so around 4000 K, where absorption lines are certain to be absent. As a matter of fact, we find that if the non-DA ratio is calculated by simply summing all the stars classified as non-DA, the transition from helium-rich to hydrogen-rich atmospheres begins at the expected temperature range between 6000 and 5000 K (Caron et al. 2023). As noted earlier, white dwarfs at low temperatures become increasingly difficult to interpret and the manner in which the ratio of non-DA is calculated has significant impact on the shape of the spectral evolution.
To summarize, we studied the observed spectral evolution of the 100 pc GSPC-WD sample for stars with masses above 0.45 M⊙ and effective temperatures between 30 000 K and 5000 K. Albeit our non-DA ratios display a systematic overestimation of about 5% compared to previous results in the literature, which can be explained by the use of weighted sums, the global trends remain the same. Of particular note are the lack of a non-DA bump and deficit between 30 000 K and 22 000 K, as well as a small dent at 7000 K.
5.3. White dwarf luminosity function
We present here the 100 pc observed white dwarf luminosity function (WDLF) of the GSPC-WD, meaning no corrections are applied due to the incompleteness of the survey. For the same reasons outlined in the previous section, we restrict our selection to stars with measured masses above 0.45 M⊙ and temperatures below 30 000 K and above 5000 K, resulting in an approximately 80% volume-complete sample.
The WDLF is a measure of the number of stars per pc3 per unit of bolometric magnitude, which we obtain using the luminosity (L/L⊙) derived from the photometric results of white dwarfs within 100 pc provided in Table 3. The bolometric magnitudes are calculated using the relation , where is the bolometric magnitude of the Sun. Each object in the sample is then simply added to the appropriate bolometric magnitude bin, and the overall results are divided by the volume defined by a 100 pc sphere.
The luminosity function for the GSPC white dwarfs within 100 pc of the Sun is presented in Fig. 11. Our results are also compared with the volume-complete spectroscopic survey of white dwarfs within 40 pc the Sun by McCleery et al. (2020, northern survey) and O’Brien et al. (2023, southern survey). We take the photometric effective temperature and surface gravity provided in the respective papers and convert them into luminosity and bolometric magnitude using the evolutionary models described in Sect. 4.2. We also include, for reference only, the theoretical luminosity function from Fontaine et al. (2001) and Limoges et al. (2015) for a total age of 10 Gyr, normalized to our own observational results between Mbol = 14.0 and 14.5. Briefly, the theoretical luminosity functions were obtained using a constant star formation rate, a classic Salpeter initial mass function (ϕ = M−2.35), an initial-to-final mass relation given by MWD = 0.4e0.125M, a main sequence lifetime law given by tMS = 10M−2.5 Gyr, where M and MWD are in solar units.
Fig. 11. White dwarf luminosity functions of GSPC white dwarfs within 100 pc of the Sun (red line) and spectroscopically confirmed white dwarfs within 40 pc of the Sun (blue line; McCleery et al. 2020; O’Brien et al. 2023). Error bars represent the Poisson statistics of each bolometric magnitude bin. Also shown for reference is the theoretical luminosity function from Fontaine et al. (2001) for a total age of 10 Gyr (black line). |
Starting with the hot end of the observed luminosity function, we find a sharp increase of the space density between bolometric magnitudes 7 and 8, which may be partially caused by our effective temperature cut. We find that including hotter objects up to 40 000 K pushes the space densities closer to what is predicted by the theoretical luminosity function of Fontaine et al. (2001), but a statistically significant dent remains nonetheless. We suspect this feature may be explained by a large 100 pc volume, as we move beyond the thin disk and start reaching more deeply into the thick disk, where the star population is older and the space density diminishes. This increase between 7 and 8 mag was also found in the spectroscopic analysis of white dwarfs found in the Kiso survey by Limoges & Bergeron (2010) and in the deep proper motion survey (Munn et al. 2017, see their Fig. 12), where both studies combined both thin and thick disk objects in their luminosity functions. In contrast, studies that focus on the thin disk find a much smoother increase of white dwarf space density (Harris et al. 2006; De Gennaro et al. 2008; Krzesinski et al. 2009). Further lending support to this idea is the more recent study by Kilic et al. (2017), who have shown that distinctly considering the thin and thick disk populations and combining them assuming a 20% thin/thick ratio could better reproduce the brightest part of the observed luminosity function in Munn et al. (2017). One can visualize the impact of considering the two disks separately with the theoretical luminosity function in Fig. 11, which does not make the distinction between the two, and predicts a smoother increase that is appropriate for the thin disk population.
Moving on to the middle section of the white dwarf luminosity function, theoretical models usually predict a monotonic rise between 8 and 15 mag. As seen in Fig. 11, our observed luminosity function perfectly follows the theoretical one, apart from the small bump around Mbol ∼ 10. This bump was first noticed almost two decades ago using SDSS data (Harris et al. 2006), and has since then also been found in numerous other studies, including ours and the latest spectroscopic sample of white dwarfs within 40 pc of the Sun (blue line in Fig. 11). The currently accepted explanation was originally suggested by Limoges et al. (2015), who proposed that the bump can be explained by enhanced star formation around 300 Myr ago. The hypothesis was further explored by Torres & García-Berro (2016), who revisited the analysis of the 40 pc sample with a population synthesis code, as well as newer initial-to-mass relation and cooling tracks. They also explained the bump around Mbol = 10 mag as a burst of star formation 600 Myr ago, however, their best-fit model significantly over-predicts the number of white dwarfs near the maximum of the luminosity function. Torres & García-Berro (2016) explained this discrepancy with an initial-final mass relation that has a slope 30% larger than the observed relation for stars more massive than 4 M⊙ from Catalán et al. (2008). As pointed out by Kilic et al. (2017), there is no evidence for such a steep initial-final mass relation and this explanation is unlikely. Instead, Kilic et al. (2017) proposed the contribution of thick disk white dwarfs to the faint end of the luminosity function as a possible explanation for the overabundance of white dwarfs near the maximum of the luminosity function.
At the faint end of our observed luminosity function, we find the peak at Mbol = 15 mag, followed by the so-called dropoff (Fontaine et al. 2001). This region is the most important when it comes to measuring the age of the galactic components and is the most sensitive to selection biases, white dwarf physics, and assumptions behind the theoretical luminosity function (see García-Berro & Oswalt 2016, for a review on the subject). An in-depth analysis of the faint end of the luminosity function is beyond the scope of this paper and will be done in future work. Here, we limit our analysis to the global features and compare them with the volume-complete 40 pc spectroscopic sample and theoretical luminosity function in Fig. 11.
The most obvious difference is, perhaps, the less pronounced peak of our observed luminosity function. It is about 17% smaller in maximum amplitude than the 40 pc sample and definitely does not follow the trend predicted by Fontaine et al. (2001). The difference with the 40 pc sample can likely be explained due to incompleteness and selection biases, which become particularly important for fainter stars. As we go down in effective temperature, the classifiers become increasingly confused and the proportion of objects that fall below the classification confidence threshold increases, thus reducing the space density we find. On the astrometric side, the combination of thin and thick disks can significantly alter the shape of the white dwarf luminosity function near its peak (Kilic et al. 2017) and may produce what we observe here. It is unsurprising, however, that our results strongly differ from the theoretical predictions of Fontaine et al. (2001) near the peak, as their physics and population synthesis were rudimentary and have since been updated and predict a much flatter peak, closer to what we find (see Tononi et al. 2019, and references therein). Another detail to consider is our fixed model atmosphere assumptions that affect the measured physical parameters. For example, the fixed metal and hydrogen abundances in our DZ fits cause small systematic effects (see Sects. 4.2 and 5.1). A larger average mass would imply smaller radii and lower luminosities, thus shifting the stars to fainter magnitude bins.
Finally, one can also see from Fig. 11 that the dropoff of our observed luminosity function happens about 0.5 mag earlier than the 40 pc sample. This can likely be attributed to the simple fact that our WDLF includes a much larger number of objects from the thick disk, which is well-known to contain an older white dwarf population (García-Berro & Oswalt 2016; Kilic et al. 2017).
To conclude, the GSPC-WD catalogue offers a nearly volume-complete sample within 100 pc of the Sun with considerable potential for the study of the WDLF. In particular, disentangling the different galactic components and measuring their relative contribution to the total white dwarf luminosity function would be very interesting given the results of Kilic et al. (2017). The faint end of the luminosity function would also make for an interesting in-depth analysis, as the GSPC-WD catalogue offers the largest sample of cool white dwarfs with spectroscopic classifications, a step forward compared to previous studies that typically assume a pure hydrogen or pure helium atmosphere for white dwarf candidates. A more detailed analysis of the GSPC WDLF is currently under way and should be published soon.
5.4. Comparison with previous studies
In this section, we follow the color calibration test performed in Cukanovaite et al. (2021) and Gentile Fusillo et al. (2021) by comparing our stellar parameters based on Gaia XP synthetic photometry and spectroscopically derived parameters from various other studies. We use the 3D-corrected SDSS sample of Tremblay et al. (2019a) for DA white dwarfs, the DB sample of Genest-Beaulieu & Bergeron (2019) with additional 3D corrections (Cukanovaite et al. 2021), and the DQ and DZ samples from Coutu et al. (2019). The DB parameters from Genest-Beaulieu & Bergeron (2019) are also corrected to put van der Waals broadening on the same scale as other samples using the procedure described in Cukanovaite et al. (2021). We also compare our parameters with those obtained Gaia broadband photometry in the Gentile Fusillo et al. (2021)Gaia-SDSS catalogue for DA and DB white dwarfs.
We cross-match objects of each comparison sample with objects of the appropriate class from our catalogue. We then calculate the difference in Teff and log g for each successfully cross-matched object. We find cross-matches for 50–60% of objects in the Gaia-SDSS of Gentile Fusillo et al. (2021) and > 60% of objects in the spectroscopic samples. We remark that these percentages do not reflect the completeness of the GSPC-WD. Since our classification scheme only assigns a main spectral type, stars such as DZA, for example, would be classified as either a DZ or DA, depending on which lines are most visible at XP resolution. We display the results of our comparison in Fig. 12 for DA and DB and Fig. 13 for DQ and DZ white dwarfs, where the median difference is calculated in bins of 2000 K for Teff < 20 000 K and in bins of 5000 K for warmer temperatures.
Fig. 12. Comparison between Teff and log g values calculated in this paper and corresponding parameters from the literature (see text). The full lines indicate the medians for the various samples and the coloured areas represent the standard deviation. The median difference was calculated in bins of 2000 K for Teff < 20 000 K and in bins of 5000 K for Teff > 20 000 K. |
Fig. 13. Same as Fig. 12, but for DQ and DZ white dwarfs from Coutu et al. (2019). |
For DA stars, we find spectroscopic Teff to be ∼5% larger between 10 000 < Teff < 30 000 K. Cukanovaite et al. (2021) and Gentile Fusillo et al. (2021) also found these differences and attributed them to Gaia colour calibrations issues. At Teff ≈ 30 000 K, the difference flips around and spectroscopic values become increasingly cooler than those derived in our work. This issue coincides with the ultraviolet calibration issues discussed in Sect. 4.1, which are most likely the main culprit behind these differences. We also note a slightly better agreement with the spectroscopic values of Tremblay et al. (2019a) for the hottest DA when compared to the photometric values of Gentile Fusillo et al. (2021), highlighting the importance of including the u band. There is generally good agreement in log g for DA stars cooler than Teff ≈ 30 000 K, after which surface gravity obtained with XP synthetic photometry then starts becoming larger than spectroscopic values. This is not surprising because warmer objects tend to have overestimated temperatures due to the calibration issues, implying a larger mass if the object is at a fixed distance.
For DB stars, we find similar trends as the DA for temperatures above 10 000 K, although the difference in log g becomes more pronounced for warmer objects starting at the 30 000 K mark. Large differences between our physical parameters and those in Genest-Beaulieu & Bergeron (2019) appear at around 10 000 K and sharply increase with decreasing temperatures. This phenomenon has been previously studied by Genest-Beaulieu & Bergeron (2019) and Bergeron et al. (2019), who found that an increasingly large trace of invisible hydrogen was necessary to reproduce the observed atmospheres of the coolest DB. Their results indicate that omitting the hydrogen trace causes the photometric Teff and log g to both be significantly overestimated, which is exactly what we find here. Our model atmospheres for DB stars assume a pure helium composition (see Sect. 4.2.2), and thus their parameters at low effective temperatures should be interpreted with care.
Finally, for DQ and DZ stars, the overall agreement is very good, with an average difference typically smaller than 5% for Teff and 0.1 dex for log g. The difference becomes slightly larger for the DZ at the warmest temperatures, though this can be explained by our model atmospheres assuming a fixed trace of hydrogen (see Sect. 4.2.6), which may not be appropriate for all temperatures.
6. Conclusion
In this paper, we have spectroscopically classified ∼100 000 white dwarf XP spectra as one of six possible types (DA, DB, DC, DO, DQ, DZ) and measured their physical parameters using synthetic SDSS photometry, nearly tripling the number of white dwarfs with spectroscopic classification. We summarize the major results from this work below:
-
We have demonstrated that XP spectra have sufficient resolution to achieve reliable classification across the six spectroscopic types listed above, for most stars. We have validated the classifications using the Gaia H-R diagram and by recovering many cornerstone features in the physical parameter distributions, the spectral evolution and the luminosity function of the GSPC white dwarfs.
-
We have measured the physical parameters of the white dwarfs using class-appropriate model atmospheres. We recover the expected mass and temperature distributions obtained by higher-resolution spectroscopic studies applied to smaller numbers of objects, extending them to include a much larger sample of stars. This is a significant step beyond simply assuming pure hydrogen or mixed atmospheres for the Gaia white dwarf candidates.
-
The blue region of the XP spectra currently suffers from color-dependent calibration issues, causing large systematic errors in the measurement of physical parameters of the hottest white dwarfs. We calculated a correction to the synthetic u band to alleviate this issue. However, despite this correction, we advise caution when interpreting results above Teff = 40 000 K.
-
The spectral evolution of GSPC white dwarfs within 100 pc of the Sun was studied through the ratio of non-DA as a function of effective temperature. We found that the spectral evolution closely resembles what was found in previous studies.
-
We have carried out an initial study of the observed GSPC white dwarf luminosity function and found evidence supporting the idea that distinctly considering the thin and thick disks could help explain some of its features. Given the high level of completeness of the GSPC-WD 100 pc sample and our atmosphere-appropriate measurement of the physical parameters of the white dwarfs, the white dwarf luminosity function we have obtained is ripe for a new and deeper volume-limited analysis, which is currently underway.
The DR3 release from the Gaia mission is already providing us with exciting opportunities to perform statistical studies of large samples of white dwarf stars. The DR4 release, planned for late 2025, will provide an even larger number of XP spectra and will increase the quality of the current sample by including an extended data collection period delivering consequent longer exposures. It will be in the interests of the community to prepare for this incoming data set by preparing tools such as improved classifiers, trained on synthetic data, and methods to measure the physical parameters directly from the XP spectra coefficients rather than byproducts, such as synthetic photometry or sampled spectra.
The models can be found here: https://www.astro.umontreal.ca/~bergeron/CoolingModels
Acknowledgments
We are grateful to the anonymous referee for a careful reading of our manuscript and for several constructive comments that helped to improve this paper significantly. The authors also thank Antoine Bédard for useful discussions on the topics of spectral evolution and hot white dwarfs. This work presents results from the European Space Agency (ESA) space mission Gaia. Gaia data are being processed by the Gaia Data Processing and Analysis Consortium (DPAC). Funding for the DPAC is provided by national institutions, in particular, the institutions participating in the Gaia MultiLateral Agreement (MLA). The Gaia mission website is https://www.cosmos.esa.int/gaia. The Gaia archive website is https://archives.esac.esa.int/gaia. This work is supported in part by the United Kingdom Space Agency (Grants: ST/K000578/1, ST/N000978/1, ST/S001123/1, ST/W002809/1, ST/X001687/1), NSERC Canada and by the Fund FRQ-NT (Québec).
References
- Abdurro’uf, Accetta, K., Aerts, C., et al. 2022, ApJS, 259, 35 [CrossRef] [Google Scholar]
- Andrae, R., Rix, H.-W., & Chandra, V. 2023, ApJS, 267, 8 [NASA ADS] [CrossRef] [Google Scholar]
- Bailer-Jones, C. A. L., Fouesneau, M., & Andrae, R. 2019, MNRAS, 490, 5615 [CrossRef] [Google Scholar]
- Bédard, A., Bergeron, P., Brassard, P., & Fontaine, G. 2020, ApJ, 901, 93 [Google Scholar]
- Bédard, A., Bergeron, P., & Brassard, P. 2022, ApJ, 930, 8 [CrossRef] [Google Scholar]
- Bergeron, P., Ruiz, M. T., & Leggett, S. K. 1997, ApJS, 108, 339 [Google Scholar]
- Bergeron, P., Wesemael, F., Dufour, P., et al. 2011, ApJ, 737, 28 [Google Scholar]
- Bergeron, P., Dufour, P., Fontaine, G., et al. 2019, ApJ, 876, 67 [NASA ADS] [CrossRef] [Google Scholar]
- Bergeron, P., Kilic, M., Blouin, S., et al. 2022, ApJ, 934, 36 [NASA ADS] [CrossRef] [Google Scholar]
- Blouin, S., & Dufour, P. 2019, MNRAS, 490, 4166 [NASA ADS] [CrossRef] [Google Scholar]
- Blouin, S., & Xu, S. 2022, MNRAS, 510, 1059 [Google Scholar]
- Blouin, S., Dufour, P., & Allard, N. F. 2018, ApJ, 863, 184 [Google Scholar]
- Blouin, S., Dufour, P., Thibeault, C., & Allard, N. F. 2019, ApJ, 878, 63 [NASA ADS] [CrossRef] [Google Scholar]
- Blouin, S., Bédard, A., & Tremblay, P.-E. 2023, MNRAS, 523, 3363 [NASA ADS] [CrossRef] [Google Scholar]
- Camisassa, M., Torres, S., Hollands, M., et al. 2023, A&A, 674, A213 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Caron, A., Bergeron, P., Blouin, S., & Leggett, S. K. 2023, MNRAS, 519, 4529 [NASA ADS] [CrossRef] [Google Scholar]
- Catalán, S., Isern, J., García-Berro, E., et al. 2008, A&A, 477, 213 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Chandra, V., Hwang, H.-C., Zakamska, N. L., & Budavári, T. 2020, MNRAS, 497, 2688 [NASA ADS] [CrossRef] [Google Scholar]
- Cheng, S., Cummings, J. D., & Ménard, B. 2019, ApJ, 886, 100 [Google Scholar]
- Clopper, C. J., & Pearson, E. S. 1934, Biometrika, 26, 404 [Google Scholar]
- Coutu, S., Dufour, P., Bergeron, P., et al. 2019, ApJ, 885, 74 [NASA ADS] [CrossRef] [Google Scholar]
- Cukanovaite, E., Tremblay, P.-E., Bergeron, P., et al. 2021, MNRAS, 501, 5274 [NASA ADS] [CrossRef] [Google Scholar]
- Cunningham, T., Tremblay, P.-E., Gentile Fusillo, N. P., Hollands, M., & Cukanovaite, E. 2020, MNRAS, 492, 3540 [NASA ADS] [CrossRef] [Google Scholar]
- De Angeli, F., Weiler, M., Montegriffo, P., et al. 2023, A&A, 674, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- De Gennaro, S., von Hippel, T., Winget, D. E., et al. 2008, AJ, 135, 1 [NASA ADS] [CrossRef] [Google Scholar]
- Dufour, P., Bergeron, P., & Fontaine, G. 2005, ApJ, 627, 404 [NASA ADS] [CrossRef] [Google Scholar]
- Dufour, P., Bergeron, P., Liebert, J., et al. 2007, ApJ, 663, 1291 [NASA ADS] [CrossRef] [Google Scholar]
- Dufour, P., Blouin, S., Coutu, S., et al. 2017, ASP Conf. Ser., 509, 3 [Google Scholar]
- Echeverry, D., Torres, S., Rebassa-Mansergas, A., & Ferrer-Burjachs, A. 2022, A&A, 667, A144 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Eisenstein, D. J., Liebert, J., Harris, H. C., et al. 2006, ApJS, 167, 40 [CrossRef] [Google Scholar]
- El-Badry, K., Rix, H.-W., & Weisz, D. R. 2018, ApJ, 860, L17 [Google Scholar]
- Fontaine, G., Brassard, P., & Bergeron, P. 2001, PASP, 113, 409 [NASA ADS] [CrossRef] [Google Scholar]
- Gaia Collaboration (Brown, A. G. A., et al.) 2016, A&A, 595, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Gaia Collaboration (Babusiaux, C., et al.) 2018a, A&A, 616, A10 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Gaia Collaboration (Brown, A. G. A., et al.) 2018b, A&A, 616, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Gaia Collaboration (Brown, A. G. A., et al.) 2021, A&A, 649, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Gaia Collaboration (Montegriffo, P., et al.) 2023a, A&A, 674, A33 [CrossRef] [EDP Sciences] [Google Scholar]
- Gaia Collaboration (Vallenari, A., et al.) 2023b, A&A, 674, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- García-Berro, E., & Oswalt, T. D. 2016, New Astron. Rev., 72, 1 [CrossRef] [Google Scholar]
- Genest-Beaulieu, C., & Bergeron, P. 2019, ApJ, 871, 169 [NASA ADS] [CrossRef] [Google Scholar]
- Gentile Fusillo, N. P., Tremblay, P.-E., Gänsicke, B. T., et al. 2019, MNRAS, 482, 4570 [Google Scholar]
- Gentile Fusillo, N. P., Tremblay, P. E., Cukanovaite, E., et al. 2021, MNRAS, 508, 3877 [NASA ADS] [CrossRef] [Google Scholar]
- Hardy, F., Dufour, P., & Jordan, S. 2023a, MNRAS, 520, 6111 [CrossRef] [Google Scholar]
- Hardy, F., Dufour, P., & Jordan, S. 2023b, MNRAS, 520, 6135 [CrossRef] [Google Scholar]
- Harris, H. C., Munn, J. A., Kilic, M., et al. 2006, AJ, 131, 571 [NASA ADS] [CrossRef] [Google Scholar]
- Holberg, J. B., & Bergeron, P. 2006, AJ, 132, 1221 [Google Scholar]
- Hollands, M. A., Koester, D., Alekseev, V., Herbert, E. L., & Gänsicke, B. T. 2017, MNRAS, 467, 4970 [NASA ADS] [Google Scholar]
- Hollands, M. A., Tremblay, P. E., Gänsicke, B. T., Gentile-Fusillo, N. P., & Toonen, S. 2018, MNRAS, 480, 3942 [NASA ADS] [CrossRef] [Google Scholar]
- Jiménez-Esteban, F. M., Torres, S., Rebassa-Mansergas, A., et al. 2019, in The Gaia Universe, 48 [Google Scholar]
- Jiménez-Esteban, F. M., Torres, S., Rebassa-Mansergas, A., et al. 2023, MNRAS, 518, 5106 [Google Scholar]
- Kepler, S. O., Koester, D., Pelisoli, I., Romero, A. D., & Ourique, G. 2021, MNRAS, 507, 4646 [NASA ADS] [CrossRef] [Google Scholar]
- Kilic, M., Munn, J. A., Harris, H. C., et al. 2017, ApJ, 837, 162 [Google Scholar]
- Kilic, M., Hambly, N. C., Bergeron, P., Genest-Beaulieu, C., & Rowell, N. 2018, MNRAS, 479, L113 [Google Scholar]
- Kilic, M., Bergeron, P., Kosakowski, A., et al. 2020, ApJ, 898, 84 [Google Scholar]
- Kleinman, S. J., Kepler, S. O., Koester, D., et al. 2013, ApJS, 204, 5 [NASA ADS] [CrossRef] [Google Scholar]
- Kowalski, P. M., & Saumon, D. 2006, ApJ, 651, L137 [Google Scholar]
- Krzesinski, J., Kleinman, S. J., Nitta, A., et al. 2009, A&A, 508, 339 [CrossRef] [EDP Sciences] [Google Scholar]
- Limoges, M. M., & Bergeron, P. 2010, ApJ, 714, 1037 [NASA ADS] [CrossRef] [Google Scholar]
- Limoges, M. M., Bergeron, P., & Lépine, S. 2015, ApJS, 219, 19 [NASA ADS] [CrossRef] [Google Scholar]
- Lindegren, L., Bastian, U., Biermann, M., et al. 2021, A&A, 649, A4 [EDP Sciences] [Google Scholar]
- McCleery, J., Tremblay, P.-E., Gentile Fusillo, N. P., et al. 2020, MNRAS, 499, 1890 [Google Scholar]
- Montegriffo, P., De Angeli, F., Andrae, R., et al. 2023, A&A, 674, A3 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Munn, J. A., Harris, H. C., von Hippel, T., et al. 2017, AJ, 153, 10 [Google Scholar]
- O’Brien, M. W., Tremblay, P. E., Gentile Fusillo, N. P., et al. 2023, MNRAS, 518, 3055 [Google Scholar]
- Ourique, G., Kepler, S. O., Romero, A. D., Klippel, T. S., & Koester, D. 2020, MNRAS, 492, 5003 [NASA ADS] [CrossRef] [Google Scholar]
- Pedregosa, F., Varoquaux, G., Gramfort, A., et al. 2011, J. Mach. Learn. Res., 12, 2825 [Google Scholar]
- Press, W. H., Flannery, B. P., & Teukolsky, S. A. 1986, Numerical Recipes. The Art of Scientific Computing (Cambridge: Cambridge University Press) [Google Scholar]
- Rolland, B., Bergeron, P., & Fontaine, G. 2018, ApJ, 857, 56 [NASA ADS] [CrossRef] [Google Scholar]
- Saumon, D., Blouin, S., & Tremblay, P.-E. 2022, Phys. Rep., 988, 1 [NASA ADS] [CrossRef] [Google Scholar]
- Sion, E. M., Greenstein, J. L., Landstreet, J. D., et al. 1983, ApJ, 269, 253 [CrossRef] [Google Scholar]
- Smart, R., Sarro, L., Rybizki, J., et al. 2021, The 20.5th Cambridge Workshop on Cool Stars, Stellar Systems, and the Sun, CS20.5, 81 [Google Scholar]
- Tononi, J., Torres, S., García-Berro, E., et al. 2019, A&A, 628, A52 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Torres, S., & García-Berro, E. 2016, A&A, 588, A35 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Torres, S., Cruz, P., Murillo-Ojeda, R., et al. 2023, A&A, 677, A159 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Tremblay, P. E., Bergeron, P., & Gianninas, A. 2011, ApJ, 730, 128 [Google Scholar]
- Tremblay, P. E., Cukanovaite, E., Gentile Fusillo, N. P., Cunningham, T., & Hollands, M. A. 2019a, MNRAS, 482, 5222 [Google Scholar]
- Tremblay, P.-E., Fontaine, G., Gentile Fusillo, N. P., et al. 2019b, Nature, 565, 202 [CrossRef] [Google Scholar]
- Vincent, O., Bergeron, P., & Dufour, P. 2023, MNRAS, 521, 760 [NASA ADS] [CrossRef] [Google Scholar]
Appendix A: Classification algorithm details and data processing
This appendix outlines the necessary details in order to reproduce our results using the classification approach and data described in Section 3. First, the 110 XP coefficients were normalized to G = 15 following the recommendations in Andrae et al. (2023). This was done by dividing all the coefficients of each star by 1015 − G/2.5. For each binary classifier, we then weighted the labels according to the number of objects in each class, for example DA and non-DA for the DA classifier, DB versus non-DB for the DB classifier, and so on. The weights of each class (wclass) are calculated using the formulae wclass = 1/2Nclass. We have found that weights for very imbalanced classes, such as DO stars, become very large and cause performance issues. We thus bring large weights (wclass > 10) back to a more reasonable value by dividing them by 10, resulting in better performance.
As for the classifier hyperparameters, we use the same setup for every classifier and keep most of the default options set in the GradientBoostingClassifier object as of version 1.1.2 of Scikit-learn. The only changes in parameters were the number of estimators (n_estimators=225), the learning rate (learning_rate=0.1) and the maximum tree depth (max_depth=3).
Appendix B: Effective temperature coverage
We present here the effective temperature coverage of the training set of each spectral class as defined in Section 3. Figure B.1 shows the normalized count of training objects for each class in bins of 500 K (2000 K for DO) using temperatures measured in this work. As a comparison point, we also show the temperature coverage obtained from the MWDD by selecting objects for which the spectral class exactly corresponds to the relevant class (e.g. we only take DA if their spectral type on the MWDD is exactly "DA"), and use the default effective temperatures provided by the database. While this is obviously an imperfect comparison, as the spectral classification could be based on wavelength regions other than optical and the effective temperatures come from various sources, it still provides a rough idea of how the SDSS biases impact our results.
Fig. B.1. Effective temperature coverage of our training set (red line), white dwarfs on the MWDD (black dashed line) and our full classification of the GSPC-WD sample (blue line). |
All Tables
Training examples and average precision and recall test scores of the cross-validation and top-5 classifier ensembles at a 0.6 threshold.
High-confidence white dwarfs per class in the GSPC-WD sample and in the Gaia-SDSS catalogue.
All Figures
Fig. 1. Number of objects in the Gentile Fusillo et al. (2021) white dwarf candidate catalogue with PWD > 0.65 (red line) and number of objects in the GSPC-WD catalogue (black line) as a function of distance. The gray dashed lines indicate distances of 50 and 100 pc. |
|
In the text |
Fig. 2. Gallery of XP spectra for Gaia objects with SDSS-confirmed spectroscopic type (indicated in the top right corner of each panel). Three spectra of varying G magnitudes are displayed to illustrate the difference between brighter and fainter stars. Spectral lines typically used for classification are displayed for reference (see text). The Gaia DR3 source identification and the spectroscopic type probability predicted by our classifiers (Pclass) are also shown. |
|
In the text |
Fig. 3. F-score curves of the top five performing classifier ensembles. A threshold value of 0.65 maximizes the mean F-scores for all classes. |
|
In the text |
Fig. 4. H–R diagram of GSPC-WD objects with high classification confidence and converged fits. The location of objects for every class is consistent with previous spectroscopic studies. For clarity, background objects (grey points) are restricted to those with G < 18 and a parallax measurement error less than 1%, and only random selection of 25% of all DA is shown. |
|
In the text |
Fig. 5. Color–color diagrams of objects with both real and synthetic SDSS photometry. For clarity, only objects with G < 17 are shown. The top panel contains the two color–color distributions before the u band correction (see text), displaying a significant color-dependent shift between the two. The middle panel shows the distributions after the u band corrections have been applied to the synthetic photometry. The bottom panel shows the corrected synthetic photometry color–color diagram with points colored by their XP spectroscopic classification. On all three panels, the cooling tracks for pure hydrogen (full black line) and pure helium (dashed black line) are shown at a constant mass of 0.6 M⊙. |
|
In the text |
Fig. 6. Median difference between real and synthetic SDSS photometry for bins of 0.2 synthetic u magnitudes. Error bars correspond to the 67.5th percentile of each bins. |
|
In the text |
Fig. 7. Mass–effective temperature diagrams for the six main spectroscopic types in the GSPC-WD sample. For clarity, the background objects (gray dots) are a fixed random selection of 50% of all GSPC-WD objects, from which objects from the class being presented are excluded, and only 20% of all DA are shown. |
|
In the text |
Fig. 7. continued. |
|
In the text |
Fig. 8. Cumulative mass distributions for the six spectral types in the GSPC-WD sample. The total number of stars in each histogram and the mean mass are shown at the top right corner. For DZ stars, the red and black histograms represent atmospheres with no hydrogen and a hydrogen abundance of logH/He = −3, respectively. The dashed line indicates the fiducial mean mass of 0.6 M⊙. |
|
In the text |
Fig. 9. Mass–temperature diagrams of the GSPC cool white dwarfs within 100 pc of the Sun. The top panel shows the diagram for DA stars, and the bottom panel shows the non-DA stars. Also shown as solid black curves are theoretical isochrones, labeled in units of Gyr, obtained from cooling sequences with C/O-core compositions, q(He) = 10−2, and q(H) = 10−4. The lower blue solid curve indicates the onset of crystallization at the centre of evolving models, and the upper one indicates the locations where 80% of the total mass has solidified. The dashed line indicates the fiducial mean mass of 0.6 M⊙. |
|
In the text |
Fig. 10. Spectral evolution curves of the GSPC-WD sample for objects within 100 pc of the Sun (black line) and the entire sample (red line). The non-DA ratio is calculated by summing the number of non-DA white dwarfs in each temperature bin, weighted according to their probability of being a DA, divided by the total number of stars within the bin. |
|
In the text |
Fig. 11. White dwarf luminosity functions of GSPC white dwarfs within 100 pc of the Sun (red line) and spectroscopically confirmed white dwarfs within 40 pc of the Sun (blue line; McCleery et al. 2020; O’Brien et al. 2023). Error bars represent the Poisson statistics of each bolometric magnitude bin. Also shown for reference is the theoretical luminosity function from Fontaine et al. (2001) for a total age of 10 Gyr (black line). |
|
In the text |
Fig. 12. Comparison between Teff and log g values calculated in this paper and corresponding parameters from the literature (see text). The full lines indicate the medians for the various samples and the coloured areas represent the standard deviation. The median difference was calculated in bins of 2000 K for Teff < 20 000 K and in bins of 5000 K for Teff > 20 000 K. |
|
In the text |
Fig. 13. Same as Fig. 12, but for DQ and DZ white dwarfs from Coutu et al. (2019). |
|
In the text |
Fig. B.1. Effective temperature coverage of our training set (red line), white dwarfs on the MWDD (black dashed line) and our full classification of the GSPC-WD sample (blue line). |
|
In the text |
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.