Open Access
Issue
A&A
Volume 671, March 2023
Article Number A52
Number of page(s) 20
Section Numerical methods and codes
DOI https://doi.org/10.1051/0004-6361/202244764
Published online 06 March 2023

© The Authors 2023

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1 Introduction

The European Space Agency’s Gaia mission (Gaia Collaboration 2016) repeatedly scans the entire sky and observes point-like or nearly point-like objects at magnitudes from approximately G = 3m to G = 20.7m. These observations include aperture-free low-resolution spectrophotometry covering a wavelength range from approximately 330 nm to 1050 nm, with a spectral resolution of about 30 to 100 (Carrasco et al. 2021). The third Gaia Data release (Gaia DR3, Gaia Collaboration 2023) includes these spectrophotometric observations for about 219 million astronomical objects (De Angeli et al. 2023). The purpose of these spectra is to lend support to the calibration process for the astrometric observations of Gaia and to characterise the observed astronomical objects (Gaia Collaboration 2016). Spectroscopy at very low-resolution is generally not well suited for the study of spectral lines, as the contrast between the line and the underlying continuum is poor, the shape of the lines is not or only partially resolved, and blending of different lines might be strong. However, the exceptional properties of the Gaia spectrophotometry, such as the all-sky coverage from a stable space environment, its large degree of completeness, down to G = 17.65m in Gaia DR3, and the wide wavelength range covered, motivates efforts for analysing spectral lines in Gaia’s low-resolution spectra as well. Eventually, Gaia’s final data release will comprise hundreds of low-resolution epoch spectra collected over several years, along with a completeness to about G = 20.7m.

As a consequence of the self-calibration strategy for Gaia spectra, they are represented by a linear combination of basis functions (Carrasco et al. 2021). This distinguishes Gaia low-resolution spectra from typical spectra, which consist of individual flux values on a grid of wavelength points. The implications of this difference in spectral representation have to be taken into account when answering questions, such as whether there are spectral lines present in the Gaia low-resolution spectrum for a given astronomical object and what the relevant level of statistical significance would be; where the lines are located in terms of wavelength and what the associated uncertainty would be; what the line strength is, whether the line is broad, and what information we might be able to gather from its shape. In the following, we work out the answers to these questions.

The structure of this work is as follows. In Sect. 2, we describe in detail the differences between typical spectra and spectra represented by a linear combination of basis functions, as is the case for Gaia. The unconventional properties of the spectra of the latter kind are worked out there. Section 3 summarises the particular properties of Gaia DR3 low-resolution spectra. In Sect. 4, we summarise the relevant mathematical properties of Hermite functions and derive mathematical relations required in this work. In Sect. 5, we address relevant aspects of the formation of Gaia spectra. In Sect. 6, we discuss the problem of inverting spectra. In the following sections, we discuss spectral lines as local extrema (Sect. 7), the cases of narrow (Sect. 8), and broad lines (Sect. 9), and the use of higher derivatives (Sect. 10). Finally, we provide some example applications to Gaia DR3 spectra in Sect. 11 and close this work with a summary and discussion in Sect. 12.

thumbnail Fig. 1

Sketch illustrating the difference between a typical spectrum and a spectrum represented by a linear combination of basis functions. Top panel: typical spectrum, consisting of sampled flux values on a wavelength grid (black symbols) derived from the spectral photon distribution (red line). Central panel: coefficients representing the same data set by a linear combination of basis functions. Black symbols show the noisy representation, red dots the true coefficients. Bottom panel: linear combination of basis functions (black line) and the 1-σ error interval (grey shaded region) as a function of wavelength.

2 Representing spectra with basis functions

Typical astronomical spectra consist of flux measurements at different wavelength points or bins. The flux measurements usually have low correlations and are significantly correlated only over very short distances in wavelength, if the line spread function (LSF; the point spread function integrated along the direction perpendicular to the dispersion direction) is oversampled. This is illustrated in the sketch in the top panel of Fig. 1. The true spectral photon distribution (SPD, red line) is sampled at certain wavelength points, with uncorrelated normally distributed random noise (black symbols) in this example. A spectral line (shown as an emission line in the sketch) is manifested by flux points deviating from an assumed continuum and the significance of the line can, in principle, be derived from how many standard deviations its data points deviate from the continuum.

If the same data set is instead represented by a linear combination of basis functions, then the coefficients of the development tend to have low correlations over small distances in coefficient indices. This is shown as a sketch in the central panel of Fig. 1, for the same set of data points. From these coefficients, the spectrum can be constructed by computing the actual linear combination on a continuous wavelength axis. This is shown in the bottom panel of Fig. 1. Here, noise manifests not as scatter of points around a mean, as is the case for the typical spectrum in the top panel, but in continuous, wavy structures. The shaded region in the bottom panel of Fig. 1 gives the formal 1-σ error interval around the mean spectrum. It is not trivial to judge visually, which wavy pattern is noise, and which actually is a spectral line. In this work, we derive a formalism to detect lines, ideally from the coefficients describing the spectrum directly, as well as a statistics to quantify the significance of wavy structures in Gaia’s low-resolution spectra.

3 Gaia low resolution spectra

We summarise in this section the properties of Gaia low resolution spectra as far as they are relevant in the context of this work. This summary is necessarily brief and incomplete and it is a compilation of information provided by Carrasco et al. (2021) for the principles of the calibration of inner-instrumental effects, by De Angeli et al. (2023) for the actual processing and validation of Gaia DR3 data, and by Montegriffo et al. (2023) for the absolute flux and wavelength calibration. We refer to these works for more in-depth discussions.

The Gaia low resolution spectrophotometer consists of two parts, producing independent spectra for each observed celestial object on the wavelength intervals of approximately 330–680 nm and 640–1050 nm, respectively. The spectra on the short wavelength range are referred to as ‘blue photometer’ (BP) spectra, and the spectra for the longer wavelength range as ‘red photometer’ (RP) spectra. We use the term XP spectra to refer to both types of Gaia spectra. From observed XP spectra of a particular source, transiting Gaia’s fields of view repeatedly, the ‘mean’ BP and RP spectra can be constructed. In this process, referred to as the ‘internal calibration’, any inner-instrumental effects – for instance, variations in response or line spread functions with time and position in the focal plane – are suppressed. The mean BP and RP spectra for an astronomical object are represented by a linear combination of 55 Hermite functions in BP and RP, respectively, in Gaia DR3. The coefficients of these representations, together with the corresponding errors and correlations, are part of Gaia DR3. The argument of the Hermite functions is a linear function of pseudo-wavelength, u, which is a coordinate along the wavelength dimension in the focal plane common to all BP and RP spectra, respectively, and which corresponds to a mean CCD pixel scale. The dispersion function links the pseudo-wavelength to the actual wavelength, λ. The value of the linear combination of Hermite functions is the flux of the source, in photo-electrons per second, per unit of pseudo-wavelength and within the Gaia aperture of 0.7872 m2. The coefficients of the linear combination refer not to Hermite functions directly, but to an orthogonal transformation of Hermite functions. The orthogonal transformations are different for BP and RP, and the transformation matrices are available with Gaia DR3.

For the 55 orthogonal transformed Hermite functions, corresponding basis functions in absolute flux and wavelength units are provided with Gaia DR3. The same coefficients that represent the internally calibrated BP or RP spectrum also represent the spectral energy distribution when using the corresponding external basis functions. The transformation of internal spectra to external spectra however is intrinsically an ill-conditioned problem (Weiler et al. 2020) and thus sensitive to systematic errors introduced to what essentially is a deconvolution process. For the analysis of spectral lines, the use of internally calibrated spectra is therefore preferable. Furthermore, the Hermite functions used in the representation of internally calibrated BP and RP spectra provide very useful mathematical properties that we can exploit in the analysis of spectral lines.

Table 1

List of notations used throughout this work.

4 Some mathematical properties of linear combinations of Hermite functions

In this section, we compile and derive the mathematical properties and relationships involving Hermite functions that we need in the context of this work. We separate the mathematical aspects in this section from the discussion of the instrumental and astro-physical aspects. In doing so, we simplify the structure of this work, by not interrupting physical and astronomical discussions later in the work with the introduction of mathematical properties related to Hermite functions. For easy reference, Table 1 provides an overview of the nomenclature that is used throughout this paper.

4.1 Hermite functions and linear combinations thereof

Hermite functions appear in a number of mathematical and scientific contexts. In physics, they own their relevance widely to being the eigenfunctions of the quantum mechanical harmonic oscillator. A detailed discussion of Hermite functions and their properties may therefore be found in most basic text books on quantum mechanics (e.g. Magnasco 2013). The nth Hermite function we denote with φn(x), with n = 0, 1, 2,… The index n we refer to as the ‘order’ of the Hermite function. In the context of this work, we need two basic recurrence relations for Hermite functions φn(x), n = 0, 1, 2, etc. The first recurrence relation is: xφn(x)=n2φn1(x)+n+12φn+1(x),$x\,{\varphi _n}\left( x \right) = \sqrt {{n \over 2}} {\varphi _{n - 1}}\left( x \right) + \sqrt {{{n + 1} \over 2}} {\varphi _{n + 1}}\left( x \right),$(1)

together with φ0(x)=π1/4e12x2,φ1(x)=π1/42xe12x2.$\matrix{ {{\varphi _0}\left( x \right) = {\pi ^{ - 1/4}}{{\rm{e}}^{ - {1 \over 2}{x^2}}},} \hfill \cr {{\varphi _1}\left( x \right) = {\pi ^{ - 1/4}}\sqrt 2 \,x\,{{\rm{e}}^{ - {1 \over 2}{x^2}}}.} \hfill \cr } $

From this relation between Hermite functions of different order n and the first two functions given explicitly, any Hermite function can easily be computed for any argument, x.

The second recurrence relation connects the first derivative of a Hermite function with Hermite functions. In this work, we denote the kth derivative of a function with the superscript “(k)”. For Hermite functions, we have the following relation: φn(1)(x)=n2φn1(x)n+12φn+1(x).$\varphi _n^{\left( 1 \right)}\left( x \right) = \sqrt {{n \over 2}} {\varphi _{n - 1}}\left( x \right) - \sqrt {{{n + 1} \over 2}} {\varphi _{n + 1}}\left( x \right).$(2)

The Hermite functions are a set of orthonormal functions on ℝ, i.e. the integrals over the product of two Hermite functions over the entire real axis obeys the orthonormality relation φi(x)φj(x)dx=δij,$\int\limits_{ - \infty }^\infty {{\varphi _i}\left( x \right)} \cdot {\varphi _j}\left( x \right){\rm{d}}x = {\delta _{i\,j}},$(3)

with δij the Kronecker delta. Furthermore, we consider linear combinations of Hermite functions, for instance, functions of the form: ϕ(x)=n=0Ncnφn(x),$\phi \left( x \right) = \sum\limits_{n = 0}^N {{c_n}} {\varphi _n}\left( x \right),$(4)

with the cn the coefficients of the linear combination. We may write the N + 1 coefficients as an N + 1 element vector c. With the orthonormality expressed in Eq. (3), the coefficients cn in Eq. (4) become cn=ϕ(x)φn(x)dx.${c_n} = \int\limits_{ - \infty }^\infty {\phi \left( x \right)} \cdot {\varphi _n}\left( x \right){\rm{d}}x.$(5)

4.2 Integrals over linear combinations of Hermite functions

The application of Eq. (3) turns the computation of integrals over the real axis over products of Hermite functions into a trivial problem. Using the relationships of Eq. (2) also makes the computation of integrals over finite intervals [a, b] on the real axis simple. We get: abφn(1)(x)dx=n2abφn1(x)dxn+12abφn+1(x)dx,$\int\limits_a^b {\varphi _n^{\left( 1 \right)}} \left( x \right){\rm{d}}x = \sqrt {{n \over 2}} \int\limits_a^b {{\varphi _{n - 1}}} \left( x \right){\rm{d}}x - \sqrt {{{n + 1} \over 2}} \int\limits_a^b {{\varphi _{n + 1}}} \left( x \right){\rm{d}}x,$(6)

and therefore abφn+2(x)dx=n+1n+2abφn(x)dx        +2n+1[ φn+1(a)φn+1(b) ].$\matrix{ {\int\limits_a^b {{\varphi _{n + 2}}} \left( x \right){\rm{d}}x = \sqrt {{{n + 1} \over {n + 2}}} \int\limits_a^b {{\varphi _n}} \left( x \right){\rm{d}}x} \hfill \cr {\quad \quad \quad \quad \quad \,\,\, + \sqrt {{2 \over {n + 1}}} \left[ {{\varphi _{n + 1}}\left( a \right) - {\varphi _{n + 1}}\left( b \right)} \right].} \hfill \cr } $(7)

Together with the indefinite integrals for the first two Hermite functions (Brychkov et al. 1989), φ0(x)dx=12π14erf(x2)$\int {{\varphi _0}\left( x \right){\rm{d}}x} = {1 \over {\sqrt 2 }}{\pi ^{{1 \over 4}}}{\rm{erf}}\left( {{x \over {\sqrt 2 }}} \right)$(8) φ1(x)dx=2π14ex22,$\int {{\varphi _1}\left( x \right){\rm{d}}x} = - \sqrt 2 \,{\pi ^{ - {1 \over 4}}}{{\rm{e}}^{ - {{{x^2}} \over 2}}},$(9)

the integrals over any interval on the real axis can be computed iteratively for all orders. In particular, for integrations of the entire real axis, that is, a = −∞ and b = ∞, we obtain φ2n+1(x)dx=0$\int\limits_{ - \infty }^\infty {{\varphi _2}_{n + 1}} \left( x \right){\rm{d}}x = 0$(10)

and the integrals over the even order Hermite functions can be computed from φ0(x)dx=π142,$\int\limits_{ - \infty }^\infty {{\varphi _0}} \left( x \right){\rm{d}}x = {\pi ^{{1 \over 4}}}\sqrt 2 ,$(11) φ2(n+1)(x)dx=n+12n+1φ2n(x)dx.$\int\limits_{ - \infty }^\infty {{\varphi _{2\left( {n + 1} \right)}}} \left( x \right){\rm{d}}x = \sqrt {{{n + {1 \over 2}} \over {n + 1}}} \int\limits_{ - \infty }^\infty {{\varphi _{2n}}} \left( x \right){\rm{d}}x.$(12)

When arranging the integrals over the interval [a, b] for the first N Hermite functions as an N-element vector i, then the integral over the linear combination of Hermite functions ϕ(x) from Eq. (4) over the real axis is ϕ(x)dx=ic.$\int\limits_{ - \infty }^\infty {\phi \left( x \right){\rm{d}}x = {{\bf{i}}^ \top }{\bf{c}}.} $(13)

Since the vector i only depends on the Hermite functions, but not on the particular linear combination thereof, the integral over any linear combination of Hermite functions can eventually be computed by the cost of one vector multiplication.

4.3 Derivatives of linear combinations of Hermite functions

The recurrence relation given by Eq. (2) has immediate important consequences, as the derivative of an Hermite function is a linear combination of two other Hermite functions, with orders increased and reduced by one. And since differentiation is a linear operation, it follows that the derivative of any linear combination of Hermite functions is another linear combination of Hermite functions. Thus, the kth derivative of a linear combination of Hermite functions with maximal order N is a linear combination of Hermite functions of maximal order N + k.

For linear combinations of Hermite functions, the differentiation is therefore conveniently expressed by a matrix operation. We define the (N + 1) × N matrix D1 as having zero entries except for the two sub-diagonals, whose entries are: Di,i+11=i2=Di+1,i1.${\bf{D}}_{i,i + 1}^1 = \sqrt {{i \over 2}} = - {\bf{D}}_{i + 1,i}^1.$(14)

Then the derivative of a linear combination of the first N Hermite functions, with coefficient vector c, is the linear combination of the first N + 1 Hermite functions, with the coefficient vector c(1) given by c(1) = D1 c.

We may generalise the notation to the kth derivative, if we expand the size of the matrix D1 to (N + k) × (N + k), and assume zero-padding for c, that is, assuming c being of length N + k, with all entries N < nN + k being zero. Thus, we obtain: c(k)=Dkc,${{\bf{c}}^{\left( k \right)}} = {{\bf{D}}^k}{\bf{c}},$(15)

and the kth derivative matrix Dk , we obtain from k multiplications of D1; that is, Dk is the kth power of D1. For D2 = D1 D1 we obtain a matrix whose elements are as follows: Di,j2={ 12ifor i=j12i(i+1)for ij=212j(j+1)for ij=20else. ${\bf{D}}_{i,j}^2 = \left\{ {\matrix{ {{1 \over 2} - i} \hfill &amp; {{\rm{for}}\,i = j} \hfill \cr {{1 \over 2}\sqrt {i\left( {i + 1} \right)} } \hfill &amp; {{\rm{for}}\,i - j = - 2} \hfill \cr {{1 \over 2}\sqrt {j\left( {j + 1} \right)} } \hfill &amp; {{\rm{for}}\,i - j = 2} \hfill \cr 0 \hfill &amp; {{\rm{else}}{\rm{.}}} \hfill \cr } } \right.$(16)

For illustration, the first 5 × 5 elements of D1 and D2 are D1=(012000120220002203200032042000420)${{\bf{D}}^1} = \left( {\matrix{ 0 \hfill &amp; {\sqrt {{1 \over 2}} } \hfill &amp; 0 \hfill &amp; 0 \hfill &amp; 0 \hfill \cr { - \sqrt {{1 \over 2}} } \hfill &amp; 0 \hfill &amp; {\sqrt {{2 \over 2}} } \hfill &amp; 0 \hfill &amp; 0 \hfill \cr 0 \hfill &amp; { - \sqrt {{2 \over 2}} } \hfill &amp; 0 \hfill &amp; {\sqrt {{3 \over 2}} } \hfill &amp; 0 \hfill \cr 0 \hfill &amp; 0 \hfill &amp; { - \sqrt {{3 \over 2}} } \hfill &amp; 0 \hfill &amp; {\sqrt {{4 \over 2}} } \hfill \cr 0 \hfill &amp; 0 \hfill &amp; 0 \hfill &amp; { - \sqrt {{4 \over 2}} } \hfill &amp; 0 \hfill \cr } } \right)$(17)

and D2=(1201200032032012052030320720003092).${{\bf{D}}^2} = \left( {\matrix{ { - {1 \over 2}} \hfill &amp; 0 \hfill &amp; {\sqrt {{1 \over 2}} } \hfill &amp; 0 \hfill &amp; 0 \hfill \cr 0 \hfill &amp; { - {3 \over 2}} \hfill &amp; 0 \hfill &amp; {\sqrt {{3 \over 2}} } \hfill &amp; 0 \hfill \cr {\sqrt {{1 \over 2}} } \hfill &amp; 0 \hfill &amp; { - {5 \over 2}} \hfill &amp; 0 \hfill &amp; {\sqrt 3 } \hfill \cr 0 \hfill &amp; {\sqrt {{3 \over 2}} } \hfill &amp; 0 \hfill &amp; { - {7 \over 2}} \hfill &amp; 0 \hfill \cr 0 \hfill &amp; 0 \hfill &amp; {\sqrt 3 } \hfill &amp; 0 \hfill &amp; { - {9 \over 2}} \hfill \cr } } \right).$(18)

The condition number of D1 is larger than one if its size is larger than 2 × 2, and the condition number increases with the size of the matrix. For a 56 × 56-sized matrix D1 , as required for the computation of the first derivative of Gaia DR3 spectra from 55 coefficients, the condition number is 66.2. As any derivative larger than one is a power of D1 , the condition numbers of the derivative matrices increase with the power of the number of derivative. If the coefficients of a linear combination of Hermite functions are affected by noise, the relative noise on the coefficients for higher derivatives consequently increases strongly. It is therefore advisable to work with derivatives of as low an order as possible in the case of noisy data. The highest derivatives we use in the course of this work is the fourth. With the kth derivative of a linear combination of Hermite functions being a linear combination of Hermite functions, the coefficients of which can be easily computed, we now proceed to the computation of the roots of a linear combination of Hermite functions.

4.4 Roots of linear combinations of Hermite functions and their uncertainties

From the relations in Eq. (1) we see that the nth Hermite function is actually a polynomial of a degree n, multiplied by a Gaussian function exp {−x2/2}. Since the Gaussian function is a strictly positive function, it does not affect the position of the root of the Hermite function. The roots of the Hermite functions coincide with the roots of the polynomial. Thus, a linear combination of Hermite functions with maximum order N has exactly N roots, which may be real and complex. For the computation of the actual roots, we may also follow the lines of root finding for polynomials. For polynomials in monomial representation, a standard approach to this problem is interpreting the polynomial as the characteristic polynomial of some matrix, called the companion matrix of the polynomial. With finding this matrix, the problem of root finding for a linear combination of monomials is transformed into an eigenvalue problem (Press et al. 2007a), which can take advantage of the variety of fast and robust numerical eigenvalue techniques. For non-monomial representations of polynomials, an elegant generalisation of this approach has been developed for orthogonal polynomials. This approach makes use of the recurrence relationship for orthogonal polynomials, rather than the polynomial properties directly, and it is therefore immediately transferable to Hermite functions, satisfying a suitable recurrence relationship expressed in Eq. (1). Here, we follow the line discussed by Day & Romero (2005), with the trivial transfer from orthogonal polynomials to Hermite functions. We only present the required results here and avoid the repetition of the not too complicated proofs and derivations, as it saves space and as they have already been presented by Day & Romero (2005) in detail.

Here, we consider a linear combination of Hermite functions according to Eq. (4). With ΦN(x), we denote the N–element vector of functions, containing the first N Hermite functions, with indices from 0 to N – 1. Furthermore, with cN we denote the vector of the first N entries of the coefficient vector c, and with eN the N element vector with a value of one as the Nth entry and zero else; that is, we have: ΦN(x):=[ φ0(x),φ1(x),,φN1(x) ],${{\bf{\Phi }}_N}\left( x \right): = {\left[ {{\varphi _0}\left( x \right),{\varphi _1}\left( x \right), \ldots ,{\varphi _{N - 1}}\left( x \right)} \right]^ \top },$(19) cN:=[ c0,c1,,cN1 ],${{\bf{c}}_N}: = {\left[ {{c_0},{c_1}, \ldots ,{c_{N - 1}}} \right]^ \top },$(20) eN:=[ 0,0,,0,1 ].${{\bf{e}}_N}: = {\left[ {0,0, \ldots ,0,1} \right]^ \top }.$(21)

We assume cN ≠ 0 here, which can be ensured by trimming the length of c such that the highest order entry is non-zero if necessary. With these conventions, the function ϕ(x) from Eq. (4) is written as: ϕ(x)=ΦN(x)cN+cNφN(x).$\phi \left( x \right) = {\bf{\Phi }}_N^ \top \left( x \right){{\bf{c}}_N} + {c_N}{\varphi _N}\left( x \right).$(22)

For the moment, it is not necessary to assume that the functions φn(x) are actually the Hermite functions and we may only assume that these functions are satisfying a general recurrence relation of the form: xφn1(x)=i=0nφi(x)hi,n1.$x\,{\varphi _{n - 1}}\left( x \right) = \sum\limits_{i = 0}^n {{\varphi _i}\left( x \right){h_{i,n - 1}}} .$(23)

Then we can form the N × N matrix H, whose elements are the hi, j from the recurrence relation above, with i and j running from 0 to N − 1. We define the N × N matrix B, denoted the nonstandard companion matrix, by B:=HhN,N1cNcNeN.${\bf{B}}: = {\bf{H}} - {{{h_{N,N - 1}}} \over {{c_N}}}{{\bf{c}}_N}{\bf{e}}_N^ \top .$(24)

The value x0 is a root of the linear combination of the first N + 1 functions φn(x) with coefficient vector c – if and only if x0 is an eigenvalue of the matrix B. Thus, the problem of root finding is again transformed into an eigenvalue problem for linear combinations of functions satisfying a suitable recurrence relation.

For Hermite functions, the general recurrence relation in Eq. (23) takes the form given by Eq. (1), and we find hi,j={ i+12for j=i+1i2for j=i10otherwise. ${h_{i,j}} = \left\{ {\matrix{ {\sqrt {{{i + 1} \over 2}} } \hfill &amp; {{\rm{for}}\,j = i + 1} \hfill \cr {\sqrt {{i \over 2}} } \hfill &amp; {{\rm{for}}\,j = i - 1} \hfill \cr 0 \hfill &amp; {{\rm{otherwise}}{\rm{.}}} \hfill \cr } } \right.$(25)

As an example, for the case of N = 5, matrix B takes the following form: B=(0120052c0c512022052c1c502203252c2c5003204252c3c50004252c4c5).${\bf{B}} = \left( {\matrix{ 0 \hfill &amp; {\sqrt {{1 \over 2}} } \hfill &amp; 0 \hfill &amp; 0 \hfill &amp; { - \sqrt {{5 \over 2}} {{{c_0}} \over {{c_5}}}} \hfill \cr {\sqrt {{1 \over 2}} } \hfill &amp; 0 \hfill &amp; {\sqrt {{2 \over 2}} } \hfill &amp; 0 \hfill &amp; { - \sqrt {{5 \over 2}} {{{c_1}} \over {{c_5}}}} \hfill \cr 0 \hfill &amp; {\sqrt {{2 \over 2}} } \hfill &amp; 0 \hfill &amp; {\sqrt {{3 \over 2}} } \hfill &amp; { - \sqrt {{5 \over 2}} {{{c_2}} \over {{c_5}}}} \hfill \cr 0 \hfill &amp; 0 \hfill &amp; {\sqrt {{3 \over 2}} } \hfill &amp; 0 \hfill &amp; {\sqrt {{4 \over 2}} - \sqrt {{5 \over 2}} {{{c_3}} \over {{c_5}}}} \hfill \cr 0 \hfill &amp; 0 \hfill &amp; 0 \hfill &amp; {\sqrt {{4 \over 2}} } \hfill &amp; { - \sqrt {{5 \over 2}} {{{c_4}} \over {{c_5}}}} \hfill \cr } } \right).$(26)

The non-standard companion matrix B is thus easily generated for any given coefficient vector c. The matrix B is of upper Hessenberg form and the computations of the eigenvalues of a real Hessenberg matrix is particularly fast and simple (see, e.g. Chap. 7.4 in Golub & Van Loan 2013, or Press et al. 2007b).

We may, however, consider the case in which the coefficient vector, c, is not exact, but comes with an error, expressed in a covariance matrix, Σc. For finding an estimate on the uncertainties of the roots of the linear combination of Hermite functions, we again may apply the approach developed for polynomial roots. Here, we follow the formulations in Sect. 3.2.2 of Stetter (2004) – again with a trivial transfer of argument from polynomials to Hermite functions.

The root of a linear combination ϕ(x) as in Eq. (4) is a nonlinear function of the coefficients in the vector c. A linearisation of the problem is thus required for an efficient computation of the uncertainties of the roots. The roots of ϕ(x) we denote as zi, i = 1,…, N, and we consider the zi functions of the coefficient vector c, as we do for the linear combination ϕ(x). We thus write: ϕ(zi(c),c)=0,$\phi \left( {{z_i}\left( {\bf{c}} \right),{\bf{c}}} \right) = 0,$(27)

and compute the derivative of this expression with respect to c, resulting in ϕ(1)(zi(c),c)d z(c)dc+ϕ(zi(c),c)c=0.${\phi ^{\left( 1 \right)}}\left( {{z_i}\left( {\bf{c}} \right),{\bf{c}}} \right) \cdot {{{\rm{d}}\,z\left( {\bf{c}} \right)} \over {{\rm{d}}{\bf{c}}}} + {{\partial \phi \left( {{z_i}\left( {\bf{c}} \right),{\bf{c}}} \right)} \over {\partial {\bf{c}}}} = 0.$(28)

We thus obtain for each component cj of c, the derivative: zicj=φj(zi)ϕ(1)(zi),${{\partial {z_i}} \over {\partial {c_j}}} = {{{\varphi _j}\left( {{z_i}} \right)} \over {{\phi ^{\left( 1 \right)}}\left( {{z_i}} \right)}},$(29)

which are the elements i, j of the Jacobian of the roots with respect to the coefficients of the linear combination, which we denote as J. In a linear approximation, we obtain the covariance matrix of the roots, Σz, as Σz=JΣcJ.${\Sigma _z} = {\bf{J}}{\Sigma _{\rm{c}}}{{\bf{J}}^ \top }.$(30)

thumbnail Fig. 2

Empirical cumulative distribution functions for the absolute value of the derivative, normalised to its error, at the roots of linear combinations of Hermite functions, for N = 5 (orange), 55 (red), and 200 (green). The black curve shows the cumulative distribution function for a chi distribution with two degrees of freedom.

4.5 Significance of local extrema

When considering a linear combination of Hermite functions, with coefficients affected by normally distributed random noise, we want to find out whether a given local extremum is statistically significant or whether it is a noise-induced feature that can be removed within some boundaries given by the covariance matrix on the coefficients. A local extremum is removed if the second derivative at the extremum is zero. In this case, the extremum is transformed to an inflection point. We might therefore consider the second derivative at the local extremum, normalised to its error, with the probability of this value being zero. This probability might tell us about the statistical significance of the local extremum.

To investigate this approach quantitatively, we used Monte Carlo simulations. We generated random covariance matrices and for each of the random covariance matrices, we drew a large number of sets of random coefficients with zero mean. We then computed the roots of the corresponding random linear combination of Hermite functions, and the values of the derivative divided by the error of the value of the derivative, at the roots. We then investigated the distribution function. We carried out this procedure for a variety of different numbers of coefficients. Figure 2 shows resulting empirical cumulative distribution functions as examples, for different numbers N, and 5 × 105 random realisations. These results only include roots within the outermost extrema of the highest order Hermite function. With increasing N, the distribution approaches a chi distribution with two degrees of freedom.

Given the cumulative distribution function of a chi distribution with two degrees of freedom, we can therefore estimate the significance of a local extremum of a linear combination of Hermite functions at zi by attributing an approximate p-value given by: p(zi)1exp{ 12(ϕ(2)(zi)σϕ(2)(zi))2 }.$p\left( {{z_i}} \right) \approx 1 - \exp \left\{ { - {1 \over 2}\left( {{{{\phi ^{\left( 2 \right)}}\left( {{z_i}} \right)} \over {{\sigma _{{\phi ^{\left( 2 \right)}}\left( {{z_i}} \right)}}}}} \right)} \right\}.$(31)

4.6 Linear substitutions

We close this section on the mathematical tools with a remark on the substitution x → (u – Δθ)/Θ in the argument of the Hermite functions. Such a transformation has been applied to Gaia DR3 low-resolution spectra to adjust the widths of the Hermite functions to the widths of the spectra (De Angeli et al. 2023). In this case, φn(uΔθΘ)φm(uΔθΘ)du=Θδnm.$\int\limits_{ - \infty }^\infty {{\varphi _n}} \left( {{{u - {\rm{\Delta }}\theta } \over \Theta }} \right){\varphi _m}\left( {{{u - {\rm{\Delta }}\theta } \over \Theta }} \right){\rm{d}}u = \Theta {\delta _{nm}}.$(32)

To maintain orthonormality, the Hermite functions after linear substitution need thus to by multiplied by 1/Θ.${1 \mathord{\left/ {\vphantom {1 {\sqrt {\Theta .} }}} \right. \kern-\nulldelimiterspace} {\sqrt {\Theta .} }}$. The derivative of a Hermite function after substitution is dduφn(uΔθΘ)=1Θddxφn(x).${{\rm{d}} \over {{\rm{d}}u}}{\varphi _n}\left( {{{u - {\rm{\Delta }}\theta } \over \Theta }} \right) = {1 \over \Theta }{{\rm{d}} \over {{\rm{d}}x}}{\varphi _n}\left( x \right).$(33)

For the kth derivative with respect to u, an additional factor Θk needs to be applied to the result.

Taking the additional factors depending on Θ into account when differentiating and integrating Hermite functions with respect to u, we can use all results derived for linear combinations of Hermite functions also for Hermite functions after a linear substitution of its argument has been applied. For general substitutions in the form f(x) → f(y), we write f(y) for f(y(x)) dy/dx.

5 Elements of the formation of spectra

Before entering into the analysis of spectral lines, we first resumed certain aspects of the formation of the spectra in this section to establish the relationship between the low-resolution observational Gaia spectrum and the underlying SPD. In particular, we worked out the role of the LSF and the advantages resulting from their efficient parameterisation.

An observational spectrum f(u) is often related to the SPD s(λ) by an integral transformation of the form: f(u)=0L(u,λ)R(λ)s(λ)dλ,$f\left( u \right) = \int\limits_0^\infty {L\left( {u,\lambda } \right)R\left( \lambda \right)s\left( \lambda \right){\rm{d}}\lambda } ,$(34)

or a modified formulation thereof. Here, L(u, λ) denotes the LSF, including the dispersion, and R(λ) is the response function. In optical theory, the LSF is the inverse Fourier transform of the optical transfer function, integrated over the spatial direction (Goodman 1996). For Gaia DR3 low-resolution spectra, this concept of an LSF is however not appropriate. The BP and RP spectra are obtained as a finite linear combination of Hermite functions constructed in the internal self-calibration process (Carrasco et al. 2021) and are not directly related to an optical transfer function describing the actual spectroscopic instrument. However, the XP spectra can still be linked to the SPD via an integral transformation of the general form of Eq. (34) if the function L(u, λ) is given a more specific form. Setting L(u,λ):=n=0Nln(λ)φn(u)$L\left( {u,\lambda } \right): = \sum\limits_{n = 0}^N {{l_n}} \left( \lambda \right){\varphi _n}\left( u \right)$(35)

and inserting it into Eq. (34), and also using Eq. (5), we obtain the following for Gaia DR3 low-resolution spectra: f(u)n=0Ncnφn(u)=0L(u,λ)R(λ)s(λ)dλ$f\left( u \right) \equiv \sum\limits_{n = 0}^N {{c_n}} {\varphi _n}\left( u \right) = \int\limits_0^\infty {L\left( {u,\lambda } \right)R\left( \lambda \right)s\left( \lambda \right){\rm{d}}\lambda } $(36)

and cn=0ln(λ)R(λ)s(λ)dλ.${c_n} = \int\limits_0^\infty {{l_n}\left( \lambda \right)R\left( \lambda \right)s\left( \lambda \right){\rm{d}}\lambda } .$(37)

Thus, the function L(u, λ) is taking the role of the LSF, in the sense that it represents the instrument’s response to a monochromatic signal. It is however not directly linked to the optical transfer function. The most important consequence of this is that L(u, λ) is not necessarily a positive function. The response to a monochromatic input, when represented by a finite linear combination of Hermite functions, might show oscillational behaviour and be partly negative as well. We refer to L(u, λ) as the ‘pseudo-LSF’ hereafter for convenience. The vector for all ln(λ), n = 0,…, N, we denote as l(λ).

In carrying out this work, it appears convenient to allow for a fast computation of the vector l(λ). Therefore, we considered developing L(u, λ), not only in Hermite functions in the dependency on u, but also developing the wavelength dependency in basis functions. Doing so requires a suitable choice for the basis functions describing the wavelength dependency. In searching for a good choice for the basis in wavelength, we note that we can write: ln(λ)=L(u,λ)φn(u)du.${l_n}\left( \lambda \right) = \int\limits_{ - \infty }^\infty {L\left( {u,\lambda } \right){\varphi _n}\left( u \right){\rm{d}}u} .$(38)

This is essentially equivalent to applying the smoothing effect of the pseudo-LSF L(u, λ) on the Hermite function φn(u). If the dispersion relation linking u and λ is applied to substitute λ with u, then the resulting function ln(u) might be very similar to the nth Hermite function, particularly for small n. We therefore expect an efficient development of the function L(λ, u) when substituting the wavelength with the pseudo-wavelength, which we denote u′ to distinguish it from u; and we use Hermite functions for the development in u′ as well. We thus assume as a good approximation: L(u,u)n=0Nm=0MΛn,mφn(u)φm(u),$L\left( {u,u'} \right) \approx \sum\limits_{n = 0}^N {\sum\limits_{m = 0}^M {{{\rm{\Lambda }}_{n,m}}{\varphi _n}\left( u \right){\varphi _m}\left( {u'} \right)} } ,$(39)

with M only moderately larger than N. With Λ as the matrix whose elements are the Λn, m, we thus obtain: I(u)=ΛΦ(u),${\bf{I}}\left( {u'} \right) = {\rm{\Lambda }}{\bf{\Phi }}\left( {u'} \right),$(40)

with Φ(u):=[ φ0(u),φl(u),,φM(u) ].${\bf{\Phi }}\left( {u'} \right): = {\left[ {{\varphi _0}\left( {u'} \right),{\varphi _{\rm{l}}}\left( {u'} \right), \ldots ,{\varphi _M}\left( {u'} \right)} \right]^ \top }.$(41)

We use Eq. (40) in Sect. 6 of this work. As further preparation for Sect. 6, we consider the computation of some integrals involving the pseudo-LSF over u′, namely, integrals of the form: k,l(u0,ul):=L(k)(ul,u)(uu0)ldu,${{\cal L}_{k,l}}\left( {{u_0},{u_{\rm{l}}}} \right): = \int\limits_{ - \infty }^\infty {{L^{\left( k \right)}}} \left( {{u_{\rm{l}}},u'} \right) \cdot {\left( {u' - {u_0}} \right)^l}{\rm{d}}u',$(42)

for k, l = 0, 1, 2,… Since the pseudo-LSF has an exponential decrease, these integrals exist. The choice of Hermite functions for the description of the dependency on both u and u′ allows us to make use of results from Sect. 4 to find a simple solution to these integrals. For the case of l = 0, from combining Eqs. (13), (15), and (40), we immediately obtain: k,0(u0,ul)=iDkΛΦ(ul).${{\cal L}_{k,0}}\left( {{u_0},{u_{\rm{l}}}} \right) = {{\bf{i}}^ \top }{{\bf{D}}^k}{{\bf{\Lambda }}^ \top }{\bf{\Phi }}\left( {{u_{\rm{l}}}} \right).$(43)

For the case of l = 1, we use the recurrence relation from Eq. (1) to develop the linear combination of Hermite functions multiplied by (u′ – u0) in a linear combination of Hermite functions. The coefficients of this linear combination are obtained by multiplication with a tridiagonal matrix: P(u0), with the elements Pi,j(u0)={ u0for i=ji2for ij=1j2for ij=10otherwise. ${{\bf{P}}_{i,j}}\left( {{u_0}} \right) = \left\{ {\matrix{ { - {u_0}} \hfill &amp; {{\rm{for}}\,i = j} \hfill \cr {\sqrt {{i \over 2}} } \hfill &amp; {{\rm{for}}\,i - j = - 1} \hfill \cr {\sqrt {{j \over 2}} } \hfill &amp; {{\rm{for}}\,i - j = 1} \hfill \cr 0 \hfill &amp; {{\rm{otherwise}}{\rm{.}}} \hfill \cr } } \right.$(44)

For the case of l > 1, we obtain the result by multiplying P(u0) l times with itself, which we express as: Pl(u0). For the general case, we get: k,l(u0,ul)=iPl(u0)DkΛΦ(ul),${{\cal L}_{k,l}}\left( {{u_0},{u_{\rm{l}}}} \right) = {{\bf{i}}^ \top }{{\bf{P}}^l}\left( {{u_0}} \right){{\bf{D}}^k}{{\bf{\Lambda }}^ \top }{\bf{\Phi }}\left( {{u_{\rm{l}}}} \right),$(45)

we use this equation in Sect. 6. It is an apt demonstration for how the recurrence relations expressed in Eqs. (1) and (2) allow for the efficient computation of integrals, as in Eq. (42), by making use of matrix algebra.

We close this section with the simple observation that the linearity of the integration implies that if s(λ) can be expressed as the sum of two SPDs, say s(λ) = sc(λ) + sl(λ), the corresponding observational spectrum is the sum of the observational spectra of sc(λ) and sl(λ), f(u)=0L(u,λ)R(λ)[ sc(λ)+sl(λ) ]   dλ$f\left( u \right) = \int\limits_0^\infty {L\left( {u,\lambda } \right)R\left( \lambda \right)\left[ {{s_{\rm{c}}}\left( \lambda \right) + {s_{\rm{l}}}\left( \lambda \right)} \right]} \,\,\,{\rm{d}}\lambda $(46) =fc(u)+fl(u).$ = {f_{\rm{c}}}\left( u \right) + {f_{\rm{l}}}\left( u \right).$(47)

If sc(λ) and sl(λ) separate the SPD onto a contribution from a continuum SPD and a SPD of a spectral line, respectively, we can thus separate also the observational spectra into a continuum contribution and a line contribution. For a convenient notation in later sections, we introduce the abbreviation: t(λ):=R(λ)s(λ).$t\left( \lambda \right): = R\left( \lambda \right)s\left( \lambda \right).$(48)

6 Inversion of observational spectra

After having derived the relevant mathematical properties of linear combinations of Hermite functions in Sect. 4 and having combined the results of this section with the observational spectra in Sect. 5, we must then finally link all results to the external spectrum, that is, we have to consider the inversion of internally calibrated Gaia spectra to externally calibrated SPDs. This inversion is required to reduce the smearing effect of the pseudo-LSF when quantifying the line strengths, as discussed later. The inversion of observational spectra is intrinsically a complex problem. However, as far as this work is concerned, we may make some simplifying assumptions that will allow us to derive a simple and efficient formalism. First, for the study of a spectral line, we require a reliable inversion of the observational spectrum only in the vicinity of the spectral line. We do not need a reliable inversion of the entire observational spectrum. Far from the position of the line, the inversion might be arbitrarily wrong for our purposes. Second, we are mainly interested in the inversion of the continuum spectrum, where the spectral lines are removed. For the continuum, we may well assume smoothness on the inverted spectrum. We are thus left with the problem of finding a local inversion to a smooth external spectrum, which is considerably easier than the problem of finding a global inversion to a general external spectrum.

A smooth local approximation to the inverted spectrum calls for a polynomial approximation in the vicinity of the line position. Naturally, the use of a Taylor expansion around the line position follows. This approach we line out in the following as the first of two approaches to the local smooth inversion.

We consider the product of the response curve and the continuum flux at some value u0. To obtain sc(u0)R(u0) = fc(u0), we use Eq. (34), with Eq. (48) and transformed from wavelength to pseudo-wavelength, which is: f(u0)=L(u0,u)tc(u)du.$f\left( {{u_0}} \right) = \int\limits_{ - \infty }^\infty {L\left( {{u_0},u'} \right){t_{\rm{c}}}} \left( {u'} \right){\rm{d}}u'.$(49)

We introduce a Taylor expansion of t(u′) centred on u0, resulting in: f(u0)=L(u0,u)[ l=0tc(l)(u0)l!(uu0)l ]du$f\left( {{u_0}} \right) = \int\limits_{ - \infty }^\infty {L\left( {{u_0},u'} \right)} \left[ {\sum\limits_{l = 0}^\infty {{{t_{\rm{c}}^{\left( l \right)}\left( {{u_0}} \right)} \over {l!}}{{\left( {u' - {u_0}} \right)}^l}} } \right]{\rm{d}}u'$(50) =l=0tc(l)(u0)l!L(u0,u)(uu0)ldu$$ = \sum\limits_{l = 0}^\infty {{{t_{\rm{c}}^{\left( l \right)}\left( {{u_0}} \right)} \over {l!}}} \int\limits_{ - \infty }^\infty {L\left( {{u_0},u'} \right){{\left( {u' - {u_0}} \right)}^l}{\rm{d}}u'} $$(51) =l=0tc(l)(u0)l!0,l(u0,u0),$$ = \sum\limits_{l = 0}^\infty {{{t_{\rm{c}}^{\left( l \right)}\left( {{u_0}} \right)} \over {l!}}} {{\cal L}_{0,l}}\left( {{u_0},{u_0}} \right),$$(52)

with k, l(u0, u0) as defined in Eq. (42). If we wish to approximate tc(u0) in a higher order approximation, we need to impose additional conditions. We can derive these conditions from higher derivatives at the same point, u0.

We compute the kth derivative of f(u) with respect to u, and we obtain an analogous expression, f(k)(u0)=l=0tc(l)(u0)l!k,l(u0,u0).$${f^{\left( k \right)}}\left( {{u_0}} \right) = \sum\limits_{l = 0}^\infty {{{t_{\rm{c}}^{\left( l \right)}\left( {{u_0}} \right)} \over {l!}}} {{\cal L}_{k,l}}\left( {{u_0},{u_0}} \right).$$(53)

If we truncate the Taylor expansion at the Kth term, we obtain a matrix equation: 𝔣(u0)=𝔏(u0)t(u0),$${\bf{\ffr}}\left( {{u_0}} \right) = {\bf{\Lfr}}\left( {{u_0}} \right){\bf{t}}\left( {{u_0}} \right),$$(54)

with 𝔣(u0)=[ f(0)(u0)f(1)(u0)f(2)(u0)    f(K)(u0) ],$${\bf{\ffr}}\left( {{u_0}} \right) = {\left[ {{f^{\left( 0 \right)}}\left( {{u_0}} \right){f^{\left( 1 \right)}}\left( {{u_0}} \right){f^{\left( 2 \right)}}\left( {{u_0}} \right)\,\, \ldots \,\,{f^{\left( K \right)}}\left( {{u_0}} \right)} \right]^ \top },$$(55) t(u0)=[ tc(0)(u0)tc(1)(u0)tc(2)(u0)    tc(K)(u0) ],$${\bf{t}}\left( {{u_0}} \right) = {\left[ {t_{\rm{c}}^{\left( 0 \right)}\left( {{u_0}} \right)t_{\rm{c}}^{\left( 1 \right)}\left( {{u_0}} \right)t_{\rm{c}}^{\left( 2 \right)}\left( {{u_0}} \right)\,\, \ldots \,\,t_{\rm{c}}^{\left( K \right)}\left( {{u_0}} \right)} \right]^ \top },$$(56)

and the (K + 1) × (K + 1) matrix 𝔏(u0), whose elements are 𝔏i,j(u0)=(i1),(j1)(u0,u0)(j1)!.$${{\bf{\Lfr}}_{i,j}}\left( {{u_0}} \right) = {{{{\cal L}_{\left( {i - 1} \right),\left( {j - 1} \right)}}\left( {{u_0},{u_0}} \right)} \over {\left( {j - 1} \right)!}}.$$(57)

Equation (54) is a formulation for the inversion of the observational spectrum by polynomial approximations to the product of the response function and the SPD.

The matrix 𝔏(u0) takes a simple form, namely, 𝔏i,j(u0)=0 for i>j${{\bf{\Lfr}}_{i,j}}\left( {{u_0}} \right) = 0\,{\rm{for}}\,i > j$(58) 𝔏i,i+1(u0)=𝔏i+1,i+2(u0) for i0.${{\bf{L}}_{i,i + 1}}\left( {{u_0}} \right) = {{\bf{L}}_{i + 1,i + 2}}\left( {{u_0}} \right)\,{\rm{for}}\,i \ge 0.$(59)

Equation (58) states that the matrix 𝔏 is of upper triangular type. Equation (59) states that the entries of the diagonal and the secondary diagonals are all identical, respectively. This implies that the (K + 1) × (K + 1) matrix 𝔏 has only K + 1 independent elements that need to be computed. This allows for a fast computation of the matrix and its inverse and makes the usage computationally simple.

For the computation of the formal error on tc(u0) which we can use to judge the quality of the inversion of Eq. (54), it is convenient to write the vector 𝔣(u0) as: 𝔣(u0)=H(u0)c,${\bf{\ffr}}\left( {{u_0}} \right) = {\bf{H}}\left( {{u_0}} \right){\bf{c}},$(60)

with the (K + 1) × (N + K) matrix H(u0) whose lth row is Φ(u0) D(l–1).Then, t(u0)=𝔏1H(u0)c,${\bf{t}}\left( {{u_0}} \right) = {{\bf{\Lfr}}^{ - 1}}{\bf{H}}\left( {{u_0}} \right){\bf{c}},$(61)

and the covariance matrix for t(u0) is Σt=𝔏1H(u0)ΣcH(u0)(𝔏1).${\Sigma _{\rm{t}}} = {{\bf{L}}^{ - 1}}{\bf{H}}\left( {{u_0}} \right){\Sigma _{\rm{c}}}{{\bf{H}}^ \top }\left( {{u_0}} \right){\left( {{{\bf{L}}^{ - 1}}} \right)^ \top }.$(62)

The disadvantage of this method comes from the dependency on higher order derivatives of the spectrum at u0. As discussed in the previous section, the signal-to-noise ratio (S/N) of higher derivatives decreases strongly with increasing derivative and local inversion becomes dominated by noise easily. In practice, only K = 1 or K = 2 might result in good estimates for tc(u0).

Here, it appears useful to replace the conditions on higher derivative by conditions on the zeroth derivative at different pseudo-wavelength positions in order to become less sensitive to random noise in the inversion. The change is straight forward, replacing the derivative conditions in Eq. (53) by conditions of f(0)(ui) at different points, ui, in the vicinity of u0. The results obtained with this approach however depend strongly on the choice of the ui and are not necessarily better than the results with derivative conditions. A more suitable approach results from adapting a technique for the global inversion from Weiler et al. (2020) to the problem of a local smooth inversion.

Thus, we selected a set of basis functions for the SPD, such as Legendre polynomials and, assuming sufficient knowledge of the instrument, we computed the observations spectra corresponding to the basis functions using Eq. (37). Locally approximating the observations spectrum around a pseudo-wavelength u0 with these observational basis functions corresponds to approximating the SPD around a wavelength, λ(u0), by Legendre polynomials. To avoid conditioning issues, we may adjust the argument of the Legendre polynomials to the range where these functions are orthonormal, namely, in the range of [−1, 1]. This approach is the same as used for the Hermite functions describing the Gaia DR3 XP spectra. For sake of simplicity, we refer to this linear substitution for the Legendre polynomials Pi(x) as x = g(λ). For the approximation of the observational spectrum, we do not have to ensure an exact solution, but do a regression instead and reduce the influence of random noise. We chose an interval Δ in pseudo-wavelength, centred on u0, and then chose a grid of M equidistant points on this interval. We arranged the grid point as a vector u. Since the errors of the observational spectrum at the set of M points ui, i = 1,…, M are likely to be correlated, we have to find a linear combination of observational spectra for the first K Legendre polynomials, the coefficients of which are arranged as a vector b, that minimise the squared Mahalanobis distance: Dm2=(f(u)Φ(u)Lb)Σf1(f(u)Φ(u)Lb).$D_m^2 = {\left( {f\left( {\bf{u}} \right) - {{\bf{\Phi }}^ \top }\left( u \right){\bf{L}}\,{\bf{b}}} \right)^ \top }\Sigma _f^{ - 1}\left( {f\left( {\bf{u}} \right) - {\bf{\Phi }}\left( {\bf{u}} \right){\bf{L}}\,{\bf{b}}} \right).$(63)

Here, L is the N × K matrix of coefficients of the observational spectra corresponding to the first K Legendre polynomials, developed in Hermite functions. This is a generalised least squares problem with the solution (Kariya & Kurata 2004): b=((Φ(u)L)Σf1(Φ(u)L))1(Φ(u)L)Σf1f(u).${\bf{b}} = {\left( {{{\left( {{{\bf{\Phi }}^ \top }\left( {\bf{u}} \right){\bf{L}}} \right)}^ \top }\Sigma _f^{ - 1}\left( {{{\bf{\Phi }}^ \top }\left( {\bf{u}} \right){\bf{L}}} \right)} \right)^{ - 1}}{\left( {{{\bf{\Phi }}^ \top }\left( {\bf{u}} \right){\bf{L}}} \right)^ \top }\Sigma _f^{ - 1}f\left( {\bf{u}} \right).$(64)

The linear combination: s(λ0)=i=0K1biPi(g(λ0))PL(λ0)b$s\left( {{\lambda _0}} \right) = \sum\limits_{i = 0}^{K - 1} {{b_i}} \cdot {P_i}\left( {g\left( {{\lambda _0}} \right)} \right) \equiv {{\bf{P}}_L}\left( {{\lambda _0}} \right){\bf{b}}$(65)

is then an approximation to the SPD at wavelength λ0. The results of the inversion of the observational spectrum with this approach depend in the choice of Δ, K, and M. A good guideline for the choices is M > K, and the conditioning of the matrix to be inverted in Eq. (64) being small enough to be well off any numerical problems. For the tests carried out in this work, we used K = 5, Δ around 6 and M around 15.

Both methods for local smooth inversion of observational spectra have their advantages and disadvantages. The great advantage of the second approach, namely, the inversion using Legendre polynomials is the lower sensitivity to random noise. This approach, however, requires the explicit use of the response functions and systematic errors in the response function immediately affect the inversion. This is not the case of the first approach, which solves for the product of the external spectrum and the response function directly. As we show in Sect. 8, this is an advantage in the treatment of narrow spectral lines. So, generally, the second approach might be more advisable. In cases of strong lines and good S/N on the observational spectrum, potential systematic errors on the response function might become larger than the random noise and the first approach might be more suitable. We re-use the second approach in Sect. 9.2 – this time not for the purposes of inversion, but for a continuum estimation.

7 Spectral lines as local extrema

A spectral line might result in a local extremum of the observational spectrum f(u). In particular, we assume that an emission line results in a local maximum and an absorption line in a local minimum. A necessary condition for a local extremum of f(u) is a root of f(1)(u). The position of such a root of the first derivative we denote with uE. For f(2)(uE) < 0, the observational spectrum has a local maximum at uE, and for f(2)(uE) > 0, a local minimum. We omit higher order derivative tests for simplicity here.

A local extremum is not, however, necessarily caused by a spectral line, as noise in the coefficients representing the spectra can be manifested in wavy structures with many local extrema. As a local extremum requires a root in the derivative, noise is more likely to result in local extrema in regions of an observational spectrum where the first derivative of the true, noise-free spectrum is small, that is, where the gradient of the spectrum is low. With decreasing S/N, more and more regions with increasing gradient of the spectrum become likely to show local extrema due to noise. We therefore need an estimator quantifying the statistical significance of a local extremum, to distinguish between noise patterns and local extrema that actually are produced by the presence of a spectral line. For this purpose, we can use the value of the second derivative, normalised by their error, as discussed in Sect. 4.5. For noise-induced local extrema, this normalised value of the second derivative at the position of an extremum follows approximately a chi-distribution with two degrees of freedom, from which we can compute a p-value for a local extremum.

The noise-induced local extrema are illustrated for simulated BP spectra in Fig. 3. Here, the value of the second derivative at the root of the first derivative, normalised to its error, is shown for simulated spectra. The simulations in this work are done with the simple Gaia XP simulator described in Weiler et al. (2020). The simulated spectra are composed of a black body continuum with 2 × 104 K, and a narrow Hß line with a random chosen equivalent width, W, which we define as W:=0s(λ)sc(λ)sc(λ)dλ.$W: = \int\limits_0^\infty {{{s\left( \lambda \right) - {s_{\rm{c}}}\left( \lambda \right)} \over {{s_{\rm{c}}}\left( \lambda \right)}}{\rm{d}}\lambda } .$(66)

This definition of W has the opposite sign than in usually used definitions (e.g. Carroll & Ostlie 2017). We use this definition here to apply it to absorption and emission lines alike; we prefer emission lines to have a positive equivalent width and absorption lines to have a negative one. The result for 2 × 105 simulated BP spectra for G-band magnitudes for the continuum of 12, 15, and 18, respectively, are shown. The colour scale shows the random value of W in each of the 105 spectra.

With increasing magnitude, and thus decreasing S/N, the number of local extrema increases. For the lowest magnitude, these local extrema caused by noise form predominantly in parts of the spectrum where the gradient is small. With increasing magnitude, the noise-related local extrema spread out. For G = 18m, the only part in the BP spectrum free of noise-related local extrema is a narrow region around the wavelength cut-off position, where the gradient of the spectrum is largest. The simulated Hß line shows very clearly against the noise background of local extrema. And the larger the equivalent width of the simulated Hß line, the smaller the value of the second derivative at the root of the first derivative, normalised to its error.

If the spectral line is weak, namely, it has a small absolute value of W, and the slope of the spectrum is sufficiently large, an emission or absorption line might not result in a local extremum of the spectrum, however. A line might manifest itself in this case as a deformation of the spectrum. To detect such lines which are weak compared to the slope of the spectrum, we might consider higher order derivatives. The deformation of the spectrum results in a local extremum in curvature of the spectrum, namely, in f(2)(u). In this work, we first discuss the case of a spectral line as a local extremum in f(u) in detail. The following sections analyse this case for narrow and broad lines. Having worked out the necessary formalism for this case, it is easy to transfer the results to extrema in the curvature of the spectrum by simply increasing all derivatives involved by two. This latter case is analysed in detail in Sect. 10.

thumbnail Fig. 3

Error-normalised value of the second derivative of the BP spectrum as a function of uE for random spectra consisting of a 2 × 104 K black body continuum and a narrow Hß line with random equivalent width W. The three panels correspond to G-magnitudes of 12m, 15m, and 18m, from top to bottom.

8 The limit of narrow lines

In this section, we consider the case of narrow spectral lines, where we understand ‘narrow’ in the sense that the intrinsic width of the spectral line in the SPD (measured e.g. by its FWHM) is small compared to the width of the LSF, expressed in wavelength. In this case we can neglect the width of the spectral line in the SPD, and write it as the sum of a continuum SPD and a weighted Dirac delta distribution: s(λ)=sc(λ)+sl(λ)$s\left( \lambda \right) = {s_{\rm{c}}}\left( \lambda \right) + {s_{\rm{l}}}\left( \lambda \right)$(67) sc(λ)+αδ(λλL).$ \approx {s_{\rm{c}}}\left( \lambda \right) + \alpha \delta \left( {\lambda - {\lambda _{\rm{L}}}} \right).$(68)

Here, sc(λ) is the continuum SPD, sl(λ) the line SPD, and λL is the wavelength of the line. Using Eq. (34) and using the substitution of u′ for the wavelength λ, we obtain in this case for the observed spectrum: f(u)=L(u,u)R(u)[ sc(u)+αδ(uuL) ]  du$f\left( u \right) = \int\limits_{ - \infty }^\infty {L\left( {u,u'} \right)R\left( {u'} \right)\left[ {{s_{\rm{c}}}\left( {u'} \right) + \alpha \,\delta \left( {u' - {u_{\rm{L}}}} \right)} \right]} \,\,{\rm{d}}u'$(69) =fc(u)+αR(uL)L(u,uL)$ = {f_{\rm{c}}}\left( u \right) + \alpha \,R\left( {{u_{\rm{L}}}} \right)L\left( {u,{u_{\rm{L}}}} \right)$(70) =fc(u)+fl(u).$ = {f_{\rm{c}}}\left( u \right) + {f_{\rm{l}}}\left( u \right).$(71)

The observed spectrum in the limit of a narrow spectral line is thus the sum of the observed continuum spectrum fc(u) and a weighted LSF as the observed line fl(u).

8.1 Systematic errors in line positions

As we are considering spectral lines as local extrema in the internally calibrated spectrum, we have to analyse the relationship between the position of the spectral line, uL, and the position at which the corresponding local extremum is occurring, uE. Using Eq. (70), we obtain, for a local extremum, the necessary condition: f(1)(uE)+αR(uL)L(1)(uE,uL)=0.${f^{\left( 1 \right)}}\left( {{u_{\rm{E}}}} \right) + \alpha \,R\left( {{u_{\rm{L}}}} \right){L^{\left( 1 \right)}}\left( {{u_{\rm{E}}},{u_{\rm{L}}}} \right) = 0.$(72)

We assume α + 0, and for convenience we introduce the abbreviation ζ(u):=fc(1)(u)R(uL)sc(uL).$\zeta \left( u \right): = - {{f_c^{\left( 1 \right)}\left( u \right)} \over {R\left( {{u_{\rm{L}}}} \right){s_{\rm{c}}}\left( {{u_{\rm{L}}}} \right)}}.$(73)

At this point, it appears convenient to express the strength of the line with respect to the underlying continuum and use the equivalent width as defined in Eq. (66). In the approximation of a narrow line, and using Eq. (68), the equivalent width becomes: W=αsc(λL).$W = {\alpha \over {{s_{\rm{c}}}\left( {{\lambda _{\rm{L}}}} \right)}}.$(74)

Substituting with uL, we obtain L(1)(uE,uL)=ζ(uE)W.${L^{\left( 1 \right)}}\left( {{u_{\rm{E}}},{u_{\rm{L}}}} \right) = {{\zeta \left( {{u_{\rm{E}}}} \right)} \over W}.$(75)

This is a strongly non-linear equation for the position of the local extremum, uE, given the line position, uL. Although this equation is not useful for practical computations in relating uE and uL, it nevertheless allows for a graphical illustration. In t0he limit of an extremely strong line, namely, for the limit W → ∞, the right hand side of Eq. (75) becomes zero and the position of the extremum coincides with the position of the line. In the limit of negligible line strength, i.e. W → 0, the right hand side becomes arbitrarily large and no more solution to Eq. (75) may exist. In this case, the line is not introducing a local extremum in the observational spectrum anymore. Between these two extreme cases, the position of the local extremum is shifted with respect to the line position.

This is illustrated with a simulated spectrum, again consisting of a black body continuum of 2 × 104 K and narrow Hα (in RP) and Hß (in BP) lines. The functions L(1)(u, uL) and ζ(u)/W are shown in Fig. 4. A local extremum forms where the two functions intersect, this position is indicated uE in the figure. A good range to look for extrema associated with a line is the range between the first local extrema of L(1)(u, uL). The shift Au is towards higher values in the underlying continuum spectrum for emission lines and towards lower values for absorption lines. As a criterion for associating a local extremum with a line at known wavelengths, and thus with known uL, we might therefore require a shift Δu which is within the extrema of the first derivative of L(u, uL); and towards lower values for a local minimum and towards larger values for a local maximum.

For Hα, the situation is more complicated. For a faint line, no local extremum within the expected range of [umin, umax] forms. Instead, two local extrema, that is, a wavy-like structure, forms around the line position. This is a consequence of the more complex structure of the response curve close to the wavelength cut-on position of RP. This filter cut-on is realised by an interference filter on the RP prism, and interference filters tend to have a wavy transmission pattern in particular close to strong gradients as a cut-on flank. This pattern, which is present in the nominal response curves for the Gaia instruments (Jordi et al. 2010), interfere with the Hα line. Even if the nominal pre-launch response curves might not be fully representative for the calibrated instrument, such patterns (as they are based on physical properties of the instrument) might still occur and complicate the analysis of faint Hα lines.

In practice, the function ζ(uE) is unknown, and with it the shift Δu between the position of the extremum and of the line position. However, we may expect the shift Δu to be with the local extrema of the first derivative of the pseudo-LSF, at umin and umax. With an assumption for the pseudo-LSF, we can compute these positions and consider a local extremum in agreement with a line position if umin < uE < umax.

thumbnail Fig. 4

Illustration of the shift between the line position, uL, and the position of local extrema, uE, for a 2 × 104 K black body background and an Hβ line (top panel) and an Hα line (bottom panel). The green lines show L(1) (uuL, uL) and the blue lines show ζ (uuL) divided by W. We refer to the text for more details.

8.2 Continuum model for narrow lines

Having f(u) available, we have to separate the continuum spectrum from the line contribution. This requires the construction of a continuum spectrum which is to be subtracted from the observational spectrum f(u). To simplify this procedure, we use a simple approximation. We assume that the continuum spectrum is characterised by fc(2)(uE)=0,$f_c^{\left( 2 \right)}\left( {{u_{\rm{E}}}} \right) = 0,$(76)

that is, we assume that the curvature of the observed continuum spectrum is zero at the maximum intensity of the spectral line. Differentiating Eq. (70) twice with respect to u results in f(2)(u)=fc(2)(u)+αR(uL)L(2)(u,uL).${f^{\left( 2 \right)}}\left( u \right) = f_c^{\left( 2 \right)}\left( u \right) + \alpha \,R\left( {{u_{\rm{L}}}} \right){L^{\left( 2 \right)}}\left( {u,{u_{\rm{L}}}} \right).$(77)

With Eq. (76), this becomes αR(uL)f(2)(uE)L(2)(uE,uL)$\alpha \,R\left( {{u_{\rm{L}}}} \right) \approx {{{f^{\left( 2 \right)}}\left( {{u_{\rm{E}}}} \right)} \over {{L^{\left( 2 \right)}}\left( {{u_{\rm{E}}},{u_{\rm{L}}}} \right)}}$(78)

and fc(u)f(u)f(2)(uE)L(2)(uE,uL)L(u,uL).${f_{\rm{c}}}\left( u \right) \approx f\left( u \right) - {{{f^{\left( 2 \right)}}\left( {{u_{\rm{E}}}} \right)} \over {{L^{\left( 2 \right)}}\left( {{u_{\rm{E}}},{u_{\rm{L}}}} \right)}}L\left( {u,{u_{\rm{L}}}} \right).$(79)

With Eq. (76) as a condition for estimating the continuum spectrum, we can thus estimate the line spectrum from the ratio of the curvature of the observed spectrum at the position where the line is strongest in the observational spectrum and the curvature of the LSF at the same position. In case the true line position, uL is not known, we may use the approximation uLuE.

Equation (79) translates, with Eq. (40), into an equation for the vector of continuum coefficients, cc, cccf(2)(uE)L(2)(uE,uL)l(uL).${{\bf{c}}_c} \approx {\bf{c}} - {{{f^{\left( 2 \right)}}\left( {{u_{\rm{E}}}} \right)} \over {{L^{\left( 2 \right)}}\left( {{u_{\rm{E}}},{u_{\rm{L}}}} \right)}}{\bf{l}}\left( {{u_{\rm{L}}}} \right).$(80)

8.3 Equivalent widths for narrow lines

For a spectral line, we are not particularly interested in the strength of the line scaled with the response function, α R(u0), as provided by Eq. (78), but, rather, we look for the equivalent width W. To obtain an expression for W, we use Eq. (74), with a substitution of uL, and we combine this equation with Eq. (78), to obtain W=f(2)(uE)L(2)(uE,uL)1sc(uL)R(uL)f(2)(uE)L(2)(uE,uL)1tc(uL).$W = {{{f^{\left( 2 \right)}}\left( {{u_{\rm{E}}}} \right)} \over {{L^{\left( 2 \right)}}\left( {{u_{\rm{E}}},{u_{\rm{L}}}} \right)}}{1 \over {{s_{\rm{c}}}\left( {{u_{\rm{L}}}} \right)R\left( {{u_{\rm{L}}}} \right)}} \equiv {{{f^{\left( 2 \right)}}\left( {{u_{\rm{E}}}} \right)} \over {{L^{\left( 2 \right)}}\left( {{u_{\rm{E}}},{u_{\rm{L}}}} \right)}}{1 \over {{t_{\rm{c}}}\left( {{u_{\rm{L}}}} \right)}}.$(81)

For the determination of the equivalent width from the observational spectrum, we thus need to derive sc(u0) R(u0) from the continuum spectrum provided by Eq. (79). Here, we can apply the inversion approaches as discussed in Sect. 6. Equation (54) or (65), together with a response function, can be combined with Eq. (81) to obtain the equivalent width of a spectral line.

We can compute the uncertainty on the equivalent width in linear approximation using the Jacobian of W with respect to the coefficients of the XP spectrum. The elements of this are: Wci=1L(2)(uE,uL)[ 1tc(uE)cif(2)(uE)f(2)(uE)tc2(uL)citc(uL) ].${{\partial W} \over {\partial {c_i}}} = {1 \over {{L^{\left( 2 \right)}}\left( {{u_{\rm{E}}},{u_{\rm{L}}}} \right)}}\left[ {{1 \over {{t_{\rm{c}}}\left( {{u_{\rm{E}}}} \right)}}{\partial \over {\partial {c_i}}}{f^{\left( 2 \right)}}\left( {{u_{\rm{E}}}} \right) - {{{f^{\left( 2 \right)}}\left( {{u_{\rm{E}}}} \right)} \over {t_{\rm{c}}^2\left( {{u_{\rm{L}}}} \right)}}{\partial \over {\partial {c_i}}}{t_{\rm{c}}}\left( {{u_{\rm{L}}}} \right)} \right].$(82)

The derivative of f(2)(ue) with respect to ci is the ith element of the vector ΦT(uE)D2. The expression for the derivative of tc(uL), with respect to ci depends on the approach chosen for inversion. Using Eq. (61), the derivative is the ith element of the first row of the matrix, 𝔏–1 H𝔏1 T(ue)D2 l(ul). Using Eqs. (64) and (65), the derivative of tc(uL) with respect to ci is the ith entry of R(uL)PL(u)((Φ(u)L)Σf1(Φ(u)L))1(Φ(u)L)Σf1Φ(u)[ 1D2l(uL) ].$\matrix{ \hfill {R\left( {{u_{\rm{L}}}} \right){{\bf{P}}_{\rm{L}}}\left( {\bf{u}} \right){{\left( {{{\left( {{{\bf{\Phi }}^ \top }\left( {\bf{u}} \right){\bf{L}}} \right)}^ \top }\Sigma _f^{ - 1}\left( {{{\bf{\Phi }}^ \top }\left( {\bf{u}} \right){\bf{L}}} \right)} \right)}^{ - 1}}{{\left( {{{\bf{\Phi }}^ \top }\left( {\bf{u}} \right){\bf{L}}} \right)}^ \top }\Sigma _f^{ - 1}{\bf{\Phi }}\left( {\bf{u}} \right)} \cr \hfill {\left[ {1 - {{\bf{D}}^2}{\bf{l}}\left( {{u_{\rm{L}}}} \right)} \right].} \cr } $(83)

A simple example computation for the illustration of the approach for narrow lines is shown in panel A of Fig. 5. As previously, simulations for a black body continuum for a temperature of 2 × 104 K and a G-magnitude of 12m was used, and a narrow Hβ line was added, with a randomly chosen equivalent width. Then, the BP spectrum was computed and the coefficients of the Hermite representation computed from ten transits. The equivalent width was computed from these coefficients, and from it the relative difference to the true equivalent width. This was repeated for 5 × 105 random realisations. The distribution of the relative difference for these simulations is shown as a function of the true equivalent width. In this computation, no a priori information on the line position was used, namely, the approximation uL = uE was used. Even with this approximation, a very good determination of W is possible, with systematic errors on the level of a percent.

thumbnail Fig. 5

Distributions of the relative difference between the computed equivalent width (W) and the true equivalent width (Wtrue) of Hβ lines with a 2 × 104 K black body continuum for different assumptions on the line shape (narrow lines, Gaussian, Voigt, and Lorentz profiles) and different computation techniques. Each panel shows the result from 5 × 105 random realisations of a spectrum for G = 12m and ten transits. Positive values in the horizontal axis correspond to Hβ emission, negative values to Hβ absorption. For details see text in Sects. 8.3, 9.3, and 9.4.

8.4 Blended narrow lines

In the following, we consider the case that two or more narrow lines are not sufficiently separated in pseudo-wavelength to neglect the overlap between them. We refer to this case as a blending of lines. The approach applied here is very similar to the case of unblended narrow lines, we can still use the condition expressed by Eq. (76) in this case. However, we have to modify Eq. (77) in such a way that it includes the contributions from other lines to a particular line under consideration. Taking ni lines into account, with the line fluxes αi, i = 1,…, ni and a pseudo-wavelength position uL,i for the ith line, we obtain: f(2)(u)=i=1nlαiR(uL,i)L(2)(u,uL,i).${f^{\left( 2 \right)}}\left( u \right) = \sum\limits_{i = 1}^{{n_l}} {{\alpha _i}R\left( {{u_{L,i}}} \right){L^{\left( 2 \right)}}\left( {u,{u_{L,i}}} \right)} .$(84)

Evaluating this expression at the line positions uL,j and writing it as a matrix equation, we get f(2)(u)=L(2)r(u),${{\bf{f}}^{\left( 2 \right)}}\left( {\bf{u}} \right) = {{\bf{L}}^{\left( 2 \right)}}{\bf{r}}\left( {\bf{u}} \right),$(85)

with L(2) as the matrix with the elements Li,j(2)=L(uL,i,uL,j)${\bf{L}}_{i,j}^{\left( 2 \right)} = L\left( {{u_{L,i}},{u_{L,j}}} \right)$, f(2)(u) as the vector containing the elements f(2) (uL,i), and r (u) as the vector containing the elements αiR(uL,i), i = 1,…, ni in both cases. Equation (85) is then solved for r(u), and the computation of the equivalent widths for the nl lines continues in the same way as discussed for the unblended case previously.

8.5 Upper limits for narrow lines

In the case of a non-detection of a particular spectral line, it might be of interest to derive an upper limit for the equivalent width of the non-detected line from the coefficients of the Gaia DR3 spectrum. Given the covariance matrix for the coefficients of the spectrum, Σc, and a statistical level of significance, p, we can compute what line, given by α R(uL), can be added to the spectrum within the level of significance. With the coefficients for the pseudo-LSF at the position of the line, l(uL) given by Eq. (40), the distance between a spectrum and a spectrum with a narrow line added, expressed in numbers of standard deviations, is given by the Mahalanobis distance: DM=| αR(uL) |l(uL)Σc1l(uL).${D_{\rm{M}}} = \left| {\alpha R\left( {{u_{\rm{L}}}} \right)} \right|\sqrt {{{\bf{l}}^ \top }\left( {u{\rm{L}}} \right)\Sigma _{\rm{c}}^{ - 1}{\bf{l}}\left( {{u_{\rm{L}}}} \right)} .$(86)

This statistics follows a chi-distribution with n degrees of freedom. With Q(p, n) the quantile function of the chi-distribution with n degrees of freedom and the coefficients for the pseudo-LSF at the position of the line, l(uL) given by Eq. (40), we obtain | αlimitR(uL) |=Q(p,n)l(uL)Σc1l(uL).$\left| {{\alpha _{{\rm{limit}}}}R\left( {{u_{\rm{L}}}} \right)} \right| = {{Q\left( {p,n} \right)} \over {\sqrt {{{\bf{l}}^ \top }\left( {{u_{\rm{L}}}} \right)\Sigma _{\rm{c}}^{ - 1}{\bf{l}}\left( {{u_{\rm{L}}}} \right)} }}.$(87)

This result can be used with Eq. (81) to derive an upper limit on the absolute value of the equivalent width of a non-detected line: Wlimit=| αlimitR(uL) |t(uL).${W_{{\rm{limit}}}} = {{\left| {{\alpha _{{\rm{limit}}}}R\left( {{u_{\rm{L}}}} \right)} \right|} \over {t\left( {{u_{\rm{L}}}} \right)}}.$(88)

With the cumulative distribution function for the chi distribution, P(n/2, p2/2), with P(k, x) the incomplete gamma function (Press et al. 2007a), the 0.6827 quantile is approximately 7.7094, while the 0.95 and the 0.99 quantiles are approximately 8.5622 and 9.0715, respectively.

9 Broad lines

9.1 Detection of broad lines

We now consider broad lines, with which we mean that the approximation of the line by a Delta function is no longer suitable. Instead, the intrinsic width of the line is so large that it cannot be neglected even for the low spectral resolution of the Gaia spectrophotometers. For the distinction between narrow and broad lines we require a measure for the line width. We can base this measure on roots of derivatives of the spectrum, if we consider inflection points that are related to a local extremum. A local extremum has an inflection point on either side, which we denote ui, 1 and ui, 1, and base the estimator for the line width on the distance between these two points. As inflection points are roots of the second derivative of the observational spectrum, for a narrow line the distance between the inflection points is in a first-order approximation the distance between the innermost inflection points of the pseudo-LSF. With increasing intrinsic line width, the distance between the inflection points increases with respect to the corresponding distance for the pseudo-LSF. We therefore normalise the distance between the innermost two inflection points of the pseudo-LSF, and use this dimensionless quantity, which we denote as ω, ω=ui,2ui,1ui,2[ LSF ]ui,1[ LSF ],$\omega = {{{u_{i,2}} - {u_{i,1}}} \over {u_{i,2}^{\left[ {LS\,F} \right]} - u_{i,1}^{\left[ {LS\,F} \right]}}},$(89)

as the measure for line widths. If this quantity is significantly larger than one, we may consider the line broad. For the uncertainty on ω , we have to take the correlations between the errors on ui, 1 and ui, 2 into account, which we obtain from the corresponding elements of the covariance matrix on the roots of the second derivative, σω=Σ11+Σ222Σ12ui,2[ LSF ]ui,1[ LSF ].${\sigma _\omega } = {{\sqrt {{\Sigma _{11}} + {\Sigma _{22}} - 2{\Sigma _{12}}} } \over {u_{i,2}^{\left[ {LS\,F} \right]} - u_{i,1}^{\left[ {LS\,F} \right]}}}.$(90)

Figure 6 illustrates ω for Voigt profiles, as a function of the full width at half maximum (FWHM) of the Gaussian and Lorentz contributions. On the horizontal axis is the FWHM of the Gaussian, FG, on the vertical axis is the FWHM of the Lorentz profile, FL. The FWHM of the resulting Voigt profile, FV is in good approximation (Olivero & Longbothum 1977) FV=0.5356FL+0.2166FL2+FG2.${F_{\rm{V}}} = 0.5356 \cdot {F_{\rm{L}}} + \sqrt {0.2166 \cdot F_{\rm{L}}^2 + F_{\rm{G}}^2} .$(91)

The pseudo-LSF used here is based on the optical LSF for Gaia, blurred with a Gaussian with a standard deviation of 0.64, which is the LSF model used in Sect. 11 of this work. The LSF changes with wavelength, and for this example we use the wavelength of Hβ. For lines with FWHM below about 20 nm, the increase in ω is only a few ten per cent, showing the intuitive result that low resolution spectra are therefore not particularly sensitive to line widths.

thumbnail Fig. 6

Line width parameter, ω, for the Voigt profiles, as a function of the FWHM of the Gaussian contribution (on the horizontal axis) and the Lorentz contribution (on the vertical axis). The dotted lines indicate iso-contours in FWHM of 10, 20, 30, and 40 nm. See text for details.

9.2 Continuum model for broad lines

In the case of broad lines, estimating the continuum underlying a spectral line requires some form of interpolation. We would like to do the interpolation on internally calibrated spectra, where we detect a line, but at the same time, we would like to ensure that the interpolation on the internally calibrated spectrum corresponds to a smooth interpolation on the externally calibrated spectrum as well. To achieve this, we may adapt the approach already discussed in Sect. 6 and choose smooth interpolation functions in the external regime, then compute the observational spectra corresponding to these interpolation functions. We can then use these artificial observational spectra of the interpolation functions to find an interpolation over some pseudo-wavelength interval for an XP spectrum. For the interpolation functions, we may choose again the first M Legendre polynomials over a wavelength interval that essentially covers the entire range of non-zero response in BP and RP, respectively.

With the interpolation functions at hand, we have to decide about the interpolation interval [ulower, uupper] under the line. The choice of this interval is crucial for a good estimate of the continuum. Choosing this interval too narrow will result in an under-subtraction of the continuum, while choosing it too wide is likely to make the interpolation unreliable. We decided to tie the interval to the widths of the line, as expressed by the distance between the two inflection points on either side of the local extremum, ui, 2 – as discussed in the previous subsection. We chose the interpolation interval symmetric around the local extremum, with a width corresponding to a factor of x times Δ. In experimenting with simulated spectra with different black body continua and Voigt profiles for the lines, we found a value around x ≈ 5.5 is a good choice.

For the interpolation, we again have to impose conditions on the function at the boundaries of the interpolation interval and different points of the function itself on either side if the interval. So we selected a small number of m points of either side of the interval, separated by a value δ, such that we can obtain a sequence of 2m + 2 points u := umδ, u – (m – 1)δ,…, ulower, uupper, u + δ,…, u + . We might not force an exact solution for the interpolation, but chose 2m + 2 > M, and do a regression, as it reduces the noise. The noise on f (u) however is likely to be correlated, with a covariance matrix Σf = ΦT(u) Σc Φ(u). We thus have to determine the coefficient vector for the interpolation functions, b, such that it minimised the squared Mahalanobis distance as a generalised least squares problem, as already discussed in Sect. 6. The product, Lb, then provides the coefficients for the continuum approximation on the interval [uiower, uupper].

9.3 Equivalent widths for broad lines

Having separated the line and the continuum, the computation of the equivalent width requires the inversion of both components of the spectrum. The inversion of the continuum is trivial, as the coefficients, b, have been constructed from the observational spectra corresponding to Legendre polynomials in the regime of the absolute calibration. However, the inversion of the line signal is intrinsically an ill-posed problem and any inversion might be subject to significant systematic errors. We therefore consider a simplification and compute an estimate for the equivalent width without inversion of the internally calibrated spectrum, namely, using the approximation Wu1u2f(u)p(b,u)p(b,u)dλdudu,$W \approx \int\limits_{{u_1}}^{{u_2}} {{{f\left( u \right) - p\left( {{\bf{b}},u} \right)} \over {p\left( {{\bf{b}},u} \right)}}} {{{\rm{d}}\lambda } \over {{\rm{d}}u}}{\rm{d}}u,$(92)

with p(b, u) the continuum approximation. The approximate computation of W from the internally calibrated spectrum instead of the SPD has been tested using simulations for narrow lines and Voigt profiles with different underlying continua. The resulting systematic errors were in the order of some percent. In all test cases, the systematic errors from continuum interpolation were at least three times larger than the error resulting from the approximate computation. This approximation might be sufficiently good for general broad lines.

Example computations for the determination of W with this approach are shown in Fig. 5. The simulated case is analogous to the case considered for narrow lines, a black body continuum for 2 × 104 K and an Hβ line are used, for G = 12m and ten transits, and 5 × 105 random realisations of the BP spectrum with randomly selected values for W. Panel B shows the distribution of the relative difference between the computed and the true equivalent width as a function of true equivalent width. The performance of the broad line approach is similar to the use of the narrow line approach, shown in panel A. The random noise is slightly higher, however.

Panel C of the same figure shows the distribution for an line with a Gaussian profile and a FWHM of 15 nm. In this case, the Legendre interpolation was done without taking correlations into account. A systematic error in W can be seen, resulting mainly from a systematic underestimation of the continuum in the fit. Panel D shows the same case with taking correlations into account. The random noise is suppressed as compared to the case without including correlations, but the systematic errors from the continuum interpolation remain. Panel F of Fig. 5 shows the results for a Voigt profile for the Hβ line, with a FWHM of 15 nm and equal FWHM for the Gaussian and Lorentz contributions to the Voigt profile. The systematic errors from the continuum interpolation increase as compared to a Gaussian line profile. Panel H shows the distribution of relative difference in W for a Lorentz profile with a FWHM of 15 nm. Here, the systematic errors from continuum interpolation increase even more. The reason for the increasing systematic errors from Gaussian to Voigt to Lorentz profile are the increasingly flat wings of the spectral line, which make the choice of a suitable interpolation interval for the continuum increasingly difficult. As a result, the fit is likely to over- or under-subtract the line from the continuum. This problem can be strongly reduced if the shape of the line can be assumed known a-priori, a case we discuss in the following section.

9.4 Broad lines with known line shapes

In cases where the position and shape of a broad line can be assumed to be known a priori, the analysis of the line in Gaia DR3 XP spectra is simplified. In this case, the interpolation step for the continuum can be replaced by an approach more similar to the case of narrow lines. Assuming a line profile p(X), normalised such that its integral is unity, we compute the observational spectrum corresponding to this profile: P(u):=0L(u,λ)R(λ)p(λ)dλ.$P\left( u \right): = \int\limits_0^\infty {L\left( {u,\lambda } \right)R\left( \lambda \right)p\left( \lambda \right){\rm{d}}\lambda } .$(93)

Analogously to the approach in Sect. 8.2, we then use the second derivative at the line position to determine the scaling factor α that describes the line strength: α=f(2)(uL)P(2)(uL).$\alpha = {{{f^{\left( 2 \right)}}\left( {{u_{\rm{L}}}} \right)} \over {{P^{\left( 2 \right)}}\left( {{u_{\rm{L}}}} \right)}}.$(94)

The approximation for the continuum spectrum is: fc(u)f(u)αP(u).${f_{\rm{c}}}\left( u \right) \approx f\left( u \right) - \alpha \,P\left( u \right).$(95)

As the line profile is assumed to be known, no numerical inversion of the profile is required. An inversion of the continuum over the pseudo-wavelength range covered by the line is, however, required. As the continuum can be assumed smooth, the inversion for tc(u) using the Taylor approximation as lined out in Sect. 6 might allow for a good smooth approximation on a grid of points in the interval of interest. With knowledge of the response function, we can thus obtain an estimate for sc(X) and compute the equivalent with by numerical integration of the expression: W=αλ1λ2p(λ)sc(λ)dλ.$W = \alpha \int\limits_{{\lambda _1}}^{{\lambda _2}} {{{p\left( \lambda \right)} \over {{s_{\rm{c}}}\left( \lambda \right)}}{\rm{d}}\lambda } .$(96)

The inversion of the continuum point-wise is increasing the computational effort, but in exchange, it is possible to remove systematic errors resulting from the interpolation of the continuum. Illustrations are shown in panels E, G, and I of Fig. 5, for a Gaussian, a Voigt, and a Lorentz line profile, respectively. The equivalent width of an Hβ line with a black body continuum of 2 × 104 K and a FWHM of 15 nm is computed assuming the line position and profile known. The distribution of the relative difference between the computed and true W is shown for 5 × 105 random generations for G = 12m and ten transits. The systematic error in W that results when interpolating the continuum without using a priori knowledge about the line, a case shown in panels D, F, and H of the same figure, is strongly reduced. The reduction in systematic error is strongest for a Gaussian line profile and lowest for the Lorentz profile. The differences in the remaining systematic error mainly arise from the interval of integration. From Gaussian and Voigt to Lorentz profile, the wings of the line become wider, and the systematic errors from an integration over a fixed interval become larger. The reduction in systematic error however comes with an increase in the random error. If the position and shape of the line is known a priori, it is possible to compute an upper limit for a non-detected line, following the approach discussed for narrow lines in Sect. 8.5. Furthermore, we may also treat the problem of blended lines analogously to the case of narrow lines described in Sect. 8.4.

10 Lines as extrema in higher derivatives

As we have seen in Sect. 8.1, no local extremum in the observational spectrum is formed by a line if the function ζ(u)/W is not intersecting with the function L(1)(u, uL), the first derivative of the LSF or pseudo-LSF. As can be seen from Eq. (73), large values for ζ(u)/W are a result of an overly large ratio between the slope of the continuum spectrum, fc(1)(u)$f_c^{\left( 1 \right)}\left( u \right)$, and the line’s equivalent width, W. Thus, a large slope and a weak line, namely, a line with a small absolute value of W, might prevent the formation of a local extremum. However, a line might still be able to produce a local extremum in higher derivatives. We are therefore considering higher derivatives of Eq. (75), stating the condition for a local extremum.

The function L(2)(u, uL) has no root at uL, and a line might show as two local extrema in f(1)(u) on both sides of uL. These two points correspond to inflection points of the observational spectrum. This situation is illustrated in the top panel of Fig. 7, for a simulated Hβ line with a black body continuum of 6000K. The shapes of the higher derivatives of the pseudo-LSF and ζ(u)/W become more complex, and the positions of uE might vary strongly. A good indicator for inflection points being related to a line is whether the values of uE are within the first local extrema of L(2)(uL), indicated by umin and umax in Fig. 7.

The function L(3)(u, uL) has a root at uL, as the curvature of the pseudo-LSF is maximal at its centre. We might therefore consider the case of local extrema in f(2)(u). This case is illustrated in the bottom panel of Fig. 7. For the determination of the equivalent width of a narrow line, we might therefore use the approach of Eq. (81), with the orders of the involved derivatives increased by two, W=f(4)(uE)L(4)(uE,uL)1tc(uL).$W = {{{f^{\left( 4 \right)}}\left( {{u_{\rm{E}}}} \right)} \over {{L^{\left( 4 \right)}}\left( {{u_{\rm{E}}},{u_{\rm{L}}}} \right)}}{1 \over {{t_{\rm{c}}}\left( {{u_{\rm{L}}}} \right)}}.$(97)

As an example, we again consider a simulated Hβ line in BP, with an underlying black body continuum for a temperature of 6000 K. This simple test case does not result in a local extremum for –1.1 nm ≤ W ≤ 1.1 nm. We derive the equivalent width from the extremum in the second derivative of f (u). The resulting distribution of differences between the true and the computed W, as a function of true W, is shown in Fig. 8. This figure shows the result from 2 × 105 random realisations of an object, with G = 12m and ten transits. Unbiased recovery of the equivalent width is possible, with some tens of percent uncertainty, also from the second derivative of a spectrum. As expected for faint lines and the use of higher derivatives, the random noise is considerably larger than for the case of deriving equivalent widths from local extrema. However, only if |W| falls below about 0.5 nm, the noise becomes so large that no meaningful interpretation of the spectrum around the Hβ line becomes possible in the simulated case.

thumbnail Fig. 7

Illustration of the positions of local extrema in the first derivative (top panel) and second derivative (bottom panel) for a simulated Hβ line with a 6000K black body continuum. See text for details.

thumbnail Fig. 8

Distribution of the difference between true and derived equivalent width for simulated weak and narrow Hβ lines. See Sect. 10 for details.

11 Example applications to Gaia DR3 spectra

In the following, we apply the methods derived in the previous sections to some example cases among the Gaia DR3 low-resolution spectra. These examples were not selected for their scientific output, but serve solely for illustrating the computational method and illustrating its potential. To cover a wide range of applications, the examples in this section include the analysis of lines in the zero and second derivative, absorption and emission lines, narrow and broad lines, and blended and non-blended cases, for hydrogen Balmer lines, He I lines, and broad interstellar absorption bands.

For the instrument model, we used the dispersion functions and response curves from the configuration files in the Gaia software tool GaiaXPy1. For easy computation, we represent the tabulated dispersion functions by spline functions. The configuration files contain no model for the (pseudo-)LSF. To construct a simple substitute, we measure the widths ω of Hα and Hβ lines in a large set of sources known for strong hydrogen absorption and emission with respect to the optical LSF of Gaia, broadened with pixel sampling and TDI effects (cf. Weiler et al. 2020). We then applied a Gaussian blur to this LSF until the width of the resulting blurred LSF matched the observed widths for Hα and Hβ. The standard deviation of the Gaussian blur applied is 0.64, independent of wavelength. From this simple LSF model, we compute the pseudo-LSF by development in Hermite functions. In the following, we show that this simplistic approach is already sufficient to obtain good results. However, the results presented should be understood as preliminary, as they are likely to improve with the use of a realistic full instrument model when available in the future.

The analysis of the Hα line of a large set of sources with strong Ha emission showed a systematic difference between the pseudo-wavelength at which the RP dispersion function predicts this line and the true line position by about 0.3 in pseudo-wavelength. We corrected for this systematic effect by artificially shifting the value of uL for Hα by +0.3 with respect to uL predicted by the dispersion relation used. A similar effect seems to apply at long wavelengths for the BP instrument, and we applied a correction of +0.35 for the He I line at 587.7nm as well.

11.1 Example: Balmer emission and absorption lines

As a first example, we considered the star Gaia DR3 505171240162594560, IRAS 02058+5719, an object with G ≈ 14.187m and GBPGRP ≈ 0.495m. This object has been selected from the IGAPS data set (Monguió et al. 2020), where Hα emission is reported. The analysis of this object is illustrated with the BP and RP spectra as a function of pseudo-wavelength in Fig. 9. The expected positions of the Hα to Hδ lines are indicated by dashed lines in this figure. The intervals [umin, umax], within which an extremum is consistent with resulting from the Balmer lines, are highlighted. The intervals are based on the two inflection points of the LSF on both sides of the maximum position of the LSF, respectively, as discussed in Sect. 8.1. For this star, we find a local maximum in RP consistent with Hα, along with three local extrema in the second derivative, consistent with Hβ, Hγ, and Hδ. All four local extrema have p-values larger than 0.999 and it is thus very unlikely that these are noise patterns. The red lines show the continuum model for narrow lines according to Eq. (79). We use Eq. (81) for the computation of the equivalent widths of the lines, using the inversion according to Eq. (54). The results for the four Balmer lines in the XP spectra of this source are summarised in Table 2. The difference between the wavelength corresponding to the local extrema, λe, and the wavelengths of the corresponding Balmer lines is well within the range of expected systematic deviations. The estimated line widths, ω, are in a good agreement with the assumption of narrow lines. Only the Hγ line shows a deviation above a 1-σ level, but mildly so.

thumbnail Fig. 9

Example for the analysis of Hβ to Hδ in BP (top panel) and Ha in RP (bottom panel) for source Gaia DR3 505171240162594560. The nominal line positions are indicated by dashed lines, the shaded regions indicate the range in pseudo-wavelength within which a local extremum is considered in agreement with being a hydrogen Balmer line. For BP, the wavelength increase from right to left (with decreasing pseudo-wavelength) for RP from left to right (with increasing pseudo-wavelength). Black curves show the XP spectra, the red curves show the continuum model. Black dots indicate the positions of local extrema in the second derivative (Hβ to Hδ) and zero derivative (Hα).

11.2 Example: Hα absorption for CALSPEC stars

In order to test the accuracy of the equivalent widths derived from Gaia XP spectra with the method presented in this work, we compared the values of W with values derived from higher resolution spectra of the same sources. For our set of stars with high-quality spectra with higher resolution, we used the CALSPEC data (Bohlin et al. 2014, Bohlin et al. 2020) and we considered the Ha line. In the visible wavelength range, the CALSPEC spectra are obtained with the Hubble Space Telescope/Space Telescope Imaging Spectrograph (HST/STIS) and the resolving power at the Hα line is approximately 673. The data set provides 44 stars with XP spectra available in DR3 and a detection of Hα in the RP spectrum with a p-value larger than 0.8.

In the CALSPEC spectrum of each star, a third-degree polynomial was fitted to continuum intervals on both sides of the line and then used as an approximation of the continuum. With the continuum approximation, the FWHM of the line is derived and the equivalent width computed by applying Eq. (66). The 44 stars cover a wide range of parameters, with equivalent widths between –0.2 nm and –4.4 nm, FWHM of the lines ranging from 1.5 nm to 7.8 nm, G band magnitudes from 14.7m to 4.3m, and GBPGrp colours ranging from −0.54 to 0.91.

A comparison between the equivalent widths derived from RP spectra and those from the CALSPEC spectra is shown in Fig. 10. The upper panel of this figure shows the equivalent widths derived with the narrow line approximation as described in Sect. 8.3. While this approximation is providing a good agreement with the reference values from CALSPEC spectra for weak absorption lines, systematic deviations are obvious for stronger absorption lines. Already for equivalent widths around –1.5 nm the systematic error is clearly visible.

The bottom panel of Fig. 10 shows the comparison for equivalent widths computed by assuming a Lorentz line shape for the Ha line for all lines with a value of ω larger than one and using the approach described in Sect. 9.4. The width of the Lorentz profile was chosen such that the value of ω derived for each source from the RP spectrum is met by the model Lorentz line. With this approach, the systematic error is strongly reduced, while the random error, which has been estimated by using bootstrapping, is increased. This is consistent with the results obtained for simulated spectra in Sect. 9.4. Only the strongest and broadest lines in Fig. 10 might be affected by systematics. However, besides the systematic errors resulting from the computational approach used, also the simplistic instrument model employed in this work might contribute to systematic errors for the broad and strong absorption lines.

Table 2

Line, XP instrument (BP or RP), position in pseudo-wavelength (uE), and shift with respect to expected line position (Δu), wavelength (λe) for the local extrema, line width parameter, ω, estimated equivalent width (W), and p-value for the lines in the XP spectra of individual sources analysed in this work.

11.3 Example: Hα and Hβ absorption in open clusters

As an example for the application of the method presented in this work to a larger set of sources, we selected open clusters and determined the Hα and Hβ equivalent widths. For the test, we selected the three open clusters NGC 3114, Mel 20 (α Persei), and NGC 3532, with the cluster members taken from Cantat-Gaudin & Anders (2020). For the test, we applied strict quality filters to the data, including only sources with fractions of blended and contaminated transits below 0.1, a small renormalised unit weight error below 1.4, a small number of multi-peaked transits or transits with complex windows, and an absolute value of the corrected flux excess below 1.5σC* according to Riello et al. (2021). This filtering resulted in 851, 552, and 820 sources for the three open clusters, respectively. Figure 11 shows the equivalent widths as a function of effective temperature for the three open clusters. The effective temperatures are the field teff_gspphot from the DR3 gaia_source and astrophysical parameters table (Creevey et al. 2023; Fouesneau et al. 2023). The p-value of the hydrogen lines is shown as a colour scale, and lines which are local extrema in the XP spectra are shown as circles, the ones which are extrema in the second derivative of the XP spectrum as triangles. The generally expected behaviour can be seen in the results for all three clusters, and for Hα and Hβ. Weaker lines tend to be detected in the second derivative, and the statistical significance of the lines, expressed by the p-value, increases as the lines are getting stronger. Also, the errors on the equivalent widths tend to become smaller if the line strength increases. Also as expected, a maximum hydrogen absorption around an effective temperature of 9000 K can be seen in all three clusters and for Hα and Hβ.

The open cluster NGC 3114 shows one prominent outlier from the general trend of Hα with effective temperature, which is Gaia DR3 5256686653442846336 (HD 87471). This star with an effective temperature of about 104 K has no significant Hα line, with W = (0.087 ± 0.100) nm, and a p-value of 0.414. The Hβ absorption for this star is also somewhat lower for this effective temperature, with W = (–0.87 ± 0.23) nm, with a p-value of p = 0.9999. No spectral peculiarities have been reported for this star. It is classified as a sub-giant (Houk & Cowley 1975), and this classification is consistent with the fact that the star is approximately 1.5 magnitudes brighter than the main sequence stars of comparable colour in the same cluster. A lower hydrogen absorption as compared to the other stars in the cluster might thus result at least partially from a lower surface gravity of a sub-giant.

thumbnail Fig. 10

Comparison of the equivalent widths of the Hα line for CALSPEC stars, derived from Gaia RP spectra, WRP, and from CALSPEC spectra, WCALSPEC. Top panel: Results obtained with the narrow line approximation. Bottom panel: Results for assuming a Lorentz line profile. The colour scale gives the FWHM of the lines, as measured in CALSPEC spectra, and it is the same for both panels.

11.4 Example: Blended narrow lines

As an example for the treatment of blended narrow lines we chose the cataclysmic variable HT Cassiopeiae, Gaia DR3 426306363477869696. This star, with a G ≈ 16.313m and a GBPGRP ≈ 0.664m, is known to show strong emissions from hydrogen and neutral helium (Marsh 1990). We therefore studied the hydrogen Balmer lines and three strong He I lines within the wavelength range covered by BP and RP, at wavelengths of 447.3 nm, 587.7 nm, and 706.7nm, which are the vacuum Ritz wavelengths from the National Institute of Standards and Technology Atomic Spectra Data Base (Kramida et al. 2021). The XP spectra for this source as a function of pseudo-wavelength are shown in Fig. 12 for illustration. The derived line parameters are presented in Table 2. All hydrogen Balmer lines for Hα to Hϵ are clearly detected as emission lines, with p-values above 0.95 in all cases. There are local extrema in the XP spectra or its second derivative that are consistent with being He I emissions. However, the statistical significance of these extrema is rather low. The p-values for the lines at 706.7 nm and 587.7 nm are 0.420 and 0.270, respectively, and the extrema are thus well in agreement with the status of noise. The extremum consistent with the 447.3 nm He I line has a somewhat larger p-value of 0.700, and might be a tentative detection of He I.

The lines of Hβ, He I at 447.3 nm, and Hγ to He are blended, and the equivalent widths of these lines are computed using the approach presented in Sect. 8.4. With five blended lines, the continuum model is constructed over a wide range of pseudo-wavelengths, shown as the red line in Fig. 12. The continuum model has a wavy structure, partly because it is affected by the noise. Additionally, the assumptions on the LSF made in this work are simplified and, in particular, the outer parts of the pseudo-LSF are affected by these simplifications. When combining the outer parts of the pseudo-LSF in case of blended lines, these simplifications might therefore additionally affect the resulting solution. An improved LSF model might therefore allow for improvements in the future.

thumbnail Fig. 11

Ha and Hβ equivalent widths against the effective temperature for the three open clusters NGC 3114, Mel 20, and NGC 3532. The colour scale shows the p-value and is common to all panels. Circles correspond to extrema in the zero derivative. Triangles correspond to the extrema in the second derivative.

11.5 Example: Broad interstellar absorption line

As an example for a broad line, we considered the interstellar band at approximately 770 nm in RP. Maíz Apellániz et al. (2021) discovered this band in reddened OB stars and provides a number of stars with equivalent widths for this band. The band shape is Gaussian, with a central wavelength of 769.9 nm and a FWHM of 17.66 nm. We use this information on the line shape to apply the approach discussed in Sect. 9.4. The internally calibrated RP spectrum for one example, Gaia DR3 2067888218857234304, with G ≈ 9.246m and GBPGRP ≈ 2.008m, is shown in Fig. 13. Maíz Apellániz et al. (2021) reported an equivalent width of the 770nm band of (−1.49 ± 0.20) nm for this star. An extremum in the second derivative, consistent with the pseudo-wavelength position of the 770 nm band, is detected, with a high p-value of 0.9996. The continuum model is shown as a red line in Fig. 13, derived for a Gaussian band profile. The resulting equivalent width of the band of W = (−1.38 ± 0.55)nm is well in agreement with the result by Maíz Apellániz et al. (2021). The line width parameter ω however is only 1.15 ± 0.16, and the band is thus still in agreement with being narrow. It is nevertheless consistent with the model for the pseudo-LSF used, which has a FWMH of already 22.7 nm at a wavelength of 770 nm. The expected value of ω is 1.21, slightly larger than unity only, and also in good agreement with the measured value of ω. The line parameters for the 770 nm band are listed in Table 2 as well.

thumbnail Fig. 12

Analysis of hydrogen Balmer and He I lines for Gaia DR3 426306363477869696. As in Fig. 9, the wavelengths increase from right to left for BP (upper panel) and from left to right for RP (bottom panel).

12 Summary and discussion

We derived a methodology for the analysis of spectral emission and absorption lines in Gaia DR3 low resolution spectra. These spectra are represented by linear combinations of Her-mite functions in their form of internally calibrated spectra and we exploited the useful mathematical properties that come with this form of representation. These properties include that the derivative of a linear combination of Hermite functions is another linear combination of Hermite functions, that can be derived by a matrix-vector multiplication. Furthermore, all roots of a linear combination of Hermite functions can be computed as eigenvalues of a simple non-standard companion matrix. These combined properties allow for a convenient and efficient determination of local extrema and inflection points in Gaia DR3 XP spectra. We used this to derive expression for the position, error on position, statistical significance, and equivalent widths of spectral lines, both for narrow and broad lines. The resulting computations are very fast and robust and therefore can be applied to search for spectral lines and extract their parameters even for large numbers of objects with modest computational resources. Neither machine learning nor human interaction is required.

We applied the derived methods to Gaia DR3 low resolution spectra with hydrogen Balmer lines, He I lines, and a broad interstellar band. The line positions, widths, and equivalent widths, as well as the statistical significance of the corresponding local extrema in the spectra were computed as a demonstration. The presented methodology based on the derivatives of Hermite functions might help to make use of the interesting and in its form unique new Gaia data product that are the low-resolution spectra. Despite their low spectral resolution, Gaia XP spectra provide opportunities for the analysis of absorption and emission lines from the near-ultraviolet to the near-infrared part of the electromagnetic spectrum. Exploiting the mathematical properties of the unique representation of the spectra as linear combinations of Hermite functions allows for a detailed analysis of the spectral lines.

thumbnail Fig. 13

Example for the analysis of the broad interstellar band at 770 nm for source Gaia DR3 2067888218857234304. The nominal band position is indicated by dashed lines, the shaded regions indicate the range in pseudo-wavelength within which a local extremum is considered to be in agreement with the status of the band. The black curve shows the XP spectra and the red curve shows the continuum model. The black dot indicates the positions of local extrema in the second derivative.

Acknowledgements

This work was funded by the Spanish MICIN/ AEI/10.13039/501100011033 and by “ERDF A way of making Europe” by the European Union through grant RTI2018-095076-B-C21, and the Institute of Cosmos Sciences University of Barcelona (ICCUB, Unidad de Excelencia ‘María de Maeztu’) through grant CEX2019-000918-M. This work has made use of data from the European Space Agency (ESA) mission Gaia (https://www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www.cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement.

References

  1. Bohlin, R.C., Gordon, K.D., & Tremblay, P.E. 2014, PASP, 126, 711 [NASA ADS] [Google Scholar]
  2. Bohlin, R.C., Hubeny, I., & Rauch, T. 2020, AJ, 160, 21 [NASA ADS] [CrossRef] [Google Scholar]
  3. Brychkov, Y.A., Marichev, O.I., & Prudnikov, A.P. 1989, Tables of Indefinite Integrals (Gordon & Breach Science Publishers) [Google Scholar]
  4. Cantat-Gaudin, T., & Anders, F. 2020, A&A, 633, A99 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  5. Carrasco, J.M., Weiler, M., Jordi, C., et al. 2021, A&A, 652, A86 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  6. Carroll, B.W., & Ostlie, D.A. 2017, An Introduction to Modern Astrophysics, 2nd edn. (Cambridge University Press) [CrossRef] [Google Scholar]
  7. Creevey, O.L., Sordo, R., Pailler, F., et al. 2023, A&A, in press, https://doi.org/10.1051/0004-6361/202243688 [Google Scholar]
  8. Day, D., & Romero, L. 2005, SIAM J. Numer. Anal., 43, 1969 [CrossRef] [Google Scholar]
  9. De Angeli, F., Weiler, M., Montegriffo, P., et al. 2023, A&A, in press, https://doi.org/10.1051/0004-6361/202243680 [Google Scholar]
  10. Fouesneau, M., Fremat, Y., Andrae, R., et al. 2023, A&A, in press, https://doi.org/10.1051/0004-6361/202243919 [Google Scholar]
  11. Gaia Collaboration (Prusti, T., et al.) 2016, A&A, 595, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  12. Gaia Collaboration (Vallenari, A., et al.) 2023, A&A, in press, https://doi.org/10.1051/0004-6361/202243940 [Google Scholar]
  13. Golub, G.H., & Van Loan, C.F. 2013, Matrix Computations, 4th edn. (John Hopkins University Press) [CrossRef] [Google Scholar]
  14. Goodman, J.W. 1996, Introduction to Fourier Optics, 2nd edn. (McGraw-Hill) [Google Scholar]
  15. Houk, N., & Cowley, A.P. 1975, University of Michigan Catalogue of Twodimensional Spectral Types for the HD Stars. I. Declinations_90_to_53_f0 [Google Scholar]
  16. Jordi, C., Gebran, M., Carrasco, J.M., et al. 2010, A&A, 523, A48 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  17. Kariya, T., & Kurata, H. 2004, Generalized Least Squares (John Wiley & Sons) [CrossRef] [Google Scholar]
  18. Kramida, A., Yu. Ralchenko, Reader, J., & NIST ASD Team. 2021, NIST Atomic Spectra Database (ver. 5.9), [Online]. Available: https://physics.nist.gov/asd [2022, July 20], National Institute of Standards and Technology, Gaithersburg, MD [Google Scholar]
  19. Magnasco, V. 2013, Elementary Molecular Quantum Mechanics: Mathematical Methods and Applications, 2nd edn. (Elsevier) [Google Scholar]
  20. Maíz Apellániz, J., Barbá, R.H., Caballero, J.A., Bohlin, R.C., & Fariña, C. 2021, MNRAS, 501, 2487 [CrossRef] [Google Scholar]
  21. Marsh, T.R. 1990, ApJ, 357, 621 [NASA ADS] [CrossRef] [Google Scholar]
  22. Monguió, M., Greimel, R., Drew, J.E., et al. 2020, A&A, 638, A18 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  23. Montegriffo, P., De Angeli, F., Andrae, R., et al. 2023, A&A, in press, https://doi.org/10.1051/0004-6361/202243880 [Google Scholar]
  24. Olivero, J.J., & Longbothum, R.L. 1977, JQSRT, 17, 233 [NASA ADS] [CrossRef] [Google Scholar]
  25. Press, W.H., Teukolsky, S.A., Vetterling, W.T., & Flannery, B.P. 2007a, Numerical Recipes, 3rd edn. (Cambridge University Press) [Google Scholar]
  26. Press, W.H., Teukolsky, S.A., Vetterling, W.T., & Flannery, B.P. 2007b, Numerical Recipes, Webnote No. 17, Rev. 1 [Google Scholar]
  27. Riello, M., De Angeli, F., Evans, D.W., et al. 2021, A&A, 649, A3 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  28. Stetter, H.J. 2004, Numerical Polynomial Algebra (SIAM) [CrossRef] [Google Scholar]
  29. Weiler, M., Carrasco, J.M., Fabricius, C., & Jordi, C. 2020, A&A, 637, A85 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

All Tables

Table 1

List of notations used throughout this work.

Table 2

Line, XP instrument (BP or RP), position in pseudo-wavelength (uE), and shift with respect to expected line position (Δu), wavelength (λe) for the local extrema, line width parameter, ω, estimated equivalent width (W), and p-value for the lines in the XP spectra of individual sources analysed in this work.

All Figures

thumbnail Fig. 1

Sketch illustrating the difference between a typical spectrum and a spectrum represented by a linear combination of basis functions. Top panel: typical spectrum, consisting of sampled flux values on a wavelength grid (black symbols) derived from the spectral photon distribution (red line). Central panel: coefficients representing the same data set by a linear combination of basis functions. Black symbols show the noisy representation, red dots the true coefficients. Bottom panel: linear combination of basis functions (black line) and the 1-σ error interval (grey shaded region) as a function of wavelength.

In the text
thumbnail Fig. 2

Empirical cumulative distribution functions for the absolute value of the derivative, normalised to its error, at the roots of linear combinations of Hermite functions, for N = 5 (orange), 55 (red), and 200 (green). The black curve shows the cumulative distribution function for a chi distribution with two degrees of freedom.

In the text
thumbnail Fig. 3

Error-normalised value of the second derivative of the BP spectrum as a function of uE for random spectra consisting of a 2 × 104 K black body continuum and a narrow Hß line with random equivalent width W. The three panels correspond to G-magnitudes of 12m, 15m, and 18m, from top to bottom.

In the text
thumbnail Fig. 4

Illustration of the shift between the line position, uL, and the position of local extrema, uE, for a 2 × 104 K black body background and an Hβ line (top panel) and an Hα line (bottom panel). The green lines show L(1) (uuL, uL) and the blue lines show ζ (uuL) divided by W. We refer to the text for more details.

In the text
thumbnail Fig. 5

Distributions of the relative difference between the computed equivalent width (W) and the true equivalent width (Wtrue) of Hβ lines with a 2 × 104 K black body continuum for different assumptions on the line shape (narrow lines, Gaussian, Voigt, and Lorentz profiles) and different computation techniques. Each panel shows the result from 5 × 105 random realisations of a spectrum for G = 12m and ten transits. Positive values in the horizontal axis correspond to Hβ emission, negative values to Hβ absorption. For details see text in Sects. 8.3, 9.3, and 9.4.

In the text
thumbnail Fig. 6

Line width parameter, ω, for the Voigt profiles, as a function of the FWHM of the Gaussian contribution (on the horizontal axis) and the Lorentz contribution (on the vertical axis). The dotted lines indicate iso-contours in FWHM of 10, 20, 30, and 40 nm. See text for details.

In the text
thumbnail Fig. 7

Illustration of the positions of local extrema in the first derivative (top panel) and second derivative (bottom panel) for a simulated Hβ line with a 6000K black body continuum. See text for details.

In the text
thumbnail Fig. 8

Distribution of the difference between true and derived equivalent width for simulated weak and narrow Hβ lines. See Sect. 10 for details.

In the text
thumbnail Fig. 9

Example for the analysis of Hβ to Hδ in BP (top panel) and Ha in RP (bottom panel) for source Gaia DR3 505171240162594560. The nominal line positions are indicated by dashed lines, the shaded regions indicate the range in pseudo-wavelength within which a local extremum is considered in agreement with being a hydrogen Balmer line. For BP, the wavelength increase from right to left (with decreasing pseudo-wavelength) for RP from left to right (with increasing pseudo-wavelength). Black curves show the XP spectra, the red curves show the continuum model. Black dots indicate the positions of local extrema in the second derivative (Hβ to Hδ) and zero derivative (Hα).

In the text
thumbnail Fig. 10

Comparison of the equivalent widths of the Hα line for CALSPEC stars, derived from Gaia RP spectra, WRP, and from CALSPEC spectra, WCALSPEC. Top panel: Results obtained with the narrow line approximation. Bottom panel: Results for assuming a Lorentz line profile. The colour scale gives the FWHM of the lines, as measured in CALSPEC spectra, and it is the same for both panels.

In the text
thumbnail Fig. 11

Ha and Hβ equivalent widths against the effective temperature for the three open clusters NGC 3114, Mel 20, and NGC 3532. The colour scale shows the p-value and is common to all panels. Circles correspond to extrema in the zero derivative. Triangles correspond to the extrema in the second derivative.

In the text
thumbnail Fig. 12

Analysis of hydrogen Balmer and He I lines for Gaia DR3 426306363477869696. As in Fig. 9, the wavelengths increase from right to left for BP (upper panel) and from left to right for RP (bottom panel).

In the text
thumbnail Fig. 13

Example for the analysis of the broad interstellar band at 770 nm for source Gaia DR3 2067888218857234304. The nominal band position is indicated by dashed lines, the shaded regions indicate the range in pseudo-wavelength within which a local extremum is considered to be in agreement with the status of the band. The black curve shows the XP spectra and the red curve shows the continuum model. The black dot indicates the positions of local extrema in the second derivative.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.