Free Access
Issue
A&A
Volume 561, January 2014
Article Number A59
Number of page(s) 8
Section Astronomical instrumentation
DOI https://doi.org/10.1051/0004-6361/201322746
Published online 23 December 2013

© ESO, 2013

1. Introduction

Spectra of astronomical objects provide a wealth of information, and the increasing need for higher precision has led to the development of very stable and fibre-fed spectrographs. A prospering example is the search for exoplanets with Doppler spectroscopy. Cross-dispersed, high-resolution echelle spectrographs are usually employed to measure radial velocities at a precision of 1 m/s, which corresponds to about 1/1000 of a pixel. Therefore careful calibration, as well as high mechanical, thermal, and pressure stability, is essential.

If aiming for high precision, not only is the hardware important, but also the software algorithms to process the images. Typically, a reduction of echelle spectra consists of several steps: bias subtraction, dark subtraction, scattered light subtraction, flat-fielding, extraction to 1D, deblazing, wavelength calibration, order merging, flux normalisation (e.g. Baranne et al. 1996). In this work we focus on extraction, which is a crucial step in image processing. A widely used method is the so-called optimal extraction (Horne 1986) and its variants (e.g. Marsh 1989; Piskunov & Valenti 2002, Table 1), which basically scales 1D cross-sectional profiles to the imaged spectrum, and the scaling factor is the best flux estimate. Additionally, most algorithms try to model and reconstruct the spatial profile/slit function with polynomial, Gaussian, or other smooth functions. However, for stabilised spectrographs, the order profiles and positions are object- and time-independent, which simplifies the extraction; in particular, there is not necessarily any need to model the spatial profile with empirical functions. We exploit this circumstance, and we derive and test our concept of flat-relative optimal extraction.

Table 1

Optimal extraction algorithms.

2. Principle of flat-relative optimal extraction (FOX)

In the following, we assume that the image processing steps (e.g., bias, dark, and background subtraction, non-linearity correction) preceding extraction are properly done and that, in the image, the main dispersion is oriented in a more or less horizontal direction (x) and the cross-dispersion (echelle orders) in a vertical direction (y).

A spectrograph consists of dispersive elements and a camera that images the slit or the fibre exit to wavelength-dependent positions and shapes. The observed light distribution S(x,y) is a convolution of the input spectrum s(λ) with a wavelength-dependent instrumental point spread function (iPSF) of the spectrograph Ψ(x,y,λ). Therefore a model Ŝ(x,y) for the observed spectrum can be formulated as (x,y)=Ψ(x,y,λ)s(λ)\begin{eqnarray} \hat{S}(x,y)=\Psi(x,y,\lambda)\otimes s(\lambda)\label{eq:2Dconvmodel} \end{eqnarray}(1)where wavelength λ and the positions x and y in the detector plane are continuous variables (the hat indicates the model or best estimate, while without the hat it indicates observations with noise: S(x,y) = Ŝ(x,y) + σ(x,y)). Furthermore, the spectrum is recorded and binned by detector pixels. Therefore it is convenient to use an effective point spread function (ePSF, Anderson & King 2000) and a discrete version of Eq. (1) as in Bolton & Schlegel (2010)x,y=λΨx,y,λsλ\begin{eqnarray} \hat{S}_{x,y}=\sum_{\lambda}\Psi_{x,y,\lambda}s_{\lambda}\label{eq:2dmodel} \end{eqnarray}(2)where x and y now correspond to pixel indices, and a finite integration over wavelength λ still has to be done. The calibration matrix Ψx,y,λ tabulates the effective response function of the spectrograph and detector. In Bolton & Schlegel (2010), Eq. (2) serves as model for “perfect” extraction.

Optimal extraction uses the column numbers as extraction grid (λ → x) and basically assumes 1D slit functions; i.e., any input wavelength λ that corresponds to the pixel x is imaged only to that column, meaning only pixels in the spatial direction y are affected, but no neighbouring columns. Under these circumstances the calibration matrix can be separated as Ψx,y,λ=δx,λψλ,y\begin{eqnarray} \Psi_{x,y,\lambda}=\delta_{x,\lambda}\psi_{\lambda,y}\label{eq:OXTassumption} \end{eqnarray}(3)where δx,λ is the Kronecker delta1 and ψλ,y the wavelength-dependent cross-section. Then the model image in Eq. (2) simplifies to x,y=ψx,ysx.\begin{eqnarray} \hat{S}_{x,y}=\psi_{x,y}s_{x}.\label{eq:FOXmodel} \end{eqnarray}(4)The response ψx,y could be measured directly with a uniform input spectrum sx = 1, but it is much more challenging to determine Ψx,y,λ. In practice, exposures of flat lamps Fx,y are usually taken as part of regular calibration sets. Those flat exposures have high signal-to-noise ratios (S/Ns), and it can be further increased in master flats by coadding many flat exposures. This means that the errors are negligible compared to science exposures and that we can set \hbox{$\hat{F}_{x,y}\simeq F_{x,y}$}. The spectrum of a flat lamp fx is generally not uniform overall, but continuous and featureless, and it varies slowly with wavelength. If fx is known, we measure the response as ψx,y=Fx,yfx\hbox{$\psi_{x,y}=\frac{F_{x,y}}{f_{x}}$} directly. However, fx is usually not known in advance and actually it cannot be measured from the flat exposure alone. It should be characterised externally, either in advance by another flux-calibrated instrument or afterwards by observations of standard stars (with known spectra) with the same instrument as part of the flux calibration step. Since both might not be available, we extract the science spectrum sx relative to the flat spectrum fx and write for the model x,y=Fx,ysxfx·\begin{eqnarray} \hat{S}_{x,y}=F_{x,y}\frac{s_{x}}{f_{x}}\cdot\label{eq:FOXmodel_used} \end{eqnarray}(5)We obtain the spectrum sxfx\hbox{$\frac{s_{x}}{f_{x}}$} by minimising the residuals between observations Sx,y and model Ŝx,y, i.e. solving the linear least-square problem χ2x,ywx,y[Sx,yFx,ysxfx]2=minimum\begin{eqnarray} \chi^{2}\approx\sum_{x,y}w_{x,y}\left[S_{x,y}-F_{x,y}\frac{s_{x}}{f_{x}}\right]^{2}=\mathrm{minimum}\label{eq:FOXchisqr} \end{eqnarray}(6)where wx,y are the weights for each pixel using a noise model σx,y (e.g. photon and readout noise; see Sect. 3). These weights may also include a map Mx,y masking bad pixels and pixels outside the extraction aperture (wx,y=Mx,yσx,y2\hbox{$w_{x,y}=\frac{M_{x,y}}{\sigma_{x,y}^{2}}$}). Setting the derivative of χ2 with respect to sxfx\hbox{$\frac{s_{x}}{f_{x}}$} equal to zero, we get a set of decoupled equations with the solution at each position rxsxfx=ywx,yFx,ySx,yywx,yFx,yFx,y,\begin{eqnarray} r_{x}\equiv\frac{s_{x}}{f_{x}}=\frac{\sum_{y}w_{x,y}F_{x,y}S_{x,y}}{\sum_{y}w_{x,y}F_{x,y}F_{x,y}}\,,\label{eq:FOX} \end{eqnarray}(7)where sxfx\hbox{$\frac{s_{x}}{_{fx}}$} is the best fitting amplitude at each spectral bin x. Figure 1 illustrates the principle of flat-relative optimal extraction (FOX).

Equation (7) is quite similar to the well known optimal extraction equation (e.g. Horne 1986). The difference is that we extract the spectrum sx relative to the flat spectrum fx, and Fx,y is not normalised and includes the natural spatial profile and all flat-field effects.

thumbnail Fig. 1

Principle of flat-relative optimal extraction (FOX). Each of the two upper panels shows a 400 pix × 13 pix section of the reddest HARPS order (fibre A). Pixel-to-pixel variations, fringing, order tilt/curvature, and the spatial profile are present in both the master flat Fx,y (top) and the stellar raw spectrum Sx,y which has additional (telluric) absorption lines (middle). The extracted spectrum sx / fx (bottom) is the scaling factor between both spectra for each column within the extraction mask (red lines).

3. Noise model and error estimation

The pixels on the detector (and therefore the extracted values sx / fx) are affected by noise σx,y. This consists mainly of photon noise σph2=x,y\hbox{$\sigma_{\mathrm{ph}}^{2}=\hat{S}_{_{x,y}}$} (if both σph and Ŝx,y are in units of photon counts)2 and detector noise (e.g. the readout noise σrn2\hbox{$\sigma_{\mathrm{rn}}^{2}$} of a CCD). Therefore, a simple noise model for each pixel is, for instance (now all units in ADU), σx,y2=σrn2+gx,y\begin{eqnarray} \sigma_{x,y}^{2}=\sigma_{\mathrm{rn}}^{2}+g\hat{S}_{_{x,y}}\label{eq:noisemodel} \end{eqnarray}(8)where constant gain g (which converts the number of photoelectrons to ADU)3 and constant readout noise σrn2\hbox{$\sigma_{\mathrm{rn}}^{2}$} are assumed.

With a given noise model, one can predict the extraction uncertainties through error propagation of Eq. (7) var(rx)=ϵrx2=y(rxSx,y)2σx,y2=y(Fx,y/σx,y2ywx,yFx,y2)2σx,y2=\begin{eqnarray} \mathrm{var}(r_{x})=\epsilon_{r_{x}}^{2} & =&\sum_{y}\left(\frac{\partial r_{x}}{\partial S_{x,y}}\right)^{2}\sigma_{x,y}^{2}=\sum_{y}\left(\frac{F_{x,y}/\sigma_{x,y}^{2}}{\sum_{y'}w_{x,y'}F_{x,y'}^{2}}\right)^{2}\sigma_{x,y}^{2}\nonumber \\ & =&\frac{1}{\left(\sum_{y}F_{x,y}^{2}/\sigma_{x,y}^{2}\right)^{2}}\sum_{y}F_{x,y}^{2}/\sigma_{x,y}^{2}\!=\!\frac{1}{\sum_{y}F_{x,y}^{2}/\sigma_{x,y}^{2}}\label{eq:error_abs} \end{eqnarray}(9)(similar to Horne 1986). We also see that the relative error, ϵrxrx=ywx,yFx,y2ywx,yFx,ySx,y,\begin{eqnarray} \frac{\epsilon_{r_{x}}}{r_{x}}=\frac{\sqrt{\sum_{y}w_{x,y}F_{x,y}^{2}}}{\sum_{y}w_{x,y}F_{x,y}S_{x,y}}\,,\label{eq:error_rel} \end{eqnarray}(10)is smaller in regions with high flux Sx,y and is scaling invariant with respect to the flat image Fx,y, because multiplying Fx,y by a constant (as happens for a longer flat exposure time) will cancel out in the ratio in Eq. (10).

The a priori predicted uncertainty in Eq. (9) does not account for the goodness of fit and thus does not indicate profile or noise model mismatches. For this reason, the error estimates from the χ2 modelling are often rescaled by χred2\hbox{$\chi_{\mathrm{red}}^{2}$}(the reduced χ2) to provide a posteriori estimated errors σsx/fx=χred·ϵsx/fx\begin{eqnarray} \sigma_{s_{x}/f_{x}}=\chi_{\mathrm{red}}\cdot\epsilon_{s_{x}/f_{x}}\label{eq:error-post} \end{eqnarray}(11)where χred=1Nνχ2\hbox{$\chi_{\mathrm{red}}=\sqrt{\frac{1}{N-\nu}\,\chi^{2}}$} with χ2 from the global fit in Eq. (6). The number of degrees of freedom N − ν in the denominator is given by the number of unmasked pixels N =  ∑ x,yMx,y within the extraction aperture and the number of fitted parameters ν (extracted spectral values).

One can also consider another, second choice for the scaling factor, namely χred,x to be taken from the individual cross-section fits, i.e. χred,x2=1Nx1χx2\hbox{$\chi_{\mathrm{red},x}^{2}=\frac{1}{N_{x}-1}\,\chi_{x}^{2}$} where only one free parameter is left (ν = 1), χx2\hbox{$\chi_{x}^{2}$} is the weighted sum of the residuals only in column x and Nx =  ∑ yMx,y is the number of unmasked pixels in that column. However, for fibre-fed spectrographs, the extraction aperture is only a few pixels (Nx ~ 10) wide. This gives low number statistics making χred,x itself very uncertain (in contrast to χred2\hbox{$\chi_{\mathrm{red}}^{2}$} which is approximately the average of all χred,x2\hbox{$\chi_{\mathrm{red},x}^{2}$})4. In particular, a considerable number of χred,x will occur with values much less than one, when fitting the profiles along the dispersion axis. However, the predicted errors should not be smaller than the fundamental limit (e.g. photon noise).

As discussed in Horne (1986), cosmic ray hits can be efficiently detected and removed with optimal extraction. Those cosmics distort the profile and can be identified as significant outliers by means of the noise model, e.g. by setting an upper threshold Sx,yx,y>κvar(Sx,yx,y)=κ·(σx,y2Fx,y2ϵsx/fx2),\begin{eqnarray} S_{x,y}-\hat{S}_{x,y}>\kappa\sqrt{\mathrm{var}(S_{x,y}-\hat{S}_{x,y})}=\kappa \cdot \left(\sigma_{x,y}^{2}-F_{x,y}^{2}\epsilon_{s_{x}/f_{x}}^{2}\right),\label{eq:kapsigclip} \end{eqnarray}(12)(see Appendix A for a derivation of this equation), where the clipping value is typically κ ~ 3 − 5 to reject cosmics. A lower threshold could be set as well, e.g. to detected unmasked cold pixels. Moreover, other or additional criteria are also used (Baranne et al. 1996). The outliers are masked and the extraction process is repeated.

We note that the variance in the residual image Sx,y − Ŝx,y is generally not just the pixel variance σx,y2\hbox{$\sigma_{x,y}^{2}$}. In contrast to Horne (1986) the correction term Fx,y2ϵsx/fx2\hbox{$F_{x,y}^{2}\epsilon_{s_{x}/f_{x}}^{2}$} should be applied in Eq. (12). Figuratively, the observed dispersion in the residuals will be noticeably smaller than σx,y2\hbox{$\sigma_{x,y}^{2}$}, because the number of pixels within the aperture is, as already mentioned, small and the profile is usually not uniform, but concentrates most flux in even fewer pixels (~3–4 pixels in fibre-fed spectrographs). Imagine a concentrated profile, where one pixel has a very high weight. This pixel dominates the fit, thus forcing its own residual to zero. As an example, in Fig. 1 the pixel in the profile centre contains about 25% of the total profile flux, and the observed dispersion in this pixel will be smaller than σx,y by a factor of 0.56 in case of readout dominated noise (Eq. (A.16)) and by 0.75 in case of pure photon noise (Eq. (A.17)).

Table 2

Efficiency components of typical echelle spectrographs visible in flat-fields.

4. Conceptual comparison with other optimal extraction implementations

Numerous optimal extraction algorithms (hereafter OXT) exists that are listed in Table 1. Usually, they have to assume a slowly varying spatial profile and differ in the reconstruction method for this profile. Using the spatial profiles of many columns, a mean high S/N model is created. Since OXT aims to estimate “absolute” counts5, the profile functions px,y are normalised to unity (∑ ypx,y = 1).

If there is no or slow order tilt, the profile might be reconstructed by averaging (Hewett et al. 1985) or fitting low-order polynomials in the dispersion direction (Horne 1986), respectively. For echelle spectrographs, this situation occurs only in small sections of the orders, but not in general. For larger tilts, the order trace (which describes the vertical position of the profile along the x direction) appears more or less explicitly in the profile modelling, and sub-pixel grids are introduced. Then neighbouring polynomials in dispersion direction are coupled (Marsh 1989), or recentred cross-sections are fitted by smoothing (spline-like) functions (Piskunov & Valenti 2002; Bolton & Burles 2007). Those algorithms need a good description of the trace to properly position the profile model. In particular, for order regions grazing the detector borders, the order tracing and the application of OXT algorithms is difficult.

An advantage of OXT is that the profile reconstruction can be done on the object spectrum itself. This is crucial for slit spectrographs where the shape and position of the object on the slit and/or detector can be different in the next exposure or change even during an exposure, depending on the seeing, guiding, and temperature. The noise in the mean profiles is then reduced by about the square root of the number of used columns compared to individual profiles (assuming an absorption spectrum; however, this approach may have problems with emission line spectra that have profile information only in a few columns).

The concept of optimal extraction was originally developed for slit spectrographs, and the application to fibre-fed spectrographs seemed natural. If the image shape becomes insensitive to seeing and guiding (as for fibre-fed instruments), one can save the effort of the profile modelling on the object itself (which can become computationally extensive as in Piskunov & Valenti 2002) and a less noisy reference profile model might be taken from a reference image or a flat (Marsh 1989; Baranne et al. 1996). Still, a recentring of the model might be needed to account for sub-pixel shifts (e.g. a nightly shift of 0.1 pixels was reported by Baranne et al. for ELODIE).

If additionally the position of the image is fixed, as for stabilised instruments without mechanical and thermal flexure, then recentring also becomes redundant. In this case there is even no longer any need to model the profile that now can be taken directly from the flat, therefore FOX does not require any choices for model parameters such as the polynomial degree, the subpixel grid size, or the amount of smoothing.

Another benefit of the fixed format is that flat-field effects can be included in the extraction mask, and in this sense FOX does the flat-fielding simultaneously. Also from Eq. (7), we note that there is no need for a pixel-wise division by a flat (an operation one likes to avoid, especially when small numbers occur as in the wings of the cross-sections).

The classical data reduction with OXT requires an additional pre- and/or post-flat-fielding, i.e. before and/or after extraction (Baranne et al. 1996). Flat-fielding has the task of correcting for various multiplicative efficiency effects that have various causes and scales. Locally, there are, e.g., pixel-to-pixel variations owing to different sizes and quantum efficiencies of the detector pixels. Fringe patterns are interference effects that become serious in the red orders and depend on the properties of CCD (layer thickness) and the incident light (wavelength, angle, position). The blaze efficiency along and across the orders depends on the echelle grating and the cross-disperser. Finally, there is also a wavelength-dependent efficiency of the detector and optical elements (e.g. fibre transmission). We may summarise these effects as εx,y = εpixel,x,y εfringe,x,y εblaze,x ελ,x.

Basically, εblaze and ελ are only functions of x and they could be corrected after extraction. However, εpixel and εfringe have spatial components and should be corrected before (or, as in FOX, during) the extraction. If not, the profile mismatches result in less correct flux estimates, and in particular, residing fringe patterns lead to strong, rapid, and systematic profile variations violating the above assumption as needed for the profile modelling.

Data reduction pipelines using OXT create a normalised flat-field image. Those might be obtained as part of the decomposition of flat exposure into a normalised flat image, a normalised profile/order shape, and the blaze function (Piskunov & Valenti 2002). Another technical approach is to use, if existing, wider or moving fibres to get a uniform illumination (similar to a long-slit, Baranne et al. 1996), but fringe patterns are likely not properly captured.

5. FOX in action

thumbnail Fig. 2

Spectra of HD 60753 and τ Cet extracted with FOX. The orders (alternating colours) are not merged. The spectrum is not flux-corrected, but is relative to the spectrum of the flat lamp. The inset shows the overlap between two orders.

thumbnail Fig. 3

Comparison of the extraction quality for two HARPS spectra of the B3IV star HD 60753 taken in the same night (S / N ~ 170). Top: extraction with FOX. Second panel: spectrum extracted with the HARPS DRS pipeline (not blaze corrected). Third panel: ratio of both exposures for FOX (red dots) and DRS (blue dots). Bottom: histogram of the ratio values. The black line is a Gaussian fit to the FOX histogram having a centre at 0.914 and a width of 0.00905.

To demonstrate the performance of FOX, we extract spectra taken with HARPS, an echelle spectrograph at the 3.6 m ESO telescope in La Silla (Chile), located in a pressure- and temperature-stabilised vacuum tank (Mayor et al. 2003). It is fed by two fibres from the Cassegrain focus, a science fibre (A) and a calibration fibre (B) to simultaneously monitor wavelength drift or sky background. HARPS has a resolving power of R = 110 000 and covers the wavelength range 3800–6900 Å with 72 orders (fibre A) on a 4 k × 4 k CCD. Figure 1 shows a section of a HARPS raw spectrum.

We also choose HARPS observations, because there is an elaborated pipeline for HARPS called data reduction software (DRS, version 3.5)6 that already has demonstrated high performance and allows comparison to our extraction results. The DRS employs the Horne (1986) algorithm and since version 3.5 uses coadded flat-fields to define the extraction profile (C. Lovis, Observatoire de Genève, priv. comm.). It is here supposed to represent optimal extraction. Both the raw and reduced spectra are publicly available7.

We extracted spectra for a standard star (HD 60753, B3IV) and a solar-like star (τ Cet, G8V). We used the REDUCE package of Piskunov & Valenti (2002) for the preprocessing: the bias frames were averaged to a master bias; five flat exposures were bias-subtracted, averaged to a master flat and the scattered light was subtracted; one order-location frame (a flat lamp illuminating only fibre A) was used to define the order traces of fibre A; from the science images, the master bias and the scattered light were subtracted (no pre-flat-fielding was performed). Then we extracted the science spectra with our FOX algorithm with a spatial extraction width of ten pixels (five whole pixels below and five above the order trace), σrn = 5 ADU, and g=0.70ADU/eph\hbox{$g=0.70\,\mathrm{ADU}/\mathrm{e}_{\mathrm{ph}}^{-}$}. The wavelength solutions were taken from the DRS pipeline.

Figure 2 shows the result of the FOX extraction for two observations. As can be seen, the extracted orders match in the overlapping regions and seem to be ready to be merged directly. Order merging, however, is not the topic of this work and is not necessary for our RV computations below. Merging also needs some care, since the different resolution and sampling will lead to mismatches in the absorption and emission features, and in this way some information is also lost.

5.1. S/N measurement for the standard star

To measure the extraction quality, we use two observations of the standard star HD 60753, taken three hours apart in the night 2007-03-26; exposures times 9.5 min, and various methods of estimating S/Ns. Figure 3 displays the extracted aperture number 68 (6613–6687 Å) of the standard star (the prominent absorption line is He I).

Using the uncertainties derived from the noise model (Eq. (9)), we estimate for FOX a quadratic mean signal-to-noise of S/N=1nxrx2ϵrx2=169.6\hbox{${\it S{/}N}=\sqrt{\frac{1}{n}\sum_{x}\frac{r_{x}^{2}}{\epsilon_{r_{x}}^{2}}}=169.6$} (exposure 2: 176.9) per extracted pixel around the central n = 100 pixels. Since no uncertainties are provided for DRS spectra, we assume pure photon noise (ϵx=sx\hbox{$\epsilon_{x}=\sqrt{s_{x}}$}). Then the count level in the DRS spectrum implies a mean photon S/N of sx=170.4\hbox{$\sqrt{\left\langle s_{x}\right\rangle }=170.4$} (exposure 2: 178.6)8. These S/N values are similar and provide fundamental limits. In this respect, we note that we have χred = 1.08 for this order.

Another way to measure the S/N independently of a priori error estimates is to analyse the scatter in a continuum region. (This is the reason for choosing a standard star for the comparison.) For the same central region, the S/N per extracted pixel derived as the ratio of the intensity mean and the standard deviation (S/N=sx(sxsx)2\hbox{${\it S{/}N}=\frac{\left\langle s_{x}\right\rangle }{\left\langle (s_{x}-\left\langle s_{x}\right\rangle )^{2}\right\rangle }$}) is 185 (exposure 2: 163) for FOX and 177 (exposure 2: 141) for DRS. These numbers already indicate that the extraction qualities are similar.

In a further comparison (also independent of a priori error estimates), we take the ratio of the two spectra of the standard star and analyse the scatter. Taking the ratio of both spectra cancels out the remaining flat spectrum in the case of FOX and the blaze function in the case of DRS, therefore allowing for a more direct comparison over a wider range and even pixel-wise. We see in the third panel of Fig. 3 that the ratio values of FOX (qx = rx,exp1 / rx,exp2) and DRS (qx = sx,exp1 / sx,exp2) have a similar mean (~0.914, constant over the full order) and are correlated. For both extractions, the scatter appears similar and barely distinguishable by eye. In both cases the scatter increases towards the order edges (since the flux decreases due to the blaze, see second panel of Fig. 3). As before, we measured a S/N from the mean and the standard deviation of the ratio values for the central 100 pixels and find a slightly higher S/N for FOX (S / Nq,FOX = 133 and S / Nq,DRS = 129).

In contrast to the previous method, we can extend this ratio method over the full order and also use regions with stellar lines (assuming that the stellar lines are static, i.e. do not vary in shape and position). A slight, relative shift between both spectra is present owing to the difference in their barycentric radial velocities (212 m/s, ~0.25  pix). The slightly increased scatter noticeable at the position of the strong stellar line is due to this shift as well as to the lower flux level (lower S/N).

The last panel of Fig. 3 shows a histogram for the 4096 ratio values. We see that the ratio values have nearly a Gaussian distribution, and we now measure the mean and dispersion more robustly from a Gaussian fit to the histograms. Using S/Nq=μσ\hbox{${\rm S{/}N}_{q}=\frac{\mu}{\sigma}$}, we find values of 100.77 (FOX) and 100.87 (DRS) for order 68. The same procedure was applied to the other orders and the results plotted in Fig. 4. The overall course of the S / Nq values mostly reflects the instrument efficiency (times the stellar energy distribution) along the order. Both FOX and DRS extraction deliver similar S / Nq with quotients close to unity and deviation of ≲1%, whereas in this example FOX provided slightly higher S/N in the blue orders.

thumbnail Fig. 4

Signal-to-noise S / Nq measured in the spectrum ratio qx of two standard star exposures for several orders from the extraction FOX (red plus) and DRS (blue crosses). The quotient of the S/N values between FOX and DRS (black filled circles) refer to the right axis and are close to unity.

5.2. RV performance for a Sun-like star

As a further, less direct, but probably more relevant proxy for the extraction quality we use radial velocity (RV) measurements. The RV precision depends on the S/N in the observation and the RV information content of the stellar spectral lines (the number and amount of gradients; Bouchy et al. 2001). In particular this means that we are probing the extraction quality in the flanks of spectral lines (rather than in continuum regions as before).

We have chosen an asteroseismology run from the night 2004-10-02 for the star Tau Cet, which is a known RV standard with a dispersion at the 1 m/s level over years (Pepe et al. 2011). This night provides the large number (438) of spectra needed to visualise the close performance of FOX and DRS (Fig. 5). Since all data were taken during the same night, we can defer discussions about the systematics due to the wavelength calibration, which is another crucial step in data reduction.

For the RV measurements we use the method of least-square template matching (Anglada-Escudé & Butler 2012). The RVs are measured over several orders, the ten bluest orders, as well as regions contaminated by telluric absorptions features have been excluded. The top panel in Fig. 5 shows that RV differences resulting from the FOX and DRS extraction are small and at or below the 1 m/s level. The rms is 1.50 m/s with FOX and 1.49 m/s with DRS extraction. Since the DRS pipeline also delivers high-quality RVs measured by cross-correlation with a binary template, we provide a comparison in the lower panel of Fig. 5 showing the impact of using the same extraction method (DRS) but a different RV computation.

thumbnail Fig. 5

Radial velocities of τ Cet in the night of 2004-10-02 computed with the least square method for FOX and DRS extraction (top) and for comparison the radial velocities from cross-correlation for DRS extraction (bottom).

The two examples presented in this section demonstrate that the concept of FOX can work in practice and indicate that FOX has a similar and comparable performance in terms of S/N and RV precision compared to standard optimal extraction.

6. Limits of optimal extraction

FOX and other optimal extraction algorithms assume 1D slit functions that are aligned with the detector columns. However, in practice the slit or fibre image is usually resolved and sampled by a few pixels in the spatial and dispersion direction, and the injection of a monochromatic wavelength causes a 2D PSF that can be seen in emission line spectra. We investigate this effect in data from HARPS.

Figure 6 shows the sharp features of a spectrum from a laser frequency comb (LFC; Murphy et al. 2007; Wilken et al. 2010) observed with HARPS (order 44). The residuals of the extraction (Sx,y − Ŝx,y) are shown in the lower panel, where Ŝx,y is computed with the extracted spectrum rx and Eq. (5)). Of course, the residuals are larger in regions with larger flux (more photon noise). However, a systematic (not random) pattern remains: the residuals are always too low in the peak centre (by ~10%!) and too high up left, as well as down right from the peak centres (this pattern is also similar to the simulation of Bolton & Schlegel 2010; see their Fig. 1). The reason for this pattern is the mismatch between the individual 2D PSF cross-sections and flat-field cross-profiles (which could be thought of as the 2D PSF cross-sections integrated along the disperion axis). Only if the individual 2D PSF cross-sections were self-similar would the mismatch diminish.

We visualise some cross-sections in Fig. 7 at the positions indicated in Fig. 6. This region was selected because the order tilt is small and the LFC peak distances are close to integer values (multiple of 12.0 pixels), i.e. the LFC peaks have only small pixel phase shifts here and samples similar phases of the effective PSF. Each of the 18 profiles (3 cuts for each of the 6 peaks) is normalised to unit area. Obviously the profiles in the flanks have a different form (more peaky and shifted) with respect to the peak centres. The mismatches can be at the 10% level for sharp features (while they are not visible in the flat-field cross-sections taken at the same positions). This will lead to high χred-values in optimal extraction and biased extracted spectrum values in sharp, high S/N features.

The visible, systematic residual tilts in Fig. 6 from bottom left to top right results from an asymmetry in the PSF, although the PSF appears like a symmetric 2D Gaussian at first glance. We presume that the physical explanation for this asymmetry (present everywhere on the detector) lies in the quasi Littrow mode configuration, i.e. the usage of R4 echelle grating with an off-plane angle of 1.5° that amplifies by a factor of ≈8 to a spectral line tilt or shear (Hearnshaw 2009) and deforms the circularly symmetric image of the fibre exit.

The shortcoming of “optimal” extraction might be solved with “perfect” extraction that involves a 2D PSF as outlined in Bolton & Schlegel (2010). It can theoretically also deal with stray light and ghost features. However, besides the increased computational effort, the main problem in practice is to obtain the calibration matrix Ψx,y,λ.

thumbnail Fig. 6

Section of a HARPS laser frequency spectrum Sx,y (top, logarithmic intensity scale) and residuals Sx,y − Ŝx,y after the FOX extraction (bottom, linear scale). Coloured ticks indicate positions of cross-section cuts (see Fig. 7).

thumbnail Fig. 7

Cross-sections at different positions in the laser frequency comb spectrum and flat-field image. The cut positions are at the LFC peak maximum (black asterisks), two pixels before (blue crosses), and two pixels after (red plus), as indicated by the corresponding colored ticks in Fig. 6; the coloured lines connect the corresponding profile means. The profiles are normalised to unit area.

7. Conclusion

We have introduced a method extracting 1D spectra from 2D raw data using flat-field exposures as a measure of the instrumental profile. This method is similar to standard optimum extraction. It does not make any assumptions about the instrumental profile but requires its temporal stability between flat-field and science exposures. The method is well suited to stabilised fibre-fed spectrographs optimised for high-precision radial velocity work.

One of the main advantages of FOX is that the reconstruction of the instrumental PSF becomes unnecessary. Modelling of the PSF is a time-consuming step, and in regions of low signal, the PSF is often ill-defined in science exposures. It is therefore a strong advantage to take the PSF from well-defined flat-field exposures, which is the main idea of FOX. Following this scheme, extraction, masking, flat-fielding, and blaze correction are all carried out in one step without any need for data fitting when the calibration matrix is constructed. The decrease in required CPU time is significant, which is particularly relevant for large surveys like HARPS and CARMENES (Quirrenbach et al. 2012). Furthermore, FOX has no requirements concerning the spectral format (such as slowly varying spatial profiles), relaxes the need for accurate localisation of spectral orders, and does not involve any numerical unstable operations like division by the flat-field.

We compared the performance of FOX to standard optimal extraction using HARPS data and the HARPS data reduction system. The results are very similar with insignificant differences in the S/N. The spectra of individual orders extracted with FOX match well in the overlap regions showing that the inherent blaze correction works. We computed the RV series from HARPS spectra extracted with FOX and DRS and find them indistinguishable in terms of their rms scatter. We conclude that FOX is a highly efficient and very robust method for extracting astronomical spectroscopic data observed with stabilised fibre-fed spectrographs. FOX cannot overcome the limitations caused by tilted PSFs, stray light, or ghost features, but it can significantly improve the robustness and time efficiency of existing and future data reduction procedures.


1

δ ij = { if i = j else . \hbox{$\delta_{ij}=\begin{cases} 1 & \mathrm{if}\, i=j\\ 0 & \mathrm{else}. \end{cases}$}

2

If a sky- or stray light background was subtracted from the image before, its contribution to photon noise should be also taken into account.

3

There are different definitions of the gain in the literature. Here we use S [ ADU ]  = g [ ADU / e ] ·S [ e ]. The photon noise in units of ADU is σph[ADU]=g[ADU/e]·σph[e]=g[ADU/e]·S[e]=g[ADU/e]·S[ADU]\hbox{$\sigma_{\mathrm{ph}}[\mathrm{ADU]}=g[\mathrm{ADU}/{\rm e}^{-}]\cdot\sigma_{\mathrm{ph}}[{\rm e}^{-}]=g[\mathrm{ADU}/{\rm e}^{-}]\cdot\sqrt{S[{\rm e}^{-}]}=\sqrt{g[\mathrm{ADU}/{\rm e}^{-}] \cdot S[\mathrm{ADU}]}$}.

4

χred,x2=1νx1Nx1χx21ν1Nx1xχx2=1νNxνχ2=χred2\hbox{$\left\langle \chi_{\mathrm{red},x}^{2}\right\rangle =\frac{1}{\nu}\sum_{x}\frac{1}{N_{x}\,-\,1}\,\chi_{x}^{2}\approx\frac{1}{\nu}\frac{1}{\left\langle N_{x}\right\rangle \,-\,1}\sum_{x}\chi_{x}^{2}=\frac{1}{\nu\left\langle N_{x}\,\right\rangle -\,\nu}\,\chi^{2}=\chi_{\mathrm{red}}^{2}$}. The approximation becomes an equality if the aperture width is the same for all columns, and no pixels are rejected (Nx = const., N ≈ νNx⟩).

5

Therefore the blaze function and wavelength-dependent efficiencies are not corrected at this stage, and usually sx·εblaze,x·ελ,x is extracted (instead of sx).

8

Accounting for readout noise from, say, 5 spatial pixels decreases the S/N values by about 1(sx+5σrn2/g)/sx~0.3%\hbox{$1-\sqrt{(s_{x}+5\sigma_{\mathrm{rn}}^{2}/g)/s_{x}}\sim0.3\%$}.

Acknowledgments

M.Z. acknowledges support by the European Research Council under the FP7 Starting Grant agreement number 279347. We thank the referee, N. Piskunov, for many useful comments. We are also thankful to C. Lovis and G. Lo Curto for helpful discussions about the HARPS instrument and F. Bauer for reading the manuscript.

References

  1. Anderson, J., & King, I. R. 2000, PASP, 112, 1360 [NASA ADS] [CrossRef] [Google Scholar]
  2. Anglada-Escudé, G., & Butler, R. P. 2012, ApJS, 200, 15 [NASA ADS] [CrossRef] [Google Scholar]
  3. Baranne, A., Queloz, D., Mayor, M., et al. 1996, A&AS, 119, 373 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  4. Bolton, A. S., & Burles, S. 2007, New J. Phys., 9, 443 [NASA ADS] [CrossRef] [Google Scholar]
  5. Bolton, A. S., & Schlegel, D. J. 2010, PASP, 122, 248 [NASA ADS] [Google Scholar]
  6. Bouchy, F., Pepe, F., & Queloz, D. 2001, A&A, 374, 733 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  7. Hearnshaw, J. 2009, Astronomical Spectrographs and their History (Cambridge University Press) [Google Scholar]
  8. Hewett, P. C., Irwin, M. J., Bunclark, P., et al. 1985, MNRAS, 213, 971 [NASA ADS] [CrossRef] [Google Scholar]
  9. Horne, K. 1986, PASP, 98, 609 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  10. Marsh, T. R. 1989, PASP, 101, 1032 [NASA ADS] [CrossRef] [Google Scholar]
  11. Mayor, M., Pepe, F., Queloz, D., et al. 2003, The Messenger, 114, 20 [NASA ADS] [Google Scholar]
  12. Murphy, M. T., Udem, T., Holzwarth, R., et al. 2007, MNRAS, 380, 839 [NASA ADS] [CrossRef] [Google Scholar]
  13. Pepe, F., Lovis, C., Ségransan, D., et al. 2011, A&A, 534, A58 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  14. Piskunov, N. E., & Valenti, J. A. 2002, A&A, 385, 1095 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  15. Quirrenbach, A., Amado, P. J., Seifert, W., et al. 2012, in Proc. SPIE, 8446, 33 [Google Scholar]
  16. Robertson, J. G. 1986, PASP, 98, 1220 [NASA ADS] [CrossRef] [Google Scholar]
  17. Urry, M., & Reichert, G. 1988, NASA IUE Newsletter, 34, 96 [Google Scholar]
  18. Wilken, T., Lovis, C., Manescau, A., et al. 2010, in Proc. SPIE, 7735, 28 [Google Scholar]

Appendix A: Appendix A

To estimate for each pixel the residual variance, var(Sx,yx,y)=var(Sx,y)+var(x,y)2cov(Sx,y,x,y),\appendix \setcounter{section}{1} \begin{eqnarray} \mathrm{var}(S_{x,y}-\hat{S}_{x,y})=\mathrm{var}(S_{x,y})+\mathrm{var}(\hat{S}_{x,y})-2\mathrm{cov}(S_{x,y},\hat{S}_{x,y}),\label{eq:res_var_def} \end{eqnarray}(A.13)we need the pixel variance, which is var(Sx,y)=σx,y2=1wx.y\hbox{$\mathrm{var}(S_{x,y})=\sigma_{x,y}^{2}=\frac{1}{w_{x.y}}$}, and an expression for the model Ŝx,y. Inserting Eqs. (7) and (9) into Eq. (5) gives x,y=Fx,yrx=Fx,yϵrx2ywx,yFx,ySx,y.\appendix \setcounter{section}{1} \begin{eqnarray} \hat{S}_{x,y}=F_{x,y}r_{x}=F_{x,y}\epsilon_{r_{x}}^{2}\sum_{y}w_{x,y}F_{x,y}S_{x,y}\,.\label{eq:Sxy-estimate} \end{eqnarray}(A.14)We see that Sx,y itself contributes to Ŝx,y and a covariance term will persist. Assuming that the pixels are otherwise independent,

the variance of the residuals is given by var(Sx,yx,y)=σx,y2+Fx,y2var(rx)2Fx,y2ϵrx2wx,yσx,y2=\appendix \setcounter{section}{1} \begin{eqnarray} \mathrm{var}(S_{x,y}-\hat{S}_{x,y}) & =&\sigma_{x,y}^{2}+F_{x,y}^{2}\mathrm{var}(r_{x})-2F_{x,y}^{2}\epsilon_{r_{x}}^{2}w_{x,y}\sigma_{x,y}^{2}\nonumber \\ & =&\sigma_{x,y}^{2}-F_{x,y}^{2}\epsilon_{r_{x}}^{2}.\label{eq:res_var} \end{eqnarray}(A.15)This equation can be simplified for three special cases.

  • Case 1. For pixelswith relatively low profile weights(Fx,y2ϵrx2σx,y2\hbox{$F_{_{x,y}}^{2}\epsilon_{r_{x}}^{2}\ll\sigma_{x,y}^{2}$}), e.g. in the wings of spatial profiles, the residual variance just becomes σx,y2\hbox{$\sigma_{x,y}^{2}$}.

  • Case 2. When the pixel variance is the same for all pixels σx,y = σ0, e.g. readout noise dominates, we have (ϵrx2=σ021Fx,y2\hbox{$\epsilon_{r_{x}}^{2}=\sigma_{0}^{2}\frac{1}{\sum F_{x,y}^{2}}$}) var(Sx,yx,y)|σx,y=σ0=[1Fx,y2Fx,y2]σ2·\appendix \setcounter{section}{1} \begin{eqnarray} \left.\mathrm{var}(S_{x,y}-\hat{S}_{x,y})\right|_{\sigma_{x,y}=\sigma_{0}} & =\left[1-\frac{F_{x,y}^{2}}{\sum F_{x,y}^{2}}\right]\sigma^{2}\cdot\label{eq:res_var_readout} \end{eqnarray}(A.16)Moreover, for a uniform profile (Fx,y = 1) the factor in the bracket becomes N1N\hbox{$\frac{N-1}{N}$} (a well known correction factor for the unbiased variance).

  • Case 3. Assuming σx,y2=gx,y=grxFx,y\hbox{$\sigma_{x,y}^{2}=g\hat{S}_{x,y}=gr_{x}F_{x,y}$}, i.e. photon noise dominates, we find (ϵrx2=1yFx,ygrx=1yFx,yσx,y2Fx,y)\hbox{$\left(\epsilon_{r_{x}}^{2}=\frac{1}{\sum_{y}F_{x,y}}gr_{x}=\frac{1}{\sum_{y}F_{x,y}}\frac{\sigma_{x,y}^{2}}{F_{x,y}}\right)$}var(Sx,yx,y)|σx,y2=gx,y=[1Fx,yFx,y]σx,y2.\appendix \setcounter{section}{1} \begin{eqnarray} \left.\mathrm{var}(S_{x,y}-\hat{S}_{x,y})\right|_{\sigma_{x,y}^{2}=g\hat{S}_{x,y}} & =\left[1-\frac{F_{x,y}}{\sum F_{x,y}}\right]\sigma_{x,y}^{2}\,.\label{eq:res_var_photon} \end{eqnarray}(A.17)Again for a uniform profile the prefactor becomes N 1 N · \hbox{$\frac{N\,-\,1}{N}\cdot$}

All Tables

Table 1

Optimal extraction algorithms.

Table 2

Efficiency components of typical echelle spectrographs visible in flat-fields.

All Figures

thumbnail Fig. 1

Principle of flat-relative optimal extraction (FOX). Each of the two upper panels shows a 400 pix × 13 pix section of the reddest HARPS order (fibre A). Pixel-to-pixel variations, fringing, order tilt/curvature, and the spatial profile are present in both the master flat Fx,y (top) and the stellar raw spectrum Sx,y which has additional (telluric) absorption lines (middle). The extracted spectrum sx / fx (bottom) is the scaling factor between both spectra for each column within the extraction mask (red lines).

In the text
thumbnail Fig. 2

Spectra of HD 60753 and τ Cet extracted with FOX. The orders (alternating colours) are not merged. The spectrum is not flux-corrected, but is relative to the spectrum of the flat lamp. The inset shows the overlap between two orders.

In the text
thumbnail Fig. 3

Comparison of the extraction quality for two HARPS spectra of the B3IV star HD 60753 taken in the same night (S / N ~ 170). Top: extraction with FOX. Second panel: spectrum extracted with the HARPS DRS pipeline (not blaze corrected). Third panel: ratio of both exposures for FOX (red dots) and DRS (blue dots). Bottom: histogram of the ratio values. The black line is a Gaussian fit to the FOX histogram having a centre at 0.914 and a width of 0.00905.

In the text
thumbnail Fig. 4

Signal-to-noise S / Nq measured in the spectrum ratio qx of two standard star exposures for several orders from the extraction FOX (red plus) and DRS (blue crosses). The quotient of the S/N values between FOX and DRS (black filled circles) refer to the right axis and are close to unity.

In the text
thumbnail Fig. 5

Radial velocities of τ Cet in the night of 2004-10-02 computed with the least square method for FOX and DRS extraction (top) and for comparison the radial velocities from cross-correlation for DRS extraction (bottom).

In the text
thumbnail Fig. 6

Section of a HARPS laser frequency spectrum Sx,y (top, logarithmic intensity scale) and residuals Sx,y − Ŝx,y after the FOX extraction (bottom, linear scale). Coloured ticks indicate positions of cross-section cuts (see Fig. 7).

In the text
thumbnail Fig. 7

Cross-sections at different positions in the laser frequency comb spectrum and flat-field image. The cut positions are at the LFC peak maximum (black asterisks), two pixels before (blue crosses), and two pixels after (red plus), as indicated by the corresponding colored ticks in Fig. 6; the coloured lines connect the corresponding profile means. The profiles are normalised to unit area.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.