Consistent radial velocities of classical Cepheids from the cross-correlation technique

Accurate radial velocities ($v_{\rm rad}$) of Cepheids are mandatory within the context of distance measurements via the Baade-Wesselink technique. The most common $v_{\rm rad}$ derivation method consists in cross-correlating the observed spectrum with a binary template and measuring a velocity on the resulting profile. Yet for Cepheids, the spectral lines selected within the template as well as the way of fitting the cross-correlation function (CCF) have a significant impact on the measured $v_{\rm rad}$. We detail the steps to compute consistent Cepheid CCFs and $v_{\rm rad}$, and we characterise the impact of Cepheid spectral properties and $v_{\rm rad}$ computation method on the resulting line profiles. We collected more than 3900 high-resolution spectra from seven different spectrographs of 64 classical Cepheids. These spectra were standardised through a single process on pre-defined wavelength ranges. We built six correlation templates selecting un-blended lines of different depths from a synthetic Cepheid spectrum, on three different wavelength ranges from 390 to 800 nm. Each spectrum was cross-correlated with these templates to build the corresponding CCFs. We derived a set of line profile observables as well as three different $v_{\rm rad}$ measurements from each CCF. This study confirms that both the template wavelength range, its mean line depth and width, and the $v_{\rm rad}$ computation method significantly impact the $v_{\rm rad}$. Deriving more robust Cepheid $v_{\rm rad}$ time series require to minimise the asymmetry of the line profile and its impact on the $v_{\rm rad}$. Centroid $v_{\rm rad}$, that exhibit slightly smaller amplitudes but significantly smaller scatter than Gaussian or biGaussian $v_{\rm rad}$, should thus be favoured. Stronger lines are also less asymmetric and lead to more robust $v_{\rm rad}$ than weaker lines.


Introduction
Cepheids are essential extra-galactic distance standards as their period of pulsation (P) correlates tightly with their absolute luminosity, a (mostly empirical) relationship known for more than a century as the Leavitt law, or the Cepheid Period-Luminosity (hereafter P-L) relationship (Leavitt & Pickering 1912). They have thus vastly contributed to precision cosmology, and especially to the measurement of the Hubble constant H 0 (Freedman & Madore 2010;Riess et al. 2011Riess et al. , 2016Riess 2018). Their high-amplitude pulsations, that generally follow a relatively clear pattern, also make them targets of choice for spectroscopic studies looking for a better understanding of the stellar structure.
Radial velocities (v rad ) of Cepheids have thus long been of high interest. First, they allow to detect and characterise Cepheid stellar companions (i.e. single-line spectroscopic binaries or SB1s), that have revealed themselves to be widespread (see e.g. Evans et al. 2015;Anderson et al. 2015Anderson et al. , 2016Gallenne et al. 2013Gallenne et al. , 2015Gallenne et al. , 2019Gallenne et al. , 2018. The most common application of Cepheid v rad is their use for the measurement of the Cepheid distances through the Parallax-of-Pulsation method, also known as the Baade-Wesselink (BW) technique (Lindemann 1918;Baade 1926;Wesselink 1946). The BW method allows to derive directly the distance in a quasi-geometrical way through the ratio of the Cepheid linear radius variation (∆R, measured spectroscopically) over the angular diameter variation ∆θ (see recent examples in e.g. Kervella et al. 2004;Gallenne et al. 2012Gallenne et al. , 2017Mérand et al. 2015;. The linear radius variation ∆R is assumed to be proportional to the v rad curve integrated over the pulsation phase. To understand how, it is first necessary to remind what exactly represents the radial velocity (see also Lindegren & Dravins 2003, on the definition) in the context of a pulsating star and especially a Classical Cepheid. Cepheids are radial pulsators with a given photospheric pulsation velocity (v puls ) at a given phase of the pulsation period. v puls is a true physical quantity, whereas we measure the resulting Doppler shift of the received stellar light integrated over the stellar disk and projected on our line of sight. This Doppler shift is measured either on a single spectral line or on the cross-correlation function (CCF) of the full spectrum with a correlation template. Each region of the stellar disk is more or less Doppler-shifted and contributes more or less to the total spectrum depending on its position on the disk and on the limb-darkening. As a consequence, the Doppler shift of the spectrum integrated over the stellar disc (and the corresponding v rad ) is mitigated by a certain amount compared to the physical v puls . This amount is called the projection factor (or p-factor) and it rounds up all the various sources of physical and spectral variability (Burki et al. 1982;Nardetto et al. 2007). It is also the key parameter of classical BW methods as it is fully degenerate with the distance .
Another consequence of the radial projection effect is that each line (and thus the global line profile) of the total Cepheid spectrum is intrinsically asymmetric (Sabbey et al. 1995;Nardetto et al. 2006). This has a strong and variable effect on the measurement of the Doppler shift of the line, depending on the measurement method. Furthermore, it is now common knowledge that Cepheid spectral lines are shifted differently depending on their formation region or depth, i.e. that an atmospheric velocity gradient is present. Thus, any Cepheid v rad measurement is affected, depending on which spectral lines are used (Nardetto et al. 2007;Anderson 2016). In short, to each v rad measurement corresponds a different p-factor (and thus potentially a different distance estimation). Finally, at a deeper precision level, additional phenomena might introduce uncertainty on the v rad and p-factors: shockwaves , convective blueshift (Nardetto et al. 2008), or cycle-to-cycle variability (Anderson 2016).
Cepheid studies do not always clarify how the v rad were computed. Furthermore and most often, the template used to crosscorrelate the spectra is not described or provided, except in very few cases (Brahm et al. 2017). However, given the context developed above, authors should always make clear what spectral lines were used to derive the v rad and how the v rad were computed, as stressed by Anderson (2018).
We present here a consistent spectroscopic survey of 64 Classical Milky Way (MW) Cepheids. It includes for each target the CCFs, various line profile observables, and several v rad time series built from different correlation templates and computation methods. In Sect. 2, we describe our Cepheid sample, the observations performed and the data used in this study. We introduce 10 0 10 1 10 2 P (day) the principles of our framework and the main outputs in Sect. 3. We then apply our method to derive a consistent set of CCFs, observables and v rad for our full Cepheid sample. We discuss and characterise the results of the survey in Sect. 4. We finally conclude on the perspectives and possible applications of this survey in Sect. 5.

Sample
Our sample is made up of 64 Classical Galactic Cepheids with pulsation periods in the range ∼2 to ∼68 days. We provide the full detail of our sample in Table A.1 (Appendix A). We selected our targets based on the number of available high-resolution spectra (and the number of corresponding distinct observation epochs) that we could recover either from new observations or from spectroscopic archives (see below). We selected only targets with a good enough sampling and coverage (i.e. 20%) of the pulsation phase. We show the pulsation period distribution of our sample in Fig. 1. Most of our targets have a period between a few and 20 days, but we also include a small number of Cepheids with periods up to several tens of days in order to scan as well as possible the whole Cepheid period range.

Data
We gathered a total of 3919 high-resolution (40 000 R = λ/dλ 115 000) spectra acquired with seven different echelle spectrographs. These instruments are located in both hemispheres and allow us to cover a wide wavelength range (from the near ultraviolet to the near infrared), depending on their respective characteristics (Table 1).

New observations
We detail here our new observations that were not published before. Our observational strategy was adapted to maximise the pulsation phase coverage for each target.
SOPHIE -From June 2013 to August 2018, we observed 30 northern Cepheids with the fibre-fed SOPHIE spectrograph Table 1. Spectrographs implemented in this study. Column 3 gives the spectral resolution R and col. 4 the instrumental (systematic) v rad uncertainty. Col. 5 gives the wavelength range(s) ∆λ on which we standardised and reduced the spectra (see text). Columns 6 and 7 give the number of observed Cepheids N and the total number of acquired spectra N sp for each instrument.

Archive data
For other spectrographs, we collected spectra that we already used in previous studies or that we retrieved from the ESO public archive 2 .
HARPS -We collected more than 400 archive spectra obtained with the High-Accuracy Radial-velocity Planet Searcher spectrograph mounted on the 3.6m telescope at La Silla observatory 1 http://atlas.obs-hp.fr/sophie/. 2 http://archive.eso.org/wdb/wdb/adp/phase3_main/ form (HARPS, Pepe et al. 2002). The HARPS spectrograph has the highest spectral resolution (R ∼115 000) and the best v rad precision among all spectrographs currently implemented in our framework (Table 1). This highlights the importance of these data. We already presented most of these spectra in previous studies (Nardetto et al. 2006(Nardetto et al. , 2007(Nardetto et al. , 2009. We retrieved the remaining spectra from the ESO database. CORALIE -We also gathered almost 500 spectra that we acquired with the CORALIE spectrograph (Queloz et al. 2001b) at La Silla Observatory from 2013 to 2017. We already used some of these spectra in previous studies (see e.g. Gallenne et al. 2019).
HARPS-North -We used 103 high-resolution, high-S/N ( 250) HARPS-North spectra of δ Cep. The HARPS-North spectrograph is mounted on the TNG telescope at La Palma observatory (see Cosentino et al. 2012, and Table 1). We already presented these spectra in a previous study that highlighted their quality (Nardetto et al. 2017).

FEROS -
We finally retrieved almost one hundred FEROS spectra from the ESO public database. The FEROS instrument is a fibre-fed spectrograph mounted on the 2.2m MPG telescope at La Silla Observatory with a spectral resolution of R ∼48 000 (Kaufer & Pasquini 1998). The FEROS data allowed us mostly to add three medium-to long-period Cepheids to our sample (UZ Cet, AV Sgr and V340 Ara, see Table A.1).

Principle: the CCF as the spectrum proxy
Because of the physical characteristics of Cepheids, their location on the Hertzprung-Russell diagram, and their moderate rotation rates, Cepheid spectra typically exhibit hundreds or even thousands of narrow absorption lines. It is then possible to measure the Doppler shift (and hence, the v rad ) for specific single lines of different depths (see e.g. Nardetto et al. 2006). However, single-line v rad may differ from one line to another, due to velocity gradients or spectral peculiarities. The single-line v rad precision also depends on the spectrum S/N, which might limit its use to high-or very high-S/N spectra (typically with S/N above 75 to 100; see e.g. Nardetto et al. 2006;Meunier et al. 2017).
An efficient approach to derive accurate v rad that are representative of the full spectrum is to use a proxy for the spectrum instead of the spectrum itself. In other words, it consists in building a single global line profile that combines the useful information from all the spectral lines (Doppler shift, depth, width and asymmetry). This description corresponds well to the CCF. Building the CCF consists in cross-correlating the spectrum with a pre-defined template that is successively Doppler shifted (Baranne et al. 1979;Queloz 1995). This template can be a synthetic spectrum (e.g. for the Gaia Radial Velocity Spectrograph, see Katz et al. 2019), a reference built from all spectra (Galland et al. 2005), or, more generally, an adapted binary template (also named a binary mask, see e.g. Queloz 1995;Pepe et al. 2002). The binary designation refers to the fact that the template is equal to 1 (or > 0) at the wavelengths of the selected spectral lines and is equal to 0 everywhere else (spectrum continuum or rejected lines). Hence, the CCF includes the contribution of all the spectral lines included within the correlation template. It then provides a much higher S/N (and a better v rad accuracy) compared to a single line (Pepe et al. 2002;Anderson 2018). Alternative techniques or mathematical functions such as the spectral broadening function (Rucinski 1992(Rucinski , 1999, the auto-correlation function (Borra & Deschatelets 2017), or the least square deconvolution method (Britavskiy et al. 2016), have been proposed to characterise Cepheid line profiles. Still, the CCF remains both the easiest and most widely used method to study line profiles of Cepheids and other stellar pulsators (see e.g. Nardetto et al. 2009;Anderson 2018).
Specifically, the CCF built from a binary correlation template allows us to select which spectral lines to take into account in the global line profile and thus in the v rad computation. This is a key factor to consider when trying to build homogeneous Cepheid v rad time series. Given its relative simplicity, its widespread use, and its high interest in terms of line selection and template customisation, we thus decided to use CCF built from tailored correlation templates within our framework.

Standardising the spectra
The main inputs of our framework are high-resolution spectra, pre-processed using the dedicated instrument pipelines. They are in a one-dimensional (1D) format (i.e. wavelength versus flux; typically the s1d format produced by ESO spectrograph data reduction systems or DRS), with the spectral orders already reconnected and re-sampled in wavelength. We chose the 1D format because it is a default product of the DRS of all the instruments included in this study, contrary to the 2D (e2ds) format which is not produced for UVES and FEROS to our knowledge. Furthermore, using the spectrum 1D format instead of the 2D format allows us not to have to correct for the instrumental response by spectral order (i.e. the blaze function), which would otherwise oblige us to introduce additional processing steps that might be less consistent from instrument to instrument. Given that classical Cepheid spectra typically exhibit thousands of narrow absorption lines, the order re-connection does not have any significant visual impact on the re-connected spectrum (Fig. 2, top). The wavelength re-sampling typically leads to wavelength steps slightly smaller than the original CCD sampling (typically The red solid line is normalised to unity. On both plots, the insert is a zoom on a 100-Å slice of the spectrum (in blue).
0.01 -0.02 Å), meaning that the resulting 1D spectra are slightly over-sampled 3 , which should not introduce any significant uncertainties. With the same view of achieving the greatest consistency of our CCF computation process, we consider that standardising our input spectra is a necessary step given the variety of the included spectrographs. First, we define three wavelength ranges (∆λ) on which the input spectra are cross-correlated. These ranges are fixed and defined to cover as much as possible the wavelength domains of the implemented spectrographs ( Second, we extract the continuum from each spectrum and normalise the spectrum following the same process for all spectrographs. Briefly, we build a continuum function or envelope by interpolating between a certain number of points that depends both on the spectrograph's resolution R and on the absorption line density along the spectrum (Fig. 2). Each interpolation point is carefully computed to be representative of the spectrum's continuum on the corresponding wavelength slice, by considering the flux value at 99% of the highest intensity over the considered slice while excluding possible cosmics. A similar normalisation process was used and validated by Meunier et al. (2017). Broad deep lines (Hydrogen Balmer and Paschen series, as well as Calcium H and K lines) are excluded from the continuum function building. We simultaneously correct for the remaining cosmics, if any. The spectrum is then normalised by dividing it by this continuum function or envelope. Next, we correct the spectrum from the barycentric Earth radial velocity (or BERV), if necessary. Finally, we select the spectra to cross-correlate based on their S/N (taken at 5500 Å in the case of the green wavelength range). We empirically put our S/N threshold to 30 to ensure a reasonably good CCF computation.

Building tailored correlation templates
This step forms the focal point of our approach. Indeed, we use our custom correlation templates to cross-correlate all observed spectra on a given wavelength range regardless of the target and of the spectrograph. We illustrate our template building process in Fig. 3 and we display the main properties of our templates in Table 2. To build our templates, we first generated a reference Cepheid synthetic spectrum over our three wavelength ranges. To do so, we used the radiative transfer PHOENIX code . PHOENIX is a non-Local Thermodynamic Equilibrium (nLTE) atmosphere model code, that uses spherically symmetric radiative transfer in the case of giant stars such as Cepheids . In terms of stellar physics, we adopted solar metallicities, which are suitable for Classical MW Cepheids with typical spectral types in the F8 to G5 range. We adopted a model with T eff = 5250 K and log g = 1. This corresponds to a Cepheid somewhat colder and with a log g slightly smaller than the average of our sample (i.e. it roughly corresponds to the average T eff and log g of a P ∼15to ∼20-day Cepheid, see e.g. Kovtyukh et al. 2005). We chose these T eff and log g values as a compromise between obtaining a reference synthetic Cepheid spectrum with as much unblended spectral lines of different depths as possible and staying close to the average properties of our sample. As a consequence, our adopted synthetic spectrum corresponds to a slightly later spectral type than the G2 templates classically used in the literature. We generated our synthetic reference spectrum over our three wavelength ranges with R = 115000, corresponding to our spectrograph with the highest resolution (HARPS, see Table 1). From our reference spectrum, we selected the spectral lines to be included within our correlation templates following an approach similar to Hindsley & Bell (1986). First, we considered only lines stronger than 0.1 in relative depth (i.e. with a minimum normalised flux below 0.9), as an arbitrary limit between meaningful lines and the continuum of the spectrum. Then, we selected the lines to include based on their relative depth, in order to probe lines forming at different optical depths. We considered three line depth ranges (i.e. shallow, intermediate-depth and deep lines) and we built three corresponding templates on the green wavelength range (weak, medium and deep templates). We selected only un-blended lines separated from adjacent lines by more than 0.4 Å for deep and intermediate lines, and by more than 0.7 Å for shallow lines. Finally, we excluded any lines within wavelength ranges around broad non-metallic lines such as the Hydrogen Balmer and Paschen series (Anderson 2018), as well as the Calcium II H and K lines and the Calcium II near-infrared triplet. We also excluded lines within wavelength ranges corresponding to strong telluric lines, based on inputs from the ESO MOLECFIT tool (Smette et al. 2015) and on high- Table 2. Main characteristics of our correlation templates. T stands for the template transparency according to Baranne et al. (1979), i.e. it represents the template t(λ) transmission weighted by the covered wavelength range (∆λ): T = 1/∆λ × ∆λ t(λ) dλ. N stands for the number of lines included in the template and σ stands for the template mean line width (in Å). The characteristics of the HARPS DRS G2 template considered over our green range (last line) are showed for comparison (see text).  (Table 2). We tailored these three depth-specific templates (weak, medium, and deep) to have a number of lines of the same order of magnitude, average wavelengths as close as possible and average line depths as different as possible to investigate the specific impact of the template average line depth on the v rad (Sect. 4.6).
Next, we built a fourth template (all) including all the lines selected within the three previous templates on the green range. This time we weighted the template lines proportionally to their relative depth within the PHOENIX spectrum (Fig. 3, bottom). Such weighting is typical of the default templates used within the DRS (Pepe et al. 2002). We built this all template: i) to have reference v rad time series based on a template with a higher number of lines (Sect. 4.1); ii) for comparison with default weighted DRS templates (Sect. 4.5); and iii) to have a reference for the comparison of our depth-dependent templates (Sect. 4.6). Weighting the lines means that stronger lines have more weight (more impact) than weaker lines within the CCF and v rad computation. Thus, the average line depth of the all template is 0.63 (weighted lines) instead of 0.58 without line weighting, while the other template characteristics (mean wavelength, mean line width) do not significantly change. This also allows us to have a fourth template more distinct in terms of average depth from the medium template (average depth ∼0.56, Table 2).
Because of our strict spectral line selection, each of our three depth-specific templates includes ∼150 to ∼220 lines only. In contrast, the default DRS templates with a spectral type closest to classical Cepheids (i.e. typically G2-type templates adapted to Main-Sequence dwarfs) include thousands of often blended lines (Anderson 2016). As an example, the HARPS DRS default G2 template considered over our green range includes more than 1700 lines (see Table 2 and Fig. 3, bottom). We also note that such DRS templates have very narrow lines (typical width  Table 2) vs. wavelength for our green λ range. The line density of our reference spectrum is plotted in black and the transparency of our four correlation templates in red, blue, green, and purple (all, weak, medium, and deep templates, respectively). The template transparency is equivalent to the selected line density within the template (see text).
σ < 0.1 Å). In contrast, we fixed our template line width by considering the selected line width at 90% of the continuum within our PHOENIX synthetic Cepheid spectrum, i.e. average line widths larger by a factor 4 to 6 compared to the G2 template. We discuss in more details the rationale for this choice and its impact in Sect. 4.5. We finally note that our correlation templates are sampled with a fine wavelength step of ∼0.02 Å, meaning that our template lines typically cover 15 to 25 wavelength pixels.
The distribution of selected lines within our templates generally follows the spectral line density of our reference spectrum (i.e. the line distribution as a function of wavelength), with dips corresponding to telluric or broad line exclusion ranges (Fig. 4)  . From left to right and top to bottom: CCF, normalised CCF, normalised CCF Gaussian fit, normalised CCF biGaussian fit. On all plots the CCF is displayed as a black solid line, the CCF continuum as a solid straight red line and the v rad at the CCF minimum as a vertical dotted black line. On the top left plot, the CCF core v rad range (i.e. ∆ core ) is highlighted in green. On the top right plot, the area covered by the CCF core and used for the EW and RV cc−c integration is highlighted in green.
CT and D designate the contrast and depth of the CCF core, respectively (as defined in the text). On the bottom right plot, the FWHMs corresponding to the blue and red parts of our biGaussian model are displayed as a blue or red arrow, respectively. The inserts are zooms in v rad on the CCF core.
induced by the increased line density (i.e. a higher line blending) in the bluer part of our reference spectrum. In addition, we built two other templates including intermediate-depth lines but covering this time our blue and red wavelength ranges. The number of included lines is relatively small, due to either the reduced wavelength span and strong line blending (in the case of the blue range) or the importance of the telluric ranges (in the case of the red range). However, our three medium templates have nearly similar average depths and average wavelengths separated by more than 1000 Å from each other ( Table 2), to investigate the potential impact of the template wavelength range on the v rad (Sect. 4.4).

Characterising the CCF
We cross-correlate each of our reduced spectra with our tailored templates, depending on the spectrograph and the covered wavelength range. We aim at extracting as much information as pos-sible from the CCF, not only the selected Doppler shift or radial velocity. To do so, we use and derive different estimators of the shape of the CCF profile, that we discuss here. We specifically discuss the v rad measurements later in Sect. 3.5. We give more technical details and formulae for the observable derivation in Appendix B. We illustrate our various observables in Fig. 5 and summarise them in Table 3.
CCF core, wings, and continuum -Cepheid spectra may exhibit strongly asymmetric lines induced by the line-of-sight projection of the pulsation velocity. Given that we exclude most blended lines from our correlation templates (i.e. lines that usually smooth the CCF if taken into account), the typical shape of our CCFs may deviate significantly from that of a Gaussian (Fig. 5, and see e.g. Queloz 1995). First, our CCFs typically exhibit two significant bumps or shoulders on both sides of the CCF core; second, the CCF shape outside of the CCF core is not completely flat. These effects are more noticeable for our CCFs than for CCFs computed with typical DRS templates because we use templates with a relatively small number of lines of variable depth, and because we reject most blended lines. Most often, studies that present CCF profiles show only the CCF core and not the CCF wings. We consider necessary to take into account both the core and the wings of the CCF for a proper CCF characterisation within our study. We compute our CCFs on an extended v rad grid ranging from -200 to 200 km s −1 in order to sample adequately the full CCF profile. The respective height of the two CCF shoulders depends on the CCF asymmetry and the direction of the Doppler shift, i.e. the CCF left shoulder is higher when the spectrum is blue-shifted (Fig. 5) and reciprocally. We define the CCF core as the area centred on the CCF main peak and below the CCF lower shoulder for practicality. Then the CCF wings include the whole v rad ranges outside of the CCF shoulders, and the CCF pseudo-continuum (C o ) is defined as the average value of the CCF wings. The CCF can be normalised by dividing it by the value of C o , as done for the input spectra.
Modelling the CCF -We derive some of our line profile observables through fitting the CCF core by parameterised models. We use here both a classical four-parameter Gaussian model (offset, depth, width and Doppler shift) and a biGaussian model (offset, depth, asymmetry and Doppler shift) where we distinguish between the blue and red parts of the CCF core (see more details in Appendix B).
CCF depth and width -We define the CCF core depth (D) as the difference between the maximum and the minimum of the CCF core, that we either measure directly on the CCF or through the Gaussian or biGaussian models. We distinguish this CCF depth from another separate observable that we name here the CCF contrast (CT ) and that we define as the difference between our CCF continuum and the minimum of the CCF core (Fig. 5, top right). Our main observable for the CCF width is the Full Width at Half Maximum (FWHM) of the CCF core Gaussian model (Fig. 5, bottom left). Finally, we compute the Equivalent Width (EW) of the CCF core in a similar way as what is commonly done for single spectral lines (see e.g. Kovtyukh et al. 2005). The EW is a mixed proxy of the CCF core depth and width. The variability of the depth and width of the CCF profile is directly related to the various quantities that have a broadening effect on the spectral lines (i.e. the pulsation, but also and mostly the effective temperature T eff , the turbulence and the Cepheid rotation rate). For example, the FWHM has been used as an es-timator of the micro-turbulence velocity (Borra & Deschatelets 2017).
CCF asymmetry -We derive the bisector of the CCF core as a classical way to estimate its asymmetry. Our proxy is the Bisector Inverse Span (BIS, Queloz et al. 2001a), i.e. the v rad difference between the top and the bottom of the CCF core bisector (Appendix B). According to authors (Anderson 2016;Britavskiy et al. 2018;Anderson 2019), the BIS is a good estimator of the line profile asymmetry of stellar pulsators such as Cepheids.
Another asymmetry proxy is the line asymmetry estimator defined by Nardetto et al. (2006) based on the biGaussian model of single spectral lines. By analogy, we derive a similar asymmetry proxy from our CCF biGaussian model through the comparison of the width of the blue and red parts of the CCF core (Fig. 5, bottom right, and see Appendix B).
CCF quality -Depending on the number and the strength of the lines selected within, the correlation template has a direct impact on the resulting CCF, its global shape and its depth. In particular, selecting lines of reduced strength (i.e. shallower lines) reduces the amount of signal contained by the CCF core with respect to the CCF wings. Thus, the reliability and accuracy of the derived v rad and line profile observables will be impacted. Here, we find it necessary to assess the quality of our derived CCF, in order to figure how much confidence we can put on our observables. We define the two following criteria (see formulae in Appendix B): 1. a CCF quality factor Q, defined as the ratio of the CCF contrast CT over the CCF short-range v rad variability, i.e. the standard deviation of the difference between the original CCF and the CCF smoothed over a given v rad window (CCF smth in Fig. 5 top left); 2. a CCF signal-to-noise ratio (SNR CCF ), defined as the ratio of the CCF core depth D over the standard deviation of the CCF wings.
The Q criterion estimates how much noisy or dispersed is the whole CCF and how well the CCF core can be distinguished from the CCF wings. This is important with respect to the convergence and reliability of our CCF Gaussian or biGaussian models and our observable automatic computation. In the following, we adopt an arbitrary minimal threshold of Q = 4 for good-quality CCFs (more details in Sect. 4). Baranne et al. (1979) also defined a CCF quality factor, but it was directly dependent on the spectrum exposure time and its photon noise.
Here our Q criterion is purely CCF-specific. Defining such a quality criterion helped us to build our line depth-specific cor- Based on δ Cep HARPS-North spectra cross-correlated with our all template on our green range. On the bottom left plot, δ RV denotes the difference between our Gaussian (RV cc−g ) and biGaussian (RV cc−2g ) v rad . Measurements are displayed as coloured dots and interpolated spline curves as red solid lines, if any. relation templates. Our second criterion SNR CCF estimates the amount of signal within the CCF core, weighted by the CCF wing global dispersion. We used SNR CCF to derive the uncertainties on some of our CCF-based proxies (see Appendix B).

Computing the radial velocities
We emphasise here again that any Cepheid (or pulsating star) v rad measurement is somewhat biased with respect to the pulsation-induced asymmetry of the line or CCF profile. We de-cided here to merely implement three different (and well-known) ways to compute v rad measurements. We did not try to definitely assess which method is to be preferred. Generally speaking, we consider that several different v rad computation methods should always be considered for Cepheids and pulsating stars (Burki et al. 1982).
Centroid  Combined v rad of FF Aql. Top: centroid v rad (RV cc−c ) obtained with the all template on the green range for four spectrographs, phased (φ) along the Cepheid pulsation period and not binary-corrected. Bottom: the same, but the v rad have been corrected from the Keplerian orbit of the binary companion. Our best spline curve is displayed as a red solid line. The v rad are corrected only from the Keplerian orbit (no offset correction), based on the orbital parameters recently computed by Gallenne et al. (2019). In the red box is one of our few CORALIE v rad outliers (see text), that we did not take into account to build the spline curve.
Appendix B). In the same way as done by Nardetto et al. (2006) for single spectral lines, the CCF RV cc−c corresponds to the first moment of the CCF core profile (see Hindsley & Bell 1986, and Fig. 5, top right). Studies of the Cepheid p-factor decomposition have favoured single-line centroid v rad compared to other singleline v rad computation methods as they are independent from rotational and turbulent broadening (Burki et al. 1982;Nardetto et al. 2006). However, they require a high enough S/N.
Gaussian v rad -The most classical and most widely used way to compute the v rad is to fit the CCF profile with a Gaussian model (Fig. 5, bottom left). The derived Gaussian radial velocity (hereafter RV cc−g ) is less sensitive to scatter in the spectrum than the CCF first moment, and is thus more stable and less dependent on the S/N of the spectrum (Anderson 2018). However, RV cc−g is potentially biased for Cepheids and other stellar pulsators as it accounts badly for the CCF profile asymmetry at high pulsation velocities (Nardetto et al. 2006). res. Fig. 8. Comparison of v rad computation methods for FM Aql SOPHIE v rad (green range, medium template). The x-axis corresponds to RV cc−c . RV cc−g and RV cc−2g are displayed as blue diamonds and red dots, respectively. We point out that the uncertainties on all v rad are displayed but are not necessarily visible. The best linear regressions are displayed as cyan and orange solid lines (along with their 1σ uncertainty in same-colour shades), respectively. The black dashed line corresponds to a slope of 1. The bottom insert corresponds to the residuals of the two linear regressions (same colour code).
biGaussian v rad -A solution to reproduce more closely the CCF asymmetry is to fit the CCF core with a biGaussian model instead of a simple Gaussian (as first done by Nardetto et al. 2006, for single-line v rad computation), i.e. by fitting separately the blue and right parts of the CCF core profile (Fig. 5, bottom right). This gives us a third, biGaussian, v rad value (hereafter RV cc−2g ).

A consistent catalogue
We computed the CCFs, corresponding line profile observable and quality proxy time series, and corresponding v rad time series for our whole 64-Cepheid sample, using our six correlation templates and the input spectra standardised on our three wavelength ranges. This makes up a large, homogeneous catalogue of MW Cepheid CCF and v rad . The full catalogue is to be published on-line, i.e. the CCFs, various time series as well as our correlation templates. We display a (small) example of our data (e.g. CCFs, v rad and line profile observables deduced from δ Cep HARPS-North spectra cross-correlated on the green range with the all template) in Fig. 6. The displayed data look robust and behave as expected along the pulsation phase. The CCF asymmetry proxies (BIS and biGaussian asymmetry) show the same behaviour and are correlated to the difference between bi-Gaussian and Gaussian v rad (in agreement with the results of Anderson 2016). In this example, our CCF quality factor Q is high and nearly constant at all phases, which we will show latter to be a sign of good CCF quality (Sect. 4). Our CCF SNR CCF behaves in correlation with the CCF depth, as expected: it is the highest when the CCF is the deepest, i.e. when the Cepheid reaches it largest radius. On the opposite, the CCF FWHM is the largest at the end of the contraction phase (Nardetto et al. 2006). We provide the v rad time series as processed, i.e. with the corresponding observation Modified Julian Day (MJD) but without any correction done on the v rad , except for the correction from the BERV done on the input spectra themselves if necessary (Sect. 3.2). Thus we provide the v rad without correcting for a potential binary companion and without removing the absolute star v rad (or systemic velocity). For some of our targets (most of them being also spectroscopic binaries, see below), we have enough data to sample the pulsation phase adequately with several spectrographs. However, the data are not enough for us to clearly estimate instrumental v rad offsets from spectrograph to spectrograph because such offsets are typically of the order of 100 m s −1 or below, i.e. very small compared to the Cepheid v rad variability (pulsation-and binary-induced). Some of these instrumental spectrograph-to-spectrograph v rad offsets have been previously measured with a great accuracy (Soubiran et al. 2013;Gallenne et al. 2018). In terms of outliers, we have only to note a very few v rad problematic data (i.e. that exhibit unexpected offsets for all templates and all methods). Those concern a few CORALIE spectra acquired between MJD 57196 and MJD 57206 (i.e. in mid-June 2015, see Fig. 7, bottom plot).

Spectroscopic binaries
Most of MW Classical Cepheids have been revealed to be components of binary or even multiple systems . Here, we are mainly concerned with single-lined spectroscopic binaries (SB1s), i.e. for which the companion signature is noticeable in the derived primary Cepheid v rad . We detected a number of 18 unambiguous SB1s within our Cepheid sample, with 11 other Cepheids exhibiting v rad scatter that hint towards a companion (Table A.1). For some of these binaries, new derivations of the companion orbital parameters were recently performed based on the data presented here in published studies (see Gallenne et al. 2019, 2018, and Fig. 7). We redirect the interested reader towards the latter studies, since Cepheid SB1s are not the focus of the present study. Nonetheless, we propose in Appendix C a new estimation of the companion orbital pa-rameters for two of our SB1 targets (SU Cyg and V496 Aql), to highlight the interest of our new v rad data in combination with previous v rad data. In the following, we did not find it necessary to correct the v rad data for the companion signatures, as we mainly compare the same v rad time series computed in different ways.

v rad computation method
Here, we investigate the specific impact of the v rad computation method on the Cepheid v rad measurements. We compare the three methods that we implemented here, i.e. the centroid, Gaussian and biGaussian v rad (RV cc−c , RV cc−g , and RV cc−2g , respectively). For this comparison, we limit ourselves to v rad measurements made with a given correlation template on the green range. We select Cepheids with a good sampling of the pulsation curve within these constraints only, i.e. 47 targets. For each of these targets, we compute two linear regressions: (1) of RV cc−g versus RV cc−c measurements and (2) of RV cc−2g versus RV cc−c measurements (Fig. 8). We then study the distribution of the values of the slope of these regressions. We consider that such an approach is safer and more reliable than comparing the peak-to-peak amplitudes of the two v rad time series phased along the Cepheid pulsation period (as done by e.g. Nardetto et al. 2007). First, it allows to take into account all the v rad measurements (and not only the extremal v rad values). Second, it should make the comparison of the two v rad time series less prone to biases induced by e.g. shockwaves  or cycle-to-cycle v rad variability (Anderson 2014(Anderson , 2016, that affect especially the extremal v rad measurements. We use the same approach in the following (Sects. 4.4, 4.5, 4.6).
Overall, we find Gaussian v rad to have slightly larger amplitudes than centroid v rad , and biGaussian v rad to have significantly larger amplitudes than both other methods. For the all template, we estimate RV cc−g to be larger than RV cc−c by ∼1% overall, and RV cc−2g to be larger than RV cc−c by ∼3-4% overall (Fig. 9, left). This agrees with biGaussian v rad being more sensitive to the CCF asymmetry, as found in previous studies (Nardetto et al. 2006;Anderson 2016). This trend seems to be more pronounced: 1. for shallower lines: with our intermediate-line template, we find RV cc−g to be ∼1-2% larger than RV cc−c and RV cc−2g to be ∼5% larger than RV cc−c overall; 2. for shorter pulsation periods (Fig. 9, right).
Both results agree with shallower lines and corresponding CCFs being more asymmetric than deeper ones (Anderson 2016). We also note that the dispersion of the linear regression slope values is much higher for biGaussian v rad than for Gaussian v rad . The Gaussian model is probably more robust than the biGaussian model because it has one less fitting parameter. If confirmed, it would also mean that a robust p-factor distribution over a large Cepheid sample would be more difficult to obtain with biGaussian v rad than with centroid or Gaussian v rad . To conclude here, we show and confirm the significant impact of the v rad computation method on the deduced CCF v rad time series.

Template wavelength range
Here, we investigate the impact of the wavelength range (∆λ) on which we compute our CCFs on the v rad measurements. To do so, we use the v rad measurements that we obtained based on our three medium templates spanning the three pre-defined wavelength ranges (blue, green, and red). First, to compare green and red CCFs and v rad , we use our three targets observed extensively with FEROS (UZ Sct, V340 Ara and AV Sgr), as well as our three HERMES targets (V1334 Cyg, FF Aql and W Sgr). For these instruments and targets, the same acquired spectra cover both the green and red wavelength ranges, which allows us to make a direct comparison between the corresponding green and red data. Second, to compare our blue and red wavelength ranges, we use our UVES data (24 targets in all, Table A.1), as the corresponding blue and red spectra were acquired at the same observation epochs (hence allowing for a direct comparison). For each target and each UVES arm, we average the successive CCFs obtained at each epoch of observation (see Sect. 2) and we derive the corresponding observables, in order to have the same number of measurements for the blue and red arms.
We look first at the general CCF quality for each wavelength range (Fig. 10). We note that our green and red (intermediateline template) CCFs exhibit a nearly constant average quality factor < Q > (around 4.5-5). On the contrary, our blue CCFs show a decreased < Q > that is much more variable (between -1 and 2, Fig. 10 middle plot). The green and red CCFs exhibit a well-defined and relatively deep core, while the blue CCFs are much more noisy and exhibit a shallower core (Fig. 10, left plot). Overall, we note that our good-quality CCFs have a nearly constant Q above ∼4 (see also Fig. 6) independently of their depth. On the contrary, when the ratio between the contrast of the CCF core and the dispersion of the CCF continuum is decreased enough, the CCF Q factor starts to decrease (below 4) and becomes significantly variable. This led us to define a CCF quality threshold of Q = 4. When looking at our other proxy SNR CCF , it is more variable: red CCFs exhibit the highest SNR CCF values (between 70 and 130 for intermediatedepth lines, Fig. 10 right plot), while green CCFs have somewhat lower SNR CCF values (in agreement with the respective CCF depths). Finally, blue CCFs show understandably much smaller SNR CCF values (below 40).
This quality difference between blue and red UVES CCFs does not originate in the number of lines included within the respective correlation templates (Table 3). Rather, it originates in the much increased spectral line density on the blue range compared to the red range. Between 3900 and ∼5000 Å, typ-ical Cepheid spectra are crowded with spectral lines that are often blended with each other. Even if we tried to select only un-blended lines from our synthetic Cepheid spectrum, this line density has nonetheless an impact on our CCFs. Meanwhile, there are much less spectral lines on the red range, that are much more separated from each other. This explains the increased quality of the red CCFs.
Second, we look at the blue and red v rad themselves. For each of the selected targets, we compute the linear regression of the red versus green v rad (FEROS and HERMES targets) or red versus blue v rad (UVES targets). Overall, it is difficult to detect a definitive trend. Yet, the average slope of the linear regression seems to be slightly below 1 (∼0.97-0.98, see Fig. 11). If confirmed, this would mean that Cepheid v rad amplitudes of variation decrease at larger wavelengths. This would agree with the findings of Nardetto et al. (2009), who reported a linear decrease of the Cepheid v rad peak-to-peak amplitude, based on CCFs computed order-by-order with the HARPS DRS and its classical G2 template. Yet, we do not find it clear whether this trend depends on the Cepheid pulsation period. We emphasise that our result remains to be confirmed given the relatively few number of v rad measurements used to perform the linear regression for each target. In the context of this study, we also consider this wavelength effect (between our different λ ranges) to be small enough to be neglected within a given λ range. Such studies would need to be extended to infrared (IR) wavelengths in order to be confirmed. Nardetto (2018) did not find a significant difference between the v rad curve amplitude of Car measured in the optical and on an IR line, respectively.
If considering the zero-point, there is no clear result for our green versus red v rad regressions given that the zero-point value seems to depend on the spectrograph (Fig. 11, top left). When looking at our UVES targets, there seems to be a consistent offset of ∼1.5-2 km s −1 between the v rad measured with the blue and red arms of UVES. It is not clear whether this is related to the wavelength or to technical differences in the acquisition of the UVES blue and red spectra (Molaro et al. 2008). Finally, we find the BIS amplitude over the Cepheid pulsation period to be slightly increased over our red range compared to our blue or green ranges (Fig. 11, bottom right).

Template line width
Here, we investigate the impact of the average line width of a given correlation template (σ ) on the resulting CCFs and v rad .

Impact of variable line width
We first consider the impact of a variable σ for one of our tailored correlation templates, i.e. the all template over the green range. We cross-correlated the 103 HARPS-North spectra of δ Cep with the all template for different line widths ranging from ∼5% to 140% of the default all template line width, i.e. σ ranging from ∼0.03 to ∼0.6 Å (Fig. 12). In terms of CCF shape, broadening the template lines has two results: first, it leads to a broadening of the CCF core (i.e. a shallower and wider CCF core); and second, it strongly reduces the noise or dispersion of the CCF wings. Both effects were already predicted by i.e. Queloz (1995). The decrease in CCF depth leads to a slight decrease of our CCF Q-factor averaged over the pulsation phase (yet still well over Q = 4). In contrast, the decrease of the CCF wing variability in the same time leads to a strong increase of our SNR CCF criterion up to a maximum for σ in the 0.35- "Red" vs "blue" v rad zero-point (km/s) Fig. 11. CCF v rad vs. wavelength range. Top: red vs. green linear regression for our six targets (see text): the regression slope is plotted vs. the regression zero-point with green diamonds for HERMES targets and green dots for FEROS targets (left). Histogram of red vs. blue v rad slope for our UVES targets (middle). The same, for the v rad zero-point (right). Bottom: amplitude of variation over P of the CCF BIS, plotted for our red range vs. our blue or green range, for the same targets as above (left). Distribution of the red vs. blue v rad slope with P for our UVES targets (middle). The same, for the v rad regression zero-point (right).
0.45 Å range (Fig. 12, top right). This is the main reason for our choice of having wider lines in our tailored templates (Sect. 3.3): broadening the template lines allows to make the CCF core more distinguishable from the CCF wings, even if the CCF core is made shallower. The decrease of the CCF wing variability with an increasing σ also allows for a more robust CCF Gaussian or biGaussian modelling by making the CCF shape closer to that of a Gaussian (Queloz 1995). We observe the same CCF behaviour as a function of σ for our other templates (deep, medium, and weak).
Increasing σ (i.e. broadening the CCF) also reduces the CCF asymmetry variability and leads to smaller v rad amplitudes (Fig. 12, middle and bottom right). We consider it a positive result as our focus is on increasing the consistency of Cepheid v rad : reducing the CCF asymmetry leads to less asymmetrydependent i.e. less variable and more consistent v rad time series (and thus less variable p-factors). On the contrary, Cepheid studies that focus on specific items such as the atmospheric velocity gradient should then use narrow-line templates to exacerbate the CCF asymmetry and enhance its impact on the v rad .  (Table 2) is showed in red. The CCF corresponding to the same δ Cep spectrum and the G2 HARPS DRS template is showed as a dashed black curve and is shifted vertically for clarity. The four right plots show the behaviour of different CCF observables vs. σ (from left to right and top to bottom; in red circles): i) the CCF Q-factor averaged over the 103 δ Cep spectra; ii) the same for the averaged SNR CCF criterion; iii) the slope of the linear regression to the δ Cep BIS time series at a given σ vs. the BIS time series at σ ∼ 0.43; and iv) the same for the RV cc−c time series. On the four plots, the red diamond corresponds to the default σ used in this study for the all template and the black square corresponds to the G2 HARPS DRS template.

Comparison with the G2 HARPS DRS default template
We then compare our all-template δ Cep CCFs with the CCFs computed from the same spectra but with the classical G2 template from the HARPS DRS. We retrieved the G2 template from the data made available by Brahm et al. (2017) and adapted it to our specifications (i.e. the green wavelength domain and telluric exclusion ranges, Sect. 3.3). Over our green range, the G2 template includes more than 1700 very narrow lines (σ ∼0.08 Å, Table 2). It gives significantly different CCFs with a narrower and deeper core and very flat wings (Fig. 12). The G2 CCF narrow core is induced by the small σ , while the CCF wing flatness is induced by the many blended lines and the occasional G2 template line mismatches (Queloz 1995, and see Fig. 3). This leads to both a high CCF Q-factor and a high SNR CCF criterion. Given that our templates are specifically tailored for Cepheids (i.e. much less line mismatches) and that our constraints on the line selection are much more stringent (no blended lines), our CCF wings are inevitably more noisy, justifying our use of wider template lines. On another hand, the G2 CCFs are much more asymmetric than the CCFs built from our all template and exhibit larger v rad amplitudes (Fig. 12).

Template line depth
Here, we investigate the impact of the average line depth of the correlation template on the resulting Cepheid CCFs and v rad . We looked at our data obtained on our green range only, with the four corresponding templates (weak, medium, deep, and all templates), i.e. considering a sub-sample of 50 targets.

CCF quality
We look at the CCF quality for each of our four line depthdependent correlation templates. As expected, the derived CCFs have a deeper core for templates corresponding to deeper lines (i.e. templates with a higher average line depth in Table 2), see Fig. 13 (left). For three of our four templates, we generally find our CCF quality factor Q to be nearly constant during the pulsation phase, with a value of ∼4-5 (Figs. 13 and 14). On the contrary, for the weak template, the Q proxy is much smaller (between 0 and 3, Fig. 14 left) and significantly variable during the pulsation (Fig. 13, middle). We consider this as a criterion defining the good quality of a CCF and the reliability of the derived observables. As we already explained in Sect. 4.4, we empirically put a Q = 4 threshold to distinguish between high-quality (Q ≥ 4) and mixed-quality (Q < 4) CCFs and data. Then, almost all our data derived from the medium, deep, and all templates meet our empirical quality threshold, while the data derived from the weak template are below this criterion. We note that even if using narrower lines for the weak template (as detailed in Sect. 4.5 for the all template), the derived Q-factors would still be below our Q = 4 empirical criterion. To obtain better-quality CCFs from such weak-line templates, we consider that the best solution would be to include more shallow lines, i.e. by alleviating our constraints on the line selection as described in Sect. 3.3. Finally, we find our SNR CCF parameter to be variable both dur-  Fig. 13. HARPS Car CCFs computed with our four depth-dependent correlation templates on the green λ range. Left: example of the four CCFs corresponding to a same spectrum and our four respective templates. Middle: CCF quality factor Q of Car vs. pulsation phase (φ) for CCFs built based on our four templates (same colour code). Right: the same, for our CCF signal estimator SNR CCF . 14. CCF quality and asymmetry vs. correlation template. Left: histograms of the CCF Q factor averaged over the pulsation phase for each target, for our four line depth-dependent templates. Middle: the same, for our CCF SNR CCF proxy (same colour code). Right: amplitude of the BIS during the pulsation phase vs. averaged BIS for each target, for our four templates (same colour code).
ing the pulsation phase (Fig. 13, right) and as a function of the correlation template used (Fig. 14, middle). Understandably, the SNR CCF values increase both for deeper lines and for templates including more lines.

CCF asymmetry
We used our BIS variable to study the impact of the line depth on the CCF asymmetry. We do not find any clear pattern in terms of BIS amplitude over the pulsation phase φ. However, we find the BIS averaged over φ to be higher (in absolute values) for the weak template compared to our three other templates (Fig. 14,  right). This would agree with shallower lines being more sensitive to the line asymmetry as reported by Anderson (2016).

Radial velocities
We finally look at the impact of the average line depth of the correlation template on the v rad . To do so, we compute for each of our selected targets three linear regressions: weak versus all RV cc−c , medium versus all RV cc−c , and deep versus all RV cc−c , respectively (on the green wavelength range). We selected the centroid v rad method as we derive it directly from the CCF and not from a CCF fit (as for Gaussian and biGaussian v rad that are thus less robust). We display our results in Fig. 15.
We find that the zero-point of the linear regressions significantly change when going from one template to another. The median offset between deep and all v rad time series is marginal (0.07 ± 0.12 km s −1 ), but the median offsets between medium and all v rad and between weak and all v rad are significant (−0.43 ± 0.31 km s −1 and 0.87 ± 0.46 km s −1 , respectively). Such v rad offsets between correlation templates based on lines of different depth could be expected and agree with the findings of e.g. Nardetto et al. (2008Nardetto et al. ( , 2009Vasilyev et al. (2017) on the dependency of Cepheid γ-velocities on the spectral lines and their depth. This would need a more in-depth analysis beyond the scope of this paper.
In contrast, the slopes of the linear regressions exhibit a significant variability but no clear trend from one template to another. The median slopes are comparable at a 1σ level (1.001 ± 0.028, 1 ± 0.016, and 0.998 ± 0.006 for the weak versus all, medium versus all, and deep versus all regressions, respectively). Yet, the so-called stellar velocity gradient (between spectral lines of different depths) is expected to have a significant impact on the Cepheid v rad amplitudes, as detailed by e.g. Nardetto et al. (2006Nardetto et al. ( , 2007 for single lines. Significant differences in terms of CCF asymmetry and v rad were also reported by Anderson (2016) when cross-correlating spectra of Car with two correlation templates (a weak-line one and a strong-line one, respectively, both having narrow lines). These authors reported an enhanced asymmetry for weak lines as well as an increased sensitivity or vari- ability of their v rad . We consider that the present lack of significant trend (i.e. of the regression slope as a function of the average template line depth) is at least partially caused by our use of broader template lines (Sect. 4.5), that reduce the asymmetry of the resulting CCFs and decrease the sensitivity of the v rad time series. Yet, the slope distribution for the weak versus all regressions exhibits a much larger dispersion than for the deep versus all regressions. This would agree with strong lines showing less asymmetry and leading to more robust v rad .

Conclusions
We carried out a large spectroscopic survey of Classical MW Cepheids based on several thousands of high-resolution spectra from seven spectrographs. We detailed the framework that we implemented to derive and characterise Cepheid line profiles and v rad time series as much consistent as possible based on the crosscorrelation method. Briefly, the main steps of our formalism are the following: 1. normalising and standardising in wavelength all the spectra in the same way; 2. using pre-defined correlation templates with an emphasis on the selected lines, their average depth, and their width; 3. characterising not only the CCF Doppler shift, but its shape, depth, width, asymmetry and amount of extractable signal; 4. deriving the v rad while accounting for the intrinsic CCF asymmetry.
We show that each parameter and each step of the process has a significant impact on the derived Cepheid v rad : the wavelength range on which the spectrum is considered, the correlation template used for the cross-correlation (both in terms of line depth and template line width), and the way of computing the radial velocity have a significant impact on the derived v rad . For Baade-Wesselink studies, it basically means that significantly different projection factors and distances could be obtained for the same Cepheid, depending on how the former items are treated. Hence, giving at least a minimum of details on the v rad computation process should be a pre-requisite for such studies. We also emphasise the importance of fully characterising the cross-correlation profile (CCF) through various estimators of its shape, width, depth, asymmetry and the amount of signal within it. We publish on-line both our tailored correlation templates, the derived CCFs, and the various v rad , line profile proxy, and CCF quality proxy time series computed from the CCFs. As we showed in this study, deriving fully consistent Cepheid v rad from the crosscorrelation method is not an easy task. Yet, it seems that the way towards more robust Cepheid v rad (and thus more robust p-factors) goes through minimising the asymmetry of the line profile (here, the CCF) and reducing the sensitivity of the resulting v rad to this asymmetry: e.g. using centroid v rad , favouring stronger lines in cross-correlation templates, or even using templates with somewhat broader lines than usual.
High-resolution spectroscopy of Cepheids has a lot of different applications. Our next objective is to take a fresh look at the computation of the Cepheid effective temperature T eff from spectra. Cepheid T eff exhibit a large variability (of the order of hundreds of Kelvin) over the pulsation phase, that has a significant impact on spectra. We aim at using our Cepheid sample to measure accurately both the absolute average Cepheid T eff and its variation over the pulsation cycle by rehashing pre-existing methods based on the ratio of the depths or EWs of selected line pairs (see e.g. Kovtyukh & Gorlova 2000;Sousa et al. 2010, respectively). Other stellar parameters can be easily derived from Cepheid spectra, such as e.g. metallic abundances (Luck 2018).
Another possibility is to use again the CCF as the spectrum's proxy to deduce other parameters than the v rad . An interesting point is to build a model grid of Cepheid CCF, in the same way as done by Britavskiy et al. (2018) with their grid of synthetic bisectors. In the case of non-pulsating stars, CCF profiles can already be used (instead of the spectra themselves) to derive e.g. the T eff , log g and metallicities (Malavolta et al. 2017).
Finally, it is interesting to see if and how the Cepheid CCF formalism that we detailed here can be applied to the data of the Gaia Radial Velocity Spectrograph (RVS, Cropper et al. 2018). The RVS spectra are centred on the Calcium II near-infrared triplet, on a narrow wavelength band if compared to the spectrographs implemented within this study. Yet, they are expected to produce transit v rad measurements with a precision well below 1 km s −1 , i.e. enough to scan the Cepheid v rad pulsation amplitude Katz et al. 2019). This would thus be a great opportunity given that Gaia will observe several thousands of MW Cepheids. Table A.1. Detail of our Cepheid sample. Columns 2 to 4 give the pulsation period P, the yearly period drift dP/dT (if known) and the adopted reference epoch for period phasing T 0 . Column 5 indicates if the target is a clear (singlelined) spectroscopic binary based on its v rad (y) or if there is some v rad scatter (s) that make it a possible SB1 (based on the present data only; other Cepheids in the list are known binaries but with no significant effect on our present v rad time series). In case of a SB1, column 6 gives the most recent reference providing the best companion orbital parameters, if any. Columns 7 to 9 give the total number of spectra N sp used in this study (all spectrographs included), the corresponding number of distinct observation epochs N epoch and the deduced pulsation phase coverage (φ cov.). Column 10 give the detail of the spectrum distribution per instrument for each target.
Cepheid P dP dT Instr. (2)   (1) G18a, G18b stand for Gallenne et al. (2019) and Gallenne et al. (2018). (2)  We use the crosscorrRV package from the Python AstroLib library 4 to cross-correlate each observed spectrum with a given correlation template. We compute the CCF on a default v rad grid ranging from −200 to +200 km s −1 with a 1 km s −1 step. Such a step is in the order of what is commonly done in the DRS of the spectrographs considered in this study (see e.g. Queloz 1995; Baranne et al. 1996), and it is enough to provide precise v rad measurements (e.g. through a CCF Gaussian or biGaussian fit). The crosscorrRV package is an easy-to-use and timetested framework for cross-correlating 1D spectra with templates. The correlation template is successively Doppler-shifted in wavelength, and then linearly interpolated at the spectrum wavelength points for the correlation. Given that our correlation templates are sampled with a ∼0.02 Å step (roughly corresponding to a ∼1 km s −1 step in terms of v rad and similar to the wavelength step of our input spectra), we do not expect this linear interpolation to modify our template lines and to impact the derived observables.

B.2. Observable computation
CCF continuum and depth -We denote the v rad extension of the CCF core (i.e. the area of the CCF peak located below the CCF lower shoulder, see Fig. 5) as ∆ core and we denote the reunion of the CCF wing v rad ranges (i.e. the v rad ranges both left and right of the two CCF shoulders) as ∆ wings . We thus define the CCF continuum C o as The integration is done on a finer v rad grid (v rad step of 50 m s −1 ) than the CCF computation (see above). The CCF EW is equivalent to the width (in km s −1 ) of a theoretical rectangle with a height equal to 1 (considering the normalised CCF) and with a surface equal to the area covered by the CCF core.
CCF first moment -To derive the first moment of the CCF core RV cc−c , we compute the cumulated integral of the CCF profile over ∆ core , using the same v rad grid as for the CCF EW.
Integrating the CCF core including the small area above the lower shoulder and below the higher shoulder does not lead to significant changes to the CCF EW and RV cc−c . The EW values are marginally higher (by ∼1-2% or typically 100 m s −1 , and the RV cc−c amplitudes increase only marginally, by a few tens of m s −1 (i.e. by a value lower than the typical RV cc−c uncertainties). We adopt the same nomenclature as of Nardetto et al. (2006Nardetto et al. ( , 2009 for the different v rad computation methods applied to the CCF: RV cc−c for the CCF first moment (by analogy with RV c for single-line first moments), RV cc−g for the CCF Gaussian (by analogy with RV g single-line Gaussian models), etc.
CCF bisector -We compute the CCF bisector by dividing the CCF core into 100 horizontal slices (between the minimum and the maximum of the CCF peak) and by computing the mean v rad for each slice. We then compute the corresponding Bisector Inverse Span (BIS), defined as the v rad span between a top and a bottom domain of the bisector (Queloz et al. 2001a). If denoting V top and V btm as the mean v rad of these top and bottom BIS domains, we have BIS = V top − V btm . We use the top and bottom bisector region definition given by Galland et al. (2005): i.e. a top region extending from 15 to 46% of the CCF depth D and a bottom region from 57 to 85% of D.
CCF quality proxies -We compute our CCF quality factor Q as where CT represents the CCF contrast, σ denotes the standard deviation, and CCF smth represents the CCF smoothed over a v rad range equal to 2.355 × ∆ core . Then we compute our CCF S/N estimator as As shown in Sect. 4, Q does not necessarily depend linearly on the CCF contrast CT . It stays on a plateau at ∼4-5 for all goodquality CCFs but starts to significantly decrease when the CCF wings become noisy and the CCF core very shallow.
CCF Gaussian fit -We model the profile of the CCF core or main peak (considered over its v rad extension ∆ core ) with a fourparameter Gaussian function (G): where C o g refers to the offset of the Gaussian function, D g to its (normalised) depth, F to its FWHM, and RV cc−g to the Gaussian v rad .
CCF biGaussian fit -As done with the Gaussian model, we fit the CCF core with a five-parameter biGaussian function (B):  to the Gaussian ones. We point out that our biGaussian model is slightly different from the one used by Nardetto et al. (2006), as the radial velocity RV cc−2g is one of the free parameters. For both the Gaussian and biGaussian fits, we use a non-linear Least-Square method as implemented in the curve_fit function of the scipy.optimize Python package.

B.3. Uncertainties
While choosing which v rad computation method(s) to use for Cepheids is an important question, computing reliable and realistic uncertainties on the CCF v rad and other line profile observables is also important. Typically, the DRS of high-resolution spectrographs include two main sources of uncertainty on the v rad : (1) the photon noise (proportional to the inverse of the spec-trum's S/N); and (2) the instrumental noise (or read-out noise), corresponding to the instrumental v rad precision and the instrumental stability (Baranne et al. 1996;Pepe et al. 2002). For the best spectrographs (e.g. HARPS), the increased instrumental stability (Table 1) and easily reachable high spectrum S/N mean that v rad uncertainties lower than 1 m s −1 on average can be routinely achieved (Pepe et al. 2018). Such v rad precision makes sense e.g. for exoplanet surveys around (generally) nonpulsating stars, for which the CCF can be properly fitted with a Gaussian model. However, Cepheids (and other radially pulsating stars) pose a different kind of challenge, due to the Cepheid v rad accuracy being strongly dependent on the correlation template and the v rad computation method used.
We decided here to compute our v rad (and other line profile observable) uncertainties based on our CCFs and their character- For each plot, the data that we obtain from the cross-correlation of the δ Cep HARPS-North (1D) spectra with the G2 template adapted to our green wavelength range (see Sect. 4.5.2) are plotted against the corresponding observables retrieved from the available DRS data (automatically produced from the 2D spectra and the default G2 template). From top to bottom and left to right: Gaussian v rad , Gaussian FWHM, CCF BIS, and CCF depth.
istics to remain consistent within our CCF-based framework, instead of using the spectrum's photon noise. We use SNR CCF (see above) to estimate the uncertainty on the centroid radial velocity RV cc−c : where W denotes the width of the CCF core at half its depth D. Such a formula is analogous in terms of dimensions to the formulae provided by e.g. Queloz (1995); Baranne et al. (1996). It makes the RV cc−c uncertainty directly dependent on the correlation template used to compute the CCF and on the observational pulsation phase. Next, to derive the BIS uncertainty ( BIS ), we first compute the uncertainties on V top and V btm as classical errors on the mean (i.e. as the ratio of the bisector v rad dispersion in the defined top and bottom regions over the square root of the number of v rad points in the two respective bisector slices). Then we add quadratically these two uncertainties to obtain BIS . For all the observables (v rad , Gaussian FWHM, biGaussian asymmetry, etc) derived from a Gaussian or a biGaussian CCF model, we use the uncertainties derived within the fit, i.e. the 1σ uncertainties computed from the square root of the covariance matrix diagonal. These uncertainties describe how closely the Gaussian or biGaussian fit agrees with the modeled CCF. We thus consider it valid to use them within our CCF-based formalism. We compare the derived uncertainties on RV cc−c ( cc−c , from our formula), RV cc−g and RV cc−2g ( cc−g and cc−2g from the fits) in Fig. B.1. Overall, < cc−2g > is twice larger than < cc−g > (∼20 m s −1 against ∼10 m s −1 , respectively, for the all correlation template), which agrees with the biGaussian model being more sensitive to the asymmetry and somewhat less robust than the Gaussian model. Meanwhile, < cc−c > is about one order of magnitude larger than < cc−g >. We cannot infer anything from this given that our computation formula for < cc−c > is arbitrary (see above), but we note that the shape of the distribution is similar for < cc−c > and < cc−g >, which gives us confidence in our formula. Finally, going from a template with more and deeper lines to a template with less and shallower lines leads understandably to an increase of the v rad uncertainties (Fig. B.1).
We also display in Fig. B.2 the v rad and BIS uncertainties versus the pulsation phase for one of our targets (δ Cep). The Gaussian and biGaussian v rad uncertainties are slightly variable with the phase, while the centroid v rad uncertainties ( cc−c ) are significantly variable (being an order of magnitude larger than the other uncertainties). The cc−c uncertainties do not exhibit the same pattern of variation as cc−g and cc−2g . The BIS uncertainties are significantly variable with φ, with BIS being the highest for the largest CCF asymmetry (see Fig. 6). In average, BIS has the same order of magnitude as cc−g or cc−2g . We note that our BIS estimation may be a conservative one as the BIS is a v rad differential measurement (Anderson 2019).

B.4. Comparison with the DRS-based CCF computation
Here, we compare our s1d-based CCF computation and derived observables with the e2ds-based CCF computation performed automatically by the spectrograph DRS. We consider on the one hand the 103 HARPS-North δ Cep (1D) spectra cross-correlated with the DRS G2 template adapted to our green wavelength range (as done in Sect. 4.5.2); and on the other hand the data produced by the HARPS-North DRS based on the original G2 template and the corresponding 2D spectra of δ Cep. The DRS cross-correlates each order of the 2D spectrum with the default G2 template before summing the CCFs over all the orders to obtain the average CCF, which is then fitted by a Gaussian. We display the results in Fig. B.3. For the four compared observables (Gaussian v rad , Gaussian FWHM, BIS and CCF depth), we obtain a Pearson correlation coefficient between 0.99 and 1, showing the robustness of our s1d-based approach. A linear regression of the two Gaussian v rad time series leads to a slope very close to 1 (= 1.006 ± 0.002) and a non-significant zero-point. The differences in terms of CCF depth and FWHM amplitudes remain small (by <2 and <4%, respectively) and probably originate in the fact that the DRS CCF computation is done on a wider wavelength interval than our green range (encompassing all orders, i.e. from ∼3900 to ∼6900 Å). The larger differences in terms of BIS amplitudes can be explained by the different BIS definitions between this study and the DRS. Overall, this confirms that using the s1d spectra with a S/N lower threshold (Sect. 3.2) is a valid approach.