USuRPER: Unit-sphere representation periodogram for full spectra

A. Binnenfeld; S. Shahaf; S. Zucker

doi:10.1051/0004-6361/202039001

Home

All issues

Volume 642 (October 2020)

A&A, 642 (2020) A146

Full HTML

Free Access

Issue		A&A Volume 642, October 2020


Article Number		A146
Number of page(s)		7
Section		Numerical methods and codes
DOI		https://doi.org/10.1051/0004-6361/202039001
Published online		13 October 2020

A&A 642, A146 (2020)

USuRPER: Unit-sphere representation periodogram for full spectra

A. Binnenfeld¹, S. Shahaf² and S. Zucker¹

¹ Porter School of the Environment and Earth Sciences, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 6997801, Israel
e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it. , This email address is being protected from spambots. You need JavaScript enabled to view it.
² School of Physics and Astronomy, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 6997801, Israel
e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Received: 22 July 2020
Accepted: 24 August 2020

Abstract

We introduce an extension of the periodogram concept to time-resolved spectroscopy. USuRPER, the unit-sphere representation periodogram, is a novel technique that opens new horizons in the analysis of astronomical spectra. It can be used to detect a wide range of periodic variability of the spectrum shape. Essentially, the technique is based on representing spectra as unit vectors in a multidimensional hyperspace, hence its name. It is an extension of the phase-distance correlation periodogram we had introduced in previous papers, to very high-dimensional data such as spectra. USuRPER takes the overall shape of the spectrum into account, which means that it does not need to be reduced into a single quantity such as radial velocity or temperature. Through simulations, we demonstrate its performance in various types of spectroscopic variability: single-lined and double-lined spectroscopic binary stars, and pulsating stars. We also show its performance on actual data of a rapidly oscillating Ap star. USuRPER is a new tool to explore large time-resolved spectroscopic databases such as APOGEE, LAMOST, and the RVS spectra of Gaia. We have made a public GitHub repository with a Python implementation of USuRPER available to the community, to experiment with it and apply it to a wide range of spectroscopic time series.

Key words: methods: data analysis / methods: statistical / techniques: spectroscopic / binaries: spectroscopic / stars: oscillations / stars: individual: HD 115226

© ESO 2020

1. Introduction

In two previous papers (Zucker 2018, 2019), we have introduced the phase-distance correlation (PDC) periodogram as a new method to detect non-sinusoidal periodicities in unevenly sampled time-series data. Essentially, for each trial period, PDC quantifies the statistical dependence between the measured quantity and the phase (according to the trial period), using the recently introduced distance correlation. Székely et al. (2007) introduced distance correlation as a measure of statistical dependence between two quantities. The calculations involved in estimating the sample distance correlation somewhat resemble those involved in estimating the Pearson correlation, hence its name. However, it is important to note that unlike the Pearson correlation, the distance correlation is not a measure of linear dependence, but rather of general dependence.

In order to quantify the dependence on the phase, which is a circular variable (i.e. cyclic), we have modified the original expression of Székely et al. (2007), following their original derivation, but for circular variables. As we have shown (Zucker 2018), the newly introduced periodogram outperforms other methods in cases of sawtooth-like variability shapes, including also radial velocity (RV) curves of eccentric single-lined spectroscopic binary (SB1) stars.

We have later extended the PDC periodogram to two-dimensional data (Zucker 2019), and specifically, to two-dimensional astrometric data, so as to improve the detection of eccentric astrometric orbits. This generalisation demonstrated an important advantage of distance correlation over the classic Pearson correlation. The Pearson correlation involves products of the sample values of the two examined variables, which means that both of them have to be real numbers. Instead, the distance correlation involves element-wise products of two matrices that are based on the distance matrices of the two variables. As long as distances can be calculated in each of the two examined spaces, no requirement regarding the dimensionality of the two variables is therefore made. They can even be of different dimensions, as long as distance matrices can be computed.

Lyons (2013) further extended the applicability of distance correlation by showing that it can be applied to variables in general metric spaces, as long as the two metrics involved are both of “strong negative type”. It is beyond the scope of this paper to delve into the definition and subtleties of strong negative-type metric spaces (first introduced in Zinger et al. 1992), but it is still important to note that Euclidean spaces are of strong negative type (Lyons 2013).

In this paper we introduce an extension of the PDC periodogram to a new domain: we propose to use it to detect general periodic variability of astronomical spectra. Perhaps the most obvious periodic variability of a stellar spectrum is that of SB1s, in which the spectra exhibit periodic Doppler shifts. The usual way to study SB1s is to cross-correlate the spectra against a template spectrum (either synthetic or observed), derive an estimate of the Doppler shifts from the location of the cross-correlation peaks (e.g. Tonry & Davis 1979), and then analyse the Doppler shifts in search for periodicity using conventional techniques, such as the generalized Lomb-Scargle (GLS) periodogram (Ferraz-Mello 1981; Zechmeister & Kürster 2009).

Double-lined spectroscopic binaries (SB2s) exhibit a more complicated periodicity pattern because each observed spectrum is essentially a superposition of two spectra, each shifted by a different Doppler shift, and both undergo opposite RV changes. Occasionally, the cross-correlation of the spectrum against a template shows two peaks, but sometimes the two peaks blend, requiring the use of techniques to disentangle the two Doppler shifts, such as TODCOR (Zucker & Mazeh 1994). Simon & Sturm (1994) proposed a disentangling technique that did not require early knowledge of the component spectra. However, their approach is still tailored only to SB2s.

Periodic variability of the spectrum need not necessarily be related to Doppler shifts in binary stars. Various types of stellar pulsations bring about many types of periodic variations of the spectrum, ranging from periodic temperature changes such as in Cepheids (e.g. Andrievsky et al. 2005) to line-profile variations in non-radially pulsating stars (e.g. Aerts et al. 1992).

In the next section we introduce the details of the calculations involved in producing the USuRPER periodogram. To demonstrate the capabilities of USuRPER, we show in Sect. 3 some test cases, both simulated and actually observed. We finally conclude in Sect. 4 with a short summary and some insights regarding applicability.

2. Unit-sphere representation periodogram

2.1. Fundamentals

We assume that we have time-resolved spectroscopy data of an astronomical object, comprising N spectra obtained at times ${t_{i}}_{i = 1}^{N}$ $Mathematical equation: $ \{t_i\}_{i=1}^N $$ . We further assume that each spectrum is essentially an array of L intensities, each corresponding to a specific wavelength. For simplicity, we assume at this stage that all spectra are calibrated to the same wavelength grid, and are all measured at the same rest frame. These assumptions can later be easily relaxed by calibration and interpolation procedures that are routinely performed in astronomical spectroscopy and RV studies.

Because we are interested only in the variability of the spectrum shape (rather than the total flux), we subtract the mean value of each spectrum and normalise it by dividing with its standard deviation. As a result, the spectra, ${\hat{f_{i}}}_{i = 1}^{N}$ $Mathematical equation: $ \{\hat{\boldsymbol{f\!}_i}\}_{i=1}^N $$ , can be now considered unit vectors in an L-dimensional Euclidean space, that is, points on the unit (L − 1)-sphere. If a periodic variability of the spectrum shape were to take place, it would therefore be manifested in a periodic motion on this unit sphere.

We introduce here a novel kind of periodogram to look for this unit-sphere periodicity. Following our previous papers, we can construct such a periodogram by quantifying for each trial period the distance correlation between the location on the unit sphere and the phase (according to the trial period). To do this, we need to have a distance function (metric) on the unit sphere that will be of strong negative type. Such a metric can be defined by the length of the chord connecting two points on the sphere: the chord-length metric. This metric is of strong negative type because it is induced by the Euclidean metric of the L-dimensional space in which the unit sphere is embedded (Lyons 2013). As we now show, this metric is very easy to compute.

Let $\hat{f_{i}}$ $Mathematical equation: $ \hat{\boldsymbol{f\!}_i} $$ and $\hat{f_{j}}$ $Mathematical equation: $ \hat{\boldsymbol{f\!}_j} $$ be two members of the sequence of unit vectors introduced above. We denote by θ_ij the angle between these two unit vectors. By simple geometry, we can immediately see that the chord length between the two corresponding unit-sphere locations is given by

$\begin{matrix} d (\hat{f_{i}}, \hat{f_{j}}) = 2 sin (θ_{ij} / 2) = 2 \sqrt{\frac{1 - cos θ_{ij}}{2}} \cdot \end{matrix}$ $Mathematical equation: $$ \begin{aligned} d(\hat{\boldsymbol{f\!}_i},\hat{\boldsymbol{f\!}_j}) = 2\sin (\theta _{ij}/2) = 2 \sqrt{\frac{1-\cos \theta _{ij}}{2}}\cdot \end{aligned} $$$ (1)

Because $\hat{f_{i}}$ $Mathematical equation: $ \hat{\boldsymbol{f\!}_i} $$ and $\hat{f_{j}}$ $Mathematical equation: $ \hat{\boldsymbol{f\!}_j} $$ are unit vectors, cos θ_ij is in fact the scalar product between them. In other words, it is actually the normalised correlation between the two original spectra, henceforth C_ij.

Now that we have defined a distance function, it might appear that we can calculate the two required distance matrices, following Zucker (2018, 2019). However, the space on which our distance function (Eq. (1)) is defined is extremely high dimensional, and as Székely & Rizzo (2013) showed, a naive computation of the distance correlation in this case would introduce a very strong bias. They proposed instead to use an unbiased estimate of the distance correlation, which we introduce in the next paragraphs.

2.2. Computation

Similarly to Zucker (2018, 2019), we define a distance matrix based on the metric we have introduced in Eq. (1). For each pair of spectra (i and j), the entry in the distance matrix is

$\begin{matrix} a_{ij} = \sqrt{1 - C_{ij}} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} a_{ij} = \sqrt{1-C_{ij}} . \end{aligned} $$$ (2)

We can safely remove the multiplicative factors appearing in Eq. (1) because they would later cancel out in the normalisation.

For each trial period P we define a phase-distance matrix, similarly to that in previous papers:

$\begin{matrix} ϕ_{ij} = (t_{i} - t_{j}) mod P, \\ b_{ij} = ϕ_{ij} (P - ϕ_{ij}) . \end{matrix}$ $Mathematical equation: $$ \begin{aligned}&\phi _{ij} = (t_i - t_j)\mod P, \nonumber \\&b_{ij} = \phi _{ij}(P-\phi _{ij}). \end{aligned} $$$ (3)

Now, instead of the zero-centring used in the previous papers, which leads to a biased estimator of the distance correlation, we apply 𝒰-centring, introduced in Székely & Rizzo (2014) in order to mitigate the bias:

$\begin{matrix} A_{ij} = {\begin{matrix} \begin{matrix} a_{ij} - \frac{1}{N - 2} \sum_{k = 1}^{N} a_{ik} - \frac{1}{N - 2} \sum_{k = 1}^{N} a_{kj} \\ + \frac{1}{(N - 1) (N - 2)} \sum_{k, l = 1}^{N} a_{kl} \end{matrix} & if i \neq j, \\ 0 & if i = j . \end{matrix} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} A_{ij} = {\left\{ \begin{array}{ll} \begin{split} a_{ij} - \frac{1}{N-2}\sum \limits _{k=1}^{N}a_{ik} - \frac{1}{N-2}\sum \limits _{k=1}^{N}a_{kj} \\ + \frac{1}{(N-1)(N-2)}\sum \limits _{k,l=1}^{N}a_{kl} \end{split}&\mathrm{if}\,{i \ne j,} \\ \\ 0&\mathrm{if}\,{i = j.} \end{array}\right.} \end{aligned} $$$ (4)

A similar procedure is applied to obtain the matrix B_ij from b_ij. When the 𝒰-centred matrices are used, the unbiased estimator of the distance correlation can be computed by the expression

$\begin{matrix} D = \frac{\sum_{ij} A_{ij} B_{ij}}{\sqrt{(\sum_{ij} A_{ij}^{2}) (\sum_{ij} B_{ij}^{2})}} \cdot \end{matrix}$ $Mathematical equation: $$ \begin{aligned} D = \frac{\sum \limits _{ij}A_{ij}B_{ij}}{\sqrt{(\sum \limits _{ij}A^2_{ij})(\sum \limits _{ij}B^2_{ij})}}\cdot \end{aligned} $$$ (5)

If prominent peaks appear in the resulting periodogram, their significance can be assessed by a permutation test. Every spectrum would then be allocated a random phase, drawn uniformly, and D would be recalculated for this random allocation of phases. There would be no need to recalculate the distance matrix among the spectra, as the original phase dependence would already have been ruined by randomising the phases. By repeating the randomisation for a prescribed number of times, the sample of D values can be used to obtain a threshold value corresponding to a desired level of the false-alarm probability (FAP).

2.3. Run-time complexity

The matrix A_ij should be calculated only once. If the spectra are all calibrated to the same wavelength grid, each C_ij is a simple correlation coefficient, requiring 𝒪(L) operations. However, because cross-correlation functions (CCFs) are routinely computed, especially in the context of RV studies, C_ij can also be extracted from the CCF, taking into account conversion to the rest-frame velocity. CCFs usually require 𝒪(Llog L) operations (using fast convolution algorithms), which we henceforth use as a worst-case estimate. Therefore, calculating the matrix A_ij and converting it into the 𝒰-centred matrix a_ij involves 𝒪(N²Llog L) operations. The matrix b_ij has to be calculated separately for each frequency, and then used to calculate the distance correlation (Eq. (5)), amounting to a total of 𝒪(N²K), where K is the number of trial frequencies (periods). The total time complexity is therefore max[𝒪(N²Llog L),𝒪(N²K)], and it is a matter of specific implementation which of the two terms dominates. Whichever dominates, it is still a matter of quadratic dependence on the number of spectra. In future applications, this quadratic dependence may be reduced to 𝒪(Nlog N) by using fast techniques to compute distance correlation that are now emerging (e.g. Huo & Székely 2016; Chaudhuri & Hu 2019).

3. Examples

3.1. Simulated SB1

In order to simulate spectra of an SB1, we used a synthetic solar-like spectrum (T_eff = 5800 K, [Fe/H] = 0.0, log g = 4.5) from the spectral library PHOENIX (Husser et al. 2013), at a spectral resolution of R = 10 000. We have simulated a simple sinusoidal RV curve (i.e. corresponding to a circular orbit), with a semi-amplitude of K = 10 km s⁻¹ and a period of seven days. We randomly drew 50 epochs from a uniform distribution on an interval of 100 days, and after shifting the spectrum according to the required RVs, we added to the spectra white Gaussian noise, at a signal-to-noise ratio (S/N) of 100¹. The wavelength range we have used for our simulations was 4900−5100 Å.

The common approach to analysing such data is to cross-correlate each observed spectrum against an assumed template and estimate the location of the cross-correlation peak. Figure 1 shows the resulting RV estimates thus obtained (using the PHOENIX spectrum as template). As is clearly evident from the figure, the high S/N we used in the simulation led to what seems to be a very smooth sinusoidal RV curve, with negligible scatter around the sinusoid. This very well-defined periodicity, combined with the relatively large number of samples, is also manifested in a very sharp and prominent peak in the GLS periodogram at a frequency of 1/7 d⁻¹ (Fig. 2, lower panel). Because the GLS is tailored to sinusoidal periodicities, we do not expect any other kind of periodogram to outperform the GLS in this case. Moreover, when we search for periodicity in the RV data, it means that we have already assumed that the spectroscopic variability is a Doppler-shift periodicity, and not, for example, line-profile variation.

Fig. 1.

Upper panel: estimated RV time series based on the simulated spectra of an SB1. Lower panel: RV time series phase-folded by the known seven-day period.

Nevertheless, it is illuminating to compare GLS to our newly introduced periodogram. The upper panel of Fig. 2 shows the resulting USuRPER periodogram. We recall that we did not extract RVs in order to obtain this periodogram, therefore it is very encouraging that USuRPER produced such a sharp peak at the correct period. The dashed line in the plot shows the threshold value corresponding to an FAP of 10⁻³, leaving little doubt concerning the significance of the detected periodicity.

Fig. 2.

GLS (lower panel) and USuRPER (upper panel) periodograms of the simulated SB1 whose RV are presented in Fig. 2. The GLS power and the distance correlation values of USuRPER are both normalised and therefore unitless. The dashed line in the upper panel corresponds to an FAP level of 10⁻³, obtained by the permutation test procedure.

This example is a very simple case, with many samples and a high S/N. It still serves as a kind of sanity check, and proves that this novel approach can indeed identify at least simple periodicities.

3.2. Simulated SB2

The case of SB2 is more challenging because the spectroscopic variability is not merely a simple Doppler shift. We have simulated SB2 data using two PHOENIX spectra. We used the same solar-like spectrum as in the SB1 above as the spectrum of the primary component of the binary. For the secondary we used a spectrum corresponding to T_eff = 5500 K, log g = 4.5, and [Fe/H] = 0.0. We shifted and blended the spectra, assuming a moderately eccentric (e = 0.3) seven-day Keplerian orbit. The orbital orientation was determined so that the maximum RV separation (K₁ + K₂) would be 10 km s⁻¹. In order to determine the individual semi-amplitudes K₁ and K₂, as well as the intensity ratio for combining the spectra, we used the masses and radii listed in PHOENIX, assuming the two stars are main-sequence stars. In total, we sampled the simulated orbit at 20 epochs, with an S/N of 30. Figure 3 presents the simulated primary and seconday RVs.

Fig. 3.

Upper panel: RV time series used in the SB2 simulation. Filled circles mark the primary RV, and empty triangles the RV of the secondary. Lower panel: same RV time-series phase-folded by the known seven-day period.

Figure 4 demonstrates the challenge in this specific SB2 case. We show in the figure two of the 20 spectra, at the largest and smallest RV separation. The figure focuses on the wavelength range 4955−4980 Å, which includes the Fraunhofer iron c-line, at 4959.0 Å (note that we did not convert PHOENIX spectra from vacuum to air wavelengths). The figure shows both components (with dashed blue and dotted red lines), and the composite noised spectrum (solid black line). For clarity we have introduced vertical offsets among the three spectra in each panel. The challenge is obvious: at a resolution 10 000 and S/N 30, it is practically impossible to distinguish the two components. The main effect of the varying RV separation seems to be a minute change in the width and depth of the composite spectral lines.

Fig. 4.

Selected segment from two simulated SB2 spectra. The dashed blue lines represent the primary PHOENIX spectrum, and the dotted red lines represent the secondary. The solid black line is the combined and noised spectrum with an S/N of 30. used for the simulation. The spectra are normalised to a continuum level of 1. For clarity, a vertical offset of 0.2 was introduced to separate the spectra. The upper panel shows the spectrum with the maximum RV separation, and the lower panel shows the spectrum with the smallest separation.

Figure 5 presents the resulting USuRPER periodogram. In spite of the challenge posed by the low resolution, relatively low S/N and small RV separation, the maximum is obtained at a clear peak around the correct frequency, safely above the 10⁻³-FAP threshold. The new periodogram appears to perform reasonably well in this quite challenging case as well.

Fig. 5.

USuRPER periodogram plot for the simulated SB2 case. The distance-correlation values of USuRPER are normalised and therefore unitless. The dashed line corresponds to an FAP level of 10⁻³, obtained by the permutation test procedure.

3.3. Periodic temperature variability

In addition to the examples above, we wished to test whether USuRPER is indeed also sensitive to other types of spectroscopic periodicities, not merely those related to periodic Doppler shifts. The periodic expansion and contraction phases of pulsating stars cause periodic Doppler shifts, but are also accompanied by cooling and heating. We therefore decided to simulate such periodic temperature changes, using the PHOENIX library, without the Doppler shift, so that the spectral features that change periodically would not be easily describable in a simple manner like Doppler shifts.

We simulated a saw-tooth effective-temperature variability, with T_eff varying between 5000 K and 6000 K, and a period of seven days by a simple linear interpolation over the PHOENIX temperature grid. This is a rough approximation to typical T_eff variability of classical Cepheids (e.g. Andrievsky et al. 2005). We simulated 15 random epochs, again over an interval of 100 days, with an S/N of 30 (Fig. 6). Figure 7 focuses on a narrow wavelength range of 4952−4967 Å around the iron c-line and shows how the spectrum changes as a result of the variable effective temperature (without the added noise). The dashed yellow line represents the spectrum of the lowest temperature simulated (5043 K) and the dotted red line shows the highest temperature (5936 K). A spectrum of a temperature in the middle (5486 K) is also plotted with a blue line. The range of simulated temperatures is shaded in grey. The minute changes in the equivalent widths of the lines caused by the varying effective temperature are visible, without any bulk Doppler shift. Moreover, different lines behave quite differently, and might even exhibit different trends in equivalent width, as the temperature varies.

Fig. 6.

Upper panel: effective-temperature time series used in the periodic temperature variability simulation. Lower panel: same effective-temperature time series phase-folded by the known seven-day period.

Fig. 7.

Selected segment of the simulated spectra with periodic T_eff variability, before adding noise. The dashed yellow line corresponds to the spectrum with the lowest temperature (5043 K) and the dotted red line to the highest temperature (5936 K). The solid blue line represents a temperature in the middle (5486 K). The shaded area represents the range between the spectra with the extreme temperatures.

Figure 8 shows the result of the USuRPER periodogram applied to this dataset. In spite of the less favourable conditions, where there are fewer samples than the previous examples, and the S/N is not optimal, the peak at the correct period, much higher than the 10⁻³-FAP threshold, is evident, confirming that our novel periodogram performs well also in cases in which the periodicity is very different from simple Doppler shifts.

Fig. 8.

USuRPER periodogram plot for the simulated temperature periodicity case. The distance-correlation values of USuRPER are normalised and therefore unitless. The dashed line corresponds to an FAP level of 10⁻³, obtained by the permutation test procedure.

It should be noted that in this specific simulation, a few lower spurious peaks appear to marginally cross the detection threshold as well. Their frequencies seem to be around half-harmonics of the simulated frequency. This might be a random finding, but it should be further explored. In any case, the correct peak is definitely much more significant.

3.4. Composite periodicity

After we have demonstrated that USuRPER is sensitive to both RV and temperature periodic variability, it is interesting to test how it performs when presented with a composite type of periodicity, such as periodic RV variability combined with periodic temperature variability, with different periods. To this end, we again simulated a set of 50 spectra. The simulated temperature variability resembled the one in Sect. 3.3, but with a period of five days, whereas the RV variability was a sinusoidal variability similar to that in Sect. 3.1, with a period of three days. White Gaussian noise was added at a level corresponding to an S/N of 100.

The two corresponding peaks, at frequencies 1/3 and 1/5 d⁻¹, are clearly seen in the USuRPER periodogram of these data, in Fig. 9. They are both safely higher than the 10⁻³ significance threshold, but they are still not of the same prominence, however. This probably reflects the fact that the effects of temperature and RV periodicities, at the simulated amplitudes, do not have the same impact on the overall variability of the spectrum. Nevertheless, the presence of both peaks in the periodogram shows that they did not in some way interfere in a destructive fashion that would make them disappear. This serves to show that USuRPER can also be used for cases of multiple periodicities. The case of a temperature periodicity combined with an RV periodicity of a different period can be encountered in cases of Cepheids in spectroscopic binary stars (Szabados et al. 2013), for instance.

Fig. 9.

USuRPER periodogram plot for the simulated composite temperature and RV periodicity case. The distance-correlation values of USuRPER are normalised and therefore unitless. The dashed line corresponds to an FAP level of 10⁻³, obtained by the permutation test procedure.

3.5. HD 115226

In the previous examples we have applied USuRPER on sequences of simulated spectra, which were obviously far better behaved than real-life data. We therefore looked for a publicly available real-life time-resolved spectroscopy dataset exhibiting spectral variability, preferably of a different type from those of the previous examples. We finally decided to test USuRPER on observed spectra of a known rapidly oscillating Ap (roAp) star.

Broadly speaking, roAp stars are stars that exhibit very short-period photometric or RV variations, with periods of the order of minutes (e.g. Kurtz 1990). Ryabchikova et al. (2007) have further characterised the spectral variability of roAp stars by showing that absorption lines of some of the heavier chemical species (rare-element ions) perform periodic Doppler shifts, usually all with the same period, but not with the same amplitude or phase. This means that the overall spectrum shape changes periodically with a rather complicated pattern, which renders analysis by cross-correlation ineffective. Instead, the common approach is to analyse each individual line separately, measure its Doppler shift, and analyse its periodicity.

Kochukhov et al. (2008) have observed the roAp star HD 115226 using HARPS (Mayor et al. 2003). They obtained time-series spectroscopy of HD 115226 including 102 spectra during a time interval of 4.3 h, and performed a meticulous RV analysis of various absorption lines. The analysis yielded an estimated oscillation period of 10.87 ± 0.01 min.

We downloaded the 102 HARPS spectra, and applied USuRPER on this dataset. Based on Table 3 of Kochukhov et al. (2008), we restricted the wavelength range we analysed to 4900−5150 Å, where a few important Nd III lines are located. A wider wavelength range would have diluted the periodicity information because most of the spectral features in other wavelengths do not exhibit periodicity. Knowing that we searched for a phenomenon with a typical period of a few minutes, we ran USuRPER on a frequency range of 50−250 d⁻¹, corresponding to a period range of 5.76−28.8 min. Figure 10 shows the resulting periodogram.

Fig. 10.

USuRPER periodogram plot for the HARPS spectra of HD 115226. The vertical dashed line represents the known period of 10.87 min (Kochukhov et al. 2008). The distance-correlation values of USuRPER are normalised and therefore unitless. The horizontal dashed line corresponds to an FAP level of 10⁻³, obtained by the permutation test procedure.

The obvious maximum is at a frequency of 132.3 d⁻¹, corresponding to a period of 10.88 min, in agreement with the period Kochukhov et al. had obtained by their individual-line analysis. A conventional error estimate for the period is difficult to estimate because it requires some modelling of the periodicity (e.g. Baliunas et al. 1994). However, some confidence interval can be estimated using the frequencies around the peak where the periodogram crosses the 10⁻³-FAP threshold. The resulting estimate of the period is 10.88 ± 0.28 min. The uncertainty is larger than the uncertainty reported by Kochukhov et al., but this is expected because they used a specific known model for the periodicity, and also included more absorption lines in their analysis. In any case, the two period estimates perfectly agree within their error bars. The peak is also well above the 10⁻³-FAP significance threshold. Interestingly, two additional twin peaks are clearly seen around the frequency 190 d⁻¹, but they barely reach the 10⁻³-FAP threshold, and are probably spurious.

4. Conclusion

The examples we have shown above attest to the wide potential of the USuRPER periodogram. We have shown that it performs well in cases of RV periodicities, composite SB2 spectra, and even complicated spectrum-shape patterns such as periodic temperature changes. We also demonstrated its performance in real-life cases of exotic variability such as roAp stars. We provide our Python implementation of USuRPER in the form of a public GitHub repository².

In order to estimate the significance of peaks in the USuRPER periodogram, simple bootstrap-like permutation tests can be performed in which the time stamps of the individual spectra would be repeatedly randomly shuffled, in order to obtain the null distribution of the distance-correlation values under the assumption of no dependence.

Because USuRPER does not provide any further information about the nature of the periodicity, except for the period and its significance, it is essentially useful as an exploratory tool. Once a prominent peak appears in the periodogram, further analysis is required in order to tell whether the observed object is a binary star (or exoplanet), a pulsating star, or maybe some other type of periodicity we did not encounter before.

An important application of USuRPER can be, for example, to use it in the analysis of the RVS, BP and RP spectra of Gaia (Gaia Collaboration 2016), or other large spectroscopic surveys with potentially multiple visits per object, for instance, APOGEE (Majewski et al. 2017) or LAMOST (Cui et al. 2012). Another interesting application might be the study of periodic stellar variability patterns that might interfere with the detection of exoplanets through minute Keplerian RV variations (Boisse et al. 2011).

The USuRPER periodogram offers a completely new approach to study astronomical spectra. An approach that may very well pave the way to new discoveries and insights, potentially ones that cannot be discovered in any other way.

The S/N definition we used was the ratio between the continuum flux level and the noise standard deviation. We estimated the continuum flux level by the 98th percentile of the flux values in the spectrum.

USuRPER is available as part of the SPARTA package, at https://github.com/SPARTA-dev/SPARTA

Acknowledgments

We thank the anonymous referee for their wise comments that helped to improve the manuscript. We are grateful to Aviad Panahi for patiently examining our USuRPER code implementation, and to Dolev Bashi for reviewing and commenting on an early version of the manuscript. This research was supported by the ISRAEL SCIENCE FOUNDATION (grant No. 848/16). We also acknowledge partial support by the Ministry of Science, Technology and Space, Israel. The research is partly based on observations collected at the European Southern Observatory, La Silla, Chile (ESO program 079.D-0118). The analyses done for this paper made use of the code packages: Astropy (Astropy Collaboration 2013, 2018), NumPy (van der Walt et al. 2011), SciPy (Virtanen et al. 2020), and PyAstronomy (Czesla et al. 2019).

References

Aerts, C., De Pauw, M., & Waelkens, C. 1992, A&A, 266, 294 [Google Scholar]
Andrievsky, S. M., Luck, R. E., & Kovtyukh, V. V. 2005, AJ, 130, 1880 [NASA ADS] [CrossRef] [Google Scholar]
Astropy Collaboration (Robitaille, T. P., et al.) 2013, A&A, 558, A33 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Astropy Collaboration (Price-Whelan, A. M., et al.) 2018, AJ, 156, 123 [Google Scholar]
Baliunas, S. L., Donahue, R. A., Soon, W. H., et al. 1994, ApJ, 438, 269 [Google Scholar]
Boisse, I., Bouchy, F., Hébrard, G., et al. 2011, A&A, 528, A4 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Chaudhuri, A., & Hu, W. 2019, Comput. Stat. Data Anal., 135, 15 [CrossRef] [Google Scholar]
Cui, X.-Q., Zhao, Y.-H., Chu, Y.-Q., et al. 2012, Res. Astron. Astrophys., 12, 1197 [NASA ADS] [CrossRef] [Google Scholar]
Czesla, S., Schröter, S., Schneider, C. P., et al. 2019, Astrophysics Source Code Library [record ascl:1906.010] [Google Scholar]
Ferraz-Mello, S. 1981, AJ, 86, 619 [NASA ADS] [CrossRef] [Google Scholar]
Gaia Collaboration (Brown, A. G. A., et al.) 2016, A&A, 595, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Huo, X., & Székely, G. J. 2016, Technometrics, 58, 435 [CrossRef] [Google Scholar]
Husser, T.-O., Wende-von Berg, S., Dreizler, S., et al. 2013, A&A, 553, A6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Kochukhov, O., Ryabchikova, T., Bagnulo, S., & Lo Curto, G. 2008, A&A, 479, L29 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Kurtz, D. W. 1990, ARA&A, 28, 607 [NASA ADS] [CrossRef] [Google Scholar]
Lyons, R. 2013, Ann. Probab., 41, 3284 [CrossRef] [Google Scholar]
Majewski, S. R., Schiavon, R. P., Frinchaboy, P. M., et al. 2017, AJ, 154, 94 [NASA ADS] [CrossRef] [Google Scholar]
Mayor, M., Pepe, F., Queloz, D., et al. 2003, The Messenger, 114, 207 [Google Scholar]
Ryabchikova, T., Sachkov, M., Kochukov, M., & Lyashko, D. 2007, A&A, 473, 907 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Simon, K. P., & Sturm, E. 1994, A&A, 281, 286 [NASA ADS] [Google Scholar]
Szabados, L., Derekas, A., Kiss, L. L., et al. 2013, MNRAS, 430, 2018 [NASA ADS] [CrossRef] [Google Scholar]
Székely, G. J., & Rizzo, M. L. 2013, J. Multivar. Ann., 117, 193 [CrossRef] [Google Scholar]
Székely, G. J., & Rizzo, M. L. 2014, Ann. Stat., 42, 2382 [CrossRef] [Google Scholar]
Székely, G. J., Rizzo, M. L., & Bakirov, N. K. 2007, Ann. Stat., 35, 2769 [CrossRef] [Google Scholar]
Tonry, J., & Davis, M. 1979, AJ, 84, 1511 [NASA ADS] [CrossRef] [Google Scholar]
van der Walt, S., Colbert, S. C., & Varoquaux, G. 2011, Comput. Sci. Eng., 13, 22 [Google Scholar]
Virtanen, P., Gommers, R., Oliphant, T. E., et al. 2020, Nat. Meth., 17, 261 [Google Scholar]
Zechmeister, M., & Kürster, M. 2009, A&A, 496, 577 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Zinger, A. A., Kakosyan, A. V., & Klebanov, L. B. 1992, J. Sov. Math., 59, 914 [CrossRef] [Google Scholar]
Zucker, S. 2018, MNRAS, 474, L86 [CrossRef] [Google Scholar]
Zucker, S. 2019, MNRAS, 484, L14 [CrossRef] [Google Scholar]
Zucker, S., & Mazeh, T. 1994, ApJ, 420, 806 [NASA ADS] [CrossRef] [EDP Sciences] [MathSciNet] [Google Scholar]

All Figures

	Fig. 1. Upper panel: estimated RV time series based on the simulated spectra of an SB1. Lower panel: RV time series phase-folded by the known seven-day period.
In the text

	Fig. 2. GLS (lower panel) and USuRPER (upper panel) periodograms of the simulated SB1 whose RV are presented in Fig. 2. The GLS power and the distance correlation values of USuRPER are both normalised and therefore unitless. The dashed line in the upper panel corresponds to an FAP level of 10⁻³, obtained by the permutation test procedure.
In the text

	Fig. 3. Upper panel: RV time series used in the SB2 simulation. Filled circles mark the primary RV, and empty triangles the RV of the secondary. Lower panel: same RV time-series phase-folded by the known seven-day period.
In the text

Fig. 4.

In the text

	Fig. 5. USuRPER periodogram plot for the simulated SB2 case. The distance-correlation values of USuRPER are normalised and therefore unitless. The dashed line corresponds to an FAP level of 10⁻³, obtained by the permutation test procedure.
In the text

	Fig. 6. Upper panel: effective-temperature time series used in the periodic temperature variability simulation. Lower panel: same effective-temperature time series phase-folded by the known seven-day period.
In the text

Fig. 7.

In the text

	Fig. 8. USuRPER periodogram plot for the simulated temperature periodicity case. The distance-correlation values of USuRPER are normalised and therefore unitless. The dashed line corresponds to an FAP level of 10⁻³, obtained by the permutation test procedure.
In the text

	Fig. 9. USuRPER periodogram plot for the simulated composite temperature and RV periodicity case. The distance-correlation values of USuRPER are normalised and therefore unitless. The dashed line corresponds to an FAP level of 10⁻³, obtained by the permutation test procedure.
In the text

	Fig. 10. USuRPER periodogram plot for the HARPS spectra of HD 115226. The vertical dashed line represents the known period of 10.87 min (Kochukhov et al. 2008). The distance-correlation values of USuRPER are normalised and therefore unitless. The horizontal dashed line corresponds to an FAP level of 10⁻³, obtained by the permutation test procedure.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[R1] Aerts, C., De Pauw, M., & Waelkens, C. 1992, A&A, 266, 294 [Google Scholar]

[R2] Andrievsky, S. M., Luck, R. E., & Kovtyukh, V. V. 2005, AJ, 130, 1880 [NASA ADS] [CrossRef] [Google Scholar]

[R3] Astropy Collaboration (Robitaille, T. P., et al.) 2013, A&A, 558, A33 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R4] Astropy Collaboration (Price-Whelan, A. M., et al.) 2018, AJ, 156, 123 [Google Scholar]

[R5] Baliunas, S. L., Donahue, R. A., Soon, W. H., et al. 1994, ApJ, 438, 269 [Google Scholar]

[R6] Boisse, I., Bouchy, F., Hébrard, G., et al. 2011, A&A, 528, A4 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R7] Chaudhuri, A., & Hu, W. 2019, Comput. Stat. Data Anal., 135, 15 [CrossRef] [Google Scholar]

[R8] Cui, X.-Q., Zhao, Y.-H., Chu, Y.-Q., et al. 2012, Res. Astron. Astrophys., 12, 1197 [NASA ADS] [CrossRef] [Google Scholar]

[R9] Czesla, S., Schröter, S., Schneider, C. P., et al. 2019, Astrophysics Source Code Library [record ascl:1906.010] [Google Scholar]

[R10] Ferraz-Mello, S. 1981, AJ, 86, 619 [NASA ADS] [CrossRef] [Google Scholar]

[R11] Gaia Collaboration (Brown, A. G. A., et al.) 2016, A&A, 595, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R12] Huo, X., & Székely, G. J. 2016, Technometrics, 58, 435 [CrossRef] [Google Scholar]

[R13] Husser, T.-O., Wende-von Berg, S., Dreizler, S., et al. 2013, A&A, 553, A6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R14] Kochukhov, O., Ryabchikova, T., Bagnulo, S., & Lo Curto, G. 2008, A&A, 479, L29 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R15] Kurtz, D. W. 1990, ARA&A, 28, 607 [NASA ADS] [CrossRef] [Google Scholar]

[R16] Lyons, R. 2013, Ann. Probab., 41, 3284 [CrossRef] [Google Scholar]

[R17] Majewski, S. R., Schiavon, R. P., Frinchaboy, P. M., et al. 2017, AJ, 154, 94 [NASA ADS] [CrossRef] [Google Scholar]

[R18] Mayor, M., Pepe, F., Queloz, D., et al. 2003, The Messenger, 114, 207 [Google Scholar]

[R19] Ryabchikova, T., Sachkov, M., Kochukov, M., & Lyashko, D. 2007, A&A, 473, 907 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R20] Simon, K. P., & Sturm, E. 1994, A&A, 281, 286 [NASA ADS] [Google Scholar]

[R21] Szabados, L., Derekas, A., Kiss, L. L., et al. 2013, MNRAS, 430, 2018 [NASA ADS] [CrossRef] [Google Scholar]

[R22] Székely, G. J., & Rizzo, M. L. 2013, J. Multivar. Ann., 117, 193 [CrossRef] [Google Scholar]

[R23] Székely, G. J., & Rizzo, M. L. 2014, Ann. Stat., 42, 2382 [CrossRef] [Google Scholar]

[R24] Székely, G. J., Rizzo, M. L., & Bakirov, N. K. 2007, Ann. Stat., 35, 2769 [CrossRef] [Google Scholar]

[R25] Tonry, J., & Davis, M. 1979, AJ, 84, 1511 [NASA ADS] [CrossRef] [Google Scholar]

[R26] van der Walt, S., Colbert, S. C., & Varoquaux, G. 2011, Comput. Sci. Eng., 13, 22 [Google Scholar]

[R27] Virtanen, P., Gommers, R., Oliphant, T. E., et al. 2020, Nat. Meth., 17, 261 [Google Scholar]

[R28] Zechmeister, M., & Kürster, M. 2009, A&A, 496, 577 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R29] Zinger, A. A., Kakosyan, A. V., & Klebanov, L. B. 1992, J. Sov. Math., 59, 914 [CrossRef] [Google Scholar]

[R30] Zucker, S. 2018, MNRAS, 474, L86 [CrossRef] [Google Scholar]

[R31] Zucker, S. 2019, MNRAS, 484, L14 [CrossRef] [Google Scholar]

[R32] Zucker, S., & Mazeh, T. 1994, ApJ, 420, 806 [NASA ADS] [CrossRef] [EDP Sciences] [MathSciNet] [Google Scholar]