A&A 386, 763-774 (2002)
DOI: 10.1051/0004-6361:20020258

String/Rope length methods using the Lafler-Kinman statistic

D. Clarke

Department of Physics and Astronomy, University of Glasgow, Glasgow G12 8QQ, Scotland, UK

Received 17 December 2001 / Accepted 18 February 2002

The original Lafler-Kinman statistic for exploring any auto-correlation within a set of measurements relative to their underlying variance has been regularized so that its determination is independent of the data sample size. In its new form, the power of its application to String-Length period searches (SLLK) has been assessed in terms of establishing confidence levels to point value detections within any generated periodogram and to confidence levels of not missing the detection when an underlying period is present. These estimations depend only on the amplitude of the variation relative to the measurement noise and are independent of the signal-to-noise ratio of the measurements and of their number. Examples of the behaviour of periodograms based on SLLK as produced from computer generated data and real data are discussed. It is also demonstrated that the principle can be readily extended to multivariate data in the form of Rope-Length period searches (RLLK), with the measurements of each parameter not necessarily being taken simultaneously nor with equal number. Using simulated data it is shown that the power of period detection improves slightly if the underlying modulations in each parameter are out of phase with each other. Examples of the RLLK principle are given for computer simulated data and for stellar multi-colour photometric and polarimetric measurements.

Key words: methods: statistical

1 Introduction

The Lafler-Kinman (later referred to as LK) statistic was originally presented in connection with the determination of periods of RRLyrae stars from small samples of single passband magnitude measurements (see Lafler & Kinman 1965). Reference to its later use in general stellar photometric variability may be found in a range of papers including IBVS notes (see, for example, Rosenweig 1976; Bell et al. 1983; Waugh 1984; Boyd et al. 1985). The statistic has quite general application beyond simple photometry and has been applied, for example, by Walborn & Nichols (1994) to UV line variations associated with stellar wind variability.

The LK period search statistic may be classed as being a non-parametric method. For each trial period, $\,P$, (or frequency, $\,\nu$) taken from a grid, $P_1,\Delta P, P_2\ $ (or $\nu_1, \Delta \nu, \nu_2)$, the original data, $m_1\dots m_j\dots
m_N$, are assigned phases, $\phi_1\dots \phi_j\dots \phi_N$, which are then re-ordered in ascending sequence. The re-ordered data, $m_1\dots m_i\dots m_N$, (note the change in subscript from "$\,j\,$'' to "$\,i\,$'') may be examined by inspection across the full phase interval, 0.0 to 1.0. For each period, the LK statistic performs a "String-Length'' (S-L) summation of the squares of the differences between the consecutive phase re-ordered measurement values. The variation of S-L with period (or frequency) may be considered as providing a periodogram, SLLK(P), from which any well defined minimum can be considered as corresponding to the underlying period. Alternative S-L methods include those of Renson (1978) and Dworetsky (1983).

Inspection of the literature citing use of the LK principle generally shows that periods are promoted without any indication of the certainty of detection or with confidence intervals (errors) assigned to the determined period. Some references suggest that a variant of the LK statistic has been applied but without its description. For example, a basic period search may be undertaken by calculating the statistic's numerator (see Eq. (1)), without incorporating the denominator's normalizing factor.

In this work, the LK method is more firmly established with a clear recipe for its application. Its form is revised so that the estimation of the statistic is freed from data sample size bias. The power of the statistic, in terms of its abilities to detect a period and for any period not to be overlooked, is explored according to the number of available measurements, $\,N$, and their signal-to-noise ratio. The LK statistic is also considered as a means of investigating the nature of the temporal assembly of data which, without the time element, may simply appear to be distributed Normally.

The statistic is developed to provide search algorithms for application to multivariate data. This latter exercise involves the combination of S-L calculations and might be described as a "Rope-Length'' (R-L) method giving rise to periodograms assigned the abbreviation RLLK(Z,P), where $\,Z$ corresponds to the number of measured parameters. In the first instance the R-L methods are considered for data involving simultaneously measured parameters. The method is then generalized to show that it can be applied to multivariate data sets with parameters measured at independent times, so illustrating how SLLK periodograms obtained from observations of different parameters are readily combined.

2 The SLLK statistic

The original LK statistic, $\,\Theta(P)\,$, is based on the determination of the sum of the squares of the vector lengths (S-Ls) required to connect re-ordered measurements, mi, in phase sequence, $\,\phi_i$, for each of the trial periods in the prescribed grid. The essence of the original formulation may be expressed in the form of a statistic written as:

...\sum\limits_{i=1}^N\bigl[\bigl(m_i-\overline m\bigr)^2\bigr]}}
\end{displaymath} (1)

where $\,\overline m\,$ is the mean value of the measurements.

Although not mentioned by LK, it may be noted that full utilization of the data is made by including the vector length between the last and first measurement of the re-ordered sequence by letting $\,m_{N+1}=m_1\,$, in the above summation. By using the squares of the vector lengths in the summation, both the upswings and downswings between the adjacent re-ordered data make a positive contribution to the statistic so that it does not converge to zero as N increases. As expressed in Eq.(1), the normalizing denominator is $\,N\times\,$the variance of the measurements. By applying this factor the statistic's value becomes independent of the measurement noise. Although it has no consequence on the determination of the minima positions in the S-L periodogram, this factor produces regularisation of the periodogram continuum levels with a scale allowing standard confidence levels of detection to be applied to any suspected period, no matter the signal-to-noise ratio (S/N) of the data; Horne & Baliunas (1986) showed the importance of doing this in their refinement of a period searching method involving Fourier analysis.

By expanding both the numerator and denominator of the statistic, it is readily shown that

$\displaystyle %
\Theta(P)\ \ =\ \ {2\times\Bigl(\sum\limits_{i=1}^N
\cdot$     (2)

Again, it may be noted that the first element of the numerator, $\,\sum\limits_{i=1}^N \bigl[\bigl(m_i\bigr)^2\bigr]\,$, is independent of $\,P$ and needs to be calculated just once for the exercise and that only $\,\sum\limits_{i=1}^N
\bigl[\bigl(m_{i}m_{i+1}\bigr)\bigr]\,$ requires determination for each of the periods in the grid. It is this element in the statistic that explores a kind of correlation value or co-variance between adjacent measurement values in the succesively re-ordered data. If oscillatory behaviour is not present, none of the examined periods will provide a correlation with the summation of the products of the phase adjacent measurements converging to $\,N\overline{m}^2\,$ for large $\,N$. Thus, in the limit, the values of $\,\Theta(P)\,$ are equal to 2.0 (see Eq. (2)). As $\,N\,$ is not infinite, any periodogram based on $\Theta(P)$ will fluctuate about a mean value $\approx$2.0.

Trial applications of the LK statistic, as calculated by Eq. (1), show that the mean periodogram levels increase slightly to be above 2.0 as $\,N\,$ reduces, resulting from the fact that the denominator relates to the `true' variance of the data. This problem is addressed by the introduction of a term as in the definition of the "sample'' variance. By scaling the LK statistic by the factor $\,(N-1)/N\,$, sample-size bias is removed. In addition, to generate periodograms with continuum levels of unity, a further normalizing factor of 1/2 needs to be applied. A regularized formula for applying the LK principle leading to SLLK(P) periodograms may thus best be written as

m\bigr)^2\bigr]}}\times{(N-1)\over{2N}} \ \cdot
\end{displaymath} (3)

If the data contain periodicity, $\,T(P)\,$ should achieve minimum value at the underlying period, $\,P_0\,$, within the fluctuations across the periodogram with mean level of $\approx$1.0. For $\,P_0\,$, the path through the component vectors in the corresponding phase/measurement diagram will display a relatively smooth undulating pattern, with the cycle occupying the full phase window. For any suspected period, it is always sensible practice to construct this diagram for inspection.

3 The behaviour of SLLK(P)

3.1 The statistics of point values

Various exercises were established to explore the behaviour of periodograms produced by the T(P) statistic and to examine its power to detect periodicity. For the simulations, data collection was mimicked by computer generation of N values from an underlying sinusoidal signal, the simulated measurements being represented by

 \begin{displaymath}m_j=A + B\sin(2\pi t_j/P_0) + \eta_j
\end{displaymath} (4)

where A is the mean level of the signal, B the amplitude of the variation, tj the time of measurement and $\,\eta_j$ a noise value which may equally well be positive or negative. By using a NAG (Numerical Algorithms Group) routine, the $\,N\,$ values of $\,t_j/P_0\,$ were generated randomly within the range 0 and 1, corresponding to the signal's phase, $\,\phi_j$, so that the argument of the sine function was selected randomly from a uniform distribution between 0 and $2\pi$. This routine was seeded at onset so that the selection was non-repeatable for each running of the program. For each generated phase value, the corresponding $\,m_j$ was calculated according to the input values of $\,A$, $\,B\,$ and the function chosen to mimic noise. By subjecting each basic measurement to a second NAG routine to provide a Gaussian distribution of values with mean value $A +
B\sin(\phi_j)$ and variance associated with photon counting statistics, (i.e., $\,\sigma^2=A + B\sin(\phi_j)$), random selection of a single value provided the value of $\,m_j$. Although the overall procedure does not exactly replicate the real situation whereby data are collected in bunches of unequal number, spaced by integral intervals of 24 hours, over a total time window of many periods, the scheme is sufficient to explore the statistical behaviour of $\,T(P)$, under the notion that any data are considered as providing random phase values over the window 0 to 1 when folded on any trial period.

In this analysis, homogeneous data are assumed with the values carrying the same level of noise, i.e., the S/N ratio of the measurements is constant. As the $\,T(P)\,$ statistic is normalized with respect to the sample variance of measurements, its power of period detection may be simply explored according to the ratio, $\,X$, of the underlying amplitude, $\,B\,$, to the S/N ratio of the measurements. For data limited in accuracy by photon counting statistics, the measurement noise may be simply expressed as being the square root of the signal. Hence the value of $\,X$may be expressed as

 \begin{displaymath}X={B\over{[A + B\sin(2\pi t_j/P_0)]^{1/2}}} \approx
\end{displaymath} (5)

when $\,A\gg B\,$.

The range of mimicked data sets investigated covered values of N from 5 to 100 with values of X from 0.5 to 10. For any given value of X and N, checks showed that the numerical behaviour of $\,T(P)$ is independent of the signal level and of the S/N ratio of the basic measurements, so confirming the efficacy of the normalizing procedure defining $\,T(P)$. The statistical behaviour of $\,T(P)$ when no periodicity is present was investigated simply by letting $\,B=0.0$. For this case the mean level through the periodograms, $\overline{T(P)}\,$, was found to be $\approx$1.0 for all N, as expected. For data involving small $\,X\,$, the levels of all periodograms were close to unity, a very small departure resulting from the signal oscillation affecting an otherwise Normal distribution of measurements.

From the simulated data produced in the way above, two values for the S-L were determined. Firstly, calculations of $T(P)\,$ were obtained directly, with the various phase values generated in random order, these effectively being representative of any trial mismatched period. For each combination of N and X, the procedure was performed 2000 times. By re-ordering the results in ascending sequence, again by a NAG routine, a normalized cumulative distribution function (CDF) for T(P) is generated. Figure 1 provides examples of such CDFs for three different values of N, for simulated data with B=0.0. It can be seen that the spread in values of T(P)decreases with N, i.e., the noise of the periodogram reduces with N, as might be expected. Corresponding to the 10th, 50th and 100th points in the CDF, the values of $\,T(P)\,$ at the lower 1%, 5% and 10% quantiles may be read, so providing boundaries which any spot value must fall below for a period detection at confidence levels of 99%, 95%, 90% respectively, these being written as $\,T^{[99]}(P)$, $\,T^{[95]}(P)$ and $\,T^{[90]}(P)$.

\end{figure} Figure 1: The three symmetric normalized CDFs to the right correspond to data with no oscillatory signal present (X=0); the crossovers for all $\,N$ occur at the value of T(P)=1.0. The spread in the T(P) values at the low and high tails, describes the noise of the associated periodogram which, as expected, decreases with $\,N$. The T[90](P) and T[95](P) levels are marked and their values may be determined for any CDF. The three asymmetric curves in the left of the diagram [smaller values of T(P)] provide examples of T(P0) CDFs for $N=10,\,20\ {\rm and}\ 50$ with X=1.0; the upper tail percentile levels of T[95](P0) and T[90](P0) are marked.
Open with DEXTER

Secondly, for each of the $\,N,X\,$ combinations, the data were ordered in ascending phase from 0 to 1, the determinations of $\,T(P)\,$ now taking minimum values as though $\,P\,$ is selected as $\,P_0\,$. Again 2000 determinations from each cycle were re-ordered in ascending sequence to provide CDFs for $\,T(P_0)\,$(see Fig. 1). In the assessment of the power not to miss a period detection as a result of the way a particular data sample has been assembled, the behaviour of the upper tail of the distribution of $\,T(P_0)\,$ is important and the 90%, 95% and 99% percentiles were selected in this zone, these corresponding to the 1901st, 1951st and 1991st points and being written as T[90](P0), T[95](P0) and T[99](P0).

The whole procedure above was undertaken 30 times to confirm the stability of all the determined elements from the CDFs of $\,T(P)\,$and $\,T(P_0)$; overall means were obtained for $\,\overline{T(P)}\,$ and $\,\overline {T(P_0)}\,$ and for the various defined percentiles.

The determined mean $\,\overline {T(P_0)}\,$ according to the $\,N,
X$ values correspond to the S-Ls obtained by connecting the data when the trial period matches the underlying value. Their behaviour and the associated distributions from which they are determined also offer information on the expectation to detect any underlying period. As mentioned above, if a trial value of $\,P\,$provides an S-L smaller than those associated with $\,T^{[90]}(P)$, $\ T^{[95]}(P)$ and $\,T^{[99]}(P)$, then this period may be considered as being detected at the corresponding confidence level. Requirements of a data sample in terms of the number of measurements needed for detection of a period at a selected confidence level may be readily assessed by plotting $\,\overline {T(P_0)}\,$ against $\,N\,$ together with various percentile values $\,T^{[\%]}(P)$ and noting the crossover positions (see the example in Fig. 2).

\par\includegraphics[width=8.4cm,clip]{MS2208f2.eps}\end{figure} Figure 2: An example of the investigation of the behaviour of $\,T(P_0)$according to the number of measurements, $\,N$, is presented for the situation of an underlying sinusoidal variation with $\,X=3.0\,$. The behaviour of $\,\overline {T(P_0)}$ displays a smooth fall with $\,N$, as do the curves for $\,T^{[95]}(P_0)$ and $\,T^{[99]}(P_0)$. For reference, the behaviour of $\,T(P)$ with $\,B=0.0$ is displayed. The value of $\,\overline {T(P)}$ is =1.0, independent of $\,N$. Curves are drawn for the lower percentile values $\,T^{[99]}(P)$ and $\,T^{[95]}(P)$, these displaying minima at some low value of N(<10). As examples of the interpretation of the crossovers of curves within the figure, it can be seen that $\sim $12 data points are required to obtain detection with a 99% confidence level with a 50% chance of making the detection; $\sim $24 measurements are required to obtain the same level of confidence but with a 99% certainty of making the detection.
Open with DEXTER


Table 1: For a range of $\,X$ values from 0.5 to 10 [column 1] and for each of the lower percentiles $\,T^{[99]}(P)\,$, $\,T^{[95]}(P)\,$ and $\,T^{[90]}(P)\,$ of the noise associated with a periodogram, the values of $\,N\,$ and $\,\overline {T(P_0)}\,$at the cross-over points are listed, [columns 2:3, 6:7 and 10:11]. These provide guides in terms of the number of measurements required for the detection of a period according to the value of $\,X$. Similar cross-over values of $\,N\,$ and $\,T^{[99]}(P_0)$ are also provided [columns 4:5, 8:9, 12:13], these providing guides as to the measurement number requirements for a period not to be missed from any data sample collected at random.
  T[99](P) T[95](P) T[90](P)
X N $\overline{T(P_0)}$ N T[99](P0) N $\overline{T(P_0)}$ N T[99](P0) N $\overline{T(P_0)}$ N T[99](P0)


>100 -- >100 -- >100 -- >100 -- >100 -- >100 --
1.0 59 0.68 >100 -- 35 0.69 >100 -- 26 0.70 >100 --
2.0 16 0.44 44 0.63 12 0.51 36 0.69 10 0.57 30 0.72
2.5 14 0.41 30 0.56 10 0.48 24 0.62 9 0.55 21 0.66
3.0 12 0.38 24 0.52 10 0.48 19 0.59 8 0.53 17 0.63
4.0 11 0.36 18 0.45 9 0.46 15 0.54 8 0.52 14 0.60
5.0 11 0.35 16 0.44 9 0.45 14 0.53 8 0.51 13 0.59
10.0 10 0.34 14 0.39 8 0.44 12 0.50 7 0.50 11 0.56

Any S-L for a period $\,P_0$ using $\,N\,$ data points, may be considered to have arisen from a distribution of values with a mean of $\,\overline {T(P_0)}\,$. It will therefore be appreciated that any calculated S-L value matching the above defined crossover points corresponds to a $50\%$ probability of detecting the underlying period from random data sampling patterns. Even if a period of $\,P_0\,$ is present, the sampling for some particular data sets may produce an S-L value at $\,P_0\,$ which happens to be larger than the mean of its underlying distribution. With such a higher value, it might be embedded in the noise of the periodogram and the period would not be detected. Thus the power of the estimator not to miss detection of a period might be more realistically assessed from the crossovers of the low end tail of the distribution of $\,T(P)\,$ with those of the high end tail of the distribution of $\,T(P_0)\,$. It was for this reason that the 99%, 95% and 90% percentiles of the upper tail of the CDF of $\,T(P_0)\,$ were high-lighted in the exercise. By determining $\,N\,$ for which the various selected percentile values associated with the larger values of $\,T(P_0)\,$ are smaller than those of the selected percentiles of the smaller values of $\,T(P)$, the probability of not missing an underlying period and detecting it at some particular level of confidence may be estimated according to the data sample size.

An example of the interpretation of an investigation of the behaviour of $\,T(P)\,[N,\,X=0.0]$, $\,T(P_0)\,[N,\,X=3.0]$ is presented in Fig. 2. As expected, the mean values of un-ordered data, $\overline{T(P)}\,$, forming any periodogram are constant with a value of 1.0. The figure shows the behaviour of the $\,T^{[95]}(P)$ and $\,T^{[99]}(P)$ values, corresponding to lower excursions within the generated periodogram, with minima between N=5 and 8; for larger data samples, the curves converge smoothly towards the level of 1.0 illustrating the reduction of the periodogram noise as $\,N\,$increases. The curves reflecting the behaviour of $\,\overline {T(P_0)}$, $\ T^{[95]}(P_0)$ and $\,T^{[99]}(P_0)$ all exhibit a near exponential fall indicating the improved period detectivity as $\,N\,$ increases.

From Fig. 2, inspection of $\,T^{[99]}(P)\,[N,X=3.0]$ shows that for any period to be detected with a confidence level of 99%, the value of $\,T(P)\,$ must be smaller than 0.35 for $\,N=10$ and smaller than 0.53 for $\,N=25$. The variation of the mean values $\,\overline {T(P_0)}$, shows a crossover with $\,T^{[99]}(P)$ at $\,N\sim 12$, this giving a criterion for the number of data points required for a 50% chance of detection at a 99% confidence level. Inspection of the crossover of $\,T^{[99]}(P)$with the curves corresponding to $\,T^{[95]}(P_0)$ and $\,T^{[99]}(P_0)$ shows that for a detection with 99% confidence, the 95% and 99% chances of succeeding require $\,N$ to be >19 and 24 respectively.

Similar diagrams for other values of $\,X\,$ show the curves for $\,T^{[95]}(P_0)\,[N]$ and $\,T^{[99]}(P_0)\,[N]$ are sensitive to the ratio of the sinusoidal amplitude to the S/N ratio of the basic measurements. When $\,X$ is increased to 5, the number of points required for detection reduces to $\sim $11 at the 50% probability level of detection and to 16 with a 99% probability. A summary of the cross-over points of $\,T(P)\,[N,\,X=0.0]$ at the 90%, 95% and 99% lower levels with $\,\overline {T(P_0)}$ for $\,X\,$ values from 0.5 to 10 is presented in Table1. Also provided are the number of required measurements to ensure at the listed 99% confidence levels that any period would not be missed from any random data collection.

The above study of $\,T(P)$ provides information on the statistical behaviour of point values in the periodogram. Although it does not offer a definitive recipe for fully assessing the behaviour over the interval containing an identifiable period, the results give insight on confidence levels of the detection. If no period is detected, a point value assessment can be applied to estimate the amplitude level that would have been confidently detected from the $\,N\,$ values of the data set.

Point values may, however, be applied directly to study the way any data have been assembled. For example, in the first instance, some data, without the time element, may suggest that they are part of some Normal distribution. By applying the SLLK statistic, the value of $\,T\,$ should be close to unity. If the value is smaller, its departure from 1.0 may be tested for its statistical significance to explore whether the data form a correlated time sequence and are not simply representative of measurements taken at random from the underlying Normal distribution. Such an exercise was recently applied by Oskinova et al. (2001) to X-ray studies of hot stellar winds.

3.2 The behaviour of periodograms

The analysis above concentrates on the statistical behaviour of spot values of $\,T(P)$ making up the periodograms. Perhaps of more relevance is the behaviour of the periodogram itself, particularly over the zone which includes any underlying period.

Again, the same computer programs were used to produce artificial data with the addition that the phase values were ascribed as $j +
\phi_j$, i.e., to each of the generated random phase values an integer was added in succession from 1 to $\,N\,$. In this way the value of the period is effectively normalized to be unity, with a sampling routine which provides one value per cycle with random phase. SLLK periodograms obtained from the exercise show that when X=0.0, the mean values of T(P) are unity and that the noise behaves according to the earlier derived CDF for the given $\,N\,$measurements. Various periodograms for the $\,X\,$ values listed in Table1 were generated covering the range $\,P=0.9\ {\rm to}\ 1.1$, with selected periods differing by 0.001, the latter being approximately the limit of period resolution, i.e., there are no zones through T(P) with flat sections over which the measurement order does not change for successive trial periods.

Figure 3 provides three examples of the behaviour of $\,T(P)\,$ around the value of $\,P_0\,$ for $\,N=15\,$ and $\,X=3.0\,$. Rather than displaying a point minimum at the value of P=1.0, a typical periodogram displays a noisy descent from unity to a minimum followed by a noisy return. It can be seen that the minima all lie below the 99% noise level associated with extreme low point values within a periodogram based on the same number of measurements but with $\,X=0.0\,$. The indication that a period is present is better assessed, therefore, by considering zones over which the values in the depression fall below some selected noise value for a periodogram rather than just considering any isolated point value.

The means of 30 repeated runs for $\,N\!=\!15\,$, 25 and 50 with $\,X=3.0\,$ are displayed in Fig. 4, together with the appropriate 99% noise levels. It can be seen that the half-widths of the minima reduce as $\,N\,$ grows. In addition to the slight increases in depth of the minima with $\,N\,$, it can also be seen from the 99% noise levels how the power of detection increases dramatically with $\,N\,$ and how the determined period value becomes better defined.

\end{figure} Figure 3: Three sample periodograms from artificial data comprising 15 randomly phased measurements with $\,X=3.0\,$. It may be noted that the noted minima around $\,P=1.0$ lie below the marked 99% level of the noise of the periodogram.
Open with DEXTER

\par\includegraphics[width=8.3cm,clip]{MS2208f4.eps}\end{figure} Figure 4: Mean periodograms based on 30 generated data sets with N= 15, 25 and 50 and X=3.0 demonstrate the narrowing and deepening of the minimum associated with the underlying period, so indicating how period detection improves with the number of measurements. The $\,T(P)$ values associated with the 99% noise level of the periodograms are indicated, these giving a better indication of how the power of detection increases with $\,N$.
Open with DEXTER

For any SLLK periodogram providing a well defined minimum with T(P) values below some assigned value associated with an acceptable confidence for a period detection, the progression through the minimum is unlikely to be smooth and may display steps because of a slight over-sampling. In order to determine the best period, the method proposed by Fernie (1989) may be applied to provide an interpolated value (based on the algorithm of Kwee & van Woerden 1956), together with an error estimate. For the periodograms displayed in Fig. 4, this method provided values of $1.00186\pm 0.00095$, $0.99960\pm0.00012$ and $1.00027\pm0.00006$ for $\,N=15, 25$ and $\,50$ respectively, the error estimates reducing significantly as $\,N$ increases. The accuracy of the period may also be assessed by progressively decreasing the sampling interval of the trial periods until the $\,T(P)$ minimum displays a flat section; at this stage, the accuracy of the determined period is of the same order as the periodogram resolution.

\par\includegraphics[width=8.3cm,clip]{MS2208f5.eps}\end{figure} Figure 5: The $\,T(P)$ algorithm for the 40 V-band measurements by Moffett & Barnes (1984) of the cepheid variable star, ALVir, displays a deep minimum with oscillations resulting from windowing effects. The period grid was $8\hbox{$.\!\!^{\rm d}$ }0\,/0\hbox{$.\!\!^{\rm d}$ }005\,/12\hbox{$.\!\!^{\rm d}$ }0$ with the deepest minimum occurring at $10\hbox{$.\!\!^{\rm d}$ }310$ and $10\hbox{$.\!\!^{\rm d}$ }315$. Using the procedure advocated by Fernie (1989), the best period is $10\hbox{$.\!\!^{\rm d}$ }3154\pm0\hbox{$.\!\!^{\rm d}$ }0007$ comparing well with the period listed by Moffett & Barnes (loc. cit.) of 10 $.\!\!^{\rm d}$302323.
Open with DEXTER

An example of a $\,T(P)$ periodogram obtained from real photometric data is given in Fig. 5 for 40 V-band measurements from Moffett & Barnes (1984) (hereafter referred to as M&B) for the cepheid variable star, ALVir; a deep minimum is clearly seen. The method of Fernie (1989) provides a value of $10\hbox{$.\!\!^{\rm d}$ }3154\pm0\hbox{$.\!\!^{\rm d}$ }0007$ comparing with the value of $10\hbox{$.\!\!^{\rm d}$ }302323$ listed by M&B. The oscillatory nature of the periodogram over the displayed region results from sampling and windowing effects associated with the collection of these data. The periodogram obtained from matching B-band data is almost identical, confirming this conclusion. The fact that the level of the periodogram is generally less than unity is a result of this particular data set. The depth of the minimum indicates very clearly how well the periodicity has been detected for this star with measurements of high $\,X$ value.

4 Multi-parameter data

4.1 General behaviour

For some astronomical investigations, several parameters may be measured very closely in time or even simultaneously. Examples in stellar photometry are observations made in multi-colour systems such as UBV or UBVIR. If periodicity is investigated, it is quite usual to examine the data of one passband or parameter only, perhaps that carrying the best measurement signal-to-noise ratio (S/N). Alternatively, the data sets for each colour may be investigated in turn, with the eventual determination of a weighted mean period. Although it is feasible to extend most period search techniques to multi-dimension by simultaneous analysis of measurements of several parameters under a single comprehensive programme, this kind of approach is not normally effected.

\includegraphics[width=5.6cm,clip]{MS2208f6c.eps}\end{figure} Figure 6: Photometric data from Moffett & Barnes (1984) for the cepheid variable ALVir made in the V and B bands shows in a) that the measurements lie on a near elliptical locus, indicating a strong correlation between the variability in the two colours but with a varying phase difference. The data are connected in order of their collection in b) indicating the large R-L required to join the measurements in the VB-plane. In c) the connection has been re-ordered according to the period of 10 $.\!\!^{\rm d}$3095, this giving a minimum R-L. If data point connections were to be made on the original period of 10 $.\!\!^{\rm d}$302323 (see Moffett & Barnes  1984), the R-L is less clean and retraces itself at locations of maximum and minimum light. The execution of the locus with time follows a clockwise progression.
Open with DEXTER

In this section it will be demonstrated that the principle of the S-L method as embraced by the SLLK algorithm is readily extendable to time-series analyses of multivariate data. The development involves calculation of statistics comprising the combination or "twining'' of "strings'' associated with each of the measured parameters to provide "Rope-Lengths'' (R-Ls). The concept is readily appreciated both in visual and physical terms for two-parameter data, with pairs of measurements obtained at identical times.

As an illustrative example, consider the V-band data of ALVir, used in the production of Fig. 5, in combination with the complimentary B-band measurements. The brightness changes in both bands are very significant relative to the measurement noise. The simultaneous V, B magnitude values are plotted against each other in Fig. 6a and the obvious correlation of behaviour of the two parameters is revealed with a near linear relationship between B and V. It may be noted that more points appear at the extreme ends of the plot, as would be expected if a sinusoidal variation is sampled at random.

In more detail, the data distribution follows a locus more like that of a neo-ellipse, suggesting that the two-colour measurements exhibit similar variations but with a phase difference. The path through the points is akin to a Lissajou figure produced by compounding measurements of orthogonal oscillatory variations of differing amplitude and phase. A strictly linear path indicates that the two oscillations are permanently in phase with the gradient determined by their relative amplitudes. Open patterns reveal the presence of phase differences in their behaviour. Although not normally depicted in this way, such behaviour is well known in colour photometry of cepheids and it can be seen here that at ALVir's maximum light (upper right of Fig. 6) the phase difference between B and V is small, whereas, at light minimum, the phase difference is very significant.

The concept of examining data in this way may obviously be extended to more than two simultaneous measurements with complicated figures being executed in multi-parameter space.

4.2 The basic RL algorithm

With the measurements mapped as in Fig. 6a, it is readily appreciated that the data may be linked by "rope'' with various connection paths. For the case involving $\,N$measured pairs of values, $\,m[1]_j, m[2]_j\,$, made simultaneously at times, $\,t_j\,$, the immediate R-L value may be written as

+ \bigl(m[2]_{j+1}-m[2]_{j}\bigr)^2\Bigr)^{1/2}\Biggr]

where the summation is completed round the full cycle by letting m[1]N+1=m[1]1 and m[2]N+1=m[2]1. If this were done according to the original data collection order, the locus connecting the measurements generally involve many forward and backward movements, requiring a long R-L to complete the task (see Fig. 6b).

As in the case of the earlier analysis of single parameter data, a grid of periods may be explored such that for each trial, the measurement pairs are assigned a phase, $\,\phi_j$, between 0 to 1, according to their measurement times. By re-assembling the data according to ascending phase values, they may be re-labelled $\,m[1]_i\,,\,m[2]_i$ with the change of subscript indicating their new order. A periodogram based on R-L values may then be produced by repeating the phase ordering exercise for each trial period, $\,P\,$, and determining the appropriate R-L value. Thus, in its basic form, a two-parameter R-L periodogram may be represented by

$\displaystyle RL(P)=\sum\limits_{i=1}^N\Biggl[\Bigl(\bigl(m[1]_{i+1}-m[1]_{i}\bigr)^2
\cdot$     (6)

For the period matching any cyclic variation in the data, RL(P) will be a minimum with the locus moving through the plotted data from one point to another fairly adjacent one; the complete run would move through about half of the data points in an upward direction followed by the other half cycle of downward movements (see Fig. 6c). The process of searching for the period which minimises the R-L is applicable to any waveform shape containing only two turning points (i.e. one maximum and one minimum) within the cycle. The progression rate through re-phased data point connections simply depends on the shape of the waveform; comment has already been made about the bunching of randomly made measurements at the extremes of a sinusoidal variation.

If the data set comprises several simultaneously measured parameters, the summation of the vectors joining the points in multi-parameter space when phased according to a trial period is simply

\end{displaymath} (7)

where $\,Z$ is the number of parameters.

5 The RLLK (Z, P) algorithm

5.1 The general T(Z, P) statistic

Rather than calculating the "true'' R-Ls in multivariate space as in Eq.(7), the R-L may be determined without taking the square root of each of the contributing vectors. This may be written simply as

\Bigl(m[k]_{i+1}-m[k]_{i}\Bigr)^2\Biggr)\ .
\end{displaymath} (8)

Because the square root values of the joining vectors are now not involved, the order of performing the summations may be relaxed and the R-L may also be rewritten as

\bigl(m[k]_{i+1}-m[k]_{i}\bigr)^2\Biggr]\ .
\end{displaymath} (9)

Thus, it can now be seen that each contributing summation to the total sum is the kernel of the original LK S-L statistic. By normalizing these as in Eq.(3), the statistic underpinning the RLLK(Z,P) periodogram may be simply written as

T(Z,P)\ =\ \sum\limits_{k=1}^Z T(k,P)
\end{displaymath} (10)

where the contribution of the kth parameter is calculated by

{m[k]}\Bigr)^2}}\ \cdot
\end{displaymath} (11)

Thus, as demonstrated in Sect. 2, each of the contributing S-Ls based on Eq.(11) is independent of the number of contributing measurements, $\,N[k]\,$, and their basic measurement S/N ratio, with a mean value of each $\,T(k,P)\,$ through the periodogram continuum equalling unity. In order for the RLLK value to be independent of $\,Z\,$, it may be rewritten as

 \begin{displaymath}{T(Z,P)={1\over Z}\sum\limits_{k=1}^{Z}T(k,P)}
\end{displaymath} (12)

this again having an expected mean value of 1.0 in the continuum of the periodogram. Combining the individual S-Ls in this way corresponds to a determination of their mean with each parameter being ascribed equal weight. Alternative RLLK determinations may also be considered with the calculation of a weighted mean according to the estimated merit of each SLLK component.

Since the SLLK for each parameter is independent of the number of contributing measurements, a very important result from Eq.(12) is that a periodogram can be obtained from combination of multi-parameter data sets comprising differing numbers of measurements with records not necessarily obtained at identical times. One advantage of this is that all the data from a study may be utilized even if there are recording gaps for some of the parameters. Combination of such data may reduce sampling and windowing effects that may be apparent if the reduction is simply limited to those measurements of the parameters taken at identical times.

In the following section, the behaviour of spot values from R-L periodograms based on T(Z,P) as defined by Eq.(12) are investigated by computer simulation in a similar fashion as for SLLK above.

5.2 The behaviour of T(Z, P)

A complete investigation of the behaviour of $\,T(Z,P)\,$ would allow the various parameters to have equal numbers of simultaneous measurements, or non-equal numbers with partial simultaneity, or even with independent measurement times; in addition, the parameters could carry differing $\,X$ ratios. For the study here, however, computer generated data sets were established with the simplification of using identical values of $\,X\,$ for each parameter with equal numbers of measurements, carrying identical times. Multivariate data have been considered with $\,Z=$ 2, 3 and 5.

As for the single parameter exercise described in Sect. 3, and following a parallel nomenclature, distributions for $\,T(Z,P)\,$and $\,T(Z,P_0)\,$ were established for each $\,Z,\ X$ and $\,N$combination. Mean values, $\,\overline{T(Z,P)}\,$ and $\,\overline{T(Z,P_0)}\,$ were calculated, together with the lower percentiles $\,T^{[90]}(Z,P)$, $\,T^{[95]}(Z,P)$, $\,T^{[99]}(Z,P)$ and upper percentiles $\,T^{[90]}(Z,P_0)$, $\,T^{[95]}(Z,P_0)$, $\,T^{[99]}(Z,P_0)$.

Again, following the same arguments as for the single parameter investigation, the way in which the S/N ratio of the periodogram itself improves according to $\,N$ together with the confidence values of a period detection and of a period not being overlooked, can readily be assessed by production of diagrams similar to that of Fig. 2, although these are not presented here. A summary of the information is, however, presented in Table2. Comparison with Table1 shows the general improvement of sensitivity and the reduced periodogram noise that multi-parameter data offer.

5.3 Out of phase parameters

Although all the parameters of any multivariate data may carry the same underlying period, the oscillations may be out of phase with respect to each other. In some cases, the periodic behaviour may not be sinusoidal and the phase differences between the parameters may change over the period. In order to see the effect that phase difference, $\,\theta\,$, has on $\,T(Z,P)$, artificial data were generated for two measurement parameters (${\rm Z}=2$) with identical values for $X\,(=$3), but allowing for a constant phase value to be present between the underlying variations of measurement pairs. The results of the exercise are also summarized within Table 2.

It can be seen that the power of $\,T(2,P)\,$ increases significantly with the phase difference, reaching its maximum when $\,\theta=\pi/2$, with the data producing a circular locus in the two-dimensional data plane. Beyond this phase value, the behaviour of the power shows symmetry, with $\,\theta =0$ being equivalent to $\,\theta=\pi$. Thus, in terms of detection of periodicity in small amplitude signals, there is positive advantage in using $\,T(Z,P)\,$ as a means of period detection.

6 The behaviour of periodograms

6.1 Computer simulations

Simulated data with a normalized period of 1 $.\!\!^{\rm d}$0 were generated for N=15, 20 and 50 with X=3.0 in similar fashion as in Sect. 3.2 but with Z values of 2, 3 and 5. Individual $\,T(Z,P)$ periodograms were very similar to the investigations of $\,T(P)$ in Sect. 3.

Mean periodograms based on 30 repetitions for each situation were similar in outcome to Fig. 4, but with the noticeable improvement of a deepening minimum as $\,Z$ increases.

6.2 Application of T(Z,P) to real data

Again the data of M&B for the cepheid star, ALVir, provide an example for the outcome of the R-L principle. Trial periodograms based on $\,T(2,P)\,$, using the B,V measurements with a range of grids, show that the minimum period resolution is $\sim $0 $.\!\!^{\rm d}$0005; the R-L minimum (flat) occurs for periods of 10 $.\!\!^{\rm d}$3085 to 10 $.\!\!^{\rm d}$3160, with the best value being taken as 10 $.\!\!^{\rm d}$3120. Using this period, the data points are linked in Fig. 6c according to phase progression and show a near-to-perfect connection around an open locus. The path through the data along the connections executes a clockwise cycle with time. Using the period of 10 $.\!\!^{\rm d}$302323 given by M&B, the linking is less satisfactory around the maximum and minimum values of the light curve. Inspection of the various trial connections around 10 $.\!\!^{\rm d}$3 shows the importance of obtaining accurate data around light maximum and minimum. It also confirms the notion that the last two decimal places in the value listed by M&B carry no significance.

Finally, an analysis of observations of M&B for RVSco serves as an example of R-L combinations of parameters for data sets with unequal numbers. Their original table of measurements show that 25 simultaneously recorded BVRI values are available with 32 additional BV values. A period of 6 $.\!\!^{\rm d}$061388 is also ascribed. Figures 7a and 7b display the periodograms over the range 5 $.\!\!^{\rm d}$0 to 7 $.\!\!^{\rm d}$0 with grid spacing of 0 $.\!\!^{\rm d}$005 for the 4-colour and additional 2-colour data respectively. Although the periodogram in Fig. 7a is noisy, the presence of the period is clearly seen. Again the period is seen in Fig. 7b but it is obvious that the data sampling here is not as good as for Fig. 7a. As a consequence, when the overall summation of $\,T(4,P)\,$ is effected, it is sensible to weight the contributions. For the example here, this has been done in the ratio of 2:1 respectively for the periodograms of Figs. 7a and b, with the resulting periodogram displayed in Fig. 7c where it can be seen that the noise has been reduced relative to that of Fig. 7a. Further trial analyses showed that the finest sensible resolution for the period grid is $\sim $0 $.\!\!^{\rm d}$0001 and the best deduced period is $6\hbox{$.\!\!^{\rm d}$ }0608\pm0\hbox{$.\!\!^{\rm d}$ }0002$ which compares with that of M&B but again revealing their exuberance in quoting an excessive number of decimal places.


Table 2: The improvement in power of the T(Z,P) algorithm is shown according to the number of measured parameters $\,Z\ (\!=2,\,3\ {\rm and}\, 5)$; simultaneous measurements were considered for each parameter with identical $\,X$ values of 0.5 to 10 (Col. 1). For each $Z\ {\rm and}\ X$ combination, the value of $\,N$ and of $\,\overline {T(P_0)}\,$ at the cross-over of the latter with the lower percentiles $\,T^{[99]}(Z,P)\,$, $\,T^{[95]}(Z,P)\,$ and $\,T^{[90]}(Z,P)\,$ of the noise associated with a periodogram, are listed, (Cols. 2:3, 6:7 and 10:11). These provide guides in terms of the number of measurements required for the detection of a period according to the value of $\,X$. Similar cross-over values of $\,N\,$ and $\,T^{[99]}(Z,P_0)$ are also provided (Cols. 4:5, 8:9, 12:13), these providing guides as to the measurement number requirements for a period not be missed from any data sample collected at random. The lowest block of the table provides an example (X=3.0) of how the power of T(2,P) improves if there is a phase difference, $\,\theta $ (Col. 1) between the behaviour of the two measured parameters.
Z=2 T[99](2,P) T[95](2,P) T[90](2,P)
X N $\overline{T(P_0)}$ N T[99](P0) N $\overline{T(P_0)}$ N T[99](P0) N $\overline{T(P_0)}$ N T[99](P0)
0.5 >100 -- >100 -- >100 -- >100 -- >100 -- >100 --
1.0 33 0.69 >100 -- 21 0.71 >100 -- 16 0.73 90 0.87
2.0 13 0.48 30 0.63 10 0.56 23 0.68 9 0.62 21 0.72
2.5 12 0.45 22 0.55 9 0.52 19 0.63 8 0.58 16 0.67
3.0 11 0.41 19 0.51 9 0.51 16 0.60 8 0.57 14 0.64
4.0 10 0.38 16 0.45 9 0.49 14 0.55 8 0.54 13 0.60
5.0 10 0.36 15 0.44 8 0.46 13 0.54 7 0.53 12 0.60
10.0 10 0.34 14 0.40 8 0.44 12 0.51 7 0.51 11 0.57
Z=3 T[99](3,P) T[95](3,P) T[90](3 ,P)
X N $\overline{T(P_0)}$ N T[99](P0) N $\overline{T(P_0)}$ N T[99](P0) N $\overline{T(P_0)}$ N T[99](P0)
0.5 >100 -- >100 -- >100 -- >100 -- 77 0.89 >100 --
1.0 26 0.70 94 0.83 17 0.73 74 0.86 13 0.75 64 0.87
2.0 12 0.51 25 0.62 10 0.59 20 0.68 8 0.63 18 0.72
2.5 11 0.45 20 0.56 9 0.55 17 0.63 8 0.60 15 0.68
3.0 11 0.44 18 0.52 9 0.52 15 0.60 8 0.58 14 0.65
4.0 10 0.39 16 0.47 8 0.48 13 0.55 8 0.55 12 0.61
5.0 10 0.37 15 0.44 8 0.47 13 0.54 7 0.53 12 0.60
10.0 10 0.34 14 0.44 8 0.44 12 0.51 7 0.51 11 0.57
Z=5 T[99](5,P) T[95](5,P) T[90](5,P)
X N $\overline{T(P_0)}$ N T[99](P0) N $\overline{T(P_0)}$ N T[99](P0) N $\overline{T(P_0)}$ N T[99](P0)
0.5 >100 -- >100 -- 66 0.89 >100 -- 48 0.89 >100 --
1.0 20 0.72 64 0.81 13 0.75 48 0.86 11 0.78 43 0.87
2.0 12 0.53 22 0.62 9 0.60 18 0.69 8 0.65 16 0.72
2.5 11 0.47 18 0.56 9 0.56 15 0.63 8 0.61 14 0.68
3.0 10 0.43 16 0.51 8 0.53 14 0.60 8 0.59 13 0.65
4.0 10 0.40 15 0.46 8 0.49 13 0.56 7 0.55 12 0.62
5.0 10 0.38 14 0.43 8 0.47 12 0.53 7 0.54 11 0.59
10.0 10 0.34 14 0.40 8 0.45 12 0.50 7 0.51 11 0.57
The effect of a phase difference between the parameters with X=3.0
Z=2 T[99](2,P) T[95](2,P) T[90](2,P)
$\theta$ N $\overline{T(P_0)}$ N T[99](P0) N $\overline{T(P_0)}$ N T[99](P0) N $\overline{T(P_0)}$ N T[99](P0)
0 11 0.41 9 0.51 8 0.57 19 0.51 16 0.60 14 0.64
$\pi/16$ 11 0.41 9 0.51 8 0.57 19 0.52 16 0.60 14 0.64
$\pi/8$ 11 0.43 9 0.53 8 0.58 19 0.52 16 0.61 14 0.65
$3\pi/16$ 10 0.43 8 0.53 7 0.59 18 0.53 15 0.61 14 0.66
$\pi/4$ 10 0.46 8 0.55 7 0.61 17 0.54 14 0.62 13 0.67
$5\pi/16$ 10 0.47 8 0.57 7 0.63 16 0.56 14 0.64 13 0.69
$3\pi/8$ 9 0.49 8 0.59 7 0.65 16 0.58 13 0.65 12 0.69
$7\pi/16$ 9 0.51 7 0.60 7 0.66 15 0.58 13 0.66 12 0.70
$\pi/2$ 9 0.51 7 0.61 7 0.67 15 0.59 13 0.66 12 0.71

\includegraphics[width=5.6cm,clip]{MS2208f7c.eps}\end{figure} Figure 7: In a) part of the $\,T_{LK}(4,P)\,$ periodogram is displayed based on the 25 BVRI measurements of RVSco made by Moffet & Barnes (1984). The periodogram covering the same period interval of some other 32 BV measurements is displayed in b) revealing the effects of a poorer data sampling pattern relative to that associated with the data used to engender a). By combining the R-L values with a weighting of 2:1 with respect to a) and b) respectively, a less noisy periodogram is produced and displayed in c), using all of the data in the best way. From exercises with higher resolution than displayed above, the best period for RVSco is $6\hbox{$.\!\!^{\rm d}$ }0608\pm0\hbox{$.\!\!^{\rm d}$ }0002$.
Open with DEXTER

7 Application to polarimetry

Measurements of linear polarization automatically constitute two-dimensional data involving the degree of polarization, $\,p$, and the position angle of vibration, $\,\alpha$. When plotted in Cartesian space, the measurements may be expressed in terms of normalized Stokes parameters, $\,q,\,u$, such that

\begin{displaymath}q=p\cos2\alpha \qquad{\rm and}\qquad u=p\sin2\alpha\ .
\end{displaymath} (13)

There are many astrophysical situations which engender periodic variations of $\,p\,$ and $\,\alpha\,$. Usually the behaviour of $\,q\,$ and$\,u\,$ involves a fundamental period and the first harmonic, both with phase differences, so producing complicated loci in the Stokes parameter diagram. The data are ideal for undertaking period searches using $\,T(2,P)$ or even with dimensions of $\,2\times Z\,$, if $\,Z\,$ bands of multi-colour data are available. It may be noted that Robert et al. (1989) made mention of an analysis of Wolf-Rayet star data based on the Lafler & Kinman (1965) method but there is no reference as to whether S-Ls were employed for calculations on the individual Stokes parameters or if some form of R-L was used on both $\,q\,$ and $\,u\,$ simultaneously.

\end{figure} Figure 8: In a) the U band data for UMon in the Stokes parameter plane are displayed with the $\,q, u\,$ points (measured in %) connected in order of their collection. The typical quality of the data is depicted by a point (not part of the data) at the left of the diagram carrying $1\sigma \,$ error bars. The periodogram based on $\,T(2,P)\,$ for these data is displayed in c) and shows minima at $\sim $90 days and 178 days. By phasing the measurements on the best period of $89\hbox{$.\!\!^{\rm d}$ }68$, re-ordered connections are made in b); although a smooth locus is not apparent, the process removes the forwards and backward movements across the central part of the data distribution.
Open with DEXTER

There are many examples in the literature of polarimetric data which provide clean loci in the $\,q,\,u$ plane with a cyclic path (see, for example, Drissen et al. 1986), for which the $\,T(2,P)\,$ would have obvious success in determining the period. To serve as an example here of the effectiveness of the algorithm, data have been taken from Serkowski (1970) for the RVTauri star, UMon. In discussions of polarimetry, it has always been assumed that the period held in these data is the same as that established from photometry and spectroscopy. The data comprise 37 U band values, 51 for the B band and 39 for the V band. Figure 8a displays the U band measurements in the from of a $q\,,u$ plot.

The measurements for each colour were analyzed by $\,T(2,P)\,$ in turn with a period grid of 0 $.\!\!^{\rm d}$02, providing minimum R-Ls at 89 $.\!\!^{\rm d}$68, 91 $.\!\!^{\rm d}$08 and 92 $.\!\!^{\rm d}$78 in the U, B and V bands respectively. The U band periodogram, with a grid of 0 $.\!\!^{\rm d}$5, is shown in Fig. 8c and displays the presence of an additional minimum at close to 2$\times$ the fundamental ($\sim $178days). This may result from the intrinsic noise of the star; Serkowski (1970) commented that the amount of intrinsic polarization of UMon changes considerably from cycle to cycle, whereas the angle of the direction of vibration behaves similarly in each cycle. Alternatively, it might be related the double periodicity behaviour seen in the photometry of these stars, with light curves exhibiting alternate deep and shallow minima.

When the colour data were lumped together and the exercise applied to the 6 parameters simultaneously without weighting, the determined period was 92 $.\!\!^{\rm d}$78. The photometric period suggested by Preston et al. (1963) was 92 $.\!\!^{\rm d}$23 but based on on additional photometry conducted at the same time as the polarimetry, Serkowski (1970) suggested a period of 91 $.\!\!^{\rm d}$3. The variations in the quoted period probably reflect the presence of intrinsic fluctuations superimposed on the basic period.

Figure 8b displays the data point connections when their order is re-adjusted according to the ascribed phase defined by the period. Although the connections do not follow a smooth locus, it may be noted that the frequency of crossing the central part of the data distribution has been reduced by the procedure, as would be expected from data following a near elliptical locus with a noisy perimeter. The overall behaviour of the analysis is consistent with the star exhibiting fluctuations in $\,p\,$ as the position angle of the polarization sweeps around from 0 to $2\pi$, the whole pattern being offset from the $\,q\,,u\,$ plane origin by a constant interstellar component.

8 Conclusions

By applying a normalizing factor of (N-1)/2N to the original Lafler & Kinman (1965) statistic, it has been demonstrated that the "String-Length'' method of LK has been regularized. As well as being independent of the S/N ratio of the basic measurements, SLLK is now independent of the number of measurements in any examined data set. Any periodogram, $\,T(P)$, based on its evaluation at each trial period, should have a mean level of 1.0.

If periodicity is present, the depth of the associated minimum in the periodogram for a given number of measurements depends only on the amplitude-to-noise ratio of the measurements. Such an attribute makes the determination of confidence levels on any period detection straight forward. This might be done with reference to artificial data according to the exercises outlined in the paper or the behaviour of the periodogram may be examined by replacing the measurements for each timed record with computer generated values which simply carry noise or a sinusoidal signal with some given amplitude-to-noise ratio; repetitive exercises of this kind allow confidence levels to be assigned to any outcome.

It has also been demonstrated that the regularized SLLK algorithm is applicable to examining multivariate data for which the parameters may, or may not, be measured simultaneously, so extending the "String-Length'' principle to the notion of a "Rope-Length''. Combination periodograms based on measurements of several parameters may be constructed by weighting the contributions from the different parameters according to their estimated importance. Such RLLK combinations, with or without weighting, should improve the overall periodogram by reducing the effects of sampling that will be apparent on each measured parameter. It is also interesting to note that the RLLK principle can be applied very effectively when there are phase differences between the underlying behaviour of the different parameters. Again, with reference to exercises involving artificial data, the regularized form of $\,T(Z,P)$ is readily amenable to the determination of confidence levels associated with a detected period or with a null outcome.

The efficacy of $\,T(Z,P)$ with respect to simultaneous 2-colour measurements of a cepheid star, to multi-colour measurements a cepheid with data sets of unequal size, and to an RV Tau star displaying periodic polarization variations, has been clearly demonstrated. In summary, $\,T(Z,P)\,$ has obvious applications to the analysis of multi-colour photometry and polarimetry. It may be noted that in a study of the polarimetric behaviour of O-type stars, a joint period analysis involving spectral line equivalent width data and broad-band polarimetry has been undertaken by Clarke et al. (2002). As an extreme parameter combination, although an example was not provided here, the method could be used to investigate periodicity in data say from X-ray, optical and radio measurements obtained contemporaneously but not necessarily simultaneously.

The exploration of S-L and R-L methods have provided several student projects at Glasgow University for the development of computing skills. From proving out the algorithms, the discussions and feedback were most useful and particularly I wish to thank Brian Hamilton, Hrobjartur Thorsteinsson and Kris Wojciechowski, the latter two being instrumental in recognizing the step from Eq. (7) to Eq. (10), so widening the concepts and usefulness of RLLK.



Copyright ESO 2002