A&A 386, 763-774 (2002)
DOI: 10.1051/0004-6361:20020258
D. Clarke
Department of Physics and Astronomy, University of Glasgow, Glasgow G12 8QQ, Scotland, UK
Received 17 December 2001 / Accepted 18 February 2002
Abstract
The original Lafler-Kinman statistic for exploring any
auto-correlation within a set of measurements relative to their
underlying variance has been regularized so that its determination
is independent of the data sample size. In its new form, the power
of its application to String-Length period searches (SLLK) has
been assessed in terms of establishing confidence levels to point
value detections within any generated periodogram and to
confidence levels of not missing the detection when an underlying
period is present. These estimations depend only on the amplitude
of the variation relative to the measurement noise and are
independent of the signal-to-noise ratio of the measurements and
of their number. Examples of the behaviour of periodograms based
on SLLK as produced from computer generated data and real data are
discussed. It is also demonstrated that the principle can be
readily extended to multivariate data in the form of Rope-Length
period searches (RLLK), with the measurements of each parameter
not necessarily being taken simultaneously nor with equal number.
Using simulated data it is shown that the power of period
detection improves slightly if the underlying modulations in each
parameter are out of phase with each other. Examples of the RLLK
principle are given for computer simulated data and for stellar
multi-colour photometric and polarimetric measurements.
Key words: methods: statistical
The LK period search statistic may be classed as being a non-parametric method. For each trial period, , (or frequency, ) taken from a grid, (or , the original data, , are assigned phases, , which are then re-ordered in ascending sequence. The re-ordered data, , (note the change in subscript from "'' to "'') may be examined by inspection across the full phase interval, 0.0 to 1.0. For each period, the LK statistic performs a "String-Length'' (S-L) summation of the squares of the differences between the consecutive phase re-ordered measurement values. The variation of S-L with period (or frequency) may be considered as providing a periodogram, SLLK(P), from which any well defined minimum can be considered as corresponding to the underlying period. Alternative S-L methods include those of Renson (1978) and Dworetsky (1983).
Inspection of the literature citing use of the LK principle generally shows that periods are promoted without any indication of the certainty of detection or with confidence intervals (errors) assigned to the determined period. Some references suggest that a variant of the LK statistic has been applied but without its description. For example, a basic period search may be undertaken by calculating the statistic's numerator (see Eq. (1)), without incorporating the denominator's normalizing factor.
In this work, the LK method is more firmly established with a clear recipe for its application. Its form is revised so that the estimation of the statistic is freed from data sample size bias. The power of the statistic, in terms of its abilities to detect a period and for any period not to be overlooked, is explored according to the number of available measurements, , and their signal-to-noise ratio. The LK statistic is also considered as a means of investigating the nature of the temporal assembly of data which, without the time element, may simply appear to be distributed Normally.
The statistic is developed to provide search algorithms for application to multivariate data. This latter exercise involves the combination of S-L calculations and might be described as a "Rope-Length'' (R-L) method giving rise to periodograms assigned the abbreviation RLLK(Z,P), where corresponds to the number of measured parameters. In the first instance the R-L methods are considered for data involving simultaneously measured parameters. The method is then generalized to show that it can be applied to multivariate data sets with parameters measured at independent times, so illustrating how SLLK periodograms obtained from observations of different parameters are readily combined.
The original LK statistic,
,
is based on the
determination of the sum of the squares of the vector lengths
(S-Ls) required to connect re-ordered measurements, m_{i}, in
phase sequence, ,
for each of the trial periods in the
prescribed grid. The essence of the original formulation may be
expressed in the form of a statistic written as:
Although not mentioned by LK, it may be noted that full utilization of the data is made by including the vector length between the last and first measurement of the re-ordered sequence by letting , in the above summation. By using the squares of the vector lengths in the summation, both the upswings and downswings between the adjacent re-ordered data make a positive contribution to the statistic so that it does not converge to zero as N increases. As expressed in Eq.(1), the normalizing denominator is the variance of the measurements. By applying this factor the statistic's value becomes independent of the measurement noise. Although it has no consequence on the determination of the minima positions in the S-L periodogram, this factor produces regularisation of the periodogram continuum levels with a scale allowing standard confidence levels of detection to be applied to any suspected period, no matter the signal-to-noise ratio (S/N) of the data; Horne & Baliunas (1986) showed the importance of doing this in their refinement of a period searching method involving Fourier analysis.
By expanding both the numerator and denominator of the statistic,
it is readily shown that
Trial applications of the LK statistic, as calculated by
Eq. (1), show that the mean periodogram levels
increase slightly to be above 2.0 as
reduces, resulting
from the fact that the denominator relates to the `true' variance
of the data. This problem is addressed by the introduction of a
term as in the definition of the "sample'' variance. By scaling the
LK statistic by the factor
,
sample-size bias is
removed. In addition, to generate periodograms with continuum
levels of unity, a further normalizing factor of 1/2 needs to be
applied. A regularized formula for applying the LK principle
leading to SLLK(P) periodograms may thus best be written as
Various exercises were established to explore the behaviour
of periodograms produced by the T(P) statistic and to
examine its power to detect periodicity. For the
simulations, data collection was mimicked by computer
generation of N values from an underlying sinusoidal
signal, the simulated measurements being represented by
In this analysis, homogeneous data are assumed with the values
carrying the same level of noise, i.e., the S/N ratio of the
measurements is constant. As the
statistic is
normalized with respect to the sample variance of measurements,
its power of period detection may be simply explored according to
the ratio, ,
of the underlying amplitude, ,
to the S/N
ratio of the measurements. For data limited in accuracy by photon
counting statistics, the measurement noise may be simply expressed
as being the square root of the signal. Hence the value of may be expressed as
The range of mimicked data sets investigated covered values of N from 5 to 100 with values of X from 0.5 to 10. For any given value of X and N, checks showed that the numerical behaviour of is independent of the signal level and of the S/N ratio of the basic measurements, so confirming the efficacy of the normalizing procedure defining . The statistical behaviour of when no periodicity is present was investigated simply by letting . For this case the mean level through the periodograms, , was found to be 1.0 for all N, as expected. For data involving small , the levels of all periodograms were close to unity, a very small departure resulting from the signal oscillation affecting an otherwise Normal distribution of measurements.
From the simulated data produced in the way above, two values for the S-L were determined. Firstly, calculations of were obtained directly, with the various phase values generated in random order, these effectively being representative of any trial mismatched period. For each combination of N and X, the procedure was performed 2000 times. By re-ordering the results in ascending sequence, again by a NAG routine, a normalized cumulative distribution function (CDF) for T(P) is generated. Figure 1 provides examples of such CDFs for three different values of N, for simulated data with B=0.0. It can be seen that the spread in values of T(P)decreases with N, i.e., the noise of the periodogram reduces with N, as might be expected. Corresponding to the 10th, 50th and 100th points in the CDF, the values of at the lower 1%, 5% and 10% quantiles may be read, so providing boundaries which any spot value must fall below for a period detection at confidence levels of 99%, 95%, 90% respectively, these being written as , and .
Figure 1: The three symmetric normalized CDFs to the right correspond to data with no oscillatory signal present (X=0); the crossovers for all occur at the value of T(P)=1.0. The spread in the T(P) values at the low and high tails, describes the noise of the associated periodogram which, as expected, decreases with . The T^{[90]}(P) and T^{[95]}(P) levels are marked and their values may be determined for any CDF. The three asymmetric curves in the left of the diagram [smaller values of T(P)] provide examples of T(P_{0}) CDFs for with X=1.0; the upper tail percentile levels of T^{[95]}(P_{0}) and T^{[90]}(P_{0}) are marked. | |
Open with DEXTER |
Secondly, for each of the combinations, the data were ordered in ascending phase from 0 to 1, the determinations of now taking minimum values as though is selected as . Again 2000 determinations from each cycle were re-ordered in ascending sequence to provide CDFs for (see Fig. 1). In the assessment of the power not to miss a period detection as a result of the way a particular data sample has been assembled, the behaviour of the upper tail of the distribution of is important and the 90%, 95% and 99% percentiles were selected in this zone, these corresponding to the 1901st, 1951st and 1991st points and being written as T^{[90]}(P_{0}), T^{[95]}(P_{0}) and T^{[99]}(P_{0}).
The whole procedure above was undertaken 30 times to confirm the stability of all the determined elements from the CDFs of and ; overall means were obtained for and and for the various defined percentiles.
The determined mean according to the values correspond to the S-Ls obtained by connecting the data when the trial period matches the underlying value. Their behaviour and the associated distributions from which they are determined also offer information on the expectation to detect any underlying period. As mentioned above, if a trial value of provides an S-L smaller than those associated with , and , then this period may be considered as being detected at the corresponding confidence level. Requirements of a data sample in terms of the number of measurements needed for detection of a period at a selected confidence level may be readily assessed by plotting against together with various percentile values and noting the crossover positions (see the example in Fig. 2).
Figure 2: An example of the investigation of the behaviour of according to the number of measurements, , is presented for the situation of an underlying sinusoidal variation with . The behaviour of displays a smooth fall with , as do the curves for and . For reference, the behaviour of with is displayed. The value of is =1.0, independent of . Curves are drawn for the lower percentile values and , these displaying minima at some low value of N(<10). As examples of the interpretation of the crossovers of curves within the figure, it can be seen that 12 data points are required to obtain detection with a 99% confidence level with a 50% chance of making the detection; 24 measurements are required to obtain the same level of confidence but with a 99% certainty of making the detection. | |
Open with DEXTER |
T^{[99]}(P) | T^{[95]}(P) | T^{[90]}(P) | ||||||||||
X | N | N | T^{[99]}(P_{0}) | N | N | T^{[99]}(P_{0}) | N | N | T^{[99]}(P_{0}) | |||
0.5 |
>100 | -- | >100 | -- | >100 | -- | >100 | -- | >100 | -- | >100 | -- |
1.0 | 59 | 0.68 | >100 | -- | 35 | 0.69 | >100 | -- | 26 | 0.70 | >100 | -- |
2.0 | 16 | 0.44 | 44 | 0.63 | 12 | 0.51 | 36 | 0.69 | 10 | 0.57 | 30 | 0.72 |
2.5 | 14 | 0.41 | 30 | 0.56 | 10 | 0.48 | 24 | 0.62 | 9 | 0.55 | 21 | 0.66 |
3.0 | 12 | 0.38 | 24 | 0.52 | 10 | 0.48 | 19 | 0.59 | 8 | 0.53 | 17 | 0.63 |
4.0 | 11 | 0.36 | 18 | 0.45 | 9 | 0.46 | 15 | 0.54 | 8 | 0.52 | 14 | 0.60 |
5.0 | 11 | 0.35 | 16 | 0.44 | 9 | 0.45 | 14 | 0.53 | 8 | 0.51 | 13 | 0.59 |
10.0 | 10 | 0.34 | 14 | 0.39 | 8 | 0.44 | 12 | 0.50 | 7 | 0.50 | 11 | 0.56 |
Any S-L for a period using data points, may be considered to have arisen from a distribution of values with a mean of . It will therefore be appreciated that any calculated S-L value matching the above defined crossover points corresponds to a probability of detecting the underlying period from random data sampling patterns. Even if a period of is present, the sampling for some particular data sets may produce an S-L value at which happens to be larger than the mean of its underlying distribution. With such a higher value, it might be embedded in the noise of the periodogram and the period would not be detected. Thus the power of the estimator not to miss detection of a period might be more realistically assessed from the crossovers of the low end tail of the distribution of with those of the high end tail of the distribution of . It was for this reason that the 99%, 95% and 90% percentiles of the upper tail of the CDF of were high-lighted in the exercise. By determining for which the various selected percentile values associated with the larger values of are smaller than those of the selected percentiles of the smaller values of , the probability of not missing an underlying period and detecting it at some particular level of confidence may be estimated according to the data sample size.
An example of the interpretation of an investigation of the behaviour of , is presented in Fig. 2. As expected, the mean values of un-ordered data, , forming any periodogram are constant with a value of 1.0. The figure shows the behaviour of the and values, corresponding to lower excursions within the generated periodogram, with minima between N=5 and 8; for larger data samples, the curves converge smoothly towards the level of 1.0 illustrating the reduction of the periodogram noise as increases. The curves reflecting the behaviour of , and all exhibit a near exponential fall indicating the improved period detectivity as increases.
From Fig. 2, inspection of shows that for any period to be detected with a confidence level of 99%, the value of must be smaller than 0.35 for and smaller than 0.53 for . The variation of the mean values , shows a crossover with at , this giving a criterion for the number of data points required for a 50% chance of detection at a 99% confidence level. Inspection of the crossover of with the curves corresponding to and shows that for a detection with 99% confidence, the 95% and 99% chances of succeeding require to be >19 and 24 respectively.
Similar diagrams for other values of show the curves for and are sensitive to the ratio of the sinusoidal amplitude to the S/N ratio of the basic measurements. When is increased to 5, the number of points required for detection reduces to 11 at the 50% probability level of detection and to 16 with a 99% probability. A summary of the cross-over points of at the 90%, 95% and 99% lower levels with for values from 0.5 to 10 is presented in Table1. Also provided are the number of required measurements to ensure at the listed 99% confidence levels that any period would not be missed from any random data collection.
The above study of provides information on the statistical behaviour of point values in the periodogram. Although it does not offer a definitive recipe for fully assessing the behaviour over the interval containing an identifiable period, the results give insight on confidence levels of the detection. If no period is detected, a point value assessment can be applied to estimate the amplitude level that would have been confidently detected from the values of the data set.
Point values may, however, be applied directly to study the way any data have been assembled. For example, in the first instance, some data, without the time element, may suggest that they are part of some Normal distribution. By applying the SLLK statistic, the value of should be close to unity. If the value is smaller, its departure from 1.0 may be tested for its statistical significance to explore whether the data form a correlated time sequence and are not simply representative of measurements taken at random from the underlying Normal distribution. Such an exercise was recently applied by Oskinova et al. (2001) to X-ray studies of hot stellar winds.
Again, the same computer programs were used to produce artificial data with the addition that the phase values were ascribed as , i.e., to each of the generated random phase values an integer was added in succession from 1 to . In this way the value of the period is effectively normalized to be unity, with a sampling routine which provides one value per cycle with random phase. SLLK periodograms obtained from the exercise show that when X=0.0, the mean values of T(P) are unity and that the noise behaves according to the earlier derived CDF for the given measurements. Various periodograms for the values listed in Table1 were generated covering the range , with selected periods differing by 0.001, the latter being approximately the limit of period resolution, i.e., there are no zones through T(P) with flat sections over which the measurement order does not change for successive trial periods.
Figure 3 provides three examples of the behaviour of around the value of for and . Rather than displaying a point minimum at the value of P=1.0, a typical periodogram displays a noisy descent from unity to a minimum followed by a noisy return. It can be seen that the minima all lie below the 99% noise level associated with extreme low point values within a periodogram based on the same number of measurements but with . The indication that a period is present is better assessed, therefore, by considering zones over which the values in the depression fall below some selected noise value for a periodogram rather than just considering any isolated point value.
The means of 30 repeated runs for , 25 and 50 with are displayed in Fig. 4, together with the appropriate 99% noise levels. It can be seen that the half-widths of the minima reduce as grows. In addition to the slight increases in depth of the minima with , it can also be seen from the 99% noise levels how the power of detection increases dramatically with and how the determined period value becomes better defined.
Figure 3: Three sample periodograms from artificial data comprising 15 randomly phased measurements with . It may be noted that the noted minima around lie below the marked 99% level of the noise of the periodogram. | |
Open with DEXTER |
Figure 4: Mean periodograms based on 30 generated data sets with N= 15, 25 and 50 and X=3.0 demonstrate the narrowing and deepening of the minimum associated with the underlying period, so indicating how period detection improves with the number of measurements. The values associated with the 99% noise level of the periodograms are indicated, these giving a better indication of how the power of detection increases with . | |
Open with DEXTER |
For any SLLK periodogram providing a well defined minimum with T(P) values below some assigned value associated with an acceptable confidence for a period detection, the progression through the minimum is unlikely to be smooth and may display steps because of a slight over-sampling. In order to determine the best period, the method proposed by Fernie (1989) may be applied to provide an interpolated value (based on the algorithm of Kwee & van Woerden 1956), together with an error estimate. For the periodograms displayed in Fig. 4, this method provided values of , and for and respectively, the error estimates reducing significantly as increases. The accuracy of the period may also be assessed by progressively decreasing the sampling interval of the trial periods until the minimum displays a flat section; at this stage, the accuracy of the determined period is of the same order as the periodogram resolution.
Figure 5: The algorithm for the 40 V-band measurements by Moffett & Barnes (1984) of the cepheid variable star, ALVir, displays a deep minimum with oscillations resulting from windowing effects. The period grid was with the deepest minimum occurring at and . Using the procedure advocated by Fernie (1989), the best period is comparing well with the period listed by Moffett & Barnes (loc. cit.) of 10 302323. | |
Open with DEXTER |
An example of a periodogram obtained from real photometric data is given in Fig. 5 for 40 V-band measurements from Moffett & Barnes (1984) (hereafter referred to as M&B) for the cepheid variable star, ALVir; a deep minimum is clearly seen. The method of Fernie (1989) provides a value of comparing with the value of listed by M&B. The oscillatory nature of the periodogram over the displayed region results from sampling and windowing effects associated with the collection of these data. The periodogram obtained from matching B-band data is almost identical, confirming this conclusion. The fact that the level of the periodogram is generally less than unity is a result of this particular data set. The depth of the minimum indicates very clearly how well the periodicity has been detected for this star with measurements of high value.
Figure 6: Photometric data from Moffett & Barnes (1984) for the cepheid variable ALVir made in the V and B bands shows in a) that the measurements lie on a near elliptical locus, indicating a strong correlation between the variability in the two colours but with a varying phase difference. The data are connected in order of their collection in b) indicating the large R-L required to join the measurements in the VB-plane. In c) the connection has been re-ordered according to the period of 10 3095, this giving a minimum R-L. If data point connections were to be made on the original period of 10 302323 (see Moffett & Barnes 1984), the R-L is less clean and retraces itself at locations of maximum and minimum light. The execution of the locus with time follows a clockwise progression. | |
Open with DEXTER |
In this section it will be demonstrated that the principle of the S-L method as embraced by the SLLK algorithm is readily extendable to time-series analyses of multivariate data. The development involves calculation of statistics comprising the combination or "twining'' of "strings'' associated with each of the measured parameters to provide "Rope-Lengths'' (R-Ls). The concept is readily appreciated both in visual and physical terms for two-parameter data, with pairs of measurements obtained at identical times.
As an illustrative example, consider the V-band data of ALVir, used in the production of Fig. 5, in combination with the complimentary B-band measurements. The brightness changes in both bands are very significant relative to the measurement noise. The simultaneous V, B magnitude values are plotted against each other in Fig. 6a and the obvious correlation of behaviour of the two parameters is revealed with a near linear relationship between B and V. It may be noted that more points appear at the extreme ends of the plot, as would be expected if a sinusoidal variation is sampled at random.
In more detail, the data distribution follows a locus more like that of a neo-ellipse, suggesting that the two-colour measurements exhibit similar variations but with a phase difference. The path through the points is akin to a Lissajou figure produced by compounding measurements of orthogonal oscillatory variations of differing amplitude and phase. A strictly linear path indicates that the two oscillations are permanently in phase with the gradient determined by their relative amplitudes. Open patterns reveal the presence of phase differences in their behaviour. Although not normally depicted in this way, such behaviour is well known in colour photometry of cepheids and it can be seen here that at ALVir's maximum light (upper right of Fig. 6) the phase difference between B and V is small, whereas, at light minimum, the phase difference is very significant.
The concept of examining data in this way may obviously be extended to more than two simultaneous measurements with complicated figures being executed in multi-parameter space.
As in the case of the earlier analysis of single parameter data, a
grid of periods may be explored such that for each trial, the
measurement pairs are assigned a phase, ,
between 0 to
1, according to their measurement times. By re-assembling the data
according to ascending phase values, they may be re-labelled
with the change of subscript indicating
their new order. A periodogram based on R-L values may then be
produced by repeating the phase ordering exercise for each trial
period, ,
and determining the appropriate R-L value. Thus,
in its basic form, a two-parameter R-L periodogram may be
represented by
If the data set comprises several simultaneously measured
parameters, the summation of the vectors joining the points in
multi-parameter space when phased according to a trial period is
simply
Rather than calculating the "true'' R-Ls in multivariate space as
in Eq.(7), the R-L may be determined without taking
the square root of each of the contributing vectors. This may be
written simply as
Since the SLLK for each parameter is independent of the number of contributing measurements, a very important result from Eq.(12) is that a periodogram can be obtained from combination of multi-parameter data sets comprising differing numbers of measurements with records not necessarily obtained at identical times. One advantage of this is that all the data from a study may be utilized even if there are recording gaps for some of the parameters. Combination of such data may reduce sampling and windowing effects that may be apparent if the reduction is simply limited to those measurements of the parameters taken at identical times.
In the following section, the behaviour of spot values from R-L periodograms based on T(Z,P) as defined by Eq.(12) are investigated by computer simulation in a similar fashion as for SLLK above.
As for the single parameter exercise described in Sect. 3, and following a parallel nomenclature, distributions for and were established for each and combination. Mean values, and were calculated, together with the lower percentiles , , and upper percentiles , , .
Again, following the same arguments as for the single parameter investigation, the way in which the S/N ratio of the periodogram itself improves according to together with the confidence values of a period detection and of a period not being overlooked, can readily be assessed by production of diagrams similar to that of Fig. 2, although these are not presented here. A summary of the information is, however, presented in Table2. Comparison with Table1 shows the general improvement of sensitivity and the reduced periodogram noise that multi-parameter data offer.
Although all the parameters of any multivariate data may carry the same underlying period, the oscillations may be out of phase with respect to each other. In some cases, the periodic behaviour may not be sinusoidal and the phase differences between the parameters may change over the period. In order to see the effect that phase difference, , has on , artificial data were generated for two measurement parameters () with identical values for 3), but allowing for a constant phase value to be present between the underlying variations of measurement pairs. The results of the exercise are also summarized within Table 2.
It can be seen that the power of increases significantly with the phase difference, reaching its maximum when , with the data producing a circular locus in the two-dimensional data plane. Beyond this phase value, the behaviour of the power shows symmetry, with being equivalent to . Thus, in terms of detection of periodicity in small amplitude signals, there is positive advantage in using as a means of period detection.
Simulated data with a normalized period of 1 0 were generated for N=15, 20 and 50 with X=3.0 in similar fashion as in Sect. 3.2 but with Z values of 2, 3 and 5. Individual periodograms were very similar to the investigations of in Sect. 3.
Mean periodograms based on 30 repetitions for each situation were similar in outcome to Fig. 4, but with the noticeable improvement of a deepening minimum as increases.
Finally, an analysis of observations of M&B for RVSco serves as an example of R-L combinations of parameters for data sets with unequal numbers. Their original table of measurements show that 25 simultaneously recorded BVRI values are available with 32 additional BV values. A period of 6 061388 is also ascribed. Figures 7a and 7b display the periodograms over the range 5 0 to 7 0 with grid spacing of 0 005 for the 4-colour and additional 2-colour data respectively. Although the periodogram in Fig. 7a is noisy, the presence of the period is clearly seen. Again the period is seen in Fig. 7b but it is obvious that the data sampling here is not as good as for Fig. 7a. As a consequence, when the overall summation of is effected, it is sensible to weight the contributions. For the example here, this has been done in the ratio of 2:1 respectively for the periodograms of Figs. 7a and b, with the resulting periodogram displayed in Fig. 7c where it can be seen that the noise has been reduced relative to that of Fig. 7a. Further trial analyses showed that the finest sensible resolution for the period grid is 0 0001 and the best deduced period is which compares with that of M&B but again revealing their exuberance in quoting an excessive number of decimal places.
Z=2 | T^{[99]}(2,P) | T^{[95]}(2,P) | T^{[90]}(2,P) | |||||||||
X | N | N | T^{[99]}(P_{0}) | N | N | T^{[99]}(P_{0}) | N | N | T^{[99]}(P_{0}) | |||
0.5 | >100 | -- | >100 | -- | >100 | -- | >100 | -- | >100 | -- | >100 | -- |
1.0 | 33 | 0.69 | >100 | -- | 21 | 0.71 | >100 | -- | 16 | 0.73 | 90 | 0.87 |
2.0 | 13 | 0.48 | 30 | 0.63 | 10 | 0.56 | 23 | 0.68 | 9 | 0.62 | 21 | 0.72 |
2.5 | 12 | 0.45 | 22 | 0.55 | 9 | 0.52 | 19 | 0.63 | 8 | 0.58 | 16 | 0.67 |
3.0 | 11 | 0.41 | 19 | 0.51 | 9 | 0.51 | 16 | 0.60 | 8 | 0.57 | 14 | 0.64 |
4.0 | 10 | 0.38 | 16 | 0.45 | 9 | 0.49 | 14 | 0.55 | 8 | 0.54 | 13 | 0.60 |
5.0 | 10 | 0.36 | 15 | 0.44 | 8 | 0.46 | 13 | 0.54 | 7 | 0.53 | 12 | 0.60 |
10.0 | 10 | 0.34 | 14 | 0.40 | 8 | 0.44 | 12 | 0.51 | 7 | 0.51 | 11 | 0.57 |
Z=3 | T^{[99]}(3,P) | T^{[95]}(3,P) | T^{[90]}(3 ,P) | |||||||||
X | N | N | T^{[99]}(P_{0}) | N | N | T^{[99]}(P_{0}) | N | N | T^{[99]}(P_{0}) | |||
0.5 | >100 | -- | >100 | -- | >100 | -- | >100 | -- | 77 | 0.89 | >100 | -- |
1.0 | 26 | 0.70 | 94 | 0.83 | 17 | 0.73 | 74 | 0.86 | 13 | 0.75 | 64 | 0.87 |
2.0 | 12 | 0.51 | 25 | 0.62 | 10 | 0.59 | 20 | 0.68 | 8 | 0.63 | 18 | 0.72 |
2.5 | 11 | 0.45 | 20 | 0.56 | 9 | 0.55 | 17 | 0.63 | 8 | 0.60 | 15 | 0.68 |
3.0 | 11 | 0.44 | 18 | 0.52 | 9 | 0.52 | 15 | 0.60 | 8 | 0.58 | 14 | 0.65 |
4.0 | 10 | 0.39 | 16 | 0.47 | 8 | 0.48 | 13 | 0.55 | 8 | 0.55 | 12 | 0.61 |
5.0 | 10 | 0.37 | 15 | 0.44 | 8 | 0.47 | 13 | 0.54 | 7 | 0.53 | 12 | 0.60 |
10.0 | 10 | 0.34 | 14 | 0.44 | 8 | 0.44 | 12 | 0.51 | 7 | 0.51 | 11 | 0.57 |
Z=5 | T^{[99]}(5,P) | T^{[95]}(5,P) | T^{[90]}(5,P) | |||||||||
X | N | N | T^{[99]}(P_{0}) | N | N | T^{[99]}(P_{0}) | N | N | T^{[99]}(P_{0}) | |||
0.5 | >100 | -- | >100 | -- | 66 | 0.89 | >100 | -- | 48 | 0.89 | >100 | -- |
1.0 | 20 | 0.72 | 64 | 0.81 | 13 | 0.75 | 48 | 0.86 | 11 | 0.78 | 43 | 0.87 |
2.0 | 12 | 0.53 | 22 | 0.62 | 9 | 0.60 | 18 | 0.69 | 8 | 0.65 | 16 | 0.72 |
2.5 | 11 | 0.47 | 18 | 0.56 | 9 | 0.56 | 15 | 0.63 | 8 | 0.61 | 14 | 0.68 |
3.0 | 10 | 0.43 | 16 | 0.51 | 8 | 0.53 | 14 | 0.60 | 8 | 0.59 | 13 | 0.65 |
4.0 | 10 | 0.40 | 15 | 0.46 | 8 | 0.49 | 13 | 0.56 | 7 | 0.55 | 12 | 0.62 |
5.0 | 10 | 0.38 | 14 | 0.43 | 8 | 0.47 | 12 | 0.53 | 7 | 0.54 | 11 | 0.59 |
10.0 | 10 | 0.34 | 14 | 0.40 | 8 | 0.45 | 12 | 0.50 | 7 | 0.51 | 11 | 0.57 |
The effect of a phase difference between the parameters with X=3.0 | ||||||||||||
Z=2 | T^{[99]}(2,P) | T^{[95]}(2,P) | T^{[90]}(2,P) | |||||||||
N | N | T^{[99]}(P_{0}) | N | N | T^{[99]}(P_{0}) | N | N | T^{[99]}(P_{0}) | ||||
0 | 11 | 0.41 | 9 | 0.51 | 8 | 0.57 | 19 | 0.51 | 16 | 0.60 | 14 | 0.64 |
11 | 0.41 | 9 | 0.51 | 8 | 0.57 | 19 | 0.52 | 16 | 0.60 | 14 | 0.64 | |
11 | 0.43 | 9 | 0.53 | 8 | 0.58 | 19 | 0.52 | 16 | 0.61 | 14 | 0.65 | |
10 | 0.43 | 8 | 0.53 | 7 | 0.59 | 18 | 0.53 | 15 | 0.61 | 14 | 0.66 | |
10 | 0.46 | 8 | 0.55 | 7 | 0.61 | 17 | 0.54 | 14 | 0.62 | 13 | 0.67 | |
10 | 0.47 | 8 | 0.57 | 7 | 0.63 | 16 | 0.56 | 14 | 0.64 | 13 | 0.69 | |
9 | 0.49 | 8 | 0.59 | 7 | 0.65 | 16 | 0.58 | 13 | 0.65 | 12 | 0.69 | |
9 | 0.51 | 7 | 0.60 | 7 | 0.66 | 15 | 0.58 | 13 | 0.66 | 12 | 0.70 | |
9 | 0.51 | 7 | 0.61 | 7 | 0.67 | 15 | 0.59 | 13 | 0.66 | 12 | 0.71 |
Figure 7: In a) part of the periodogram is displayed based on the 25 BVRI measurements of RVSco made by Moffet & Barnes (1984). The periodogram covering the same period interval of some other 32 BV measurements is displayed in b) revealing the effects of a poorer data sampling pattern relative to that associated with the data used to engender a). By combining the R-L values with a weighting of 2:1 with respect to a) and b) respectively, a less noisy periodogram is produced and displayed in c), using all of the data in the best way. From exercises with higher resolution than displayed above, the best period for RVSco is . | |
Open with DEXTER |
(13) |
Figure 8: In a) the U band data for UMon in the Stokes parameter plane are displayed with the points (measured in %) connected in order of their collection. The typical quality of the data is depicted by a point (not part of the data) at the left of the diagram carrying error bars. The periodogram based on for these data is displayed in c) and shows minima at 90 days and 178 days. By phasing the measurements on the best period of , re-ordered connections are made in b); although a smooth locus is not apparent, the process removes the forwards and backward movements across the central part of the data distribution. | |
Open with DEXTER |
There are many examples in the literature of polarimetric data which provide clean loci in the plane with a cyclic path (see, for example, Drissen et al. 1986), for which the would have obvious success in determining the period. To serve as an example here of the effectiveness of the algorithm, data have been taken from Serkowski (1970) for the RVTauri star, UMon. In discussions of polarimetry, it has always been assumed that the period held in these data is the same as that established from photometry and spectroscopy. The data comprise 37 U band values, 51 for the B band and 39 for the V band. Figure 8a displays the U band measurements in the from of a plot.
The measurements for each colour were analyzed by in turn with a period grid of 0 02, providing minimum R-Ls at 89 68, 91 08 and 92 78 in the U, B and V bands respectively. The U band periodogram, with a grid of 0 5, is shown in Fig. 8c and displays the presence of an additional minimum at close to 2 the fundamental (178days). This may result from the intrinsic noise of the star; Serkowski (1970) commented that the amount of intrinsic polarization of UMon changes considerably from cycle to cycle, whereas the angle of the direction of vibration behaves similarly in each cycle. Alternatively, it might be related the double periodicity behaviour seen in the photometry of these stars, with light curves exhibiting alternate deep and shallow minima.
When the colour data were lumped together and the exercise applied to the 6 parameters simultaneously without weighting, the determined period was 92 78. The photometric period suggested by Preston et al. (1963) was 92 23 but based on on additional photometry conducted at the same time as the polarimetry, Serkowski (1970) suggested a period of 91 3. The variations in the quoted period probably reflect the presence of intrinsic fluctuations superimposed on the basic period.
Figure 8b displays the data point connections when their order is re-adjusted according to the ascribed phase defined by the period. Although the connections do not follow a smooth locus, it may be noted that the frequency of crossing the central part of the data distribution has been reduced by the procedure, as would be expected from data following a near elliptical locus with a noisy perimeter. The overall behaviour of the analysis is consistent with the star exhibiting fluctuations in as the position angle of the polarization sweeps around from 0 to , the whole pattern being offset from the plane origin by a constant interstellar component.
By applying a normalizing factor of (N-1)/2N to the original Lafler & Kinman (1965) statistic, it has been demonstrated that the "String-Length'' method of LK has been regularized. As well as being independent of the S/N ratio of the basic measurements, SLLK is now independent of the number of measurements in any examined data set. Any periodogram, , based on its evaluation at each trial period, should have a mean level of 1.0.
If periodicity is present, the depth of the associated minimum in the periodogram for a given number of measurements depends only on the amplitude-to-noise ratio of the measurements. Such an attribute makes the determination of confidence levels on any period detection straight forward. This might be done with reference to artificial data according to the exercises outlined in the paper or the behaviour of the periodogram may be examined by replacing the measurements for each timed record with computer generated values which simply carry noise or a sinusoidal signal with some given amplitude-to-noise ratio; repetitive exercises of this kind allow confidence levels to be assigned to any outcome.
It has also been demonstrated that the regularized SLLK algorithm is applicable to examining multivariate data for which the parameters may, or may not, be measured simultaneously, so extending the "String-Length'' principle to the notion of a "Rope-Length''. Combination periodograms based on measurements of several parameters may be constructed by weighting the contributions from the different parameters according to their estimated importance. Such RLLK combinations, with or without weighting, should improve the overall periodogram by reducing the effects of sampling that will be apparent on each measured parameter. It is also interesting to note that the RLLK principle can be applied very effectively when there are phase differences between the underlying behaviour of the different parameters. Again, with reference to exercises involving artificial data, the regularized form of is readily amenable to the determination of confidence levels associated with a detected period or with a null outcome.
The efficacy of with respect to simultaneous 2-colour measurements of a cepheid star, to multi-colour measurements a cepheid with data sets of unequal size, and to an RV Tau star displaying periodic polarization variations, has been clearly demonstrated. In summary, has obvious applications to the analysis of multi-colour photometry and polarimetry. It may be noted that in a study of the polarimetric behaviour of O-type stars, a joint period analysis involving spectral line equivalent width data and broad-band polarimetry has been undertaken by Clarke et al. (2002). As an extreme parameter combination, although an example was not provided here, the method could be used to investigate periodicity in data say from X-ray, optical and radio measurements obtained contemporaneously but not necessarily simultaneously.
Acknowledgements
The exploration of S-L and R-L methods have provided several student projects at Glasgow University for the development of computing skills. From proving out the algorithms, the discussions and feedback were most useful and particularly I wish to thank Brian Hamilton, Hrobjartur Thorsteinsson and Kris Wojciechowski, the latter two being instrumental in recognizing the step from Eq. (7) to Eq. (10), so widening the concepts and usefulness of RLLK.