A&A 400, 1173-1181 (2003)
DOI: 10.1051/0004-6361:20030025
G. Perrin
LESIA, FRE 2461, Observatoire de Paris, section de Meudon, 5 place Jules Janssen, 92190 Meudon, France
Received 4 September 2002 / Accepted 19 December 2002
Abstract
I present in this paper a method to calibrate data
obtained from optical and infrared interferometers. I show that
correlated noises and errors need to be taken into account for a
very good estimate of individual error bars but also when model
fitting the data to derive meaningful model parameters whose
accuracies are not overestimated. It is also shown that under
conditions of high correlated noise, faint structures of the source
can be detected. This point is important to define strategies of
calibration for difficult programs such as exoplanet detection.
The limits of validity of the assumptions on the noise statistics
are discussed.
Key words: techniques: interferometric - methods: data reduction
With optical-infrared interferometry becoming more mature, the quality of visibility measurements have become an issue. Single-mode interferometers (see Sect. 2.3) allow one to eliminate non-stationary effects by filtering out the spatial modes of turbulence. The response of interferometers is therefore very stable and the issue of estimating the accuracies of non-biased data is raised. The final visibility estimate is a complex quantity as it is a non-linear mix of noisy measurements and of parameter estimates with their own uncertainties. Estimating the stability of the instrument, a crucial point for calibration, and the final error on visibilities is therefore non-trivial and must be considered with caution. Moreover, data analysis mainly consists of model fitting the final visibilities and the matter of their potential correlations becomes important, especially if some very faint structures are looked for, as is the case in extra-solar planet detection.
In this Paper I propose a method to meet these challenges. The method has been tested and elaborated along with the FLUOR interferometer, the first single-mode interferometer. This method was first published in Perrin (1996) and used in Perrin et al. (1998). It is updated and improved in this paper by accounting for correlations.
![]() |
Figure 1: Examples of squared coherence factor histograms obtained with FLUOR in one of the interferometric channels. About 100 interferograms have been recorded for each object. The mean and rms of individual measurements are given for this channel. The correlation factor r measures the noise correlation between the two interferometric channels. The amount of atmospheric piston is decreasing from the left to the right. |
| Open with DEXTER | |
In this section the general scheme of data reduction is reviewed to introduce the vocabulary and notations. Two main steps are to be considered. In the first one (Sect. 2.1), the fringe processing, fringe contrasts are derived from raw signals. Because of contrast losses, fringe contrasts are calibrated in a second step (Sect. 2.2) to provide the visibilities directly linked to the spatial intensity distribution of the source.
In the following we will distinguish between the fringe contrast
obtained on a source and the visibility of this source. The fringe
contrast measured from a single exposure or scan is called the
coherence factor and is noted
whereas the visibility is noted V.
Whatever the beamcombining technique,
being the modulus of a
complex number, unbiased estimators are only obtained for squared
quantities from wich biases due to additive noise can be subtracted.
In the future, phase referencing techniques may allow one to directly
measure complex visibilities (real and imaginary parts) but this is
not the case yet and I will only consider measurements of fringe
contrast moduli. The result of the processing of a series of scans on
a single source is a series of realizations of the
estimator
or is an average value of the realizations with a 1
error bar
if their statistical distribution can be trusted to be Gaussian.
The average
is not directly an estimator of the squared
modulus of the visibility of the source because some physical
phenomena degrade the coherence factor. Among these phenomena,
polarization mismatches between the interferometric beams are the most
common. Without perfect beam cleaning by a fiber, atmospheric
turbulence also degrades the fringe contrast. It is necessary to
estimate the loss of coherence on a calibrator source for which the
visibility is known. A transfer function T is obtained by computing
the ratio of the measured coherence factor
to the
expected visibility
.
With squared quantities this
yields:
![]() |
Figure 2:
Examples of squared co-transfer functions measured with
FLUOR. The two curves for each night correspond to the two
interferometric channels of the coaxial interferometer. The
full circles are the squared co-transfer functions measured on
calibrators whereas the open circles are the values
interpolated at the time when the science targets were observed.
|
| Open with DEXTER | |
The calibration process may fail if the assumption that the transfer function is stable is wrong. This usually happens if non-stationary processes like atmospheric phase turbulence play a role in the fringe formation process. In a perfect single-mode interferometer in which single-mode fibers are used to filter the phase aberrations produced by atmospheric turbulence (except for the piston mode) these non-stationary effects are eliminated. In order to avoid instabilities due to the piston mode, the fringes are scanned at a frequency far above the characteristic frequency of piston. In interferometers where the piston is stabilized by a fringe tracking servo loop, this issue is solved. The remaining main sources of variation of the transfer function are basically temperature drifts and differential polarisation effects due to the change of beam inclination on the first mirrors with changing positions of the sources in the sky. In both cases the transfer function drifts are very slow and a good estimate of the transfer function can be obtained by interpolating two estimates bracketing the source to be calibrated. This has been demonstrated with the FLUOR beamcombiner, as will be shown is Sect. 7.2.
In the following we will therefore consider that the efficiency of the interferometer is continuously assessed by observing calibrators before and after science sources. We will not consider the case where the transfer function is derived by averaging individual transfer functions on a large temporal scale, as this is not required for a single-mode interferometer. This technique does not allow one to assess the quality of the calibration in detail. Nevertheless, should the transfer function estimates be statistically compatible with a constant transfer function, it would be legitimate to use this value to calibrate a whole night. The method to evaluate correlations should be considered and accordingly adapted.
This section focuses on estimating the statistical properties of fringe contrasts. I will not describe the method to compute coherence factors from single exposures and I will refer the reader to appropriate articles in the next paragraphs.
In a multiaxial interferometer, distant parallel beams feed a focusing optic. The beams are recombined in the focal plane where they overlap at the focus locus. The modulation is spatial as the fringe phase varies across the diffraction pattern. A method to derive fringe contrasts has been published by Mourard et al. (1994) in the case of GI2T. The method has been adapted to AMBER which is a single-mode multiaxial interferometer (Chelli et al. 2000).
Thanks to the filtering of the non-stationary modes of turbulence, the
statistics of
can be well approximated by a Gaussian
distribution. This will be demonstrated in the case of the data
obtained with FLUOR in Sect. 7.1. The estimate of the squared
coherence factor is therefore the mean of the distribution of the
realizations denoted
.
An unbiased estimate of
the variance of individual measurements is:
![]() |
(3) |
![]() |
(4) |
In a coaxial interferometer, beams are superimposed in position and in direction. This can be realized with a beamsplitter or with a fiber coupler. A relative phase between the beams is introduced by setting an optical path difference. This is achieved with a moving mirror in one of the two beams, hence the temporal modulation of the phase. A method to compute fringe contrasts for this type of interferometer is described in Coudé du Foresto et al. (1997). A more recent method based on wavelets analysis has been proposed by Segransan et al. (1999). The method to obtain an estimate of the coherence factor without the photon noise bias is explained in Perrin (2003). A prototype instrument for this kind of interferometer is the FLUOR beamcombiner.
![]() |
Figure 3:
Example of visibility fit. The source is SW Vir and
the model is a uniform disk. Errors are |
| Open with DEXTER | |
The difference with the previous interferometer of Sect. 3.1 is that
it produces two interferometric beams and therefore two sets of
coherence factors estimates. The statistics of each set can be well
approximated by a Gaussian statistics as will be shown in
Sect. 7. The photon noises of the two
interferometric signals are uncorrelated. The read-out noises are
generally considered uncorrelated but some correlation may occur as
different pixels share the same read-out electronics. In addition the
two beams suffer from the same turbulence effects (residual piston and
photometric beam fluctuations) which generate some noise in the
measurements. Part of the noise is therefore common to the two
signals and the coherence factors estimates are correlated. The
correlation factor r is directly estimated from the
distributions:
![]() |
(5) |
It is assumed that the transfer function is a slowly varying function
which is rapidly sampled. This property will be illustrated with real
data in Sect. 7. It is then legitimate to linearly
interpolate the squared transfer function at the time when the science
source was observed. Because the variances of products of random
variables are more easily calculated than those of ratios, the
reciprocal of the squared transfer function, the squared co-transfer
function, is interpolated instead of the squared transfer function.
The use of one or the other is equivalent. In order to be general,
two interferometric outputs are always considered. The particular
case of the multiaxial interferometer will be considered in
discussions. The expression of the interpolated co-transfer functions
in the two outputs of the instrument is:
![]() |
(7) |
![]() |
(8) |
![]() |
Figure 4:
Examples of fits of visibility data with a
uniform disk model. All correlations are taken into account.
Errors are |
| Open with DEXTER | |
The single-channel squared visibility in the case of a multiaxial
interferometer or the two-channel squared visibilities of a coaxial
interferometer are simply the product of the science target squared
coherence
factors and squared co-transfer functions of Eq. (6)
yielding:
The above equations define the uncertainties on the channel estimates
of the visibilities. In the case of a multiaxial interferometer, this
is the final estimate of the visibility. In the case of coaxial
interferometers, the two estimates of the visibility need to be
averaged at this stage. For this, it is necessary to assess the
correlation factor between the two estimates. By definition, the
correlation factor is equal to:
![]() |
(14) |
![]() |
(15) |
It is also interesting to analyze the propagation of noises in the
visibility estimates. For example, if the noise on the measurements
is negligible, it is possible to evaluate the amount of variance due
to the uncertainty on the calibrators visibilities (or diameters).
Dropping channel indices I obtain:
![]() |
(16) |
The final squared visibility V2 is estimated from the two
squared
visibilities obtained from each output of the interferometer
V12 and V22 and their respective variances (or
equivalently uncertainties
and
). I define
the final estimate V2 as being the least squares fit estimator
of
the squared visibility as this is an optimal estimator for Gaussian
random variables. In this fit, the model is linear and has only one
parameter: V2. Let us call C the covariance matrix of
V12 and V22:
![]() |
(17) |
![]() |
(18) |
![]() |
(19) |
![]() |
(20) |
![]() |
(21) |
![]() |
(23) |
The quality of the fit is expressed by the
:
Coherence factors recorded simultaneously on different baselines with
telescopes in common may also be correlated. This correlation should
be taken into account and saved with the reduced data in the form of a
correlation matrix. The correlations may be as high as the
correlations between the two channels of a coaxial interferometer as
all calibrators are common to all baselines. The method used in
Sect. 5.2 should be applied. A correlation matrix for
the
should be computed first. The final
correlation factors for the final visibility estimates are then
computed with a Monte-Carlo method.
Visibilities obtained on different baselines or on different days are usually considered independent. In the last paragraph, we focused on the possible correlations of visibilities recorded simultaneously on baselines with telescopes in common. In this section, we will consider the correlation due to common uncertainties in the calibration process for independent baselines or for visibilities measured at different times. The calibration of the transfer function may have required us to use the same calibrators hence the same diameter estimates. The errors on the visibilities are therefore not independent. It is the purpose of this paragraph to establish a method to compute this correlation and, more important, to be able to trace it to compute it a posteriori long after taking the data at the telescopes.
Let S1 and S2 be two spatial frequencies at which squared
visibilities
V2(S1) and
V2(S2) have been measured.
The visibility estimates of Eqs. (10) and (22) can take the form:
![]() |
(25) |
The correlation between two expected squared visibilities at two
different baselines is not easy to evaluate analytically. Besides, it
may depend upon the model of the calibrator. A computation can be
performed which shows that the correlation is indeed equal to 1 with
an excellent accuracy as long as no baseline is equal to 0. This can
also be shown by expanding the visibility function. Thus, the
expected visibilities derived from a uniform disk model of a same
calibrator at two different baselines are fully correlated to the
first order. This holds as long as the second derivative of the model
is small (which in the case of the uniform disk model is true except
close to the zeros of the model) and as a condition, none of the
baselines is very close to zero. In practice, the error on the
diameter being usually small (less than 5%), the first order
approximation is valid and the two expected squared visibilities can
therefore be considered fully correlated. This is true down to very
short baselines as for example for a diameter of
mas the
correlation starts to decrease for a baseline below 5 cm.
For practical use, Eq. (26) can be simplified as the
correlations between expected visibilities are either 0 or 1 when the
calibrators are respectively different or alike. The only requirement
to compute this correlation is therefore that the variances of the
expected visibilities and the coefficients
and
be
saved with the reduced data. These correlations will have to be
computed to model fit the data. The generalization of the
Levenberg-Marquardt method with correlated data is given in the
Appendix at the end of the paper.
Examples of data reduction results and calibrations are presented. The quantities introduced in the previous sections are discussed in practical situations and general comments on observing strategies are expressed.
I have plotted in Fig. 1 three examples of
distributions. In the case of V636 Her, the fringe speed puts the
fringe frequency far above the turbulence piston spectrum. The piston
is almost frozen during each scan and the amount of correlated noise
is small. In the case of 71 UMa, the fringe speed is lower and the
measurements are more sensitive to piston hence the higher correlated
noise.
Sge is an intermediate case. In all three examples,
the distributions of
are compatible with Gaussian
distributions hence validating the basic assumption on the statistics
of the
.
An important fact is that the amount of correlated
noise is not negligible and must be taken into account. However, a
test on distributions is performed to detect deviations from Gaussian
statistics. Deviations are not common and are always due to
instrumental problems. In such cases, depending on the required level
of data quality, data may be eliminated.
Figure 2 presents two examples of squared co-transfer functions. Full circles are measurements on calibrators whereas open circles are interpolations for science targets. It is visible that the co-transfer function is not always stable and may experience variations. In some cases like on May 15, 2000 at 8:07, an error of calibration may have happened as the co-transfer functions jump by a few percent. Yet, in most cases, the transfer functions variations are slow on time scales of a few hours and variations can be well approximated to the first order. Data collected on May 22, 2000 show that this is still the case when the calibrator diameters are known with a very good precision.
Before presenting examples let us summarize the different levels of correlations we have encountered so far:
It will be interesting to assess the level of correlations of
visibilities measured with multiple beam interferometers. It can be
anticipated that it will not be negligible and will be of the same
level as
.
The importance of correlation between visibilities recorded separately
is illustrated in Fig. 3. The data have been reduced
in two different ways. Data plotted with open circles and fitted by a
dashed-line uniform disk model are reduced without taking
correlations into account. Data plotted with full circles and fitted
by the continuous line were reduced with the method of this paper. In
the first case, the fit is of very good quality with a
smaller than 1. Yet, all visibilities have been calibrated with the
same source, hence a strong correlation between visibility values as
the
correlation matrix shows:
![]() |
(27) |
Other examples of model fitting are presented in
Fig. 4. The BK Vir data were calibrated with the
same calibrator (the correlation matrix is similar to the matrix
above) as SW Vir but the visibilities are very consistent with each
others. Data for the three other sources are either totally
independent or slightly correlated. Only the first lobe data were
used for the fit of G Her. These four examples show very good fits
and consistency of data. In particular, this sets the best absolute
accuracy of the calibration of visibilities with FLUOR to 0.004
(equivalent to an accuracy of 0.004 on V2 with V=0.5 as
to the first order).
It is important to adapt the strategy of calibration to the type of
astrophysical studies addressed with optical interferometers. For
most studies where visibility accuracies of a few percent are
acceptable, the repeated use of a single or of a few calibrators is
possible. For difficult programs like exoplanet detection, a very
high level of accuracy is required and the strategy needs to be well
prepared. Two cases may arise depending on whether the required
calibration of visibilities is absolute or relative. If absolute
accuracies better than 0.001 have to be obtained on visibilities then
it is very likely that no calibrator can be used twice, unless the
error on the expected visibility of this calibrator is less than the
level of accuracy required. This would suppose that the visibility
model of the calibrator be measured first. Another possibility is
relative detection. As illustrated by the example of SW Vir, if the
same calibrator is systematically used, the fit is sensitive to very
low levels as the correlated noise does not contribute to the value of
the
.
In this example, a departure from the uniform disk
model or a calibration error may have been detected to a level much
lower than the error bars. For very faint detail detection, this can
work if the visibility curve of the calibrator is smooth and without
wiggles of similar amplitude as the ones searched for on the science
target.
In any case, the observing strategy should be prepared in advance and should take the problem of data correlations into account.
I have proposed in this paper a method to calibrate visibility data obtained with single mode interferometers. The single mode character is required to make valid the assumption that the statistics of coherence factors data are Gaussian and stationary. It is possible to derive reliable error bars if all correlations are considered in the derivation of all estimators. Correlations also need to be taken into account when fitting the data by models. The validity of the method has been demonstrated on real interferometric data recorded with FLUOR. An important conclusion of this work is that the strategy of calibration has to be adapted for specific programs requiring high standards of calibration.
Algorithms for model fitting are well known. One of the most commonly used is the Levenberg-Marquardt algorithms. An example of practical implementation is given in Press et al. (1988) in the case where data are not correlated. Here, I give a generalization of this algorithm to the case of correlated data.
Let
be a series of squared visibility
measurements of an astronomical source. Let C be the matrix of
variances-covariances of these measurements:
![]() |
(A.1) |
The best estimates of the parameters in the sense of the highest
likelihood are obtained by maximizing the likelihood function of the
vector which leads to minimizing the following
functional:
![]() |
(A.2) |
![]() |
(A.3) |
![]() |
(A.4) |
Close to minimum, S can be expanded to the second order in
:
![]() |
(A.5) |
![]() |
(A.6) | ||
![]() |
|||
![]() |
(A.7) |
![]() |
(A.8) |
![]() |
(A.9) | ||
![]() |
![]() |
(A.10) |
![]() |
(A.11) | ||
![]() |
![]() |
(A.12) |
![]() |
(A.13) |
| (A.14) |