Issue |
A&A
Volume 541, May 2012
|
|
---|---|---|
Article Number | A81 | |
Number of page(s) | 17 | |
Section | Astronomical instrumentation | |
DOI | https://doi.org/10.1051/0004-6361/201218932 | |
Published online | 03 May 2012 |
Kalman-filter control schemes for fringe tracking
Development and application to VLTI/GRAVITY
1
LESIA, Observatoire de Paris, CNRS, UPMC, Université Paris
Diderot, Paris Sciences et Lettres,
5 place Jules Janssen, 92195
Meudon,
France
2
Instituut voor Sterrenkunde, KU Leuven, Celestijnenlaan 200D,
3001
Leuven,
Belgium
e-mail: jonathan.menu@ster.kuleuven.be
3 Groupement d’Intérêt Scientifique PHASE (Partenariat Haute
résolution Angulaire Sol Espace) between ONERA, Observatoire de Paris, CNRS and Université
Paris Diderot, France
Received:
31
January
2012
Accepted:
28
February
2012
Context. The implementation of fringe tracking for optical interferometers is inevitable when optimal exploitation of the instrumental capacities is desired. Fringe tracking allows continuous fringe observation, considerably increasing the sensitivity of the interferometric system. In addition to the correction of atmospheric path-length differences, a decent control algorithm should correct for disturbances introduced by instrumental vibrations, and deal with other errors propagating in the optical trains.
Aims. In an effort to improve upon existing fringe-tracking control, especially with respect to vibrations, we attempt to construct control schemes based on Kalman filters. Kalman filtering is an optimal data processing algorithm for tracking and correcting a system on which observations are performed. As a direct application, control schemes are designed for GRAVITY, a future four-telescope near-infrared beam combiner for the Very Large Telescope Interferometer (VLTI).
Methods. We base our study on recent work in adaptive-optics control. The technique is to describe perturbations of fringe phases in terms of an a priori model. The model allows us to optimize the tracking of fringes, in that it is adapted to the prevailing perturbations. Since the model is of a parametric nature, a parameter identification needs to be included. Different possibilities exist to generalize to the four-telescope fringe tracking that is useful for GRAVITY.
Results. On the basis of a two-telescope Kalman-filtering control algorithm, a set of two properly working control algorithms for four-telescope fringe tracking is constructed. The control schemes are designed to take into account flux problems and low-signal baselines. First simulations of the fringe-tracking process indicate that the defined schemes meet the requirements for GRAVITY and allow us to distinguish in performance. In a future paper, we will compare the performances of classical fringe tracking to our Kalman-filter control.
Conclusions. Kalman-filter based control schemes will likely become the next standard for fringe tracking, providing the possibility to maximize the tracking performance. The results of the present study are currently being incorporated into the final design of GRAVITY.
Key words: techniques: interferometric / techniques: high angular resolution
© ESO, 2012
1. Introduction
Ever since the early days of optical interferometry, the motion of fringes due to atmospheric turbulence has been a major limitation to this observation technique. One possible way to avoid the blurring of fringes on the detector is to use short fringe exposures, since short integration times temporally “freeze” atmospheric motion. In combination with the low optical throughput of interferometric systems, however, the sensitivity associated with short integrations is low, even on the largest-telescope systems.
Tests with prototype interferometers (Shao & Staelin 1980; for a further overview, see Monnier 2003) demonstrated the use of the control technique fringe tracking to increase the sensitivity of interferometric observations. Correcting for the atmospherically introduced optical path difference (OPD) between different interferometer arms indeed allows the continuous observation of the fringes. In a fringe tracker, a phase sensor estimates the OPD corresponding to the current fringe exposure. On the basis of this value, a command is calculated by the OPD controller, which is then sent to an OPD-correcting actuator system. The result is that fringes in the scientific beam combiner are stabilized to atmospheric motion, and can be integrated much longer than the typical atmospheric coherence time. In addition, the synchronous tracking of the fringes on a bright source allows the interferometric observation of much fainter sources.
Apart from turbulence, an important limitation of current real-time wavefront correction or tracking is instrumental vibrations. Initial on-sky tests of NAOS, the first adaptive-optics (AO) system on the Very Large Telescope (VLT), attributed a loss in Strehl ratio of up to 15% to mechanical vibrations (Rousset et al. 2003). In the case of the VLT Interferometer (VLTI), longitudinal vibrations on the level of the baselines cause fringes to move onto the detector. Power spectra of sequences of OPD values estimated during a run on the VLTI/PRIMA instrument (Delplancke 2008) indicate the presence of a discrete number of vibration components during fringe observations (Sahlmann et al. 2009). In an effort to reduce the vibration level of the VLTI system, a campaign to identify the contribution of different VLT subsystems has been initiated (Haguenauer et al. 2008; Poupar et al. 2010). The efforts to isolate noisy instruments has led to significant improvements.
Active correction techniques can also be applied to reduce vibration contribution. On the one hand, a vibration-sensing accelerometer system operates on the first three mirrors of the four 8.2-m unit telescopes (UTs) and effectively corrects vibrations in the range 10–30 Hz (Haguenauer et al. 2008). On the other hand, the control algorithm of the OPD controller can be extended with additional vibration blocks that correct for a defined set of vibration components. The Vibration TracKing algorithm (VTK; Di Lieto et al. 2008) is such an example that operates as a narrow-band filter. For the Keck Interferometer, a similar narrow-band control system has been designed (Colavita et al. 2010).
In recent years, there has been an effort in AO to design new control schemes. Of special importance are the techniques based on Kalman filtering (Kalman 1960), which is a technique designed for general data processing. It is used to track and correct observed systems in a statistically optimal way, in that it minimizes the norm of the residual signal (for linear systems with Gaussian noise statistics). Le Roux et al. (2004) designed an AO control scheme based on Kalman filtering. One of the important characteristics of this scheme is that it can be extended (Petit 2006) to coherently treat turbulence and vibrations, rather than add extra vibration control blocks to an unsuitable classical controller. As usual for Kalman filters, it is based on an a priori model for the controlled system, which here is referred to as turbulence and vibration perturbations.
The purpose of this work is the design of Kalman-filtering control schemes for fringe tracking, starting from the work that has been done for AO control. In particular, we consider the case of four-telescope interferometry, and develop control schemes that will be applicable to the fringe-tracking subsystem of GRAVITY (Eisenhauer et al. 2011, 2005), which is a four-telescope beam combiner to be installed on the VLTI. The main purpose of this second-generation instrument is high-precision narrow-angle astrometry and phase-referenced interferometric imaging in the near-infrared K band. The main scientific target of GRAVITY is Sgr A∗, which is currently the most well-studied and likely supermassive black hole, located at the Galactic center. On-sky operation is expected in 2014.
Conceptually, the idea of using Kalman filtering to correct for atmospheric fringe motion dates back to Reasenberg (1990), but has never been implemented under that form. A simple Kalman-filter based strategy for single-baseline fringe tracking was effectively used at the Palomar Testbed Interferometer (PTI, Colavita et al. 1999), which to our knowledge, is the only example of on-sky fringe tracking using Kalman-filter techniques. Lozi et al. (2010) demonstrated the operation of Kalman filtering for vibration correction on a laboratory prototype for single-baseline space-based nulling interferometry.
The use of Kalman filtering for (ground-based) fringe tracking will have several advantages compared to classical control strategies (e.g. integrator control):
-
Kalman filtering allows us to control turbulence and longitudinalvibrations in a coherent and equivalent way: the contribution ofdifferent perturbation components is decomposed;
-
all information on disturbances is used via an a priori model. This makes Kalman filtering a statistically optimal control method when considering linear systems with Gaussian white noise statistics;
-
the use of Kalman filters may solve issues with existing active vibration control at VLTI (e.g. artifacts in frequency spectra of residual VTK-corrected data; see Haguenauer et al. 2008);
-
the predictive nature of the filter allows us to deal with the loss of observables due to measurement noise.
In the case of the latter point, fringe-tracking control schemes need to be robust against flux losses owing to imperfect beam combination. Failures in tip-tilt correction, for instance, can lead to the short-time absence of fringe signal. The prediction capability of the Kalman filter controller is leveraged for this purpose.
The outline of this work is the following. In Sect. 2, we present the AO Kalman-filter control scheme, translated into a form applied to two-telescope fringe tracking. The Kalman filter is based on a parametric model and we present a technique for parameter identification in Sect. 3. Since an aim of this work is to develop fringe-tracker control schemes applicable to GRAVITY, we extend the control in Sect. 2 to four-telescope systems (Sect. 4). After considering some specific improvements in Sect. 5, we introduce the complete control schemes in Sect. 6. Before concluding this work, the results of the first simulations of four-telescope Kalman-filter fringe tracking for GRAVITY are presented (Sect. 7).
2. Basic case: two-telescope Kalman-filter control
The input information provided for fringe-tracking control are the OPD values estimated by the phase sensor. The aim of the OPD controller is to give an appropriate output command to the piston actuators (piezo actuators and delay lines). This section introduces the control law that is appropriate for a two-telescope interferometer. The approach used is based on a formalism developed for AO modeling, hence we can directly apply it here to fringe tracking. For a more complete introduction to this model, we refer to Le Roux et al. (2004) and Petit (2006), which are the papers on which the current section is based. The notation we use is (mainly) that of Meimon et al. (2010).
2.1. Modeling the fringe-tracking control
We have already mentioned before that Kalman filters need a model for the controlled system. We successively introduce the model for the fringe tracking and the evolution of the disturbances (turbulence and vibrations).
2.1.1. Description of the fringe-tracking process
Turbulence and longitudinal vibrations add up to introduce OPD perturbations. The OPD
in a two-telescope interferometer can be related to a phase value ϕ as
(1)In reality, the
quantity y observed by a fringe tracker is not exactly
ϕ, but contains two additional components. The first of these is an
additive measurement noise w. Secondly, a fringe tracker in operation
will apply correction phases, or commands, u using the
piston actuators. We can therefore express the measured value of the fringe tracker as
(2)In what follows, we
refer to y as the residual OPD. For simplicity, we ignore conversion
factors, e.g. 2π/λ, and assume that all quantities
are expressed in the same units (say, microns)1.
To get into the problem of discretization, we denote by T the elementary integration time of the fringe tracker. The discretized variables are then defined in the following way:
-
is defined as the average phase of the disturbance component i during the time interval [(n − 1)T,nT] :
(3)The sum of all these phases is referred to as \begin{lxirformule}$\varphi_n$\end{lxirformule}.
-
The command is a fixed value during one integration (assuming that the response time of the actuators is much shorter than T). We therefore define un as the command applied at time step n, i.e. applied during the time interval [nT,(n + 1)T] .
-
The final parameter, the noisy component wn at time step n, could in principle be defined in a similar way as
. In this model, however, w will be considered as simple zero-mean white Gaussian noise that adds to y. The standard deviation is fixed to σw, which makes the values wn independent of other variables.
The assumption that the data processing after integration amounts to one time step
T leads to the following discretized version of Eq. (2): (4)The quantity
yn is thus the measurement that is
available at time step n. We include a schematic representation of the
fringe measurement model in Fig. 1.
![]() |
Fig. 1 Time diagram of the fringe tracking process. |
A final note to this model for the fringe-tracking process concerns the notion of a fixed measurement error σw. Further on in this work (Sect. 5), we introduce an adaptive weighting according to the (variable) signal-to-noise ratio (S/N) conditions. The current assumption of a fixed σw is only for clarity reasons and not for the actual implementation.
2.1.2. Description of the OPD evolution
The formalism of Kalman filtering that is used to estimate the commands, requires an a priori model not only for the measurement process, but also for the evolution of the observed system, that is, of the turbulence and the longitudinal vibrations. In essence, this requires the description of new states of the system in terms of previous realizations.
The OPD spectra obtained at the VLTI typically contain a number of peaks (e.g. Sahlmann et al. 2009), which are associated with
vibration modes excited by a variety of processes. In the following, we can assume each
mode to be independent. The simplest way is then to describe the dynamics of such a
vibration mode i is to consider it as a discrete damped oscillator,
excited by a Gaussian white noise v. It can be shown (Petit 2006) that the one-dimensional equation of
motion for a vibration mode i of natural frequency
and damping
coefficient ki can be discretized into the
recursive equation
(5)where
Equation (5) has a form that is better known as an
autoregressive model of order two, AR(2). We note that the conversion from continuous to
discrete involves a Taylor approximation.
The modeling of longitudinal vibrations as randomly-excited damped oscillators leads in a natural way to an AR(2) description. One may ask whether the contribution of turbulence can also be described as an AR(p) process. Le Roux et al. (2004) use an AR(1) turbulence model, based on temporal evolution characteristics of the phenomenon and basic parameters such as wind speed and the telescope diameter. A slightly more advanced first-order temporal model is discussed in Looze (2007).
When taking a damping coefficient k > 1, the previous AR(2)
vibration model can also be used to describe non-resonating signals, as in the case of
turbulence2. This property led Meimon et al. (2010) to use an AR(2) model for the
turbulence description. We follow their argumentation and write (8)
2.2. State-space description of the system control
The state-space representation is a general framework to describe problems involving an evolving system on which measurements are performed (for a general introduction, see Durban 2004). Under the assumption of linearity and time invariance (LTI state-space models), the model has the form of two recursive (constant) matrix equations: one describes the evolving system and the other the observational process on that system.
At least qualitatively, we have good indications that the considered problem can be
described as an LTI problem. On the one hand, the states of turbulence and vibrations can
be considered as realizations of a time-invariant system (for reasonable timescales). On
the other hand, the OPD measurements are clearly a form of observation of that system.
Using standard state-space notation, we can cast Eqs. (4), (5), and (8) into the form (Meimon et al. 2010) where
(for the sake of the example, we assume two vibration components, vib 1 and vib 2)
and
(14)The vector
x is called the state vector of the system. In addition,
we denote by Σv and
the
covariance matrices of the system noise (v) and measurement
noise (w), respectively. We note that for the current example,
Σv is of the form
(15)As usual in state-space
representation, the state variable
xn is a hidden quantity: only
yn is observed. It is important to
highlight the block-diagonal form of the matrix A: every perturbation component
(turbulence and vibrations) contributes a single AR(2) block.
2.3. Control: the (asymptotic) Kalman filter
Kalman filtering (Kalman 1960) is a technique designed for general data processing. It is used to track and correct observed systems in a statistically optimal way, in that it minimizes the residual signal (Chui & Chen 2009). Kalman filters start from a linear state-space description of the system and provide estimations of the system’s (hidden) state vector derived from the observations.
We denote by
the estimation of the state vector at time step n, based on all
observations up to time step n′ (i.e.,
y0,...,yn′).
Given the state-space model in Eqs. (9) and
(10), the Kalman filter equations3 take the form
The
former of these equations updates the information about the disturbance state
xn, taking into account the
measurement yn that has become available at
time step n. The latter then predicts the state at the subsequent moment
in time.
The matrix G∞ in Eq. (16) is
the (asymptotic) Kalman gain matrix, calculated as (18)where
Σ∞ is the solution of the matrix (Riccati) equation
(19)The gain matrix
determines the weight by which the new measurement
yn is taken into account for updating our
estimate of the state vector xn.
The mathematical form of G∞ can be derived by requiring that the expectation
value
is minimal (e.g. Chui & Chen 2009). We note that the gain matrix
contains the noise characteristics
(Σv,Σw), making it optimally
adapted to the considered system.
The final piece of information that is needed to design a two-telescope control scheme is
an equation to determine commands. An optimal command will minimize the observed residual
OPD yn, which was given by Eq. (10). The command u needs to
compensate the x-part (w is the noisy part,
which cannot be reduced by u). It is not difficult to see that the most
appropriate choice of u is (20)where (still for
the two-vibration example)
(21)This result forms the
final equation needed for control.
3. Parameter identification
The state-space description that we need to model the fringe-tracking process contains a
possibly large number of parameters. Reconsidering the state-space model in Eqs. (9) and (10), we see that the parameters to identify are
(22)Once the parameters are
obtained, we can “fill” the system matrices, and apply the Kalman filter.
We assume that OPD measurements from the phase sensor are the only available piece of information about the disturbances. In other words, it is assumed that no other a priori information about the disturbance components is available (e.g. from other observation techniques).
3.1. The spectral method
The method we use to calculate model parameters is based on the one proposed by Meimon et al. (2010, hereafter the MPFK method). The MPFK method is a power-spectrum-based routine designed for the parameter identification of the Kalman-filter AO control scheme of Le Roux et al. (2004) and Petit (2006), the scheme that lies at the basis of Sect. 2. We refer to the original MPFK paper for details of the method, and give here a brief summary and some information on our modifications.
The operation of the MPFK method is based on modeling the (estimated) power spectrum of
an observed perturbation-phase sequence. In a time interval before identification, a
sequence of closed-loop data is obtained and corrected for the applied commands
(pseudo-open loop data, POL). In the notation of the state-space model, a sequence of data
(23)is collected. The
periodogram of this sequence contains the imprint of the different perturbation components
that contribute to the phase signal (Fig. 2). In the
first fitting step, the measurement noise is estimated from the flat tail of the
periodogram. By stepwise adding AR(2)-model spectra to the estimated noise spectrum, first
for the turbulence and then for the vibration components, the global power spectrum is
assembled until a stopping criterion is satisfied. The iterative fitting itself is based
on the maximization of the likelihood function associated with each model periodogram. We
note that the MPFK method is designed to operate in non-real time, i.e. outside the
real-time tracking loop.
![]() |
Fig. 2 A general perturbation power spectrum consists of a turbulence spectrum, a flat (measurement) noise spectrum and vibration peaks. |
In the standard MPFK method, a vibration-peak fitting step makes use of a large parameter
grid for the three parameters associated with a single vibration component
(). The method
uses no a priori information about the vibration components, but simply maximizes in each
iteration the likelihood criterion corresponding to the given grid. In other words, the
fitting of vibration peaks is done in a “blind” way on a large grid. To limit the
computation time, we decided to add a simple subroutine that selects peaks from the power
spectrum. In every iteration, we flag the points that differ significantly from the
estimated power spectrum resulting from the previous iteration. Picking each time the
highest peak allows us then to use a much smaller grid around a first guess for the peak
parameters, which is made by estimating the maximum, width, and central frequency of the
peak. In addition to the gain in computation time, peak picking limits the risk that
multiple peaks are being fit with a single large peak, degrading the quality of the fit.
We note that our peak-picking method selects peaks based on their height, rather than
their energy.
3.2. Practical application
In the MPFK paper, it is shown that N = 2000 is a proper choice for the length of the POL-data acquisition. Since POL data is constructed from closed-loop data, the acquisition of data sequences does not require the loop to be (re)opened. This property is advantageous for the operation of the fringe tracker and opens perspectives to perform updates of the model parameters in closed loop. There is however a fundamental difference between the case of AO, for which the MPFK method was designed originally, and interferometry. In the former case, the initialization of the Kalman filter can be based on a sequence of open-loop data (no POL data is available at initialization). In the case of fringe tracking, however, it is impossible to obtain a sequence of open-loop data without taking the risk that fringes are lost. Unlike the case of AO, where wavefront observables remain well-defined at the level of the wave-front sensor (e.g. microlens images for a Shack-Hartmann wavefront sensor), uncorrected fringes have a typical root mean square (rms) motion that largely surpasses typical detection ranges.
The initialization of the Kalman-filter controller for fringe tracking thus inevitably requires a different strategy than open-loop measurements. A straightforward solution is to close the loop temporally using a classical control algorithm. POL data obtained in this way can then be fitted, and consequently we switch from the classical to the Kalman-filter controller. We note that this technique assumes that we take into account the difference in measurement error: the measurement noise σw for a classical controller, extracted from the POL-data fit, will be higher than for the Kalman filter. The measurement error for the latter can be estimated based on Eq. (59), which is introduced in Sect. 7.
3.3. Other methods
The spectral identification method is based on a maximum-likelihood method to model periodograms of measurement sequences. Alternative methods exist to identify parameters in state-space models (see also Meimon et al. 2010).
The advantage of the spectral method is that it is rather transparent, and gives satisfying results (see Sect. 7). The reason is that the imprint of the different perturbation components can be largely decoupled in frequency space, which makes it possible to separate the different disturbances. Future tests on real data will have to justify the use of the method for fringe-tracking purpose, and allow us to optimize the parameter-grid construction. A disadvantage of the method, however, is that it is difficult in general to make an iterative fitting routine robust against outliers and non-standard situations. A good alternative to the spectral-fit method could be a method of a purely algebraic nature, e.g. similar to the one proposed by Hinnen et al. (2005). We are currently checking the possibility of using methods of this nature for parameter identification, as an alternative to the spectral method introduced here. Tests of identification methods based on the reconstruction of the measured time sequences (Shumway & Stoffer 1982; Ziskand & Hertz 1993) were not successful in the current case, basically owing to the large number of components that need to be characterized.
4. Four-telescope fringe-tracking control
In the two previous sections, we have presented a general and complete control scheme for two-telescope fringe tracking. It is generally known that two-telescope interferometers provide a rather limited amount of information per observation (one point in the uv-plane per wavelength bin, no direct phase information). Optical interferometry needs to be extended to include more apertures, especially when considering image reconstruction.
The purpose of the current section is to find how two-telescope control can be extended to more telescopes. We chose to consider the case of four-telescope fringe tracking, in the framework of the second-generation near-infrared beam combiner GRAVITY for VLTI. GRAVITY will combine the beams of the four UTs (or four 1.8-m Auxiliary Telescopes, ATs) in a completely redundant way, i.e. on six baselines. Per elementary fringe exposure, the instrument therefore measures six OPDs, rather than one OPD for a two-telescope beam combiner. The OPDs are corrected by a system of four actuators, i.e. one per telescope. We now consider how the six observables can be transformed into proper commands for the actuator system. We note that this problem is absent in the case of a two-telescope beam combiner: one OPD measurement corresponds to one command. Our four-telescope approach can easily be generalized to n-telescope fringe tracking.
4.1. General modal-based control
Piston disturbances acting on a four-telescope interferometer can be considered as
vectors in a four-dimensional space. In this space, an arbitrary disturbance
P can be expressed as
(24)The actual piston
disturbances P are not directly observable: only piston
differences or OPDs are measured. In a similar way to above, we can define OPD vectors as
(25)where
OPDij ≡ Pj − Pi.
For simplicity, we have assumed that pistons and OPDs are expressed in the same units, say
microns. The conversion
P → OPD
is then given by the matrix equation
(26)where
(27)It is easy to show that
the system in Eq. (26) is not invertible.
Mathematically, the matrix M is of rank three, rather than four. Physically, this
corresponds to it being impossible to recover the global piston
Ptot ≡ P0 + P1 + P2 + P3
from OPD observations (in a similar way to the piston not being observable with a
single-dish telescope). A four-telescope fringe tracker therefore only needs to correct
for three net observables.
Rather than using a description that is based on pistons, we perform a matrix
transformation, based on a singular value decomposition of the matrix M, to construct a
set of variables that directly isolates the invisible global piston. We define the
mode four-vector Q
(Qi for i = 0,1,2,3) as
(28)where
(29)This orthogonal
transformation isolates (a multiple of) Ptot into
Q3. The Q-modes form an equivalent to the Zernike modes in
AO, and their relation to the pistons is graphically represented in Fig. 3. The conversion between vectors
OPD
and Q is then given by
(30)where
(31)We note that the matrix H
automatically puts Q3 to 0, indicating again that we have no
information about Ptot from vectors
OPD.
The three components Q0, Q1, and
Q2 now form a proper description of the observable system
with three degrees of freedom. A general control scheme based on modes consists then of
three steps:
-
1.
we calculate the residual mode vector yQ from the measured residual OPD 6-vector y using the transformation in Eq. (30)
(32)
-
2.
using three controllers (e.g. integrator controllers), we calculate a proper Q-command uQ to correct the Q-modes. We fix
, that is, the global piston is fixed (by convention);
-
3.
transform uQ into the four-component piston command u using the inverse transformation of Eq. (28): u = V uQ. These four commands are sent to the actuators for piston correction.
![]() |
Fig. 3 Graphical representation of the four pistons (top) and the four modes (bottom). The four telescopes are represented as four pixels per square. Grey zones denote the zero-positions, black zones denote positive displacement, and white zones equal negative displacement. The modes could for example be labeled as tilt, twist, tip, and global piston. They form an equivalent to Zernike modes in AO, but for a four-telescope interferometer. |
4.2. Modal Kalman-filter control
Above, we introduced a properly defined control scheme based on a modal approach. We now apply a similar philosophy to a Kalman-filter based control model, and find the first appropriate control law for four-telescope fringe tracking.
4.2.1. Modal state-space model
Starting from the state-space description in Sect. 2.3, we derived (see Appendix A) a full modal state-space model for a four-telescope fringe-tracking system, which can be easily generalized to n telescopes. Individual OPD measurements are assumed to be uncoupled in this model. Additionally, all piston perturbations are considered as being independent. We note that this condition is imposed on two scales. First, on the scale of one telescope, we ignore the coupling between different vibration components. Secondly, on the scale of all telescopes, we assume neither that coupling exists between different telescope vibrations, nor that the turbulence perturbation on the pistons is coupled4.
The result of this analysis is the modal state-space model (compare with Eqs. (9) and (10)) given by The
vector quantities have the following definitions:
-
xQ:
4-block modal state vector;
-
yQ:
modal measurement 4-vector (
);
-
uQ:
modal command 4-vector (
);
-
vQ:
4-block modal system noise vector, of covariance matrix ΣQ,v;
-
wQ:
measurement noise 4-vector, of covariance; ΣQ,w note that
.
On the other hand, the matrices and
are
(35)where the blocks A and
C have the same structure as the corresponding system matrices in the two-telescope
state-space model. All scalar quantities
(yn,
un,
wn) have now become four-vectors, and
the vector/matrix quantities
(xn, A, etc.) have obtained
a four-block structure (xQ,n,
,
etc.). This is all because there are four modes. There are several interesting
properties about the above result, which we discuss now. We note that we first neglect
the noise terms.
First, all four terms in Eq. (34) have
a last component that is immediately defined to be 0 (see also matrix
). In
other words, the global piston is automatically defined as not being observable. Thus,
although it is still a real evolving quantity described by the last block component of
the vectors in Eq. (33), we can neglect
it for the analysis.
Secondly, when looking at the structure of the state-space model and of Eq. (35), we see that all modes have a decoupled
evolution (still neglecting the noise). What is more is that the sub-matrices of
are
the same, and the same observation holds for
.
Neglecting thus the noise, we see that the evolution is described by four identical
copies of a two-telescope state-space model (with the modification that for the last
copy the observation equation is 0). Modes therefore essentially behave in a similar way
to the quantity ϕ of the previous sections5.
Considering the noise terms.
A few things change when we also take into account the noise terms. Since modes are calculated as combinations of pistons, Eq. (28), or combinations of OPDs, Eq. (32), the covariance matrices are in general no longer diagonal (given our assumption that piston perturbations as well as OPD measurements are independent).
Since we have individual OPD measurements at our disposal, it is possible to estimate
the measurement errors
associated with the measurements of OPDij. This allows us
to calculate the mode measurement error covariance matrix
ΣQ,w as
(36)where
(37)(we
refer to Appendix A). The evaluation of the
covariance matrix ΣQ,v, however, is a more fundamental
problem. The matrix ΣQ,v depends on quantities that
require direct piston information, which is inaccessible (owing to the unobservability
of the absolute pistons). We neglect the off-diagonal entries; under this
approximation, it can be shown that ΣQ,v has a diagonal
built of four equal diagonal parts (e.g. like the structure of
). In
this model, the evolution of the four modes by Eq. (33) is thus genuinely independent and equal.
Convention.
Up to now, we have carried the global piston mode as a superfluous burden in our
matrix equations. Since we assume that the modes are fully decoupled, we can drop all
matrix and vector blocks corresponding to Q3. Under this
convention, the fundamental matrices V and H become
(38)It is not difficult
to generalize this operation to the other quantities. For example, the
measurement-error covariance matrix ΣQ,w loses its last
column and line and becomes a 3 × 3 matrix6.
4.2.2. Modal Kalman filter
The non-diagonal nature of the covariance matrix ΣQ,w shows
that the modes are not independently measured. It follows that the control scheme has to
involve one global Kalman filter for the three modes, rather than one per mode. The full
Kalman-filter equations then take the form In
this equation, the Kalman gain G∞ is calculated as in Eqs. (18) and (19), with the changes
. Finally, the
optimal mode command is calculated as
(41)where the matrix
is
defined in the same way as
in
Eq. (35), with blocks K defined as in
Sect. 2.3.
4.2.3. Practical considerations
The modal Kalman-filter control is an interesting control scheme: it is completely symmetric, its conversion matrices are simple, its input and output have the same dimensions (three), and (under the above approximations) each mode has exactly the same state-space model. An additional advantage of the modal scheme is presented in Sect. 4.4, where we consider the identification of the parameters in the state-space model described in Eqs. (33) and (34).
The symmetry involved in the definition of the modes implies that the control will work most effectively when all other involved quantities are symmetric. It is easy to see, however, that this is not the case. The design and implementation of the GRAVITY system are largely symmetric, whereas actual observation conditions (e.g. wind shake, vibrations) and baseline configurations are not. Whenever a single component in the system fails, for example one telescope or one baseline, at least two of the three observables are spoiled in the current control scheme (each OPD contributes to two modes). A priori, the loss of observables might be expected to have a negative impact on the control. We therefore propose a second four-telescope control scheme, which is based on the tracking of the individual OPDs.
4.3. OPD Kalman-filter control
An alternative to a global Kalman filter that tracks all modes is to use six Kalman filters to control each individual OPD. In this (redundant) scheme, each telescope pair is considered as a single two-telescope interferometer.
One might wonder why it would be more interesting to track twice the number of variables (three modes versus six OPDs). The main interest of the modal control is that it was defined in a perfectly symmetric way to transform the six observables and four pistons into three quantities that are tracked. Here, the conversion to and from modes is done using fixed matrices. In the OPD control, however, the redundancy of the conversion to and from OPDs allows us to choose different ways of defining commands. In particular, one can consider non-symmetric combinations of observables or – even better – adaptively weight observables with respect to the physical conditions of observations. The latter possibility is our main point of interest: whenever observing conditions (low flux, low-signal baselines, etc.) make certain OPDs unfavorable, we can lower their weight in the command calculation. Additionally, it even allows us to let the interferometer work with only three or two telescopes.
We postpone a deeper discussion of the above problem to Sect. 5, where we also modify the modal scheme to a more robust version. For
completeness, we already give an appropriate method for command-vector calculation. By
xOPD, we denote the six-block OPD state
vector, i.e. the state vector corresponding to each OPD organized into a column vector.
The four-component piston command
un is then calculated as
(42)where
is the
six-block version of the previously introduced matrix K, in order to operate on the six
substate vectors in
. The matrix , on the
other hand, is a weighted generalized inverse to the matrix M (Eq. (27)), calculated as
(43)In
this equation, the 6 × 6 weighting matrix W, defined further on in Eq. (47), distributes the weight among the
different OPD substates for the command calculation.
4.4. Four-telescope parameter identification
The MPFK method (including the peak-pick extension) can be directly applied to
two-telescope fringe tracking, for the simple reason that it is exactly designed for the
corresponding state-space model. In the four-telescope case, the sequences of
(Eq. (23)) are replaced by equivalent sequences of
six-dimensional vectors
, i.e. the POL OPD
measurement six-vectors.
In the modal state-space model, the three observed modes are decomposed into the
elementary disturbances. Therefore, the most logical option is to apply the spectral fit
to the time sequences (44)where
are the
three-component POL mode measurements (we use Eq. (32)). Under the convention of neglecting the off-diagonal entries of
the covariance matrix ΣQ,v, we have shown in Sect. 4.2.1 that the state-space model is the same for the
three modes. The result of this approximation is not only that we can apply the spectral
fit to each of the three non-trivial modes, but also that we can average the spectra of
the three modes and perform one global fit. Indeed, assuming three independent modes, the
three periodograms are independent realizations of the same state-space model. Averaging
periodograms for identification translates into a modification of the periodogram
statistics, and thus a new likelihood function for the identification method (see Appendix
B for some details). The important advantage is
that the periodogram is a more accurate estimation of the power spectrum, and that even
small perturbation components become significantly discernible from statistical noise.
The OPD Kalman-filter control scheme, on the other hand (Sect. 4.3), is based on the redundant tracking of all OPDs by six two-telescope Kalman filters. In this case, it thus suffices to apply the spectral fit to each of the six OPD measurement sequences. The disadvantage is, of course, that we have to apply the identification method six times, instead of once in the modal scheme. Additionally, since we have only one sequence per OPD, we cannot apply the advantageous averaging of periodograms.
5. Fine-tuning the fringe-tracking control
Our aim to find appropriate Kalman-filter control schemes led us to consider two options: tracking three observable modes in a three-component Kalman filter and tracking six OPDs in six individual Kalman filters. We now consider some specific cases encountered in real observations, and present a way of dealing with them in our control schemes.
5.1. Low S/N baselines
Different processes can lead to lower-quality OPD measurements on a given baseline. Examples are non-central fringe measurements in the fringe envelope and variable splitting ratios of the beams. Whenever the fringe amplitude on a certain baseline is low with respect to the noise level, the control strategy should be designed to maintain the fringe tracking at as optimal a level as possible.
An advantage of the interferometer concept considered above is that its redundant architecture enables it to take into account low fringe-amplitude baselines, in which different OPD combinations can compensate for other OPD measurements (e.g. OPD01 = OPD02 − OPD12). The key step will be to take a weighted recombination of the six baselines, in order to mutually improve the raw measurements.
In essence, we can write the process of weighting the residual OPD values as
(45)with
an associated measurement-error covariance matrix
(46)The 6 × 6 matrix
IW combines in a weighted way the OPD measurements, which
allows us to mutually improve each OPD value. The weights
Wij attributed to each
OPDij are organized in a diagonal weighting matrix W,
which is used to calculate the previously introduced matrix (Eq. (43)). In accordance with the purpose of
weighting, a proper definition of W is
(47)which is the
inverse of the (diagonal) measurement-error covariance matrix. In other words, we take
.
We note that either global weights W, i.e. fixed weights during operation, or
instantaneous weights Wn, i.e. defined for each OPD
measurement, can be considered. The latter is possible owing to an instantaneous
measurement error that is provided by the phase sensor (we refer to Eq. (59) in Sect. 7).
In the modal Kalman-filter control scheme, we now simply apply the standard OPD-to-mode
transformation in Eq. (32) to the weighted
modes yW, and accordingly use a
weighted measurement covariance matrix (48)For the OPD scheme, the
six Kalman filters are applied to the six weighted OPDs. We note that in this scheme, the
weighting is assumed not to introduce coupling into the OPDs. In other words, we ignore
off-diagonal terms in the covariance matrix Σw,W in the OPD
scheme. This allows us to keep using the 6 independent Kalman filters, and will prove
useful for the adaptive-gain definition below.
A final note on this OPD weighting is that it is only useful for non-resolved fringe-tracking targets. When the target is (significantly) resolved, the intrinsic visibility phases may be non-zero and hence phase closures cannot be assumed to be 0 (e.g. OPD01 + OPD12 − OPD02 ≠ 0, in general). In any case, fringe tracking of significantly resolved objects is disadvantageous owing to the lower fringe contrasts.
5.2. Flux dropouts
OPD weighting resolves the issue of low S/N’s on individual baselines. A second problem occurs when the flux acquired at a telescope is low, i.e. the so-called flux dropouts. The notion of flux dropouts contains the different processes by which the number of photons coming from a certain telescope drops. These include:
-
atmosphere-related effects, where scintillation and otherturbulence effects can lead to a failure of the AO system. Theresulting Strehl ratio can be insufficient for proper fringeformation and detection. In addition, cirrus may lower the fluxarriving at the focal plane of a telescope;
-
errors in the fiber injection, where mechanical problems (e.g. vibrations) and other propagating tip-tilt errors in the optical train can lead to problems in the injection of the beams into the fibers, coupled to the beam combiner.
As far as the second point is concerned, GRAVITY will have a fiber coupler subsystem to perform an internal tip-tilt correction (Pfuhl et al. 2010). Owing to the residual errors, however, a specially developed fringe-tracking strategy is still required.
It should be clear that the problem of flux dropouts is very different from the problem introduced in the previous section. When the flux drops, it is impossible to compensate for the low signals on the baselines associated with that telescope: all these baselines are affected in a similar way. For non-predictive control schemes, no other solution exists apart from waiting until the lost signal re-emerges, while keeping the command for that telescope fixed. The advantage of using a predictive method, such as a Kalman filter, is that it allows us to continue operating based on the state-space model.
In Appendix C, we argue for the necessity to modify the gain to take into account flux problems at the level of a telescope. In essence, we modify the constant Kalman gain and allow it to decrease when observing conditions on associated telescopes degrade. The result is that the estimation of the next state is based more on the model, i.e. the observed (noisy) residual OPD is not taken into account as effectively.
To introduce the instantaneous-gain definitions, we first define the instantaneous
versions ΣQ,w,W,n and Σw,W,n of
the previously introduced noise-covariance matrices (Eqs. (46) and (48)). The
instantaneous covariances are recalculated at each time step using instantaneous weighting
matrices Wn (rather than a constant W). The discussion in
Appendix C then leads us to the instantaneous-gain
definitions (49)(“tr”
denotes taking the trace) and
(50)(the index
ij to the parentheses indicates that we take the matrix diagonal
element corresponding to the considered OPDij; for instance,
for ij = 01 and ij = 12, we take the zeroth and third
diagonal element, respectively).
The advantage of the OPD scheme is now clear. Here we consider the extreme case when we lose all information provided by one telescope. In the modal scheme, the loss of information immediately means that we lose all mode observables, which implies that we cannot take into account the observational information at that step (the instantaneous gain matrix becomes zero). On the other hand, losing a telescope in the OPD scheme means we still have three OPDs that are properly measured, and the gain for those Kalman filters can stay fixed. This implies that the command is still based on half the number of usual observables (versus none for the modal scheme). The cost we have to pay is that we need to track six variables instead of three.
5.3. Piston isolation
For some specific reasons, it might be interesting to completely decouple one or more telescopes from the command estimate. This can occur when a telescope (or its AO system) temporally becomes very unreliable during an observing run. The flexibility of the OPD scheme allows us to track blindly on one/two telescope(s) without affecting the tracking of the others, after which we can switch back to the usual scheme.
To temporally decouple telescope i, we attribute a weight zero to all
associated baselines to telescope i. Doing so, however, results in
putting the command on telescope i to 0. This can be disadvantageous to
the stability of the system, as it generally involves a large jump in all actuator
positions (the average actuator position is always 0, hence all actuator positions change
when one command is put to 0). The basic idea is then to add an extra component to the
command vector, (51)where
the components of
that
correspond to i are blindly estimated (gain 0). The newly-introduced
matrix Li, on the other hand, has a specific structure that
adds a blindly estimated command to i, and subtracts an equal value from
all other commands such that the sum of the commands is zero. For instance, for
i = 0 (i.e. to isolate P0) and
i = { 1,2 } (i.e. to isolate both P1 and
P2), we have
(52)respectively. As we
subtract an equal command from each non-decoupled actuator, these commands on the
decoupled telescope(s) do not affect the non-decoupled telescopes.
6. Full Kalman-filter operation
We have introduced the basic theory, developed several control schemes, and considered strategies that allow us to minimize the problems associated with imperfections of the system/control schemes. It is a good point now to define the full Kalman-filter operation by comparing the two schemes. The operation is split into an identification phase and a (real-time) tracking phase, the former being needed to find the parameters.
Identification phase
-
1.
Pseudo-open loop (POL) data acquisition: a set of data pointsand estimated measurement errors is obtained in closed loop andcorrected for the applied commands to get a POL sequence.
-
2.
Weighting: we estimate a global measurement noise for each POL OPD sequence by taking the median of the measurement errors. We apply the weighting in Eq. (45) to get the weighted OPD sequence. In addition, for the modes, apply the OPD-to-mode transformation in Eq. (32) to calculate the weighted mode sequence. We also calculate the weighted covariance matrices using Eq. (46) or (48).
-
3.
Spectral fit: the spectral-fit method is applied to either the averaged mode periodograms (modal scheme) or the individual weighted OPD periodograms (OPD scheme) to get the fit parameters.
-
4.
Filling of the Kalman matrices: the fitted parameters are used to define the system matrices, and to calculate the gain G∞.
Tracking phase
For each elementary fringe exposure:
-
1.
Weighting: the residual OPD estimates for the six baselines,obtained from the phase sensor, are combined in a weighted wayto give optimal residual mode estimates (modal scheme) or OPDestimates (OPD scheme). The weighting factors are calculatedfrom instantaneous estimates of the measurement errors. Wenote that this step requires a singular-value decomposition tocalculate the weighted generalized inverse of M. We calculate the associated instantaneous measurement covariances ΣQ,w,W,n or Σw,W,n (Eq. (46) or (48) with instantaneous weights).
-
2.
Kalman step: we apply the Kalman filter, using the instantaneous gain definitions (Eq. (49) or (50)).
-
3.
Command calculation: we calculate the actuator commands.
7. Simulations for VLTI/GRAVITY
To test the performance of our Kalman-filter control, we construct a simple simulator of the control process in a fringe tracker, which also allows us to distinguish between the different control schemes. The simulations are performed in function of the key science case of the four-telescope beam combiner GRAVITY: the astrometric observation of the Galactic-center neighborhood. As described in Gillessen et al. (2010), stable fringe tracking of a magnitude K = 10 star is required, with a specification of 300 nm for the residual OPD.
More complete simulations of the fringe tracking process for GRAVITY, including frame-by-frame acquisition, are postponed to a future paper (Choquet et al., in prep.). In the same paper, a comparative study with classical control algorithms will be presented and loop parameters will be optimized.
![]() |
Fig. 4 Power spectra used to simulate the perturbations on the four pistons (four colors). The turbulence-only spectrum is indicated by the dashed line. |
7.1. Disturbance simulation
Data is simulated based on model power spectra for the associated phenomena. Since turbulence and vibration components are assumed to be mutually independent, we can sum individual perturbation sequences to assemble complete data sequences.
For each data sequence, we take a number of vibration components under the form of randomly excited damped oscillators. The frequencies chosen are inspired by OPD power spectra obtained with the VLTI/PRIMA fringe tracker, shown in Sahlmann et al. (2009). The total rms deviation per piston is chosen between 200 and 240 nm, which is a realistic estimate of the current vibration level at VLTI (Poupar et al. 2010). The vibration profiles used for each telescope are shown in Fig. 4 (superimposed on the turbulence profile), with some specifications in Table 1.
Considering the turbulence contribution, Conan et al.
(1995) derive the asymptotic laws for single turbulence-layer OPD spectra
(53)(pure
Kolmogorov turbulence, i.e. with infinite outer scale), where v is the
wind speed parallel to the baseline, B is the baseline, and
D is the diameter of the telescopes. The first two power laws in
Eq. (53) are clearly observable in OPD
spectra obtained with the Mark III interferometer (Colavita
et al. 1987). In this case, the steep f−17/3
cut-off is not observed because of to the small telescope diameters (high cut-off
frequency). Nevertheless, the f−17/3 law, which is also
expected for mono-layer tip-tilt spectra, is also in practice not observed on large
telescopes. The tip-tilt power spectra of the AO-system VLT/NAOS, used by Meimon et al. (2010) to verify the spectral
identification method (MPFK method, Sect. 4.3), are
well-modeled by an AR(2)-model spectrum (lacking a very steep cut-off). In addition, OPD
spectra obtained with the VLTI/PRIMA fringe tracker using the ATs
(D = 1.8 m) are compatible with an
f − 8/3 power law up to 200 Hz, where a flat noise tail
takes over (Sahlmann et al. 2009). We thus conclude
that, in real-life situations (multi-layer turbulence, integrated
-profiles,
multi-directional wind), the f−17/3 law is not
observable/observed.
We therefore decide to use the OPD-spectrum model in Eq. (53), but discard the f−17/3 cut-off. A
very similar model is used by Gitton (2010) in the
context of VLTI instrumentation. In Table 1, we
present the values assigned to the parameters, which are representative of the observing
conditions at the VLTI. Each piston sequence is scaled to an rms deviation of
10 μm, such that the resulting OPD sequence has an rms value of
~14 μm (m). The
theoretical turbulence profile is shown in Fig. 4.
7.2. Flux dropouts and measurement noise
In addition, we need to introduce the phenomenon of varying flux in the simulations,
which manifests itself as variable measurement noise. This means that, in parallel to the
simulated OPD sequences, we need to simulate a time sequence of instantaneous noise
characteristics, and add a random noise-realization sequence to the noise-free OPD
sequence. The measurement error on a
baseline ij is inversely proportional to the mean observed visibility
Vij and the S/N
ρij on that baseline, where
(54)and
(55)Here, RON is the read-out
noise per pixel and Ni is the number of
coherent photons arriving from telescope i for a single phase
measurement. The factor 4, on the other hand, is due to the phase sensing for GRAVITY
being based on 4-pixel measurements (ABCD algorithm, see Shao & Staelin 1977). For simplicity, we only consider OPD estimation
using phase-delay estimation. Phase-unwrapping techniques for resolving the
2π ambiguity in measured fringe phases, for example using a combination
of phase-delay and group-delay algorithms (e.g. Blind
et al. 2011), are postponed to future work
In terms of the number of coherent photons N arriving per exposure on
one telescope, we have (56)where
is the total coherent throughput (AO-corrected atmospheric turbulence, delay-line system
and GRAVITY). The factor
comes
from the splitting of arriving photons on the three baselines corresponding to the
telescope, while
refers
to the dispersion of fringes on five spectroscopic channels (needed for the group-delay
estimation, not considered in the current simulations).
The variability in the coherent throughput
is simulated based on a model for the propagating tip-tilt errors in the AO system, thus
simulating improper fiber injection. Basically, we assume that
(57)The quantity
φ is the relative offset of proper fringe injection (the ratio of
tip-tilt offset to the fiber mode field radius). On the basis of tip-tilt measurements at
the VLTI, the temporal power spectrum for φ is derived to be
(58)(Gitton,
priv. comm.). The total rms tip-tilt deviation is scaled to 14.6 mas, of which 5 mas is
present in the vibration component. A deeper discussion of this power spectrum would lead
us too far for current purposes, and will be presented in future work (Choquet et al., in
prep.). An extract of a simulated normalized throughput sequence is shown in Fig. 5.
![]() |
Fig. 5 Ten-second extract of a normalized throughput sequence for one of the pistons. |
Since phase delays estimated on the five spectroscopic channels can be combined, the
estimator for the measurement error is derived to be7(59)(Houairi et al. 2008). The parameters in this and the
above equations are specified in Table 1, and are
representative of the astrometric observation of the Galactic center, which is the
principal science case of GRAVITY. At maximal throughput
(
),
we derive a coherent photon number per exposure of
Ni = 20 and a corresponding measurement
error of
nm.
Simulation parameters, representative for the Galactic-center observation of GRAVITY.
7.3. Turbulence-only simulations
Before considering simulations with the complete set of expected OPD perturbations (turbulence and longitudinal vibrations), we perform a number of fringe-tracking runs including only the effect of turbulence (but with flux dropouts). The simulation process is started by generating a sequence of 2000 POL-data points. To keep the focus of our simulations on investigating the principle and performance of Kalman-filter fringe tracking, we chose to construct POL data in a simple way by adding measurement errors (Eq. (59)) to a turbulence OPD sequence. In complete and fully realistic simulations of the fringe tracking, the simulated POL data should be acquired in a frame-by-frame simulation with the phase-sensing algorithm. As mentioned before, this problem will be addressed in our future work (Choquet et al., in prep.).
Following the operation scheme defined in Sect. 6, we simulate 200 full fringe-tracking runs of 100 s for both the modal and the OPD Kalman filters. To test our definition of the instantaneous Kalman gains when handling flux dropouts, we distinguish between simulations without modifying the gains (F, fixed) and with instantaneous gains (I).
Once the fit parameters are obtained and the Kalman matrices are generated, both
recombination schemes can be tested. The fringe tracking is initiated with an empty state
vector (). The
convergence of the Kalman filter to the actual state of the system typically takes a few
to a few tens of tracking points. We reject this transition region when calculating the
residual OPD.
The results of the turbulence-only simulations are shown in Fig. 6. In this plot, all residual rms OPD values (6 per run) are grouped in
blocks of 2 nm, giving us a measure of the distribution of residual OPD values. We found
that the performance of both control schemes (modal and OPD) in both gain settings (fixed
and instantaneous) is similar. The residuals are distributed with some degree of asymmetry
around a nominal residual value in the range of 125–145 nm. The total tracking residuals
of 100-s simulations are thus only about twice as large as the measurement error
at maximum
throughput for a single OPD observation (68 nm, see previous section). This indicates
that the tracking performance of the Kalman-filter schemes is good. For both filters
(modal versus OPD), the gain modification slightly reduces the residual OPDs, by about
3–4 nm (about 5–6% in energy).
![]() |
Fig. 6 Distribution of residual OPD values resulting from 200 simulations with turbulence as the only OPD disturbance. The plots show the number of residuals per block of 2 nm. Both Kalman-filter schemes are considered, with instantaneous (I) and fixed (F) gains. |
7.4. Full perturbation simulations
We now turn to simulations based on the full power spectrum in Fig. 4. We put a maximum of 40 on the number of vibrations that can be identified during the mode identification, and 20 per baseline for the OPD identification. These numbers correspond to a maximum matrix size of 240 × 240 for Σ∞.
As in the case of the turbulence-only simulations, we used 2000-point data sequences for the identification. The result of a typical spectral fit is shown in Fig. 7. By definition, the average mode spectrum contains all vibration components. This is clearly apparent in the upper plot, where the number of components to identify is about twice as high (a single baseline has the vibrations of only two telescopes). Another thing to note is that the statistical spread of the average modal periodogram is indeed lower than that of the OPD periodogram.
![]() |
Fig. 7 Result of applying the spectral-fit method to an average mode periodogram (top) and one of the six OPD periodograms (bottom). The statistical spread in the periodogram points is clearly smaller than in the average periodogram, in other words the power spectrum is more accurately estimated. |
A total of 200 full simulations of identification and 100-s fringe-tracking run were performed. The corresponding distributions of OPD residuals are shown in Fig. 8, organized into blocks of 10 nm. The addition of vibration perturbations has introduced a considerable spread in the residual OPD values. In addition, unlike in the turbulence-only simulations, a clear difference in performance between the modal and the OPD scheme is found. The residual rms value for the modal scheme is about 290 nm, on average, where about 30% of the measurements are above 300 nm. The OPD scheme, on the other hand, has an average of about 240 nm for the rms OPD value, and only 6% of the residuals surpass 300 nm. When choosing between fixed or instantaneous gains, there is no apparent significant difference. We describe these observations in greater detail in Sect. 7.5.
Since the vibration perturbations considered are on the level of 200–240 nm, the
corresponding OPD vibration perturbations are roughly 300 nm
(e.g. nm). Comparing
the smallest residuals of turbulence-only simulations (~125 nm) to the best
residuals of the current simulations (~200 nm), our simulations indicate that
as much as 75% of the vibration energy can be filtered out by our Kalman filters (we
calculate that (2002 − 1252)/3112 = 25% of the
vibration energy is left).
7.5. Discussion
The above simulations give a first quantitative result for the Kalman-filter schemes developed in this work. We have illustrated the operation of the spectral fit on realistic perturbation profiles, and have found the performance one would expect from the modal and OPD control schemes.
When considering our definition of instantaneous gains, it is clear that no significant differences between fixed and instantaneous gains are seen, for the considered flux perturbation sequences. One conclusion that might be drawn is that the instantaneous gains do not improve the fringe-tracker performance. To verify that this would be the case, we performed simulations without the flux dropouts, for which we added the results to Fig. 8. Surprisingly, although there is some apparent difference, the global pattern in the distribution of OPD residuals seems to be rather unaffected by the flux dropouts. At this point, it might be expected that the Kalman-filter based control scheme is not limited by the expected fluctuations in flux. Future comparison to other control strategies would be required to determine whether this property is exclusive to Kalman filtering. Although the instantaneous-gain definitions do not change the Kalman-filter performance for the considered flux perturbations, we assume that the gains are still useful. When poorer observing conditions prevail, causing longer flux dropouts for instance, the gain can still be expected to limit the risk of improper tracking. In this respect, using instantaneous gains might be very advantageous to the fringe tracking on the ATs, which lack a higher-order AO correcting system. Additionally, when lower fluxes are assumed for the observed object, the gain modification becomes useful.
![]() |
Fig. 8 Distribution of residual OPD values resulting from 200 simulations with full perturbation contribution. The plots show the number of residuals per block of 10 nm. Both Kalman-filter schemes are considered, with instantaneous (I) and fixed (F) gains. In black dashed and dotted lines, we overplot the corresponding distributions for simulations without flux dropouts. |
The main observation of our simulations is that the OPD Kalman filter provides the best results. This is somewhat unexpected, since the modal scheme has the most reliably estimated power spectrum to model, and has the optimal number of variables to track. One of the main disadvantages of the modal scheme, however, is that all vibrations are fitted at once. As a result, the periodogram fit is more difficult:
-
A higher fraction of the periodogram points are part of a vibrationpeak, thus fewer points are available to estimate the turbulencecomponent. Since the latter component is the most importantperturbation, a good estimation of its associated power spectrumis essential.
-
Vibration peaks place a much stronger bias on the estimation of the turbulence component. The reason is that the statistics of an average mode periodogram are “narrower” (i.e. have a smaller spread, see the discussion in Appendix B), hence vibration peaks have a stronger influence on the turbulence estimation than in the case of an OPD-periodogram fit.
-
The more peaks that are present in the periodogram, the higher the probability that the peaks will overlap. Owing to the insufficient resolution of the periodograms, overlapping peaks will be fitted as a single broader peak, which is more difficult to correct (more transient behavior).
All of these properties might lower the quality of the power-spectrum estimation for the modal scheme, hence lower the Kalman filter performance.
Another property that might have been incorporated to explain the performance differences between modal and OPD control is that flux dropouts have different implications for the schemes. We recall that a low level of flux at one telescope can significantly affect the quality of all modes, whereas three of the six OPDs are well-measured in the OPD scheme. However, since simulations with flux dropouts give similar results to simulations without, the performance difference cannot be attributed to flux dropouts. Temporally losing information about all mode observables thus does not significantly influence the tracking performance of the Kalman filter. In contrast, for non-predictive control algorithms applied to the modes, losing observables can be expected to be a critical limitation.
When considering the actual shape of the distributions in Fig. 8, we find a quite large spread in the residual OPD values. Not taking into account local statistical jitter, the distributions have a clear global maximum with some “leaking” to higher residuals (~10–15%). We ascribe these outliers to improper estimation of the power spectra of the perturbation components. Improvements to the identification method could indeed be made to make it more robust (e.g. optimal grid construction, reconsidering the number of identification points).
As a final note, it cannot be ruled out that the performance difference is related to the conceptual difference between the Kalman filters. For a vibration acting on one telescope, the three baselines that are not associated remain unaffected from this vibration. In the modal scheme, however, the vibration needs to be filtered in all modes. It might be this property of indirectly affecting the other telescopes (via the modal filtering) that causes the lower average performance of the modal scheme. More tests are needed to verify this reasoning.
7.6. Performance with respect to GRAVITY
The main conclusion of these simulations is that the average performance of the Kalman-filter control schemes is compatible with the performance required for GRAVITY, specified at 300 nm (Gillessen et al. 2010). For the OPD scheme, at least 90% of the performed tracks should attain this stability level, and a nominal performance of λ/10 is within reach. This makes the latter scheme clearly the most suitable of the two.
The most important working point to reach this stability level for our Kalman-filter schemes is optimizing the operation of the identification method. In this respect, a decent implementation should limit both the risk of low-quality identification and the calculation time. For the latter point, an optimal adaptation of the parameter grid to the typical prevailing disturbances at the VLTI should be defined (Chouquet et al., in prep.).
Compared to the simulations done in Choquet et al. (2010), based on a classical integrator controller, our simulations show that the nominal Kalman-filter performance might be compatible or better. In any case, we should immediately nuance this comparison, since the assumed parameters and perturbation profiles are rather different than the ones used here. For example, the simulated turbulence profiles used here exclude the unobserved f−17/3 part and a higher and more realistic vibration contribution is assumed. Our future work will present a firmer base for comparing Kalman-filter fringe tracking to classical control, something that is beyond the reach of the current work.
8. Summary
In the framework of new-generation interferometric instruments, we have designed and analyzed a Kalman control scheme for fringe tracking. The starting point of this work was the work done in AO control, in particular a control scheme proposed by Le Roux et al. (2004) based on Kalman filtering, which is an optimal data processing algorithm. Apart from the usual correction for turbulence disturbance, a Kalman filter allows us to correct for vibration disturbances, present in VLTI observations under the form of longitudinal vibrations.
We started by formulating a control scheme for two-telescope fringe tracking, based on a parametric model for the fringe-tracking disturbances. To estimate the involved parameters, we included an identification method based on the characterization of perturbation power spectra (Meimon et al. 2010). As a direct application to multi-telescope fringe tracking, we extended the fringe-tracking formalism to four-telescope control schemes. We note that our approach can easily be generalized to n-telescope fringe tracking.
Two Kalman-filter control schemes are considered in the four-telescope case: a modal-based scheme, where three modes (linear combinations of pistons) are tracked in a single three-component Kalman filter, and an OPD-based scheme, where six OPDs are tracked using six Kalman filters. The advantage of the former is that only one spectral fit needs to be performed: the three modal periodograms can be averaged. To ensure that the control schemes are unaffected by low S/N baselines and low fluxes at the level of the telescopes (flux dropouts), we included a weighting step and a modification to the gain.
Simulations performed in the context of the four-telescope beam combiner VLTI/GRAVITY allowed to test the two control schemes and the concept of instantaneous gains. Two important observations that follow are:
-
1.
Despite being operationally heavier, the OPD control schemeprovides the best results. We (partly) attribute the lowerperformance of the modal control to the more difficultidentification (all vibration parameters need to be extracted atonce). There might also be a conceptual performance limitationrelated to Kalman filtering of the modes, in that all perturbationsneed to be corrected in all modes.
-
2.
For the considered flux perturbations, the instantaneous gains do not significantly improve the fringe-tracking performance.
When considering the latter point, comparison to simulations without the flux dropouts have indicated that Kalman-filter control seems to be rather unaffected by the considered flux perturbations. Yet, the instantaneous gain might prove useful when worse observing conditions prevail (e.g. fainter tracking source, longer flux dropouts).
As far as the science case of VLTI/GRAVITY is concerned, both schemes show nominal performances that meet the specified performances. The results open perspectives for attaining even higher quality residuals (λ/10 performance). The most important working point for the future is optimizing the identification method.
The complex identities
and cos(jθ) = coshθ are used to compute
a1 in Eq. (7).
Note that Eqs. (16) and (17) represent the asymptotic Kalman equations. The non-asymptotic equations can be found e.g. in Chui & Chen (2009).
Using Eq. (36), it can be shown that the last column and row were zeros anyway, exactly owing to their unobservability.
Note that, during operation, the number of photons per aperture can be directly estimated from the six ABCD measurements (ABCD photometry, see Benisty et al. 2009).
The horizontal bar indicates that C− is a row vector built of row vectors, and allows us to distinguish from the previously introduced C matrix (Eq. (A.5)). In the principal text, we drop this bar, for notational simplicity.
Acknowledgments
We would like to thank Pierre Fedou for useful interactions on the design of VLTI/GRAVITY. We are grateful to Frank Eisenhauer, the editor (Thierry Forveille) and the anonymous referee for comments and suggestions that helped improving this manuscript.
References
- Benisty, M., Berger, J.-P., Jocou, L., et al. 2009, A&A, 498, 601 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Blind, N., Absil, O., LeBouquin, J.-B., Berger, J.-P., & Chelli, A. 2011, A&A, 530, A121 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Choquet, E., Cassaing, F., Perrin, G., et al. 2010, in Proc. SPIE, 7734 [Google Scholar]
- Chui, C. K., & Chen, G. 2009, Kalman Filtering: with Real-Time Applications (Springer) [Google Scholar]
- Colavita, M. M., Shao, M., & Staelin, D. H. 1987, Appl. Opt., 26, 4106 [NASA ADS] [CrossRef] [Google Scholar]
- Colavita, M. M., Wallace, J. K., Hines, B. E., et al. 1999, ApJ, 510, 505 [NASA ADS] [CrossRef] [Google Scholar]
- Colavita, M. M., Booth, A. J., Garcia-Gathright, J. I., et al. 2010, PASP, 122, 795 [NASA ADS] [CrossRef] [Google Scholar]
- Conan, J.-M., Rousset, G., & Madec, P.-Y. 1995, J. Opt. Soc. Am. A, 12, 1559 [NASA ADS] [CrossRef] [Google Scholar]
- Delplancke, F. 2008, New Astron. Rev., 52, 199 [NASA ADS] [CrossRef] [Google Scholar]
- Di Lieto, N., Haguenauer, P., Sahlmann, J., & Vasisht, G. 2008, in Proc. SPIE, 7013 [Google Scholar]
- Durban, J. 2004, Introduction to state space time series analysis (Cambridge: University Press), ed. A. Harvey, S. J. Koopman, & N. Shephard [Google Scholar]
- Eisenhauer, F., Perrin, G., Brandner, W., et al. 2011, The Messenger, 143, 16 [NASA ADS] [Google Scholar]
- Eisenhauer, F., Perrin, G., Rabien, S., et al. 2005, AN, 326, 561 [NASA ADS] [Google Scholar]
- Gillessen, S., Eisenhauer, F., Perrin, G., et al. 2010, in Proc. SPIE, 7734 [Google Scholar]
- Gitton, P. 2010, Interface Control Document between VLTI and its Instruments. Part II, Tech. Rep. VLT-ICD-ESO-15000-4809, ESO [Google Scholar]
- Haguenauer, P., Abuter, R., Alonso, J., et al. 2008, in Proc. SPIE, 7013 [Google Scholar]
- Hinnen, K., Verhaegen, M., & Doelman, N. 2005, in Proc. SPIE 5903, ed. R. K. Tyson, & M. Lloyd-Hart, 75 [Google Scholar]
- Houairi, K., Cassaing, F., Perrin, G., et al. 2008, in Proc. SPIE, 7013 [Google Scholar]
- Kalman, R. E. 1960, Trans. ASME J. Basic Eng., 82, 35 [Google Scholar]
- Le Roux, B., Conan, J.-M., Kulcsár, C., et al. 2004, J. Opt. Soc. Am. A, 21, 1261 [NASA ADS] [CrossRef] [Google Scholar]
- Looze, D. P. 2007, J. Opt. Soc. Am. A, 24, 2850 [NASA ADS] [CrossRef] [Google Scholar]
- Lozi, J., Cassaing, F., Le Duigou, J. M., et al. 2010, in Proc. SPIE, 7734 [Google Scholar]
- Meimon, S., Petit, S., Fusco, T., & Kulcsár, C. 2010, J. Opt. Soc. Am. A, 27, 122 [Google Scholar]
- Monnier, J. D. 2003, Rep. Prog. Phys., 66, 789 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
- Petit, C. 2006, Ph.D. Thesis, Université Paris 13 [Google Scholar]
- Pfuhl, O., Eisenhauer, F., Haug, M., et al. 2010, in Proc. SPIE, 7734 [Google Scholar]
- Poupar, S., Haguenauer, P., Merand, A., et al. 2010, in Proc. SPIE, 7734 [Google Scholar]
- Reasenberg, R. D. 1990, in Proc. SPIE, 1237, ed. J. B. Breckinridge, 172 [Google Scholar]
- Rousset, G., Lacombe, F., Puget, P., et al. 2003, in Proc. SPIE, 4839, ed. P. L. Wizinowich, & D. Bonaccini, 140 [Google Scholar]
- Sahlmann, J., Ménardi, S., Abuter, R., et al. 2009, A&A, 507, 1739 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Shao, M., & Staelin, D. H. 1977, J. Opt. Soc. Am. (1917–1983), 67, 81 [Google Scholar]
- Shao, M., & Staelin, D. H. 1980, Appl. Opt., 19, 1519 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Shumway, R. H., & Stoffer, D. S. 1982, J. Time Ser. Anal., 3, 253 [CrossRef] [Google Scholar]
- Ziskand, I., & Hertz, D. 1993, IEEE Trans. Signal Proc., 41, 3202 [NASA ADS] [CrossRef] [Google Scholar]
Appendix A: Derivation of the modal state-space model
The derivation below is based on the state-space model in Sect. 2. We use a natural intermediate state-space model to develop the modal model.
A.1. Canonical state-space model
Before going to the complete set of four telescopes, we recast the problem of a two-telescope interferometer (Sect. 2) in a more useful form. The result allows a direct generalization to an n-telescope interferometer.
A.1.1. Two-telescope interferometer in a piston scheme
In two dimensions, piston disturbances and associated OPD measurements can be written
like
P = (P0 P1)⊤
and (trivially)
OPD = (OPD01).
Finally, the matrix M that governs the conversion
P → OPD
is (A.1)We define a new
state-space model in terms of fundamental piston disturbances
P (rather than the phase disturbances
ϕ in Sect. 2). For instance, a
vibration on the level of piston Pk
(k = 0,1) can be written as
(A.2)where we note
that we simply include one extra index, k, to refer to the considered
piston. In this state-space model, new state vectors x
are defined as
(A.3)where, taking
e.g. two vibrations on piston 0 and one on piston 1,
The
full equation of state can then be written in block form as
(A.4)where block
diagonal matrices Ai are defined exactly as before (for
each fundamental piston). We assume that the system noise
v is composed of independent white Gaussian noise
components, hence the covariance matrix Σv is still
diagonal.
Given that we consider the same two-telescope interferometer as before, the
observable is still the same: the residual OPD (note
that we give the index “01” to refer to the baseline on which we work). The new state
vector requires an updated definition of the system matrix C, which we denote
. An
appropriate definition is
(A.5)where C0
and C1 are row matrices defined as before. For example, the corresponding
C0 and C1 to Eq. (A.3) are
(A.6)In accordance with
the definition of the matrix A in Eq. (A.4), we have defined C in Eq. (A.5) as the block-diagonal matrix of matrices
Ci. Considering the piston control, we wish to calculate
the commands to be applied directly to the pistons. Command vectors are hence of
length two (we consider two actuators). The full measurement equation in its new form
is then
(A.7)
A.1.2. Four-telescope interferometer in piston scheme
The equation of state in Eq. (A.4) in
now trivially generalized to the four-double equivalent (A.8)We still
assume that the covariance matrix Σv is diagonal, i.e. the
vectors
(k = 0,1,2,3) have independent white Gaussian noise components. The
fact that fringes are tracked on six baselines leads to the use of residual OPD
vectors y and measurement noise vectors
w, where
(A.9)and
analogously for w. The system matrix C and the vectors
un also have their
four-dimensional equivalent, and the full measurement equation finally reads
(A.10)As we did with
Σv, the measurement error covariance matrix
Σw is assumed to be diagonal.
A.1.3. Problems with the state-space model
The combination of Eqs. (A.8) and (A.10) forms a very natural full state-space model of a four-telescope six-baseline fringe tracking system. On the one hand, it describes in a natural way the disturbances on the level of the four apertures, under the reasonable assumption that these are decoupled. On the other hand, the measurement equation describes how the six measured residual OPDs are the result of the contributions of the four fundamental pistons, and of the correction that is applied by the four actuator systems (piezos + delay lines). Owing to the very natural description given by this model, we can refer to it as the canonical state-space model. It seems that we have found an appropriate input model to build a Kalman filter.
However, there is a serious difficulty of the canonical model: it is impossible to identify its parameters. In Sect. 3, the parameter identification method that is introduced is based on applying a spectral fit to sequences of (pseudo-)open loop measurements. Here, however, the matrix M couples all information about the evolution of pistons, in a non-trivial way. In essence, the problem is that absolute piston information cannot be unambiguously recovered from observations. Difficulties with the system identification for a piston-based state-space model could hence have been expected. However, the state-space model defined above now forms a perfect starting point for the construction of a modal-based state-space model.
A.2. Modal state-space model
We now consider how the canonical state-space model can be put into a modal form, in the spirit of what was done in Sect. 4.1. This direct derivation of a modal state-space model is more reliable than simply applying one Kalman-filter controller per mode. In particular, although the modes are orthogonally defined, they have coupling terms that are related to there being linear combinations of pistons/OPDs. The power of the Kalman filter is that it is constructed to deal with vector quantities and covariance matrices which describe coupling between different terms.
Modal equation of state.
For the new equation of state, we introduce a piece of new notation. We use the
tilde sign (~) to denote extended vector and matrix quantities with respect
to the previously introduced versions. First we introduce the extended state-space
vectors as
(A.11)where e.g.
(A.12)In words, the
components
consist of four zero-columns of which the kth zero-column has the
same length as the vector
(k = 0,1,2,3), but the ith zero-column is replaced
by
itself. It
follows that, by definition, all
have
the same length (unlike the vectors
), and we define
(A.13)In exactly the same
way as in Eq. (A.11), we define the
vectors
as
extensions to vn, and the
corresponding covariance matrix is written as
.
Finally, we define the matrix
as
(A.14)i.e. a block diagonal
matrix with the square system matrix A on the diagonal. A completely equivalent
equation of state to the one in Eq. (A.8) is then given by
(A.15)To pass to a
modal equation of state, we follow an approach similar to that used in the
Q-mode definition
Q = V⊤ P.
We define a (still unitary) block-matrix version of the matrix V
(A.16)where
“1” denotes a unit r × r matrix. The modal state
vector is then defined as
(A.17)Multiplying
Eq. (A.15) by and using the equality
, the modal equation of state finally reads
(A.18)where we have
defined the mode system noise
(A.19)The
covariance matrix of vQ is .
Using some numerical examples, it can be shown that ΣQ,v
has a quadruple diagonal structure, i.e. four equal diagonal blocks, but also
possesses off-diagonal terms.
Modal observation equation.
We multiply the observation equation by H (Eq. (31)), which leads to (A.20)where
R = diag(1,1,1,0). We note that we used the property H M = R V⊤. We
verify that this equation indeed defines the last component of
yQ,n to be zero, in
accordance with our convention that Q3 ≡ 0. The defined
transformation gives the observing equation in terms of the residual
Q-mode observables
yQ. Next, we need to
pass from the piston state vector
xn to the equivalent
xQ,n for the modes. For
this, we define an extended version of the matrix C as8
(A.21)It takes some pen and
paper work to show that one can rewrite the first term in the right hand side of
Eq.~(\ref{eq:oemodified}) as
(A.22)Two final
definitions are the modal command and the mode measurement noise
(A.23)(for
both, the last component is 0). The full modal equation is finally given by
(A.24)We note that
the covariance matrix of wQ
is
ΣQ,w = H Σw H⊤.
Appendix B: Details of four-telescope identification
We denote by Py the periodogram of a
POL-data sequence on a single baseline, i.e. as in Eq. (23). In addition, we define S as the power spectrum
associated with the periodogram Py. Under the
assumption of Gaussian white noise excitation, it can be shown that, for fixed frequencies
f, the distribution of
Py(f) asymptotically
approaches an exponential distribution of expectation value
S(f). In other words, for long identification
sequences, the likelihood function L of Py is
close to (B.1)where
λ denotes all model parameters of S.
Averaging three exponential distributions leads to a Gamma-Erlang distribution of order
three. Hence, denoting by the
averaged periodogram, the periodogram likelihood function in Eq. (B.1) is modified into
(B.2)At a fixed frequency
f, the expectation value of the averaged periodogram remains the same,
whereas the standard deviation is reduced by a factor
.
The smaller standard deviation is the major advantage of periodogram averaging: the power
spectrum is more accurately estimated, and even low vibration peaks become significant.
The spectral-fit method then requires the following three modifications:
-
1.
The maximum-likelihood criterion needs to be adapted to thelikelihood function in Eq. (B.2).
-
2.
The threshold for vibration-peak detection (i.e. peak-picking criterion) can be lowered, since lower peaks become significantly detectable. More concretely, we choose the detection criterion Py(f) > 4 S(f) to detect vibration peaks (probability of 0.05% per point for false detection), rather than the detection criterion Py(f) > 7 S(f) for non-averaged periodograms (probability of 0.1% per point for false detection).
-
3.
A more subtle point is the characterization of the measurement-noise covariance matrix ΣQ,w (see Eq. (36)). In the standard MPFK method, applicable to single-baseline data, the measurement error is determined from the horizontal tail of the POL-data periodogram. In modal periodograms, the information of the different baselines is mixed. As a solution, we can first extract the measurement errors of the 6 OPD periodograms. Alternatively, we can simply combine the measurement errors estimated by the phase sensor (see Eq. (59)) into a global measurement error (one for each baseline). We note that the measurement noise of the average modal periodogram is calculated by summing the diagonal of ΣQ,w and dividing by three (i.e. averaging over the three modes).
Appendix C: Instantaneous-gain definition
The calculation of an arbitrary Kalman gain matrix G∞ (e.g. as in Eq. (18)) is of the form (C.1)where
Σ∞ is calculated by solving a Riccati equation. It is clear that the gain
depends on a measurement covariance Σw. Every Kalman step
should ideally involve the calculation of a new gain, to compensate for the variable
measurement noise (flux dropouts). As this would be a heavy computational burden, we
propose adding an artificial term to the gain.
It is clear from Eq. (C.1) that the gain
is inversely related to the Σw (neglecting that it is also
used to calculate Σ∞). It then makes sense to define an instantaneous gain
G∞,n as a quantity of the form (C.2)where
f is a real function. The matrix Σw,n is
defined as the instantaneous measurement covariance at time step n. For
flux of the usual level of noise, the definition of G∞,n
implies that we are tracking using the optimal gain G∞ (the multiplicative term
is 1). Whenever the instantaneous measurement noise increases, the multiplicative factor
must ensure that the gain is decreased. In the extreme case when the measurements are very
noisy and hence unusable, f can be chosen such that the gain approaches
0. In that case, the Kalman filter is based completely on the deterministic part of the
evolution.
For the modal and the OPD Kalman filter, we take the instantaneous-gain definitions as in Sect. 5.2. It is important to understand the difference between the modal and the OPD scheme that led to Eqs. (49) and (50). In the modal scheme, each mode estimate is limited by the noise on the worst OPD triplet (by triplet, we mean three baselines that are connected to the same telescope). This is because these modes are defined as mixes of OPDs associated with all telescopes. A first order multiplicative correction of the gain thus preferably contains a measure for all error terms, hence we consider the trace of the covariance matrix, neglecting the off-diagonal terms. For the OPD scheme, however, each Kalman filter tracks a variable in an independent way. The gain of the different Kalman filters can therefore be adapted purely as a function of the noise on the corresponding variable. Again neglecting the coupling terms, the gain associated with a baseline ij is modified by the corresponding term in the error covariance matrix. In this way, we avoid that the flux problems for one telescope propagating to the tracking on baselines not associated with that telescope.
We remark that the discussion that led us to Eqs. (49) and (50) is probably not the end of the story. It is possible that more elegant “instantaneous gain” definitions exist. Yet, these definitions would likely have a similar form to our choice.
All Tables
Simulation parameters, representative for the Galactic-center observation of GRAVITY.
All Figures
![]() |
Fig. 1 Time diagram of the fringe tracking process. |
In the text |
![]() |
Fig. 2 A general perturbation power spectrum consists of a turbulence spectrum, a flat (measurement) noise spectrum and vibration peaks. |
In the text |
![]() |
Fig. 3 Graphical representation of the four pistons (top) and the four modes (bottom). The four telescopes are represented as four pixels per square. Grey zones denote the zero-positions, black zones denote positive displacement, and white zones equal negative displacement. The modes could for example be labeled as tilt, twist, tip, and global piston. They form an equivalent to Zernike modes in AO, but for a four-telescope interferometer. |
In the text |
![]() |
Fig. 4 Power spectra used to simulate the perturbations on the four pistons (four colors). The turbulence-only spectrum is indicated by the dashed line. |
In the text |
![]() |
Fig. 5 Ten-second extract of a normalized throughput sequence for one of the pistons. |
In the text |
![]() |
Fig. 6 Distribution of residual OPD values resulting from 200 simulations with turbulence as the only OPD disturbance. The plots show the number of residuals per block of 2 nm. Both Kalman-filter schemes are considered, with instantaneous (I) and fixed (F) gains. |
In the text |
![]() |
Fig. 7 Result of applying the spectral-fit method to an average mode periodogram (top) and one of the six OPD periodograms (bottom). The statistical spread in the periodogram points is clearly smaller than in the average periodogram, in other words the power spectrum is more accurately estimated. |
In the text |
![]() |
Fig. 8 Distribution of residual OPD values resulting from 200 simulations with full perturbation contribution. The plots show the number of residuals per block of 10 nm. Both Kalman-filter schemes are considered, with instantaneous (I) and fixed (F) gains. In black dashed and dotted lines, we overplot the corresponding distributions for simulations without flux dropouts. |
In the text |
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.