A&A 378, 316-326 (2001)
DOI: 10.1051/0004-6361:20011167
T. A. Carroll - J. Staude
Astrophysikalisches Institut Potsdam, Telegrafenberg, Sonnenobservatorium Einsteinturm, 14473 Potsdam, Germany
Received 2 July 2001 / Accepted 10 August 2001
Abstract
We investigate the application of artificial neural networks (ANNs) for the
interpretation of Stokes profiles. We have employed ANNs to approximate the
nonlinear inverse mapping between the Stokes profiles and some of the underlying
atmospheric parameters. This approximate model is used in the following
to carry out a fast non-iterative inversion of synthetic Stokes profiles. We have used
synthetic Stokes profiles of the photospheric infrared line Fe I
15648 to
demonstrate that the ANNs are capable to yield accurate single valued estimates of the
complete magnetic field vector, line-of-sight (LOS) velocity, microturbulence,
macroturbulence and the filling factor with exceptional speed. For a stratified atmosphere
we also demonstrate that these single valued parameters do represent very good averaged
values of the input stratification.
To retrieve some of the temperature information encoded in the Stokes profiles
we modeled a neural network classifier on the basis of several
semi-empirical model atmospheres (i.e. temperature and pressure stratification).
With this classifier we are able to determine the probability that a given
Stokes profile has its origin from a particular temperature stratification of a
semi-empirical model.
Key words: line: formation - line: profiles - Sun: photosphere - Sun: magnetic fields - method: data analysis
The reliable determination of magnetic fields and the thermodynamic structures of the solar atmosphere is still crucial for our understanding of many complex processes in and outside magnetic active regions. The inference of atmospheric parameters from Stokes profiles is a nonlinear inverse problem. Typically, those problems are solved by linearizing an appropriate forward model, computing the sensitivities and then iteratively solving a regularized optimization problem. In this sense most of the inversion procedures for Stokes profiles are based on a non-linear least-square minimization (i.e. Levenberg-Marquardt algorithm) (Auer et al. 1977; Landolfi et al. 1984; Skumanich & Lites 1987). All the latter are using a simplified forward model (Milne-Eddington model) in order to yield an analytical solution for the fast evaluation of the requiered derivatives. The most advanced Stokes profile inversion so far was introduced by Ruiz Cobo & del Toro Iniesta (1992). This inversion code is based on response functions and does not rely on the analytical Milne-Eddington solution and therefore it is able to retrieve height dependent information within a reasonable time.
But the improving techniques in spectro-polarimetry and the improving spatial resolution of especially future projects like GREGOR (von der Luehe et al. 1999), SOLIS (Keller 1998) or SolarB (Lites 2000) provide us with huge amounts of polarimetric data. This emphasizes the need for a fast and stable automated analysis and interpretation of Stokes profiles. In this sense the new inversion method described by Socas-Navarro et al. (2001) which operates in the low-dimensional eigenfeauture space, produced by a principle component analysis (PCA) of Stokes profiles, bypass the iterative behaviour and seems to be a very promising alternative and supplement to existing inversion methods. In this paper we introduce another alternative to an iterative solution, a direct inversion, based on the approximation of the nonlinear inverse mapping between the Stokes parameter and some of the underlying atmospheric parameters. The inversion based on this approximate inverse mapping proved to be stable, accurate and extremely fast.
Artificial neural networks were already succesfully applied in many different astrophysical fields such as the classification of stellar spectra (Bailor-Jones et al. 1998; Weaver & Torres-Dodgen 1997; Gulati et al. 1994), the retrieval of stellar parameters from low resolution spectra (Bailor-Jones 2000), the time series analysis of solar active regions (Calvo et al. 1995) or the sunspot index prediction (Fessant et al. 1996).
We have used one of the most popular types of ANNs, the Multi-Layer Perceptron (MLP),
which is a general model for approximating nonlinear multivariate functions. To adapt
(i.e. train) our MLP to the inverse problem we take advantage of the fact that we have
a good knowledge of the forward problem (i.e. the polarized radiative transfer) in
the case of magnetic sensitive absorption lines formed under the conditions of local
thermodynamical equilibrium (LTE). Thus we are able to generate a sufficient large database
of synthetic Stokes profiles for the training process of the MLP.
We have applied this model in the following work to simulated observation of the
infrared line (IR) Fe I
15648.
Once the MLP is trained and has found a good approximation of the
inverse mapping it can recover the complete magnetic field vector, the line of sight
velocity, the microturbulence, the macroturbulence and the filling factor as well as an
estimate of the temperature stratification with exceptional speed.
This paper is organized as follows: in Sect. 2 we describe the general structure and properties of a Multi-Layer Perceptron. In Sect. 3 we investigate the capabilities of the MLP to retrieve some of the temperature information encoded in the Stokes profiles. We consider the problem if an MLP can recognize a specific signature in a given Stokes profile to distinguish between the temperature structure of different semi-empirical model atmospheres. Extensive statistical tests are made to evaluate the performance of the MLP. Section 4 deals with the inference of various atmospherical parameters from the Stokes parameters. The training of the MLPs to approximate the inverse relation and the statistical tests of the trained MLPs are described. Because our estimates are single valued we consider, in Sect. 5, the influence of magnetic field strength and LOS velocity gradients on the MLP calculations. To assess the results we use the concept of heights of formation (HOFs) defined by Sanchez Almeida et al. (1996), to demonstrate that the calculations of the MLP do represent very good averaged values of the stratified atmosphere. In Sect. 6 we employed the MLP model to retrieve estimates for the filling or stray light factor. We simulated a simple two component atmosphere to demonstrate that a trained MLP can yield reliable estimates about the fraction of the magnetic component. In Sect. 7 we give a short comment on the remarkable speed of the inversion. Finally, in Sect. 8, we summerize the main conclusions of this work and give an outlook to some future applications.
The adaptive model for approximating the inverse relationship between the Stokes profiles
and the physical parameters is a fully connected two-layer MLP
which is a practical framework for approximating arbitrary nonlinear
multivariate functions.
The general adaptive function, which is represented by a fully connected two-layer MLP,
is given by
![]() |
(1) |
![]() |
(2) |
![]() |
(3) |
Once the network is successfully trained in terms of minimizing the error function (3) the network weights are frozen and the network, which represents a nonlinear function, can be applied to unknown data (application mode).
In the following work we make use of two theoretical concepts for the Multi-Layer Perceptron. It can be shown that the MLP network function (1) can in principle approximate any continuous functional mapping to arbitrary accuracy (Funahashi 1989; Hecht-Nielsen 1989; Hornik et al. 1989). In practice an arbitrary good approximation of the intrinsic mapping is never reached due to the limiting size (i.e. degree of freedom) of the network function in terms of weights, the limiting size of our training database and the possible inability of the optimization algorithm to find the global minimum within the high-dimensional model parameter space of the network function (1). Nevertheless we can get in many cases a satisfying result and a good approximation of the intrinsic mapping for a wide range of possible input parameters. The other remarkable theoretical result is that an MLP trained as classifier is able to approximate the Baysian posterior probability (Richard & Lippmann 1991) which make them a powerful tool in classification problem.
Both parameter estimation (regression) and classification can be seen as a particular case of function approximation (Richard & Lippmann 1991). In a regression problem (i.e. parameter estimation) the task is to find an approximation of an unknown mapping on the basis of a continuous input and target relation. In a classification problem the task is to assign inputs to a number of discrete and possible mutually exclusive classes or categories, to yield an approximation of the underlying discriminant function.
The temperature is the dominating factor in the radiative transfer process, it affacts
the appearance of the resulting Stokes spectra in various ways. The paramount role
of the temperature can be demonstrated in terms of a first order pertubation of the
radiative transfer equation (e.g. response functions). The temperature response of the
Stokes parameters, Fig. 1, is at least an order higher than the responses from other parameters.
A resulting variation of the Stokes vector
,
for each wavelength, from a pertubation of different parameters
,
in optical depth, can be expressed in first order as,
![]() |
(4) |
![]() |
(5) |
To avoid possible problems, due to the dominating effect of the temperature, for the convergence and accuracy of the MLP during the training phase we decided to separate the problem of estimating the temperature information from the inference of the other atmospherical parameters. To retrieve some of the information from the underlying temperature we analyse, in a first step, the Stokes profiles with a neural network classifier. The classifier determines possible similarities to a set of semi-empirical model atmospheres. The main source of information for a classification of the underlying temperature structure is given by the variation of the complete intensity spectrum, (Fig. 1) shows the temperature response for the Stokes I parameter. Since the response functions for the line cores have the slowest trend to zero with height, the line depth variation relative to the continuum, does also indicate a relative variation of the temperature gradient for the classifier (Figs. 1 and 2). To retain most of the information of the intensity variation we did not normalize our spectra by the local continuum, instead we have used the HSRA quiet-sun atmosphere from Gingerich et al. (1971) as an independent reference model.
![]() |
Figure 1:
The temperature response function of Stokes I for the Fe I |
| Open with DEXTER | |
![]() |
Figure 2:
The line center temperature response function of Stokes I for the Fe I |
| Open with DEXTER | |
In this section we describe the basics of the training process for the classifier. To yield an estimate of the temperature structure from a given Stokes profile we train our MLP model on the basis of nine semi-empirical model atmospheres (temperature and pressure stratification). We have used three basic semi-empirical models, the umbra model M from Maltby et al. (1986), the penumbral model from Ding & Fang (1989) and the quiet-sun model from Holweger & Mueller (1974). From each basic model we have calculated two variations in hydrostatic equilibrium with a uniform temperature offset of plus 200 and minus 200 Kelvin respectively. The temperature stratifications of the resulting nine model atmospheres are shown in Fig. 3. With this set of atmospheres we get a coarse grid of possible temperature structures. Our classifier should be able to find similarities of a given profile to a set of trained profile which were calculated on the basis of one of the nine semi-emirical model atmospheres. For each model the similarities are expressed in terms of a likelihood that this particular model has been responsible for producing the observed profile.
To create the database for the classification network we have used the DIAMAG synthesis
code (i.e forward model) from Grossmann-Doerth (1994) which implements a numerical
integration of the radiative transfer equation (RTE) by means of the diagonal lambda
iteration method (DELO) (Rees et al. 1989).
For the photospheric Fe I
15648.5 we calculated the Stokes
parameters in a wavelength range of -160 pm to +160 pm around the line center, with a
sample of 4 pm, such that for each discrete Stokes parameter there are 81 wavelength
points available. For the classification of the temperature model we have used the
Stokes I and V parameters only because we did not expect any further information
about the temperature variation from Q and U. That means, we have used 162 data
points as an input vector to our classification network. The simulated profiles are assumed
to be normalized to the HSRA quiet-sun atmosphere from Gingerich et al. (1971).
On the basis of the nine semi-empirical model
atmospheres we produced a data set of 9000 Stokes profiles for the training set (1000 for each model
to account for an equal prior probability) and 9000 Stokes profiles for a validation set to check
the performance of the network with an independent data set during the training process.
![]() |
Figure 3: Temperature stratification of the semi-empirical models: Maltby, Maltby+200, Maltby-200 (solid line), DingFang, DingFang+200, DingFang-200 (dotted line), Holweger, Holweger+200, Holweger-200 (dashed line). |
| Open with DEXTER | |
To take into account the full variability of all the other free atmospheric parameters, a random generator first determined the atmospheric values of a single-valued atmosphere for the magnetic field strength, inclination, azimuth, LOS velocity, microturbulence and macroturbulence. The values are taken from the intervals given in Table 1. In a second step another random generator picks out the temperature and pressure stratifications from one of the nine semi-empirical models. With this procedure we calculated the entire database for the training and the validation samples. To meet the demands of a more realistic inversion we added white noise to the synthetic Stokes profiles with a signal-to-noise ratio of 1000 (relative to the continuum). We applied a linear transformation (zero mean and unit standard deviation for each wavelength point) with respect to the training database. For the output variables of the MLP we adopt a typical 1-of-c coding scheme for the classification networks. The number c of output elements of the MLP corresponds to the number of categories we want to distinguish. We have used nine semi-empirical temperature models which means that each of our MLPs has nine output elements and the training and validation samples are also associated with nine target values. If a given pair of I and V profiles is formed under a specific temperature model, the corresponding target value in the training set is set to 1 (i.e. 100% probability) and the other eight are set to zero.
| Atmospheric Parameter | Range |
| Mag. Field Strength |
|
| Inclination |
|
| Azimuth |
|
| LOS Velocity |
|
| Microturbulence |
|
| Macroturbulence |
|
The training data set is then used to train different network architectures (with different numbers of hidden layers and elements in these layers) with a standard error Back-propagation algorithm (Rumelhart et al. 1986), together with a regularized conjugate gradient algorithm.
If the MLP has found a good approximation of the
underlying discriminant function of the classification problem the activations of the output units
should provide a direct estimate of the posterior probabilities
for a particular
class (i.e. model atmosphere) Ck given the Stokes profile
(Richard & Lippmann 1991; Bishop
1995).
Since our classes are not mutually exclusive, which means that an observed profile may carry attributes from
temperature stratifications of different semi-empirical models, the MLP classifies multiple independent attributes.
Each single output activation (i.e. class activation) represents the probability that the attributes
of a particular temperature model are present.
The network which shows the best performance in the training
session in terms of the error function (3) was taken to statistically determine the performance of
the network. The free model parameters (i.e. weights) of the MLP are frozen and the MLP is now in
application mode.
In order to test the performance of the MLP in identifying the most probable model atmosphere for
a given Stokes profile we have calculated a test database of 18000 Stokes profiles (2000
for each semi-empirical model). The interpretation of the network results follow the
Bayesian approach for classification, where minimizing the probability of misclassification is
achieved by the selection of the largest posterior probability (Bishop 1995).
In this way we interpret the largest output activation as a decision for the model
which is represented by that particular output unit.
By interpreting the network results in this way the MLP could identify, in all 18000 cases, the correct semi-empirical model!! For every single pair of Stokes I and V in the test database the neural network classifier was able to determine the correct underlying temperature structure. Figure 4 shows the distribution of the probabilities (output activation) for the corresponding correct model, the summed probability of all other output elements (classes) is in all cases less then 5%, this indicates the high certainty of the MLP decisions. The MLP has found distinct characteristics for each temperature model in the Stokes profiles, and this although the other free parameters like magnetic field strength, inclination, azimuth, LOS velocity, microturbulence and macroturbulence are randomly changed.
![]() |
Figure 4: Calculated distribution of the network output activity for the true semi-empirical model. |
| Open with DEXTER | |
![]() |
Figure 5: Temperature stratification of the Kollatschny model (solid line) and the three trained umbral models, Maltby-200, Maltby, Maltby+200 (all dashed lines). |
| Open with DEXTER | |
In a realistic application, none of the nine temperature models will match exactly the original temperature stratification of a given observed Stokes profile. As an example how the MLP classifier reacts in the presence of a profile which has its origin from an unknown temperature stratification we consider a set of Stokes profiles which were calculated on the basis of the umbral model of Kollatschny et al. (1980). The temperature stratification of the Kollatschny model is shown in Fig. 5 (solid line) together with the three trained umbral models Maltby, Maltby+200, Maltby-200 (dashed lines). Since the temperature stratification of the Kollatschny model shows different good agreements with these three umbral models, depending on the optical depth, we expect that profiles calculated for the Kollatschny model comprises attributes from all three umbral models. We calculated a set of 2000 Stokes I and V profiles for the Kollatschny model, the atmospheric parameters for each calculation are again chosen by a random generator from the parameter intervals given in Table 1.
| Model | Average Output |
| Maltby-200 | 0.011 |
| Maltby | 0.868 |
| Maltby+200 | 0.123 |
| DingFang-200 | 0.000 |
| DingFang | 0.000 |
| DingFang+200 | 0.000 |
| Holweger-200 | 0.000 |
| Holweger | 0.000 |
| Holweger+200 | 0.000 |
Again we get a very impressive result, in all 2000 cases only the output units which represents
the three umbral models showed activation, all the others remained silent. In all 2000 test cases the
MLP classifier clearly favours the original Maltby atmosphere. In Table 2 we listed the
average activation of each output unit for all 2000 test profiles. The original Maltby
model has an average activation of 0.868 (or 86.8% probability
that attributes from that temperature model are present), the Maltby+200 shows an average activation
of 0.123 (12.3%) and the Maltby-200 has an activation of 0.011 (1.1%). This clearly shows
that the classifier has found most similarities with Stokes profiles that were calculated under the
original Maltby model.
From Fig. 2, the line center response of Stokes I, we can see that the classifer does indeed provide
reasonable results, the Stokes I line center response has its highest values in a depth range of
about
to
right where the temperature stratification of the
Kollatschny model runs just slightly above the temperature stratification of the original Maltby model
(Fig. 5).
Therfore the original Maltby temperature model can be used as an appropriate approximation of the
Kollatschny model. Note that this is just valid for the considered Fe I
15648 line, a
different line, with a different temperature sensitivety in optical depth, might be better approximated by a
different temperature model.
Although the grid of available semi-empirical models is very coarse and should be increased in a realistic application to better constrain the real temperature stratification, the simple neural classifier has demonstrated that it can yield fast and reliable classifications of the underlying temperature model.
One of the major goals of the Stokes profile inversion is the inference of the magnetic
field and dynamic structure of the solar atmosphere.
The Stokes parameters formalism provides a description of the Zeeman induced splitting
and polarization properties of magnetic sensitive absorption lines. A formal solution of the polarized
RTE given by Landi Degl'Innocenti & Landi Degl'Innocenti (1985) represents the forward model
from which we want to infer some of the underlying atmospheric parameters. The emergent
(observed) Stokes vector
for a particular wavelength at,
is given by
![]() |
(6) |
For each semi-empirical model atmosphere we have trained an MLP as a regression network. We again
calculated a training database and a validation set for each network with the DIAMAG synthesis code
for the Fe I
15648 line. The training set and validation set consist of 8000 patterns (Stokes profiles to physical parameters).
The Stokes profiles are calculated
in a wavelength range of -160 to +160 pm around the line center, with a sample
of 1 pm, to make 81 wavelength points available for each discrete Stokes parameter.
In order to constrain the numbers of free parameter of the MLPs we
only make use of Stokes I and V for the determination of the magnetic field strength,
inclination, LOS velocity, microturbulence and macroturbulence and have trained an
extra network for the determination of the field azimuth which uses the Stokes Q and Uparameters.
Thus each regression network has a input vector of 162 elements. The networks for the determination
of the azimuth angle have one single output while the others have five, one for each parameter.
To account for an appropriate representation of the inverse mapping, for each calculated Stokes profile
a random generator determines the values for the different free atmospheric parameters from the intervals
given in Table 1.
Once more noise with a signal-to-noise
ratio of 1000 was added and a linear transformation of the input values (zero mean and unit standard deviation
for each wavelength point) with respect to the training database was applied.
The training data set is used to train different network architectures with a
standard error Back-propagation algorithm together with a regularized conjugate gradient
algorithm.
In order to limit the description of the results we demonstrate the good perfomance by presenting the results of the two MLPs which were trained on the basis of the original Maltby atmosphere. The other MLPs trained on the other semi-empirical temperature models are showing the same good results in terms of root-mean-square (rms) error and error distribution. Therefore we consider in the following two MLPs, one for the estimation of the magnetic field strength, inclination, LOS velocity, microturbulence and macoturbulence (MLP-M-5 hereafter) and another one which only retrieves an estimate of the azimuthal angle of the magnetic field (MLP-M-1 hereafter). As for the classification network we have chosen the network which shows the best performance in the training session in terms of the error function (3), to statistically determine the performance of the inversion with an independent test data set.
For the independent test database we have calculated 10000 Stokes profiles under the same conditions as the training data set. Since we are now in the application mode all the free parameters of the network (i.e. weights) are frozen and the MLP represents a simple and fast to evaluate nonlinear function. The presentation of a Stokes I and V profile to the MLP-M-5 network yields a simultaneous calculation of all 5 atmospheric parameters within a fraction of a second.
For the complete test dataset of 10000 Stokes profiles we yield an rms error for the magnetic field strength of 24.02 Gauss, the distribution of the error and a plot of the calculated field strength values versus the true (known) values, are shown in Fig. 6. These very good results demonstrate that the MLP could recognize the characteristics of the magnetic field strength within the Stokes I and V profiles. The mapping of the profiles to the field strength values was performed with high accuracy (see Fig. 6 left). From the right plot in Fig. 6 we see that for the whole trained field strength range (500 G to 3500 G) the estimated values show a very dense alignment along the ideal line which indicates that the approximate mapping for the field strength has a very low variance over the entire trained parameter range.
For the inclination angle we yield an rms error of 2.01
.
The distribution of the error
for all 10000 test profiles are shown Fig. 7. The variation of Stokes I and V due to a change
of the inclination angle is remarkably good recognized and quantified by the MLP. Over the entire
parameter range the inclination angles are very accurately estimated by the MLP.
![]() |
Figure 6: Distribution of the deviation from the true field strength (left) and a plot of MLP calculated versus the true field strength (right). The calculated values are marked by a "+'' sign. |
| Open with DEXTER | |
![]() |
Figure 7: Distribution of the deviation from the true inclination (left) and a plot of MLP calculated versus the true inclination angle (right). |
| Open with DEXTER | |
The LOS velocity was also very accurately estimated by the MLP, the relative line shifts of the Stokes I and V profile are easily detected by the MLP (Fig. 8). The determination of the LOS velocity for all 10000 Stokes I and V profiles yields an rms error of only 28.81 ms-1.
![]() |
Figure 8: Distribution of the deviation from the true velocities (left) and a plot of MLP calculated versus the true LOS velocities (right). |
| Open with DEXTER | |
Another remarkable result is the fact that the MLP is able to disentangle the non-thermal broadening effects of both microturbulence and macroturbulence. Although a small pertubation of both effects shows a rather small variation in the observed Stokes profiles (i.e. small response functions) the MLP has learned to identify even fairly small variations. This is expressed in the fact that the rms error of the retrieved microturbulence, for all 10000 test profiles, has a value of only 103.74 ms-1. The macroturbulence could even be determined with an rms error of 57.62 ms-1. Figures 9 and 10 show again the distributions of the error.
![]() |
Figure 9: Distribution of the deviation from the true microturbulences (left) and a plot of MLP calculated versus the true microturbulences (right). |
| Open with DEXTER | |
![]() |
Figure 10: Distribution of the deviation from the true macroturbulences (left) and a plot of MLP calculated versus the true macroturbulences (right). |
| Open with DEXTER | |
For the test of the MLP-M-1 network which determines the azimuthal angle we have generated another
test database of 10000 Stokes Q and U profiles. The 6 free atmospheric parameters are again choosen from the
intervals shown in Table 1, with the exception that we have shorten the range of possible inclination angles.
The inclination angles are now chosen from
to
to
provide a sufficient signal in the Stokes Q and U components.
In Fig. 11 the error distributions for the complete test data set are shown. The determination of
the azimuth could not reach the accuracy of the inclination, but taking into account the presence
of the added artificial noise and the relative weakness of the Stokes Q and U signals compared to
Stokes I and V, the retrieval of the azimuthal angle for all test profiles is still remarkable.
The rms error of the azimuth for all 10000 test profiles is
.
![]() |
Figure 11: Distribution of the deviation from the true azimuth (left) and a plot of MLP calculated versus the true azimuthal angle (right). |
| Open with DEXTER | |
Since our MLP models for the inversion of Stokes parameters are trained on the basis of homogeneous
single-valued atmospheres for the magnetic field vector, LOS velocity, microturbulence and macroturbulenc
we are interested to see how the nonlinear approximation of the MLP react in the presence of a
profile which has originated from a stratified atmosphere.
To test this behaviour we calculated a set of 1000 Stokes profiles with different gradients in the
magnetic field strength and LOS velocity. We have adopted various linear dependencies of the field
strength and the velocities with log(
). The gradients for the magnetic field strength are changed in
a range from 50 G/log(
)
to 500 G/log(
). For the line forming region of the Fe I
15648
line in the Maltby atmosphere this corresponds to a gradient of about 0.5 G/km to 5 G/km.
The gradients for the LOS velocity are changed in
a range from 20 m/s/log(
)
to 200 m/s/log(
).
Apart from any physical meaning we adopted a positive and a negative gradient for both parameters. A variable
offset in
was also provided.
Since our expectation is that the network, trained on the basis of single-valued parameter, should provide
us with a kind of averaged values for the stratified atmosphere, we need a method which yields an
appropriate estimation of these averaged values.
For this reason, we follow the definition of heights of formation (HOFs) for a measurement from Sanchez Almeida
et al. (1996). They consequently related the determination of the HOFs with the physical parameter under
consideration, the Stokes parameter and the underlying atmosphere.
This relation is expressed in terms of the response functions Rij for the corresponding
Stokes parameter i and the considered physical parameter j.
To take into account both Stokes I and V and the whole wavelength range, +p to -p,
we calculated the mean depth given in
)
with:
![]() |
(7) |
We found extremely good agreement between the MLP calculations and the determined averaged field strength and velocity parameter. To illustrate how good these agreements are, in Fig. 12 (left) we have depicted the deviations of the retrieved magnetic field strength by the MLP-M-5 from the calculated averaged field strength. Figure 13 (left) shows these deviations for the retrieved LOS velocities. For a better comparison of how these deviations are related to a difference in optical depth, for each retrieved field strength and velocity of the MLP-M-5, we determined the position in optical depth, obtained by an interpolation from the original field strength and velocity stratification. With this procedure we have calculated the deviation, in optical depth, for each retrieved field strength and velocity, from the related HOFs. Figure 12 (right) and Fig. 13 (right) illustrates these deviations.
![]() |
Figure 12: Distribution of the deviation of the MLP-M-5 results from the averaged magnetic field strength (left). Distribution of deviations from the HOFs (right). |
| Open with DEXTER | |
Most magnetic structures outside of sunspots are represented by small unresolved
magnetic elements (i.e. magnetic flux tubes) (e.g. Stenflo 1989). Therefore the estimation of
the fraction of the magnetic structure provides a valuable parameter and puts additional physical constrains
on the inference of the other atmospheric parameters. We have adopted a simple two-component atmosphere
which has a magnetic and a non-magnetic component. This simple two-component model is also a valid representation
for the contribution of unpolarized stray light. The observed Stokes vector is then given by
![]() |
(8) |
For the calculation of the training and validation data set we assumed the temperature structure of the
quiet-sun model of Holweger & Mueller (1974) for the non-magnetic or stray light component and for the
magnetic component we assumed the temperature structure from the Maltby et al. (1986) atmosphere.
The Stokes parameters for the different components are calculated on the basis of the intervals given in
Table 1. This means we have 6 free parameters for the magnetic component and 3 free parameters for
the non-magnetic component. For the resulting Stokes vector we have one additional free parmeter, the filling
factor for the magnetic component, which is adjusted between 0.05 (5%) and 1.0 (100%).
Since we are now only interested in the fraction
of the magnetic or non-magnetic component our MLP has just one output element which represents the filling
factor
.
The free parameters are again determined by a random generator to produce a training and
a validation set of 5000 synthetic Stokes profiles. As input we have used the Stokes I and V parameter.
Therfore we have excluded inclination values within a range of
to
to
the LOS to retain a sufficient Stokes V signal.
![]() |
Figure 13: Distribution of the deviation of the MLP-M-5 results from the averaged LOS velocity (left). Distribution of deviations from the HOFs (right). |
| Open with DEXTER | |
The training data set is then used to train different network architectures (with different numbers of hidden layers and elements in these layers) with a standard error Back-propagation algorithm (Rumelhart et al. 1986), together with a conjugate gradient algorithm. After a successful training the network which has the best performance in the trainng session is used to statistically determine the behaviour and accuracy of the estimations. Therefore we calculated another test set of 5000 synthetic Stokes profiles under the same conditions like the training data set.
The whole test set of 5000 profiles was statistically investigated and does once more demonstrate the remarkable abilities of the MLP. The filling factor could be determined by the trained MLP with an rms error of only 0.021 (2.1%). Figure 14 shows again the distribution of the deviations from the true values (left) and a plot of the MLP estimates over the true filling factors (right) for all 5000 test profiles. Despite the variation of all the other free atmospheric parameters and the superimposed noise the trained MLP is able to determine the fraction of the magnetic component or non-magnetic component, respectively, with a very good accuracy.
![]() |
Figure 14: Distribution of the deviation from the true filling factor (left). A plot of MLP calculated versus the true filling factors (right). |
| Open with DEXTER | |
One of the striking features of the MLP is, that once it is adapted to the problem, the network parameters are frozen (application mode) and the network simply represents a mathematical function, see (1), which allows a fast evaluation either as a software program or as a hardware (VLSI) implementation. We implemented a software simulation of our neural networks (written as C++ code) which runs on a fairly ordinary Pentium III processor under the Linux operating system. The CPU time needed for the software simulated MLPs to invert 10000 Stokes profiles (I and V or Q and U) is less than 10 s, which is many orders of magnitudes faster than any conventional inversion routine based on an iterative optimization and still significantly faster than the PCA based inversion of Stokes profiles proposed by Socas-Navarro et al. (2001).
We have investigated the ability of Multi-Layer Perceptrons for the inversion of
Stokes profiles. We have used synthetic observations of the Fe I
15648 line. The application
of IR line offers valuable advantages for the diagnostic of magnetic fields (Solanki 1992) but
the application of ANNs are not limited to the IR lines, the application to other visible lines
like the Fe I
6302.5 and Fe I
5250.2 lines (Carroll et al. 2000)
revealed very similar results as presented in this paper.
The classification of Stokes profiles for the estimation of the underlying temperature structure on the basis of nine semi-empirical model atmospheres has demonstrated the remarkable classification properties of the MLP. Although the IR line has a rather small temperature sensitivity compared to other visible lines, the neural classifier was able to find distinct features within the profiles and to determine the correct underlying atmosphere or the most probable respectively. For a real application of that classifier the number of available semi-empirical models should be increased, but this is straightforward and no principle problem for the MLP classifier.
For the inference of atmospheric parameters from the Stokes profiles we have trained an MLP on the basis of an appropriate numerical forward model. Despite the presence of an artificial signal-to-noise ratio of 1000 the trained MLPs found a good approximation of the inverse mapping of the problem. For all free parameters, magnetic field vector, LOS velocity, microturbulence and macroturbulence, the retrieved estimates from the Stokes profiles are very close to the original true values.
Since the MLPs are trained on a single-valued atmosphere for the considered free parameters we investigated the results of the inversion in the presence of moderate gradients for the magnetic field strength and LOS velocity. These investigations reveal that the estimated single-valued parameters of the MLP do represent very accurate averaged values for the stratified atmosphere.
For the inference of an unresolved magnetic component or the presence of unpolarized stray light, a MLP could be trained on a two-component model and has retrieved remarkable good estimates of the underlying fraction of the magnetic or non-magnetic component respectively.
Although our investigation are based on synthetic observations it was demonstrated that the Multi-Layer Peceptron as one well known type of artificial neural networks is a suiteable model for the inversion of Stokes profiles. Because of their speed, noise tolerance and nonlinear adaption abilities MLPs and other supervised ANN models provide us with a powerful statistical tool to tackle the problem of an automated and stable inversion of large amounts of polarimetric data.
ANNs offer a variety of possible outlooks for the analysis and interpretation of Stokes profiles, in forthcoming work we will extend the application to a multi-line inversion to improve the diagnostic capabilities and a possible inference of gradients for a stratified atmosphere. The coupling of ANNs with a conventional inversion procedure based on response functions is already in progress. For more complicated atmospheric structures (e.g. multi-component or interlaced atmospheres) which might contain inherent ambiguities, an ANN based modeling of mixture of experts (Jacobs et al. 1991) or mixture of models (Bishop 1995) offer a promising approach to deal in a probabilistic way with multi-valued inverse problems. Further outlooks are the forward modeling of the polarized RTE with ANNs and a fast approximation of response functions. Another interesting application might be the auto-associative MLP (Kramer 1991) to perform a nonlinear dimensionality reduction, this offers an nonlinear extension to the linear decomposition of Stokes profiles (Rees et al. 2000) which is performed by a principle component analysis.
Acknowledgements
The authors gratefully acknowledge financial support of the present work by the German Federal Ministry of Education and Research through the German Space Research Center (DLR) under grant No. 50 QL 0003.