Issue |
A&A
Volume 646, February 2021
|
|
---|---|---|
Article Number | A41 | |
Number of page(s) | 12 | |
Section | The Sun and the Heliosphere | |
DOI | https://doi.org/10.1051/0004-6361/202038617 | |
Published online | 04 February 2021 |
A nonlinear solar magnetic field calibration method for the filter-based magnetograph by the residual network
1
Key Laboratory of Solar Activity, National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, PR China
e-mail: guojingjing@bao.ac.cn
2
Yunnan Observatories, Chinese Academy of Sciences, Kunming 650216, PR China
3
Big Bear Solar Observatory, 40386 North Shore Lane Big Bear City, CA 92314-9672, USA
e-mail: jkf@ynao.ac.cn
4
University of Chinese Academy of Sciences, Beijing 100049, PR China
Received:
9
June
2020
Accepted:
6
October
2020
Context. The method of solar magnetic field calibration for the filter-based magnetograph is normally the linear calibration method under weak-field approximation that cannot generate the strong magnetic field region well due to the magnetic saturation effect.
Aims. We try to provide a new method to carry out the nonlinear magnetic calibration with the help of neural networks to obtain more accurate magnetic fields.
Methods. We employed the data from Hinode/SP to construct a training, validation and test dataset. The narrow-band Stokes I, Q, U, and V maps at one wavelength point were selected from all the 112 wavelength points observed by SP so as to simulate the single-wavelength observations of the filter-based magnetograph. We used the residual network to model the nonlinear relationship between the Stokes maps and the vector magnetic fields.
Results. After an extensive performance analysis, it is found that the trained models could infer the longitudinal magnetic flux density, the transverse magnetic flux density, and the azimuth angle from the narrow-band Stokes maps with a precision comparable to the inversion results using 112 wavelength points. Moreover, the maps that were produced are much cleaner than the inversion results. The method can effectively overcome the magnetic saturation effect and infer the strong magnetic region much better than the linear calibration method. The residual errors of test samples to standard data are mostly about 50 G for both the longitudinal and transverse magnetic flux density. The values are about 100 G with our previous method of multilayer perceptron, indicating that the new method is more accurate in magnetic calibration.
Key words: magnetic fields / Sun: magnetic fields
© ESO 2021
1. Introduction
At present, the solar magnetic field is generally measured indirectly by means of the Zeeman effect (Hale 1908) of the magnetically sensitive solar spectral lines. According to the different focal plane equipment, the optical magnetograph can be classified as the systems of the systems of the grating spectrographs (SGs) and filtergraphs (FGs; Lin 2001; Iglesias & Feller 2019). Additionally, the integral field solutions are being developed to solve the inability of SGs and FGs to cover the desired spatial and spectral field of views simultaneously (Iglesias & Feller 2019).
The advantage of the spectral magnetograph is its capability to obtain a primary spectrum, so the magnetic field is generally inferred by inversion techniques which have been the most useful tool since the early 1970s (Skumanich & Lites 1987; Ruiz Cobo & del Toro Iniesta 1992; Socas-Navarro 2001; Asensio Ramos & de la Cruz Rodríguez 2015; Li et al. 2019). The inversion techniques fit the observed and the synthetic stokes profiles calculated from the solution of the radiative transfer equation considering the appropriate solar atmospheric parameters, such as the magnetic field and temperature, for example (del Toro Iniesta & Ruiz Cobo 2016). The method estimates a highly accurate magnetic field at the expense of a rather large computing resource consumption, since the Stokes inversion of each pixel takes substantial calculating time. Benefiting from the improved computer capabilities, deep learning has greatly developed in recent decades (Goodfellow et al. 2016). Therefore, novel methods and techniques have been tested and acquired some advanced results on magnetic field inversion. Socas-Navarro et al. first utilized a principle component analysis (PCA) to infer the magnetic field strength and inclination angle from Stokes profiles, which is extremely fast and stable (Rees et al. 2000; Socas-Navarro et al. 2001; Socas-Navarro 2001). Meanwhile, T. A. Carroll and J. Staude applied artificial neural networks (ANNs) to train the network between Stokes profiles and solar atmospheric parameters and retrieved a good estimation for the magnetic field vector (Carroll & Staude 2001). The statistical machine-learning techniques based on Mercer’s kernel were proposed to inverse the vector magnetic field by Teng (2015). Convolutional neural networks (CNN) were applied to Stokes inversion by Asensio Ramos & Díaz Baso (2019), Liu et al. (2020), and Milic & Gafeira (2020). Those methods mostly employed the numerical Magneto-Hydro-Dynamical (MHD) simulated data in training as well as during the validating process, and all of them worked on multi-wavelength polarimetric signals.
The traditional filter-based magnetograph commonly employs a single wavelength point in regular magnetic field observations to obtain high temporal resolution. One way to carry out magnetic calibration is via the linear calibration method based on a weak-field approximation by comparing results with the Stokes inversion from multi-wavelength points (Bai et al. 2013, 2014; Su & Zhang 2004a). The purpose of magnetic calibration is to obtain the appropriate coefficients between Stokes parameters and magnetic vectors, that is, the magnetic flux density in the longitudinal direction (Bl), the transversal direction (Bt), and φ. For a filter-based magnetograph, the previously mentioned coefficients can be obtained by Eqs. (1)–(3) under the assumption of a weak field (Stenflo 1994). We note that Cl and Ct are the linear calibration coefficients for Bl and Bt, respectively, that is:
For the strong magnetic field region, the weak-field approximation is not satisfied due to the magnetic saturation effect. In this case, the linear calibration method has a large error when recovering the strong magnetic field. The Narrowband Filter Imager onboard Hinode (Tsuneta et al. 2008) fits the respective linear calibration coefficients on the different solar structure so as to avoid magnetic saturation in the strong-magnetic field region (Chae et al. 2007). Prior calibration studies (Ai & Hu 1986; Zhang et al. 2007) for the Solar Magnetic Field Telescope (SMFT) installed at the Huairou Solar Observing Station (HSOS) also verified this conclusion. With ANNs (Carroll & Staude 2001; Socas-Navarro 2003, 2005) and deep learning (DL), which is generally utilized (Díaz Baso & Asensio Ramos 2018; Kim et al. 2019; Zhao et al. 2019; Liu et al. 2019; Park et al. 2020), Guo et al. (2020) attempted to carry out magnetic field calibration for a filter-based magnetograph using multi-layers perceptron (MLP) without considering the spacial correlation to build models to infer magnetic fields. In this paper, we try to use the state-of-art CNN method, considering spacial correlation, to calibrate the magnetic field for a filter-based magnetograph.
The Full-disk MagnetoGraph (FMG; Deng et al. 2019), which is a filter-based magnetograph working on Fe I 532.42 nm onboard the Advanced Space-based Solar Observatory (ASO-S; Gan et al. 2019), is being designed by National Astronomical Observatories, Chinese Academy of Sciences and is scheduled to be launched around 2022 to measure full-disk vector magnetic fields. The preliminary method for data reduction and calibration of the FMG has been studied by Su et al. (2019). So this work is helpful for the nonlinear magnetic field calibration of FMG.
The paper is organized as follows. In Sect. 2, the process of obtaining a dataset is introduced. In Sect. 3, we describe the neural network structure and training strategies. In Sect. 4, we evaluate the accuracy and performance of the trained models using the test dataset. Finally, the conclusions are given in Sect. 5 with a discussion on the properties and the future work.
2. Datasets
The purpose of the study is to improve the quality and accuracy of actually observing magnetic field calibration for a filter-based magnetograph working at a single wavelength point. The dataset is selected from Hinode/SP (Tsuneta et al. 2008) and the detailed data acquisition process is described in Guo et al. (2020). For a better comparison with the MagMLPs, we employed the same datasets to train the CNN models. The datasets contain 176 frames of active regions including level 1 data and level 2 data (Lites & Ichimoto 2013). Level 1 data include the Stokes I(λ), Q(λ), U(λ), and V(λ) polarized maps with 112 wavelength points. Level 2 data are the magnetic parameters coming from the Stokes inversion of the High Altitude Observatory (HAO) Merlin inversion code, which was developed by the Community Spectra-polarimetric Analysis Center. A brief description of the training set, validation set, and test set is listed in Table 1. The wavelength point was taken from the spectral polarimeter profiles at −0.063 Å apart from the line center of Fe I 6301 Å to simulate the observation of the filter-based magnetograph. That is because the point at this wavelength has the relatively large value on the first derivative of the spectral profile and it is the most sensitive point to magnetic signals. Furthermore, regarding the actual observation of the filter-based magnetograph, the observation wavelength is usually chosen according to this principle. Then the Stokes I, Q, U, and V maps at this wavelength are considered as the input parameters. A sample of an active region in NOAA AR 12158 in the training set is presented in Fig. 1.
Training set, validation set, and test set.
![]() |
Fig. 1. Stokes I, Q, U, and V at −0.063 Å apart from the line center of Fe I 6301 Å as the input parameters. The data were observed at 10:47 UT on 2014 September 11 in NOAA AR 12158. These maps of Stokes parameters were normalized quantities by Eq. (6) after data preprocessing. |
From the level 2 data, we can directly obtain the magnetic field (B) by taking the filling factor (α), the inclination angle (θ), and the azimuth angle (φ) into account. For a filter-based magnetograph, it is not suitable for the diagnosis of a filling factor (α) due to the limited spectral information. For example, the filling factor is also not included in the data products for both HMI (Borrero et al. 2011) and SO/PHI (Mueller et al. 2019) data. So we used the longitudinal magnetic flux density (Bl) and transverse magnetic flux density (Bt) without considering the α or φ as the target parameters. The corresponding Bl and Bt can then be obtained by Eqs. (4) and (5):
The sample of them in NOAA AR 12158 inferred by inversion code are shown in Fig. 21.
![]() |
Fig. 2. Bl and Bt (the main parameters we are concerned with in our study) without considering α or φ (azimuth angle) as the target parameters. |
On the basis of analysis for influencing factors, the pre-processing to input and target dataset includes data cleaning and unifying the input data. Certain data processing methods have been applied to select a strong magnetic field region and improve the signal-to-noise (S/N). The Stokes I intensity was normalized by dividing by the median value of the quiet region itself. The components Q, U, and V were indirectly divided by the Stokes I to unify data. The equations are presented as
In this study, Stokes I, Q, U, and V were used as input parameters, and Bl, Bt, φ were used as the output parameters. The sizes of each data sample are different. Using maps of different sizes to train the models increases the difficulty of training. So the maps of the training set and validation set are cut into the same shape to train. If the maps are cut too small, the features of a map decrease. If the maps are cut too large, this increases the training requiring memory of a graphics card. As we already know, the training of neural networks need a large amount of samples. For one epoch of the CNN training, we randomly selected a block of 65 pixel × 65 pixel from each of the total 121 sample images. We repeated this kind of selection and training process for tens of thousands of epochs to augment the number of training samples. And then we did the same for geometric transformations of digital images by rotating the blocks by 0°, 90°, 180°, or 270° or, alternatively, by mirroring the blocks to further increase the number of training set.
3. Method
In our previous work, the multilayer perceptron is a pixel to pixel neural network, without considering the spatial correlation of images. In this paper, we utilize the CNN (Le 1989) architecture to build the neural network model. The residual network (ResNet; He et al. 2016) is one of the famous CNN architectures, which was first developed and published in 2016. Its major improvement is reformulating the layers as learning residual functions, thus making it easier to optimize structures and gain accuracy from a considerably increased depth. The similar architecture has been used by many authors (e.g., Asensio Ramos et al. 2017; Asensio Ramos & Díaz Baso 2019) with fantastic performance. Based on ResNet, the CNN models for magnetic calibration are built and then trained. In this case, each output parameter has a different unit and S/N and each parameter needs to be set to a different weight in the loss function. We have not found a perfect solution for assigning weights for each output parameter. In order to gain more accurate models, we abided by the scenario that one model is built for each magnetic parameter.
The design of the structure for a network model is vital for its performance. According to our experiments, residual blocks combined with the basic convolution layer can best calibrate the magnetic field. The structure of our model is shown in Fig. 3. Every convolution layer contains many training parameters and generates many feature maps through a convolution filter extracting a feature (Lecun et al. 1998). In our model, we set a convolution layer with 64 filter kernels, with a size of 3 pixel × 3 pixel to produce feature maps for the initial input parameters. Additionally, all of the filter kernels have a size of 3 pixel × 3 pixel. The kernel size is crucial for model performance for two reasons: A kernel that is too large would over-smooth the training result, while a kernel that is too small could reduce the spatial correlation. Then, we put the linear relation into a nonlinear one through a rectified linear unit ReLU activation function, which has a good performance and is extensively applied in the DL (Nair & Hinton 2010), defined as:
![]() |
Fig. 3. Schematic architecture of the ResNet used in this paper. |
We then set a ReLU activation function to provide the nonlinear properties. These steps were done to fuse the input maps by extracting more feature maps before the residual blocks.
There are 11 residual blocks in this model. Each residual block contains a residual part and an identity of input x. The residual part is made of five functional parts, as is shown in the left panel of Fig. 3. There are two convolution layers with the same kernels as the initial convolution layer. A ReLU layer is set as the middle layer to provide nonlinear mapping. The input x goes through the residual part to get to the residual F(x). The output of the residual block is the input identity x, which adds the F(x), then passes a ReLU layer. We named the network model MagRes.
The MagRes network was trained with a training batch size of 28 and a validation batch size of 18. There are about 9.26 thousand free parameters in the network model. It takes, on average, 21 s per epoch to train. The mean squares error (MSE) is a loss function, the Adam algorithm (Kingma & Ba 2014) is an optimizer, and the initial learning rate is 1e−4. The loss function MSE is a common method used in regression networks, which is defined as follows:
In the measurement of the vector magnetic field, there is an inherent 180° ambiguity in the field perpendicular to the line-of-sight, as inferred from observations of linear polarization in magnetically sensitive spectral lines (Metcalf et al. 2006). This effect results in φ being equivalent to φ + 180° in the azimuth angle. In using the same training process as other parameter models, the values around 0° or 180° are inferred as a discrete distribution in the range [0,180]°. The scatter graph of predictive values and target data looks like a reversing N shape. Therefore, we continued to test the converged model to correct it by changing the loss function, as can be seen in Eq. (9). This is the case because the azimuth angle is a circular quantity with 180°; the angular difference between the predictive value and the target value should not be more than 90°. So if |yi−ŷi| > 90, we used (180−|yi−ŷi|) instead, thus:
where
This work uses Keras 2.2.2, which takes the TensorFlow 1.5.0 as computing background, as the modeling environment of deep learning. For accelerating the training process, GPUs are extensively leveraged in DL, especially for the CNN. Our study mainly works on a NVIDIA Quadro P2000 (set up in our Dell T3620 graphic workstation) with 1024 CUDA Cores, large 5GB GDDR5 on-board memory, and a NVIDIA Tesla P100 (provided by Ali cloud) with more than 21 teraFLOPS of a 16-bit floating-point.
4. Results and testing the network availability
The MagReses are completely trained and finally have three convergent models to infer Bl (longitudinal magnetic flux density), Bt (transverse magnetic flux density), and φ (azimuth angle). It takes 1.11s for the model to predict a map with a resolution of 384 pixel × 684 pixel. The loss function MSE of a training set for Bl, Bt, and φ models are 3522 G, 3313 G, and 525° (including noise). This means that the residual errors of training results with the target data for them are 59.3 G, 57.6 G, and 22.9°. In our training process, the input images are set as any size. The data of test sets could be carried out to produce a physical quantity of a magnetic field without changing the size. Then Stokes I, Q, U, and V only need to be normalized by Eq. (6) before carrying out testing.
4.1. Longitudinal magnetic flux density
We employed Hinode/SP vector magnetic field data in NOAA AR 12738 as a test set to evaluate our model. The data were observed at 10:26 UT on 2019 April 13 with an image scale of 0.317″ per pixel. Figure 4 presents the ResNet result for Bl. The ResNet Bl is primarily similar to the inverted Bl, while it exhibits a cleaner appearance. This may be because the convolution operation smoothes the adjacent pixels and brings about de-noise in the maps. The residuals between ResNet Bl and the inverted Bl are mostly less than 300 G. In the sample, only 284 pixels in the residual map that have 446 976 pixels are more than 300 G. The proportion of errors reaching 300 G in part of the active region is 0.002.
![]() |
Fig. 4. MagRes calibrations for longitudinal magnetic flux density. Upper panels (from left to right): inverted result, MagRes result, and their residual difference, respectively. Lower left panel: histogram of the residual error with its Gaussian kernel density curve. Lower right panel: scatter diagram, which identifies the density of the inversion results with the testing results. |
The lower left panel is the histogram of the residual error. We used the root mean square (RMS), which is defined as
of the residual errors as the effective value of the residual errors to evaluate the performance of the models. We used the mean absolute error (MAE), which is defined as
of the residual errors, as the mean value to evaluate the models. The RMS and MAE for the Bl are 35 G and 17 G. The ratio of MAE to RMS, which represents uncommonly large errors, is 0.48 here. The smaller this ratio is, the more unusual large residual errors are. The lower right panel is the scatter diagram, which identifies the density of the inversion results with the testing results. In statistics, the coefficient of determination, R2, is interpreted as the proportion of the variance in the dependent variable that is predictable from the independent variable, thus:
The R2 represents for the accuracy of a fit. An R2 of 1 means the dependent variable can be predicted without error from the independent variable. In this case, R2 is 0.99.
4.2. Transverse magnetic flux density
Figure 5 presents the MagRes result for transverse magnetic flux density. The testing result of Bt in the upper middle panel has a very similar appearance as the inversion result in the upper left panel. The MagRes result in the upper middle panel is also cleaner than the inverted result. With the residual error, it is shown that the values of a few pixels exceed 300 G, which are dark purple or dark green; most of them are in the range of [−300 300] G. In the sample, only 85 pixels in the residual map are more than 300 G. The lower left panel is the histogram of the residual errors. The RMS and MAE of the residual errors are 34 G and 22 G. The MAE to the RMS ratio is 0.65 for transverse magnetic flux density, which means that there are fewer large, uncommon residual errors. The lower right panel is the scatter diagram, identifying the density of the inversion results with the testing results. The coefficient of determination R2 for the scatter diagram shown in the lower right panel is 0.98.
![]() |
Fig. 5. Upper left panel: inversion result for transverse magnetic flux density. Upper middle panel: testing result. Upper right panel: residual error of the inversion result and the testing result. Lower left panel: histogram of the residual error. Lower right panel: scatter diagram, identifying the density of the inversion results with the testing results. |
4.3. Azimuth angle
The final results for φ are shown in Fig. 6 and the contents are also the same as in Fig. 4. There are many noises on the quiet region in the inversion results, as is shown in the upper left panel. In the sunspots region, MagRes calibratin has a similar appearance with the inversion results and it ignores the boundary area. The map of residual error in the upper right panel shows that most of the sunspot region is in the range of [−20 20]°. We used the clean data that remove most of the quiet area by extracting the corresponding pixels, which are greater than 300 G in transverse magnetic flux density and shown in lighter colors, to further analyze the performance below. The lower left panel is the histogram of the residual errors. The RMS and MAE of the residual errors are 6° and 3°. The ratio of MAE to RMS is about 0.5. The lower right panel is the scatter diagram, which identifies the density of the inversion results with the testing results. The coefficient of determination R2 for the scatter diagram shown on the lower right panel is about 0.99.
![]() |
Fig. 6. Upper left panel: inversion result for the azimuth angle. Upper middle panel: testing result. Upper right panel: residual error of the inversion result and the testing result. Lower left panel: histogram of the residual error. Lower right panel: scatter diagram, identifying the density of the inversion results with the testing results. |
In conclusion, the new approach infers the magnetic field parameters with a precision comparable with that of the inversion technique. Furthermore, it produced cleaner maps with better noise suppression.
5. Comparison and analysis of results
5.1. Comparison with the linear calibration on Bl and Bt
One of our main aims for this study is to improve our model in regions with a strong magnetic field, where the linear calibration method fails for the magnetic saturation effect. Here we present the ResNets results in the magnetic saturation regions and conduct a comparison between the results of ResNets and the linear calibration method for Bl and Bt on the strong magnetic field.
The MagMLP tells us that the neural networks have the ability to solve the magnetic saturation. This is also the case for the CNN method, MagRes. We employ the active region AR 12192 data on the test set to demonstrate. It has a more complicated structure, which was observed with Hinode/SP at 23:41 UT on 24 October 2014. Based on the weak field approximation, the linear calibrations for Bl and Bt are obtained by fitting the straight lines between the Stokes parameters and the inversion results using the least square method. The relationships for linear calibrations are presented in Eqs. (14) and (15):
The results of MagRes, the linear calibration, and the Stokes inversion for Bl and Bt are shown in Fig. 7. As is shown with red rectangles, the results of MagReses in the middle panels have a very similar appearance, structure, and shape as those from the inversion in the left panels. However, the results of linear calibration have pronounced magnetic saturation.
![]() |
Fig. 7. Results of the inversion, MagRes, and linear calibration for Bl and Bt. Top row: for Bl, bottom row: for the Bt. |
With the RMS of residual errors of target data with the testing results of networks, we compare and analyze the performance of the two different neural network methods. The RMS of residual errors for MagReses are less than half for those of MagMLP. The data from 2018 and 2019 in the test set act as samples, which can be seen in Table 2, indicating that the MagReses show a better performance than the MagMLPs based on the quantitative evaluation of the RMS.
RMS values’ residual errors of target data with the testing results of networks from 2018 and 2019 in the test set.
5.2. Comparison and analysis on results of azimuth angle
The azimuth angle φ could be directly inferred by Eq. (3). However, the effects of Faraday rotation exists in the filter-based magnetograph data, especially for the data taken near the line center (Hagyard et al. 2000; Su & Zhang 2004b). So we make a comparison for φ between the methods for ResNet and Eq. (3).
The φ (azimuth angle) is the direction of the magnetic field projected on the sky plane, and there is no magnetic saturation effect on φ. Figure 8 compares the results of MagRes, the linear calibration, and Stokes inversions for φ in AR 12192. It displays that the results of ResNet and linear calibration are all similar to the inversion results. According to the scatter plots in the bottom panel of Fig. 8, ResNet results are better approaching the inversion results than those of linear calibration. Additionally, the RMS of the residual error of the ResNet result with an inversion result is smaller than those for linear calibration, which are 16° and 21°, respectively. This indicates that ResNet could improve the accuracy of φ.
![]() |
Fig. 8. Results of the inversion, MagRes, and linear calibration for φ. Lower left panel: scatter graph of the results of ResNet and the inversion method. Lower right panel: scatter graph of the results of linear calibration and the inversion method. |
5.3. Analysis on the inclination angle
Meanwhile, the θ from the level 2 data of Hinode/SP have many “bright spots” and “dark spots” at 180 and 0 degrees respectively. In our previous paper (Guo et al. 2020), the problem is discussed. We propose to do a comparison experiment with the previous work.
From Eqs. (4) and (5), we can infer that the θ can be represented as:
The results for θ are shown in Fig. 9. The testing result of θ is similar to the inversion result in the sunspot region, but it presents significant differences in the quiet region. In our prior study, MagMLP models eliminate “bright spots” and “dark spots” from the inverted inclination angle map. However, these small-scale features are reconstructed by MagRes. The figure of the training result is cleaner. We extracted the corresponding pixels where the transverse magnetic flux density was above 150 G to further evaluate the model performance below that point. The map of residual error in the upper right panel illustrates that most of the active regions are in the range of [−20 20]° and there are many pixels in the quiet region outside of the range [−20 20]°. The lower left panel is the histogram of the residual errors. The distribution is asymmetric, which may be because there are more cases in which the number of pixels is higher than 90° as opposed to lower than it. The RMS and MAE of the residual errors are 5° and 3°. The ratio of MAE to RMS, which can help us understand whether there are large but uncommon errors, is about 0.6. This illustrates that there are a few uncommon residual errors which are larger. The lower right panel is the scatter diagram, which identifies the density of the inversion results with the testing results. The coefficient of determination R2 for the scatter diagram shown in the lower right panel is 0.98.
![]() |
Fig. 9. Upper left panel: inversion result for the inclination angle. Upper middle panel: testing result. Upper right panel: residual error of the inversion result and the testing result. Lower left panel: histogram of the residual error. Lower right panel: scatter diagram, identifying the density of the inversion results with the testing results. |
6. Discussion, conclusion, and future work
We have developed a new approach for the magnetic field calibration for a filter-based magnetograph from a selected wavelength point of the Stokes profiles of Hinode/SP based on ResNet. A series of experiments were carried out on Bl, Bt, φ, and θ. We collected 176 frames of data samples, including 121 frames for a training set, 18 frames for a validating set, and 37 frames for a test set. When training the network models, we considered the influential factors on the models’ performance, such as data cleaning, data normalization, and α (filling factor). The main effect of α is in the quiet region. Considering it in the magnetic parameters will increase the prediction error of the models, especially in the weak magnetic field regions.
Different input parameters would generate results with a different accuracy. We have attempted to use Stokes V and I as input parameters to train the model of Bl and use Stokes Q, U, and I as input parameters to train the model of Bt. We note that Bl and Bt can also be inferred from these models, but the accuracy is lower. Additionally, we attempted to select another wavelength point at this line profile to train the models. From the results of our experiments, these training results of models did not have a higher accuracy. The accuracy increased at a different degree, but these are just some of our attempts, and the research value for actual observations is not clear. We focused on the research of the wavelength position at −0.063 Å from the center of line Fe I 6301 Å since this is the closest to the routine observations of filter-based magnetograph.
Thereafter, the trained models were used to infer the vector magnetic fields from the samples of the test set. Firstly, the image data from AR 12738, which were collected on 13 April 2019, were utilized to test the models availability. Secondly, the image data from NOAA AR 12192, which were collected on 24 October 2014 with a more complicated magnetic structure, were used to display the comparison with the results of inversion and linear calibration. Thirdly, 12 frames of the image data from the test set were exploited to compare the performance of ResNet with MLP. Based on these experiments, we obtain the consistent finding, which can be summarized as follows:
(1) Our new method could infer the Bl (longitudinal magnetic flux density), Bt (transverse magnetic flux density), and φ (azimuth angle) well using narrow-band Stokes I, Q, and U as well as the V maps. The ResNet method produces cleaner magnetic maps with less noise compared with the inversion method.
(2) The results inferred by the new approach are extremely close to the Stokes inversion results with the RMS values of residual errors within 100 G for Bl and Bt, where the RMS can be within 50 G for the simple AR images data. For φ (azimuth angle), the RMS reached 12° for the non-noisy simple structure data. The ResNet-inferred results are highly correlated to the inversion inferred results with the coefficient of determination R2 values being closer to 1.
(3) Compared with the linear calibration, our new approach could infer the magnetic fields better without the magnetic saturation.
(4) Based on the analysis of the RMS values of residual errors between the inversion inferred results and the MLP-produced or the ResNet-produced results, the ResNet-produced results are mostly below 50 G, which is much better than those of our previous multilayer perceptron (MLP) method, which were mostly above 100G. This may be because the new method could be able to utilize the spatial relationship between adjacent pixels on the input parameters.
The proposed method also has its shortcomings. Firstly, the final accuracy of the model is limited by the accuracy of the corresponding inversion method that the dataset used. Secondly, a large error occurs if the data predicted by using these models are beyond the data range of the training set. Thirdly, the S/N of actual observation data may also affect the recovered accuracy. All of these effects increase the difficulties involved in the selection of samples as well as cleaning and testing network models. One should keep the above effects in mind when using our method.
In conclusion, our attempts based on ResNet can be understood as an alternative, efficient solution to the problem of linear calibration for the filter-based magnetograph. The study is just a start, and more tests are needed to ensure that the magnetic field recovered from ResNet can be used for scientific analysis. We propose applying the ResNet method to FMG magnetic data calibration. Therefore, some experiments are being conducted on the full-disk data using the observation data of Helioseismic and Magnetic Imager onboard the Solar Dynamics Observatory (Schou et al. 2012), considering the influence of the orbital velocity. Meanwhile, we are also trying to use other neural networks to train models, such as the Dense Convolutional Network (DenseNet; Huang et al. 2016). In addition, neural networks are not well designed to work with circular quantities, such as the azimuth angle. Whether the function approximation conditions can be considered based on these physical requirements (such as integrating physical constraints on the basis of the traditional loss function) is also a question worth studying. This is not only an important issue for this project, it will affect the degree of trust and use of these data processing results by solar physicists. We also want to use this research to explore a way to overcome the application of machine learning methods in astronomy methods of cognitive impairment to better serve astronomy research in the future.
On the code’s webpage https://www2.hao.ucar.edu/csac/csac-data/sp-data-description, we found that “The Hinode/SP inversions solve for the fill factor α, the observed Stokes I profile Iobs is fitted with α * Imag + (1 − α)*Iscatt, where Imag is the magnetized component and Iscatt is the scattered light profile”. Here we do not follow this concept strictly. Please be careful if you use our method; we will elaborate on this in Appendix A.
Acknowledgments
We are very grateful to our referee for putting forward many valuable feedback. We thank the Astronomical Big Data Joint Research Center, co-founded by the National Astronomical Observatories, the Chinese Academy of Sciences and the Alibaba Cloud. Thanks for the Community Spectropolarimetric Analysis Center (http://www2.hao.ucar.edu/csac) providing Hinode/SP data. This project has received funding from the Strategic Priority Research Program on Space Science, the Chinese Academy of Sciences under No. XDA15320300, XDA15320302, XDA15052200, XDA15010800, the National Natural Science Foundation of China (NSFC) under No.12073077, 11873027, 11773072, 11427803, 11427901, 11773040, 11573012, 11833010, 11973056, 11873062, 11773038, 11703042, and Beijing Municipal Science and Technology under No. Z181100002918004. We also thank the NVIDIA Corporation for the donation of the Quadro P2000, one of the GPUs in this work. We acknowledge the community effort devoted to the development of the following open-source packages used in the research: Keras (https://keras.io/), TensorFlow (https://www.tensorflow.org/), Matplotlib (matplotlib.org), Numpy (numpy.org), and Astropy (astropy.org). We are grateful to prof. Hongqi Zhang and Dr. Junfeng Hou of HSOS for helpful discussion.
References
- Ai, G. X., & Hu, Y. F. 1986, Acta Astron. Sin., 27, 173 [Google Scholar]
- Asensio Ramos, A., & de la Cruz Rodríguez, J. 2015, in Polarimetry, eds. K. N. Nagendra, S. Bagnulo, R. Centeno, & M. Jesús Martínez González, IAU Symp., 305, 225 [NASA ADS] [Google Scholar]
- Asensio Ramos, A., Requerey, I. S., & Vitas, N. 2017, A&A, 604, A11 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Asensio Ramos, A., & Díaz Baso, C. J. 2019, A&A, 626, A102 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Bai, X., Deng, Y., & Su, J. 2013, Sol. Phys., 282, 405 [CrossRef] [Google Scholar]
- Bai, X. Y., Deng, Y. Y., Teng, F., et al. 2014, MNRAS, 445, 49 [Google Scholar]
- Bellot Rubio, L., & Orozco Suárez, D. 2019, Liv. Rev. Sol. Phys., 16, 1 [CrossRef] [Google Scholar]
- Borrero, J. M., Tomczyk, S., Kubo, M., et al. 2011, Sol. Phys., 273, 267 [Google Scholar]
- Carroll, T., & Staude, J. 2001, A&A, 378, 316 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Chae, J., Moon, Y.-J., Park, Y.-D., et al. 2007, PASJ, 59, S619 [Google Scholar]
- del Toro Iniesta, J. C., & Ruiz Cobo, B. 2016, Liv. Rev. Sol. Phys., 13, 4 [Google Scholar]
- Deng, Y.-Y., Zhang, H.-Y., Yang, J.-F., et al. 2019, Res. Astron. Astrophys., 19, 157 [Google Scholar]
- Díaz Baso, C. J., & Asensio Ramos, A. 2018, A&A, 614, A5 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Gan, W.-Q., Zhu, C., Deng, Y.-Y., et al. 2019, Res. Astron. Astrophys., 19, 156 [Google Scholar]
- Goodfellow, I., Bengio, Y., & Courville, A. 2016, Deep Learning (MIT press) [Google Scholar]
- Guo, J., Bai, X., Deng, Y., et al. 2020, Sol. Phys., 295 [CrossRef] [Google Scholar]
- Hagyard, M. J., Adams, M. L., Smith, J. E., & West, E. A. 2000, Sol. Phys., 191, 309 [Google Scholar]
- Hale, G. E. 1908, ApJ, 28, 315 [Google Scholar]
- He, K., Zhang, X., Ren, S., & Jian, S. 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) [Google Scholar]
- Huang, G., Liu, Z., van der Maaten, L., & Weinberger, K. Q. 2016, ArXiv e-prints [arXiv:1608.06993] [Google Scholar]
- Iglesias, F. A., & Feller, A. 2019, Opt. Eng., 58, 082417 [Google Scholar]
- Kim, T., Park, E., Lee, H., et al. 2019, Nat. Astron., 3, 397 [NASA ADS] [CrossRef] [Google Scholar]
- Kingma, D. P., & Ba, J. 2014, ArXiv e-prints [arXiv:1412.6980] [Google Scholar]
- Le, Y. 1989, in Steels (North-holland, Bv: Elsevier Science Publishers) [Google Scholar]
- Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. 1998, Proc. IEEE, 86, 2278 [Google Scholar]
- Li, H., Xu, Z., Qu, Z., & Sun, L. 2019, ApJ, 875, 127 [Google Scholar]
- Lin, Y. 2001, Introduction to Solar Physics [Google Scholar]
- Lites, B. W., & Ichimoto, K. 2013, Sol. Phys., 283, 601 [NASA ADS] [CrossRef] [Google Scholar]
- Liu, H., Ji, K., & Jin, Z. 2019, Chin. Sci. Phys. Mech. Astron., 105 [Google Scholar]
- Liu, H., Xu, Y., Wang, J., et al. 2020, ApJ, 894, 70 [NASA ADS] [CrossRef] [Google Scholar]
- Metcalf, T. R., Leka, K. D., Barnes, G., et al. 2006, Sol. Phys., 237, 267 [Google Scholar]
- Milic, I., & Gafeira, R. 2020, A&A, 644, A129 [CrossRef] [EDP Sciences] [Google Scholar]
- Mueller, D., Solanki, S. K., & del Toro Iniesta, J. C. 2019, AGU Fall Meeting Abstracts, 2019, SH21D-3292 [Google Scholar]
- Nair, V., & Hinton, G. E. 2010, Proceedings of the 27th International Conference on Machine Learning (ICML-10), June 21–24, 2010, Haifa, Israel [Google Scholar]
- Park, E., Moon, Y.-J., Lim, D., & Lee, H. 2020, ApJ, 891, L4 [Google Scholar]
- Rees, D. E., López Ariste, A., Thatcher, J., & Semel, M. 2000, A&A, 355, 759 [NASA ADS] [Google Scholar]
- Ruiz Cobo, B., & del Toro Iniesta, J. C. 1992, ApJ, 398, 375 [NASA ADS] [CrossRef] [Google Scholar]
- Schou, J., Scherrer, P. H., Bush, R. I., et al. 2012, Sol. Phys., 275, 229 [Google Scholar]
- Skumanich, A., & Lites, B. W. 1987, ApJ, 322, 473 [Google Scholar]
- Socas-Navarro, H. 2001, in Stokes Inversion Techniques: Recent Achievements and Future Horizons, ed. M. Sigwarth, ASP Conf. Ser., 236, 487 [Google Scholar]
- Socas-Navarro, H. 2003, Neural Net., 16, 355 [Google Scholar]
- Socas-Navarro, H. 2005, ApJ, 621, 545 [NASA ADS] [CrossRef] [Google Scholar]
- Socas-Navarro, H., López Ariste, A., & Lites, B. W. 2001, ApJ, 553, 949 [NASA ADS] [CrossRef] [Google Scholar]
- Stenflo, J. 1994, Solar Magnetic Fields, Vol. 189 [CrossRef] [Google Scholar]
- Su, J.-T., & Zhang, H.-Q. 2004a, Chin. J. Astron. Astrophys., 4, 365 [Google Scholar]
- Su, J., & Zhang, H. 2004b, Sol. Phys., 222, 17 [Google Scholar]
- Su, J.-T., Bai, X.-Y., Chen, J., et al. 2019, Res. Astron. Astrophys., 19, 161 [Google Scholar]
- Teng, F. 2015, Sol. Phys., 290, 2693 [Google Scholar]
- Tsuneta, S., Ichimoto, K., Katsukawa, Y., et al. 2008, Sol. Phys., 249, 167 [NASA ADS] [CrossRef] [Google Scholar]
- Zhang, H.-Q., Wang, D.-G., Deng, Y.-Y., et al. 2007, Chin. J. Astron. Astrophys., 7, 281 [Google Scholar]
- Zhao, D., Xu, L., Chen, L., Yan, Y., & Duan, L.-Y. 2019, Adv. Astron., 2019, 5343254 [Google Scholar]
Appendix A: Transverse magnetic flux density and filling factor
As previously mentioned footnote 1, the observed Stokes I profile Iobs is fitted with αImag + (1 − α)Iscatt, indicating that Q = αQmag, U = αUmag, and V = αVmag. In the weak field regime, Stokes Q and U are proportional to the squared transverse magnetic flux density (Bellot Rubio & Orozco Suárez 2019). In other words, Btf, without considering the α, follow Equation .17 based on the weak field regime:
This is because alpha acts in the signals and not in the magnetic field itself. If an approximation like weak field is used, the derived quantity has an impact on the output, that is, the magnetic field inferred does not generate the observed signals in areas with a small filling factor and a large field. It could work inside the umbra where alpha is high and the field is large, but not in a plage or faculae where the field is high and the filling factor is small (due to the saturation effect). If one follows the theory and Eq. (16), one cannot use Eq. (15) to calculate the inclination and one would need the information of alpha to do that.
We also trained a ResNet for Btf; the results of which can be seen in Fig. 1. The results of ResNet appear to be consistent with the inversion results. The RMS and MAE of the residual errors are 48 G and 31 G. The coefficient of determination R2 for the scatter diagram shown in the lower right panel is 0.96. These indicators show poorer results than Bt. On the quiet region, we can see that more dots are distributed in the upper right panel than those of Bt, which are shown in Fig. 5.
![]() |
Fig. A.1. Upper left panel: inversion result for Btf. Upper middle panel: testing result. Upper right panel: residual error of the inversion result and the testing result. Lower left panel: histogram of the residual error. Lower right panel: scatter diagram, identifying the density of the inversion results with the testing results. |
All Tables
RMS values’ residual errors of target data with the testing results of networks from 2018 and 2019 in the test set.
All Figures
![]() |
Fig. 1. Stokes I, Q, U, and V at −0.063 Å apart from the line center of Fe I 6301 Å as the input parameters. The data were observed at 10:47 UT on 2014 September 11 in NOAA AR 12158. These maps of Stokes parameters were normalized quantities by Eq. (6) after data preprocessing. |
In the text |
![]() |
Fig. 2. Bl and Bt (the main parameters we are concerned with in our study) without considering α or φ (azimuth angle) as the target parameters. |
In the text |
![]() |
Fig. 3. Schematic architecture of the ResNet used in this paper. |
In the text |
![]() |
Fig. 4. MagRes calibrations for longitudinal magnetic flux density. Upper panels (from left to right): inverted result, MagRes result, and their residual difference, respectively. Lower left panel: histogram of the residual error with its Gaussian kernel density curve. Lower right panel: scatter diagram, which identifies the density of the inversion results with the testing results. |
In the text |
![]() |
Fig. 5. Upper left panel: inversion result for transverse magnetic flux density. Upper middle panel: testing result. Upper right panel: residual error of the inversion result and the testing result. Lower left panel: histogram of the residual error. Lower right panel: scatter diagram, identifying the density of the inversion results with the testing results. |
In the text |
![]() |
Fig. 6. Upper left panel: inversion result for the azimuth angle. Upper middle panel: testing result. Upper right panel: residual error of the inversion result and the testing result. Lower left panel: histogram of the residual error. Lower right panel: scatter diagram, identifying the density of the inversion results with the testing results. |
In the text |
![]() |
Fig. 7. Results of the inversion, MagRes, and linear calibration for Bl and Bt. Top row: for Bl, bottom row: for the Bt. |
In the text |
![]() |
Fig. 8. Results of the inversion, MagRes, and linear calibration for φ. Lower left panel: scatter graph of the results of ResNet and the inversion method. Lower right panel: scatter graph of the results of linear calibration and the inversion method. |
In the text |
![]() |
Fig. 9. Upper left panel: inversion result for the inclination angle. Upper middle panel: testing result. Upper right panel: residual error of the inversion result and the testing result. Lower left panel: histogram of the residual error. Lower right panel: scatter diagram, identifying the density of the inversion results with the testing results. |
In the text |
![]() |
Fig. A.1. Upper left panel: inversion result for Btf. Upper middle panel: testing result. Upper right panel: residual error of the inversion result and the testing result. Lower left panel: histogram of the residual error. Lower right panel: scatter diagram, identifying the density of the inversion results with the testing results. |
In the text |
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.