A&A 470, 1185-1191 (2007)
DOI: 10.1051/0004-6361:20065231
Y. Li^{1,2,5} - Y. Yan^{2} - M. Devel^{3} - R. Langlet^{4} - G. Song^{1}
1 - School of Science, Xidian University, Xi'an, Shaanxi, PR China
2 -
National Astronomical Observatories, Chinese Academy of Sciences, Beijing, PR China
3 -
Institut UTINAM, UMR CNRS 6213, 16 route de GRAY, 25030 Besançon Cedex, France
4 -
Laboratoire de Physique du Solide, FUNDP, Rue de Bruxelles 61, 5000 Namur, Belgium
5 -
School of Applied Science, Taiyuan University of Science and Technology, Taiyuan, Shanxi, PR China
Received 18 March 2006 / Accepted 17 April 2007
Abstract
Context. Since the 1990's, Yan and colleagues have formulated a kind of boundary integral formulation for the linear or non-linear solar force-free magnetic fields with finite energy in semi-infinite space, and developed a computational procedure by virtue of the boundary element method (BEM) to extrapolate the magnetic fields above the photosphere.
Aims. In this paper, the generalized minimal residual method (GMRES) is introduced into the BEM extrapolation of the solar force-free magnetic fields, in order to efficiently solve the associated BEM system of linear equations, which was previously solved by the Gauss elimination method with full pivoting.
Methods. Being a modern iterative method for non-symmetric linear systems, the GMRES method reduces the computational cost for the BEM system from
to
,
where N is the number of unknowns in the linear system. Intensive numerical experiments are conducted on a well-known analytical model of the force-free magnetic field to reveal the convergence behaviour of the GMRES method subjected to the BEM systems. The impacts of the relevant parameters on the convergence speed are investigated in detail.
Results. Taking the Krylov dimension to be 50 and the relative residual bound to be 10^{-6} (or 10^{-2}), the GMRES method is at least 1000 (or 9000) times faster than the full pivoting Gauss elimination method used in the original BEM extrapolation code, when N is greater than 12 321, according to the CPU timing information measured on a common desktop computer (CPU speed 2.8 GHz; RAM 1 GB) for the model problem.
Key words: Sun: magnetic fields - Sun: corona - methods: numerical
Since the 1990's, Yan and colleagues have formulated a kind of boundary integral representations for the linear or non-linear solar force-free magnetic field with finite energy content in the semi-infinite space above the Sun, and applied a well established numerical method - boundary element method (BEM) - to compute the magnetic field above the active regions on the Sun (Yan et al. 1991; Yan 1995; Yan & Sakurai 1997, 2000). The BEM extrapolation has found wide applications in the study of the coronal magnetic fields since its advent (see, e.g., Yan & Wang 1995; Yan et al. 1995, 2001a,b; Wang et al. 2001, 2002; Liu et al. 2002; Zhang & Wang 2002).
In the BEM extrapolation, however, one needs to compute the normal derivative of the magnetic field at the boundary region as an intermediate step, which amounts to solving a non-symmetric, dense linear system of equations. In the original BEM extrapolation code, a Gauss solver with full pivoting, found in Liu et al. (1980), was used for the solution of the BEM system, which is a fairly time-consuming process for a large scale problem. This well known difficulty has made the routine use of the BEM extrapolation code prohibitive on common computers. Although the difficulty can be reduced by exploiting a supercomputer, it is more desirable to seek faster substitutive algorithms in the long run.
In this paper, we use the generalized minimal residual method (usually called GMRES method), originally presented by Saad & Schult (1986), to efficiently solve the linear systems arising from the BEM extrapolation. Compared with other iterative algorithms for non-symmetric linear systems such as the generalized conjugate residual method, the GMRES method is more economical in memory requirements and arithmetic operations, and can terminate in at most N steps (where N is the number of unknowns of the linear system under consideration), which is desirable for large scale problems (Huang & Vorst 1989, 1990). There are some practical variants of the original GMRES algorithm, among which the restarted GMRES algorithm, called GMRES(m) in short, and the incomplete GMRES algorithm are the most popular ones (Saad & Wu 1996; Quarteroni & Valli 1997). Recent applications of the GMRES method and its relatives alike can be found, e.g., in Patton & Holloway (2002); Ravindra & Muddun (2004), Spyropoulos et al. (2004) and Ronald (2005). Like other iterative methods for linear systems, the efficiency of the GMRES method is dependent on the nature of the coefficient matrix of the linear system under consideration. Fortunately, the GMRES solver we used converges very fast to the solution of our problem with a desired accuracy. This is understandable since the linear systems arising from integral equations are usually well-conditioned.
The paper is divided as follows. In Sect. 2, we briefly recall the boundary integral representation of the constant force-free magnetic field proposed by Yan (1995), and make some notation conventions. In Sect. 3, we introduce the main ideas of the GMRES method while the numerical results are reported in Sect. 4. Finally, we discuss some important points and draw conclusions in Sect. 5.
We recall the boundary integral representation of the constant
force-free magnetic field proposed by Yan (1995) as
follows. Let
be the outer space above the Sun, the photosphere surface, and
the magnetogram area, i.e.
,
,
and
.
The
governing Eqs. (1) and (2) together
with the boundary condition
By applying the Green's second identity to the above exterior
boundary value problem, one can obtain an integral representation
of the constant
force-free magnetic field as follows:
One may observe from (5) that the value of at any point can be determined by the boundary value of and Y together with the boundary value of their normal derivatives. However, the normal derivative of the magnetic field at the boundary, i.e. , is unknown.
To evaluate the normal derivative of the magnetic field at the
boundary, we let
.
Then, one can verify that
always holds at the boundary,
which makes (
)
take the form of
Consider the linear equations
In the GMRES algorithm, one needs to store the matrices V_{l} and in the memory. Since the Krylov dimension lkeeps increasing with iterations until an approximate solution with a prescribed accuracy is obtained (or until reaching N), the GMRES algorithm may encounter storage issues for large scale problems, in case the algorithm can not satisfy the accuracy criterion in a small number of iterations for the specific problem. To overcome this difficulty, a variant algorithm called GMRES(m), also proposed by Saad & Schultz (1986), is widely used in practical applications. Here, the Krylov subspace dimension mis a fixed number that should be properly determined to obtain a good performance (the algorithm is restarted every m steps, hence the name). Numerical results for m= 2, 3, 4, 5, 20 have been reported in Saad & Schultz (1986) on a pure mathematical test problem, but usually no theoretical conclusion is available on the optimal choice of m for a specific problem at hand (Barrett et al. 1994). For this reason, experimental efforts for determining a proper m for the given problem are in principle needed (see, e.g., Spyropoulos et al. 2004).
The GMRES algorithm and a few of its variants, together with some other popular Krylov-type iterative solvers, have been implemented in several well-known packages, such as SPARSKIT (Saad 1994), Templates (Barrett et al. 1994), and Aztec (Tuminaro et al. 1999). We have adapted most of the SPARSKIT package to our recently rebuilt BEM extrapolation code. Considerable efforts have been made in our new code to hide as many technical details as possible from the users, so they just need to configure a few external parameter files for their own problems, escaping from the potential dangers of modifying a huge source code or providing inconsistent inputs.
Table 1: CPU timing information for assembling the BEM matrices, for solving the linear systems of equations with the Gauss solver, the LU solver, or the GMRES(50) solver with various settings of , and for extrapolating the fields for 15 layers (CPU time in seconds).
Now, we take a square region with L_{x}=L_{y}=5 on the XOY plane, acting as the magnetogram region, on which the "magnetogram data'' is supplied by the analytical force-free magnetic field. The scheme is first to discretize the region with a uniform grid, then the BEM method is applied to compute the boundary normal derivative of the magnetic field on the grid, and finally the extrapolation is conducted on several levels of height, with a vertical step z_{h}=0.2. In the intermediate step of computing the boundary normal derivative by the BEM, one normally just needs to discretize the region only once with a properly determined resolution, resulting in a linear system with a 3D right-hand side. Here, in order to investigate the asymptotic convergence behaviour of the GMRES(m) solver, we take 10 grids with higher and higher resolution, i.e. grids defined by , where m_{k}=33+6(k-1), n_{k}=m_{k}-2, . Thus, we obtain 10 linear systems with 3D right-hand sides, whose sizes are , . Note that these sizes are designed such that they increase in a "near'' linear manner, which is desirable for proper representation of the results.
Figure 1: a) Performance of the Gauss solver, LU solver and GMRES(50) solver with , 10^{-6} or 10^{-2}; dotted lines are extended from the actual timing data for prediction purpose; diamonds indicate the extra experiments for confirming and correcting the prediction lines; CPU times for assembling the BEM matrices and extrapolating the fields for 15 layers are also plotted for reference purpose. b) Impact of the setting of of the GMRES(50) solver on the accuracy of the extrapolated field. c) Impact of the Krylov dimension m on the convergence behaviour of the GMRES(m) solver for . d) Impact of the Krylov dimension m on the convergence behaviour of the GMRES(m) solver for . | |
Open with DEXTER |
One may observe in Table 1. that the CPU time for the Gauss solver increases very quickly, ranging from about 12.5 s to about 17 h, when N increases from 1023 to 7395, and that it is almost always dominant compared to the assembling and extrapolation times (even much bigger for the bigger sizes). One can also see that the ratio of the CPU time taken by the LU solver to the CPU time taken by the Gauss solver is considerably reduced when N increases from 1023 to 7395, i.e. by a factor ranging from 9 to 100! Still, we have verified that both of the solvers are O(N^{3}) procedures. The large disparities of the CPU time concerning the two direct solvers should be due to the fact that the Gauss solver we tested is implemented using the Gaussian elimination method with full pivoting while the LU solver we tested is implemented using Crout' s method with partial pivoting and heavily optimized for all target processor architectures including the one we used. The fact that both of the direct solvers provide solutions with almost the same accuracy makes it a pity that the slowest solver has been in use for so many years. We note that the Gauss solver with full pivoting was chosen during the design of the original BEM extrapolation code because it was believed more accurate (see Liu et al. 1980), but no systematic numerical tests were conducted then to reveal the asymptotic performance behaviour of the two kinds of direct solvers applied to the BEM extrapolation code, since it was not intended for large scale problems at that moment.
Next, we see in Table 1 and Fig. 1a that the GMRES(50) solver with becomes faster than the LU solver when N is greater than 2499, but the relative difference remains small over the range of N tested here. As the iterative solver is an O(N^{2}) procedure, it may have advantages over the LU solver for even bigger problems. However, in both cases, it can be seen that the CPU time needed to solve the linear systems and extrapolate the fields becomes relatively dominant compared to the assembly time so that differences will not be very noticeable.
By lifting to be 10^{-6} (or 10^{-2}), the GMRES(50) solver is accelerated by a factor ranging from 3 (or 15) to 4 (or 25) when N increases from 1023 to 7395, relatively to the GMRES(m) solver with . We see that the extrapolating stage becomes relatively dominant in the BEM extrapolation if the GMRES(50) solver with or is used for solving the BEM systems. In addition, we observe that the CPU time needed by the assembling stage is just slightly less than that needed by the GMRES(50) solver with .
The above timing data are further illustrated in Fig. 1a. Note that, we have taken the natural logarithm values of the timing data to represent them properly. One can observe there are intersections between the performance curves of the LU solver and the GMRES(50) solver with or 10^{-16} around N=1023 or N=2499, which demonstrate that the investigation on the asymptotic performance behaviour of these solvers is indeed necessary for acquiring a more complete knowledge on their performance. Since the trends of variation of these performance curves are rather regular in the current coordinates, we can extrapolate them to predict the solution time of the corresponding solvers for problems with more degrees of freedom (see also Bucher et al. 2002). We estimate that, for a problem with N=20 000 degrees of freedom, the Gauss solver requires about 22.9 days to get the solution, and the LU solver needs about 3.36 h for the same thing, while the GMRES(50) solver can bring out its solution in about 50 min for , about 11 min for , or less than 2 min for . To verify our prediction, we have executed an additional experiment by solving a problem with a degree of freedom N=12 321 with the Gauss solver and the GMRES(50) solver with or respectively, as individually indicated by the symbols " '' in Fig. 1a.
Table 2: Iteration information of the GMRES(m) solver with m=50 and various settings of ( , , ).
The above timing data, of course, is associated to the machine architectures and compiler options, but the relative relations among these data should be less dependent on the computational environment. To this point, we have collected in Table 2 the iteration information of the GMRES(50) solver (which is independent on the computing environment) for the three concerned values of . Note that, we have represented the iteration information with 3-component vectors since each of the linear systems has three right-hand sides. We see that, for each of the cases of , the iteration steps for the second and third right-hand sides for each degree of freedom are close to (or same) each other, and greater than those of the first ones. This is understandable since there is a symmetrical relation between the later two components of the boundary data (see (14) and (15)), while the first component of the boundary data is rather different from the other two. In addition, we point out that, for the cases of or , the number of iteration steps for each of the right-hand sides increases with N. Furthermore, the larger is, the slower the convergence rate of the GMRES(50) solver. On the contrary, for the number of iteration steps for each of the right-hand sides remains constant or even decreases when Nincreases. Finally, we verified, for each problem size, that the ratios of CPU times corresponding to the three variants of the iterative solver in Table 1 are nearly exactly proportional to the corresponding ratios of numbers of iterations from Table 2.
The above analysis shows that the GMRES(50) solver holds great
promise in efficiency compared with the Gauss solver or the LU
solver, especially when the relative residual bound
is increased up to 10^{-2}. One may ask, however, whether it is
reasonable to increase
up to a value as big as
10^{-2}! To answer this question, we fix m=50 and discretize
the problem with a grid of
,
and solve the resulting
linear system of size 5475 by the GMRES(m) solver with
,
,
,
,
and
respectively. Thus, we obtain five
sets of 3D boundary derivatives of the magnetic fields of
differing accuracies, resulting in five sets of numerical 3D
fields of differing accuracies after the extrapolation process.
Note that, all five sets of numerical 3D fields are computed on a
cubic grid, so that we can use the method in Amari et al. (1998)
to measure the accuracy of these computed fields against the
analytical one. More precisely, one computes the Euclidean
field relative error of the extrapolated magnetic fields vs
the analytical magnetic field on each extrapolation level (or
height) by the formula (cf. Amari et al. 1998):
Now, let us consider the choice of the Krylov dimension m of the GMRES(m) solver. The scheme is to solve each of the 10 linear systems of 3D right-hand sides above by the GMRES(m) solver, taking m=5,10,20,30,40,50,100, and 500 respectively while fixing or . Figure 1c shows the impact of the value of m on the convergence speed of the GMRES(m) solver for . We see that the GMRES(m) solver is accelerated by a factor of 2 or 3 when the value of m is increased from 5 to 20 and upwards. Meanwhile, the speed-up becomes less significant (or even null) when m is greater than 20 (or 100). It appears to us that m=50 can be an optimal choice for the Krylov dimension of the GMRES(m) solver while fixing , since it does not increase too much workspace compared with m=100, and may save more time compared with m=20 when N becomes even bigger. The situation is somewhat changed when is increased to 10^{-2}, as shown in Fig. 1d (note that we have zoomed in 10 times in Fig. 1d compared to Fig. 1c for proper representation of the results). We see that the convergence speed of the GMRES(m) solver is almost not changed when m is set to be 10 and upwards. In addition, we observe some subtle phenomena of mathematical interest when further zooming in the figure. For example, for all N, the setting of m=500 is even slightly worse than that of m=10, concerning the convergence speed of the GMRES(m) solver for . However, detailed analysis of these phenomena is out of the scope of this paper. Similar phenomena for "DGMRES algorithm'' have recently been reported in (Zhou & Wei 2006). Generally speaking, we may take around 20, 30, 40, or 50 as a proper choice of the Krylov dimension m of the GMRES(m) solver for both cases of and , while m=50 is our preference since relatively bigger m can make the iterative solver more stable, and it appears more fair to set m=50 when comparing the convergence speed of GMRES(m, 10^{-6}) and GMRES(m, 10^{-2}).
We have successfully applied the GMRES(m) method to the BEM extrapolation of the solar force-free magnetic field on a well known test problem, and made an effort on the optimal choice of the Krylov dimension m and the relative residual bound ^{}. By investigating the asymptotic convergence behaviour of the GMRES(m) solver applied to the BEM systems of linear equations, we reveal that the GMRES(m) solver, with properly determined m and , is much more efficient than the Gauss solver or the LU solver when the size of the problem becomes very large, while the accuracy of the extrapolated field can still be satisfactory within some height. Thus, the BEM extrapolation has taken a significant step forward towards the routine task of extrapolating the daily observed photospheric magnetogram data. In the future, we will apply the GMRES(m) algorithm to the non-linear case of the BEM extrapolation proposed by Yan & Sakurai (2000). Finally, we note that it is possible to apply the GMRES(m) algorithm to other field extrapolation methods containing large scale linear system of equations, such as the method for "a better linear force-free field'' proposed by Wheatland (1999) or the calculation method for the non-linear force-free field proposed by Sakurai (1981).
Despite the encouraging progress on the efficiency, there are still problems to be dealt with when applying the rebuilt BEM extrapolation code to the real data. (i) The BEM matrix requires considerable amount of memory for storage when N becomes very large (see, e.g., Wheatland 1999; Sakurai 1981 for similar problems). For example, for a magnetogram with pixels, that amount could be as large as 8192 GB, which would be challenging even for a supercomputer. Fortunately, this problem has been widely recognized in many research fields, and some promising methods, such as multipole expansions (Greengard & Rokhlin 1987), panel clustering (Hackbusch & Nowak 1989), and wavelet compression method (Beylkin et al. 1991) have been proposed. We have already begun to study the adaptation of the wavelet approach to the solution of integral equation (Li et al. 2005) and will apply it to tackle the storage problem in the BEM extrapolation in a forthcoming paper (Li et al. in preparation). (ii) Since two dimensional numerical quadrature is involved both in the procedure of generating the BEM matrix and in the stage of computing the magnetic field in in the BEM extrapolation, computational cost of numerical quadrature can be considerable in the case that high space resolution is desirable. Therefore, it is necessary to seek more effective quadrature methods. We have also started to work on this aspect. (iii) The convergence behaviour of the GMRES(m) method subjected to the BEM extrapolation with real magnetogram data remains to be investigated. However, it can be expected that the noise in the observed data, the distribution of , the size and resolution of the magnetogram region may affect the convergence behaviour of the GMRES(m) method so that slow convergence may be encountered due to these factors. In that case, we suggest trying a bigger m, but the should not be bigger than 10^{-2}, according to the numerical experiments done in this paper. In addition, since we have implemented most of SPARSKIT (which contains, among other things, 10 iterative solvers) in our rebuilt BEM extrapolation code, the users of our code can consider using other solvers of SPARSKIT, such as DQGMRES, BCG, etc., in case the GMRES(m) solver converges slowly for the specific problem under consideration or to check the accuracy by comparing results provided by different kinds of (fast) solvers. Furthermore, we have implemented our new code in a modular way such that the users usually need not modify the source code but simply change parameters in input files to change algorithms or set parameters and may adopt new standard packages if more advanced method, known as "preconditioning'', have to be considered. Alternatively the users may just use the LU solver from LAPACK for comparison purposes on small size problems. Finally, we note that further numerical experiments will be done in Li (in preparation) to compare the convergence behaviour of some other iterative solvers in the SPARSKIT, in the context of the BEM extrapolation.
Acknowledgements
The authors are deeply indebted to the anonymous referee for her/his critical and constructive comments on the first version of the paper. This work was supported by NSFC (10225313), MOST (G2000078403), and CAS grants. One of the authors (Yiwei Li) is also supported by Taiyuan University of Science and Technology, Taiyuan, China, while another (R. Langlet) was supported by the region of Franche-Comté, France. Professors Qinshao Zhang, Gening Xu, Zhiqing Li, Xingjian Yang, Xiyun Wang and Junlin Li of TYUST are thanked for their persistent encouragement and support to Y. Li. Professor Jingxiu Wang (NAOC) is thanked for a general discussion on the solar magnetic fields and for providing some useful references at the beginning of this work. Professor Huaning Wang (NAOC), Professor Xiangchu Feng (XDU) and Dr. Zhuheng Li (XDU) are thanked for some useful discussions. Drs. Chijie Xiao, Jiangtao Su, Guiping Zhou, Chengming Tan, and Yuzong Zhang of NAOC are thanked for their warm help and concern on this work. Professor Qijun Fu, Dr. Shujuan Wang, Sr. Engineers Yuying Liu, Zhijun Chen, Huirong Ji and Guoshu Ji, Ms. Hongxia Nie, and Ms. Hongrong Du are thanked for their warm help to Y. Li during his stay at NAOC.