A&A 380, 776788 (2001)
DOI: 10.1051/00046361:20011411
S. Richling^{1}  E. Meinköhn^{1,2}  N. Kryzhevoi^{1,3}  G. Kanschat^{2,3}
1  Institut für Theoretische Astrophysik,
Universität Heidelberg,
Tiergartenstr. 15,
69121 Heidelberg, Germany
2 
Institut für Angewandte Mathematik,
Universität Heidelberg,
Im Neuenheimer Feld 294,
69120 Heidelberg, Germany
3 
Interdisziplinäres Zentrum für Wissenschaftliches Rechnen,
Universität Heidelberg,
Im Neuenheimer Feld 368,
69120 Heidelberg, Germany
Received 16 May 2001 / Accepted 4 October 2001
Abstract
A finite element method for solving the monochromatic
radiation transfer equation including scattering in three dimensions
is presented. The algorithm employs unstructured grids which are
adaptively refined. Adaptivity as well as ordinate parallelization
reduce memory requirements and execution time and make it possible to
calculate the radiation field across several length scales for objects
with strong opacity gradients. An a posteriori error estimate for one
particular quantity is obtained by solving the dual problem. The
application to a sample of test problems reveals the properties of the
implementation.
Key words: radiative transfer  scattering  methods: numerical
Radiative transfer calculations are an important tool for the interpretation of spectra and images of astrophysical objects. A threedimensional treatment is required for objects with complex geometry and kinematics. Intruding examples are molecular clouds consisting of clumpy filaments (Wiseman & Ho 1998), circumstellar disks and jets accompanying the formation of multiple stellar systems (Zinnecker et al. 1999), planetary nebulae with precessing bipolar outflows (Miranda et al. 1999), quasars with infrared spectra indicating a clumpy nature of the torus (Haas et al. 2000), and highredshift radio galaxies with jetinduced star forming regions (Bicknell et al. 2000).
Threedimensional radiative transfer codes frequently employ the Monte Carlo method. This simple concept of following the path of photon packages works in principle for any problem. But Monte Carlo calculations are very timeconsuming, especially in three dimensions, because the error of the results decreases very slowly. When N is the number of simulated photons, the rate of convergence is in the worst case. Nevertheless, rates of convergence or even better can be obtained (e.g. Haber 1970). Monte Carlo methods are implemented in codes which consider scattering of photons by dust in different environments (e.g. Witt & Gordon 1996; Wolf et al. 1998; Városi & Dwek 1999; Wood & Reynolds 1999) and in codes which calculate line spectra in clumpy molecular clouds (Juvela 1998; Park & Hong 1998). However, the line transfer problem in three dimensions is also widely treated with the method of short characteristics (Väth 1994; Papkalla 1995; Fabiani Benedicho et al. 1997; Folini 1998). Alternatively, we will introduce a finite element method for the solution of the radiative transfer equation. Finite element methods are still very infrequently used for the modeling of astrophysical systems. However, Meier (1999) proposes a finite element code suitable for a wide range of astrophysically interesting differential equations. Radiative transfer calculations employing finite element methods were already performed by Dykema et al. (1996). They solve the zeroth and first moments of the transfer equation in two dimensions and use a finite element technique on fixed quadrilateral meshes for the computation of the Eddington tensor. In contrast, we solve the radiative transfer equation directly in three dimensions for discrete ordinates on a hexahedral mesh which is adaptively refined. Therefore, it is possible to calculate the radiation field across several length scales even for objects with strong opacity gradients. In addition, we apply ordinate parallelization for the integral operator via the Message Passing Interface (MPI) which permits the use of computer clusters.
This work provides an extensive description of the basic monochromatic radiative transfer code originally developed by Kanschat (1996) and of some tests which reveal the main properties of the finite element implementation. The series will be continued with the presentation and application of extensions to frequencycoupled radiative transfer: a line transfer version which considers the kinematics of the material and complete frequency redistribution and a continuum transfer version which utilizes the energy balance equation to determine selfconsistently a temperature distribution. The applications will focus on the radiation fields of highredshift galaxies.
In Sect. 2, we describe the application of the finite element method to the monochromatic radiative transfer equation and the details of our implementation, i.e. discretization, grid adaptation, structure and solution of the resulting matrix, and parallelization strategy. In Sects. 3, 4 and 5 we present results for three test problems: the searchlight test problem, the angular distribution of the specific intensity escaping a planeparallel layer, and the propagation of radiation emitted from an extended region embedded in a scattering halo. We summarize our results in Sect. 6.
(2) 
The left hand side of Eq. (1) will be abbreviated as an
operator
applied to the intensity function
,
yielding the
very compact operator form of the radiative transfer equation
Equation (1) is analyzed in Dautray & Lions (1993) and
the natural space for finding solutions is
Applying standard Galerkin finite elements to solve (5), we
choose a finite dimensional subspace W_{h} of W consisting of
functions that are piecewise polynomials with respect to a subdivision
or triangulation
of
.
The mesh size h is the
piecewise constant function defined on each triangulation cell K by
.
The discrete analogue of (5) is
finding
,
such that
On S^{2} we use fixed discretizations based on a refined icosahedron as
illustrated in Fig. 1. Quadrature points are the cell centers
projected on S^{2}. This projection method guarantees equally
distributed quadrature points, each associated with a solid angle
of the same size. In comparison to other numerical quadrature
methods, additional symmetry conditions are redundant and
discretization artifacts at the poles are avoided. Furthermore, we use
piecewise constant trial functions. That way, the sevendimensional
integration of the scattering term in the weak formulation
(4) can be calculated very efficiently, since integration
weights are not necessary. For example, the integration of the
intensity over the whole unit sphere is simply replaced by a sum
divided by the number of discrete ordinates M
For the space domain we use locally refined hexahedral meshes. The mesh size with respect to the space variable will be denoted by h in the course of this publication. Since the boundaries are arbitrary for our astrophysical application, we can choose a unit cube for and do not have to worry about boundary approximation. We use continuous piecewise trilinear trial functions in space.
Figure 1: Refined icosahedron with 80 and 320 cells. The cell centers projected on a unit sphere are equally distributed.  
Open with DEXTER 
In addition, computational goals in astrophysics are often more specific. The result of a simulation is to be compared with observations to develop a physical model for an object. For instance, in the case of a distant unresolved object, the total flux leaving the domain in one particular direction is of interest. Generally, a measured quantity like the total flux can be expressed as a linear functional . By linearity, the error of the measured quantity is , where . In the following, we will show, that it is possible to obtain an a posteriori estimate for , even if the exact solution is unknown.
Suppose,
is the solution of the dual problem
(10) 
A first approach in the development of strategies for grid refinement
based on a posteriori error estimates is the control of the global
energy or L^{2}error involving only local residuals of the computed
solution (Führer & Kanschat 1997). Formally, using the
functional
in the dual problem
(8), we obtain
for the left hand side of (9). Estimating the
right hand side of (9) yields
The grid refinement process on the basis of an a posteriori error
estimate is organized in the following way: Suppose that some error
tolerance TOL is given. The aim is to find the most economical grid
on which
(13) 
(14) 
refinement step  in %  in %  N  
0  0.0457  35  1.000  0.37  289  289 
1  0.0805  39  1.076  0.32  631  1089 
2  0.0833  38  1.104  1.7  1395  4225 
3  0.0963  34  1.109  7.6  2995  16641 
4  0.1018  36  1.171  14  6312  66049 
5  0.0833  36  1.106  24  13084  263169 
6  0.0833  34  1.080  24  28005  1050625 
7  0.0929  32  1.095  25  57606  4198401 
8  0.0934  29  1.093  23  118179  16785409 
The linear system of equations resulting from the discretization described above is large (10^{7} unknowns at least), sparse, and strongly coupled due to the integral operator.
The standard algorithm used in astrophysics to solve these systems is the socalled iteration. Considering the whole discrete system, it is a Richardson method with nearly blockJacobipreconditioning. Using a full Jacobi preconditioner is a first step to better convergence rates (Turek 1993). Since the transport operator is inverted explicitly, these methods converge very fast for transport dominated problems. Exploiting the triangular matrix structure of upwinddiscretizations, the inversion is indeed very cheap (one matrixvectormultiplication).
Unfortunately, in the interesting case of scattering dominance this method  like other stationary iterations  breaks down, since the condition number of the iteration matrix becomes very large. Since the convergence rate of preconditioned Richardson iteration methods only depends on the condition number, these are not suited for the scattering dominated case.
The eigenvalue distribution of the scattering matrix is clustered, where usually one of them is zero and the others are close to unity or at least bounded away from zero. We exploit this structure by applying a Krylov space method like GMRES or biCGSTAB (see e.g. Sleijpen & Fokkema 1993 and references therein) to the solution. Unfortunately, the scattering operator is perturbed by the transport operator. As a differential operator, it requires preconditioning by multilevel methods to obtain convergence independent of the mesh size.
Transport dominated problems differ in one specific point from elliptic problems: there is a distinct direction of information flow. This has to be considered in the development of parallelization strategies. While a domain decomposition for Poisson's equation should minimize the length of interior edges, this does not yield an efficient method for transport equations.
In Kanschat (2000), it was concluded that parallelization strategies for transport equations should not divide the domain across the transport direction. The solution of the radiative transfer equation consists of a bunch of transport inversions for different directions. The construction of an efficient domain decomposition method would require a direction dependent splitting of the domain . Since this causes immense implementation problems, we decided to use ordinate parallelization.
This strategy distributes the ordinate space S^{2} of the radiative transfer equation. Since we use discontinuous shape functions for the ordinate variable and there is no local coupling due to the integral operator, this results in a true nonoverlapping parallelization. Clearly, it has disadvantages too: As the integral is a global operator, ordinate parallelization involves global communication. Additionally, the resolution in space is restricted by the per node memory and not by the total memory of the machine.
Figure 2: Test I: intensity distribution (top) and structure of the grid (bottom) for refinement steps 2, 4 and 8. Each dot marks the location of a vertex. The incoming beam is introduced at x=1.0 between y=[0.875,0.750].  
Open with DEXTER 
Figure 3: Test I: intensity distribution at y=0.25 for different adaptive steps as indicated. The right hand side gives a more closer view to the range x=[0.55,0.45]. In addition, the crosses mark the location of the vertices.  
Open with DEXTER 
The searchlight beam test is a standard test for the transport operator: a narrow beam of radiation is introduced into the computational grid at a certain angle. The beam should cross a vacuum ( ) without dispersion.
For this test we started with a twodimensional, globally prerefined grid with cells which covers the domain x=[1,1] and y=[1,1]. This grid consists of vertices and has a resolution of . The beam of radiation with intensity was launched at x=1 between y=[0.875,0.750] at an angle of approximately . We calculated the radiation field for the basic grid and 8 successively refined grids, where in each step 25% of the cells with the worst L^{2}indicator (see Sect. 2.3) were refined.
Figure 2 shows the intensity distribution and the grid structure for step 2, 4 and 8. The minimal cell size of the grid at step 2 corresponds to an equidistant grid with cells. The dispersion of the beam at this step is qualitatively comparable to the results of the short characteristic method by Kunasz & Auer (1988). The following refinement steps resolve at first the beam and later mainly the fringes of the beam which is most clearly visible near the entrance of the beam into the grid. At step 8 the intensity distribution shows a straight beam with negligible dispersion. This result is obtained with about 10^{5} vertices which is less than a hundredth of the vertices required for an equidistant grid with the same resolution (see last two rows of Table 1). The functioning of the code is demonstrated in detail in Fig. 3. Here the intensity distribution at y=0.25is shown for all refinement steps. The right hand side of this figure magnifies the left fringe of the beam. The location of the vertices are indicated by crosses. As already expected from the grid structure, the first steps begin to resolve the beam itself and the following steps try to reproduce the intensity distribution at the sharp edges of the beam. Since the streamline diffusion discretization is a second order method, we encounter the problem of negative intensities and overshooting. Table 1 shows this behavior more quantitatively. For all steps the deviation of the minimum intensity and the maximum intensity from the analytical solution is about 10% of . With increasing resolution the fraction of vertices with negative intensities is slightly declining from 40% to 30%, whereas the fraction of vertices with intensity values greater than unity is continuously increasing to a fraction of 25%. The minimum intensities are somewhat better than those obtained with the parabolic upwind interpolation scheme by Kunasz & Auer (1988), who find for equally spaced grids as well as for logarithmic grids. Unsurprisingly, their relative number of cells with negative intensities is smaller, because the finite element code just favors the generation of finer cells in regions of discontinuities where negative intensity values occur. Negative intensity values cause problems in cases where even the mean intensity J turns out to be negative and therefore prevent the determination of important physical quantities. A proper treatment of these cases, instead of simply adopting as a lower limit, would require expensive code development involving the tracing of discontinuities in order to switch between second order and first order methods. Refer to shock capturing methods described in the literature, e.g. Eriksson & Johnson (1993).
Figure 4:
Test II: angular distribution of the intensity for different
total optical depth
and albedo . 

Open with DEXTER 
FE  FD  
DUAL  L2  GLOBAL  GLOBAL  
vertices  value  vertices  value  vertices  value  vertices  value  
125  0.4274  125  0.4274  125  0.4274  35937  0.5760  
501  0.5960  579  0.5250  729  0.5912  274625  0.6162  
2213  0.6609  2474  0.6397  4913  0.6420  2146689  0.6417  
8103  0.6723  10935  0.6616  35937  0.6612  
14953  0.6740  86903  0.6687  274625  0.6686 
FE  FD  FE  FD  
processors  1  2  4  8  16  1  1  2  4  8  16  1  
memory [MB]  31.9  16.7  9.0  5.2  3.3  13  196  99.6  52.2  28.7  15.1  99  
time [s]  846.5  371.8  121.4  56.2  48.2  1275.0  3167.2  1364.2  428.7  207.1  136.1  9878.9 
In order to test the scattering operator we calculate the radiation field of an infinitely extended planeparallel layer. In this case, we compare the results with those of an analytical method, the separable representation (SR) method, and a finite difference (FD) code. Both methods are briefly described in the appendix.
The test starts with a homogeneous planeparallel layer with thickness . We assume zero boundary conditions and a constant source term with extinction coefficient and albedo . We calculate the radiation field for optical depth and albedo and investigate the angular dependence of the specific intensity escaping the planeparallel layer.
For the calculations with the FE code we approximate the infinitely extended layer with a threedimensional anisotropic grid with initially 4^{3} cells which have an aspect ratio of 100:100:1. For each optical depth and albedo we perform up to a maximum of seven refinement steps and tap the outgoing intensity at the middle of the surface of the layer. We use local refinement criterions taken from (11) for a functional , i.e. only the intensity leaving the domain from one point in one particular direction is of interest. An aspect ratio of 100:100:1 means we get reliable results for , where is the anglecosine of ray direction relative to the vertical.
The FD code is also a threedimensional code (Appendix B). Here, we use a numerical grid with 32^{3} cells with the same aspect ratio of 100:100:1. The iterative procedure for the mean intensity Jis carried out until the relative change of J is smaller than 10^{5}.
The SR method (Appendix A) enables us to obtain the solution in terms of a function which is an infinite series. An approximation of this series by means of a finite sum is possible. A very good agreement of the approximated function with the exact one can already be obtained in the approximation with N=5 for a wide range of parameters, where N is the number of terms in the finite sum. For small the exact and the approximated function show a different behavior which causes wrong results in this range. But the higher the order of the approximation N the better are the results for small . We calculated the specific intensity for all cases with N=10. Figure 4 shows the angular distribution of the specific intensity escaping from a planeparallel layer obtained with the three methods. In general, the three codes yield the same results. Excellent agreement is obtained between the SR method and the FE method. The FE code needed 2 to 3 refinement steps for low optical depth and 5 refinement steps for to match the solution of the SR method very well. The resolution of refinement step 5 corresponds to a globally refined grid with 128^{3} cells. The FD code is not able to reproduce the results of the other methods for high optical depth and large albedo. In these cases the results of the FD code are too low. The results can be improved by using a grid with higher spatial resolution, but even a grid with 128^{3} cells fails to produce equally good results as the FE code. In order to accomplish the results of the SR method, the required accuracy of J and the number of ordinates M play a minor role. Since the FD code only uses a first order discretization, the very good results of the FE code must be attributed to the second order streamline diffusion discretization.
The general form of the intensity distributions displayed in
Fig. 4 becomes clear when considering the change of
along the distance
(15) 
(16) 
(17) 
(18) 
We use this test problem to point out the advantage of employing an adaptive grid refinement process as described in Sect. 2.3 compared with the commonly used structured grids. Therefore, we calculate the specific intensity in a particular direction on a variety of grids. The optical depth and albedo are 20 and 0.80, respectively. Table 2 lists the number of vertices and the computed value of the specific intensity escaping the layer from the point . The FE method uses grids which are generated by using local refinement criterions as described in Sect. 2.3 (DUAL, L2) and a global refinement criterion where all cells are refined in every step (GLOBAL). L2 indicates the control of the L^{2}error and DUAL employs the functional for a point value as described at the beginning of this section. The FD code is restricted to the use of structured grids only. First, we discuss the quality of the L2 and the DUAL grid compared with the quality of the structured grid used by the FE code. The quality of a grid is related to the question of how economic it is. This is defined by the number of vertices needed to produce an accurate solution. Table 2 shows that the L2 refinement strategy already improves the quality of the generated grid by a factor of 3 to 4. A dramatic enhancement is obtained by applying the DUAL refinement process, since approximately 20 times less vertices are needed to produce a similar solution as in the GLOBAL case. The DUAL local refinement criterions taken from (11) generate a grid that is much more appropriate for our test problem. Due to the local weights the indicators forces the refinement of the cells close to the boundary point where the outgoing intensity is measured. Finally, we compare the solution of the FE code (for the GLOBAL grid) with the solution of the FD method in Table 2. To compute a comparable solution, the FD code requires 129^{3} vertices, whereas for the FE method 17^{3} are sufficient. Again, this reflects the higher order of the FE streamline diffusion discretization.
0.1  1.0  10.0  
M  20  80  320  1280  20  80  320  1280  20  80  320  1280  
step 0  9.48  9.48  9.48  9.48  9.48  9.48  9.48  9.48  9.48  9.48  9.48  9.48  
step 1  8.04  8.04  8.04  8.04  8.04  8.04  8.04  8.04  8.04  8.04  8.04  8.04  
step 2  8.19  8.19  8.04    8.19  8.22  8.04    8.19  8.19  8.19    
step 3  8.19  8.19      8.19  8.19      8.20  8.20      
step 4  8.20        8.20        8.18  8.20     
Figure 5: Test III: dependence of the mean intensity J on the distance from the center r for different and M.  
Open with DEXTER 
The setup for the last test problem is as follows: The 3D
computational domain (x=[1,1], y=[1,1], z=[1,1]) contains a
spherically symmetric cloud with radius
which is illuminated
by an extended emission region with radius
.
The source
term is given by
(19) 
(20) 
We designed this test problem as a very simple model for Lyman halos associated with highredshift galaxies. Lyman halos are diffuse and clumpy emission line regions which are more than ten times larger than the corresponding radio emission (van Ojik et al. 1996,1997; Steidel et al. 2000). The Lyman line profiles suggest a complex velocity field of the material surrounding starforming galaxies. Hereafter, we intend to model Lymanhalos with the line transfer version of the finite element code. This test problem will help us to determine the demands on spatial resolution and the required number of ordinates.
Since the Lyman
line is a resonance line, we assume
everywhere. In this case, the zeroth moment equation
of the radiative transfer equation reduces to
(22) 
The numerical calculations start with a prerefined grid with 16^{3}cells. Thus, the size of the cells at step zero is equal to . The grid refinement process is based on the global L^{2}indicators taken from (12). We perform the calculations for and M=20,80,320,1280. Due to limited memory resources, the number of feasible refinement steps critically depends on M. In Table 4 we show for all calculations the values for F_{r}(1.0) averaged over all directions. In comparison to the analytical result, step zero overestimates and step one underestimates the radiative flux. With refinement step two the emission region is resolved sufficiently well and the relative error of the numerical value for the radiative flux is of order 10^{3}for all and M. This results proves the global fluxconserving property of the FE implementation. On the contrary, the FD code is not fluxconserving. The radiative flux obtained for the same problem with the FD code and a global spatial resolution comparable to step 2 is about 6% smaller than the analytical result.
Table 4 also shows, that refinement step 2 is not sufficient for calculations with and . In these cases, the error estimator prefers to refine cells in the outer regions of the cloud. Since we now know the minimal required resolution for the emission region, this effect can be avoided with a proper local prerefinement of the central region at step 0.
For the spherically symmetric test case, it is sufficient to use only 20 ordinates to get a very good result for the radiative flux. But the required number of ordinates is completely different, if one is interested in local quantities, for example the radiative flux from isolated regions for comparison with spatially resolved spectroscopic observations or the mean intensity in order to determine a temperature distribution. In Fig. 5 we show the distribution of the mean intensity J(r) at all vertices for step 2, except for M=1280, where we show step 1. Instead of a smooth distribution we find a large scatter of about three orders of magnitudes for small and M=20. The scatter is typical for discrete ordinates methods and decreases with higher optical depth and with increasing number of ordinates. At high optical depth it seems sufficient to use M=80, but at small optical depth one has to use at least M=320.
We have presented a finite element code for the solution of the monochromatic radiative transfer equation. The use of adaptively refined grids and ordinate parallelization enables us to compute the radiation fields of complex threedimensional configurations.
The searchlight beam test showed the capability of the code to resolve discontinuities within a few adaptive steps. An equidistant grid would require about 100 times more cells to achieve the same resolution. Due to the second order streamline discretization, negative intensities and overshooting occur in the vicinity of discontinuities.
The radiation field escaping from an infinite planeparallel layer was used as a test problem for the scattering operator. We compared the angular distribution of the intensity with the results of an analytical method and of a simple finite difference technique for various values of optical depth and albedo. The results of the finite element code are in close agreement with the analytical results. No matter how many ordinates are considered, the finite element code needs 25 adaptive steps to reach convergence. The finite difference code obtains fairly good results for low optical depth. But for high optical depth and high albedo the analytical results could only be reproduced with extraordinary high grid resolution. This reflects the lower order of the finite difference discretization. Due to efficient grid adaptation and the application of parallelization techniques, the finite element code can obtain more accurate results about 2070 times faster than the finite difference code. The total memory requirement of the two codes is of the same order of magnitude. The good performance of the finite element code is crucial for the planned extention to 3D frequencydependent radiative transfer problems.
Finally, we examined the propagation of radiation emitted from an extended object embedded in a scattering, spherically symmetric halo. This threedimensional problem is the basic setup for future line transfer calculations of Ly halos associated with highredshift galaxies. The results show that the total flux is already obtained after a few adaptive steps independent of the number of ordinates. But if one is interested in the spatial distribution of a quantity  for example the mean intensity  a high number of ordinates is required in order to obtain reliable results.
Acknowledgements
We thank Rainer Wehrse, Garii Efimov and Rolf Rannacher for their collaboration. This work was supported by the Deutsche Forschungsgemeinschaft (DFG) within the SFB 359 "Reaktive Strömungen, Diffusion und Transport'' and SFB 439 "Galaxien im jungen Universum''. The calculations were mainly performed on the computer clusters of the Institute for Applied Mathematics and the Interdisciplinary Center for Scientific Computing (IWR) in Heidelberg.
The separable representation method developed by Efimov et
al. (1995, 1997) allows to obtain an analytical solution
of the planeparallel radiative transfer equation
(A.1) 
(A.3) 
The finite difference code is based on an implicit discretization of the transfer equation, Eq. (1), in cartesian coordinates. The resulting linear system of equations is solved by a combination of recursion and iteration (Stenholm et al. 1991).
When
defines a specific direction and ,
and
are the spatial resolutions, the radiative
transfer equation is discretized in a first order upwind scheme
(B.1) 
(B.2) 
(B.3) 