A 3D radiative transfer framework

E. Baron; P. H. Hauschildt; B. Chen

doi:10.1051/0004-6361/200911681

Home

All issues

Volume 498 / No 3 (May II 2009)

A&A, 498 3 (2009) 987-992

Full HTML

Free Access

Issue		A&A Volume 498, Number 3, May II 2009


Page(s)		987 - 992
Section		Numerical methods and codes
DOI		https://doi.org/10.1051/0004-6361/200911681
Published online		19 March 2009

V. Homologous flows

E. Baron^1,2,3 - P. H. Hauschildt¹ - B. Chen²

1 - Hamburger Sternwarte, Gojenbergsweg 112, 21029 Hamburg, Germany
2 - Homer L. Dodge Department of Physics and Astronomy, University of Oklahoma, 440 W Brooks, Rm 100, Norman, OK 73019-2061 USA
3 - Computational Research Division, Lawrence Berkeley National Laboratory, MS 50F-1650, 1 Cyclotron Rd, Berkeley, CA 94720 USA

Received 19 January 2009 / Accepted 6 March 2009

Abstract
Context. Observations and theoretical calculations have shown the importance of non-spherically symmetric structures in supernovae. Thus, the interpretation of observed supernova spectra requires the ability to solve the transfer equation in 3-D moving atmospheres.
Aims. We present an implementation of the solution of the radiative transfer equation in 3-D homologously expanding atmospheres in spherical coordinates. The implementation is exact to all orders in v/c.
Methods. We use the methods that we have developed in previous papers in this series as well as a new affine method that makes use of the fact that photons travel on straight lines. The affine method greatly facilitates delineating the characteristics and can be used in the case of strong-gravitational and arbitrary-velocity fields.
Results. We compare our results in 3-D for spherically symmetric test problems with high velocity fields (up to 87% of the speed of light) and find excellent agreement, when the number of momentum space angles is high. Our well-tested 1-D results are based on methods where the momentum directions vary along the characteristic (co-moving momentum directions). Thus, we are able to verify both the analytic framework and its numerical implementation. Additionally, we have been able to test the parallelization over characteristics. Using 512² momentum angles we ran the code on 16 384 Opteron processors and achieved excellent scaling.
Conclusions. It is now possible to calculate synthetic spectra from realistic 3D hydro simulations. This should open an era of progress in hydro modeling, similar to that that occurred in the 1980s when 1-D models were confronted with synthetic spectra.

Key words: radiative transfer - relativity - stars: supernovae: general

1 Introduction

Supernovae of all types are known to deviate significantly from spherical symmetry. The evidence comes from both flux spectra, but particularly from the interpretation of spectropolarimetry (see Wang & Wheeler 2008, and references therein). In the case of core-collapse supernovae, the asymmetry is thought to be due to the underlying central engine which is probably asymmetric and this leads to geometrically asymmetric ejecta, with the asymmetry growing as one gets closer to the central engine (thus ``stripped'' supernovae such as type Ic are significantly more asymmetric than supernovae with intact hydrogen envelopes such as type IIP). Type Ia (thermonuclear) supernovae are thought to be geometrically rather round but the composition is thought to be asymmetrical. Since the light curve of type Ia supernovae is powered by the radioactive decay of $\ensuremath{^{56}{\rm Ni}}$ and its products, asymmetries in the $\ensuremath{^{56}{\rm Ni}}$ distribution will lead to asymmetries in the ionization fractions and opacities that will produce polarization and alter the flux spectra. Thus, particularly in type Ia supernovae one can accurately calculate light curves and spectra assuming homologous flow ( $v \propto r$ ) but including the geometrical or compositional asymmetry in three dimensions.

In this series of papers (Baron & Hauschildt 2007; Hauschildt & Baron 2009,2008,2006, henceforth Papers I-IV) we have built up the full characteristics method of solving the transfer equation in 3-D for static atmospheres in a variety of geometries. Here we build on the results of Paper IV for spherical geometry as well as those of Chen et al. (2007) for the affine method.

2 Transfer equation

Chen et al. (2007) showed that the transfer equation in flat spacetime could be written in terms of an affine parameter and that the right hand side could be evaluated in the co-moving frame provided that the wavelength (or frequency) was evaluated in the co-moving frame. However, the momentum directions could be held constant and coincide with those of the observer's frame.

We define the rest frame photon direction in spherical coordinates as

$\begin{displaymath}{\vec n}=(1,\theta_n,\phi_n),\> \vert{\vec n}\vert=1, \end{displaymath}$

(1)

or in Cartesian coordinates

$\begin{displaymath}{\vec n}=(\sin\theta_n \cos\phi_n, \sin\theta_n\sin\phi_n, \cos\theta_n), \end{displaymath}$

(2)

and the starting position of the photon

$\begin{displaymath}{\vec r}_0=(r_0,\theta_0,\phi_0). \end{displaymath}$

(3)

The 3-D geodesic can be parametrized as

$\begin{displaymath} {\vec r}(s)= {\vec r}_0+{\vec n}s, \end{displaymath}$

(4)

where s is the rest frame physical distance related to the affine parameter $\xi$ by

$\begin{displaymath} s \equiv \frac{h}{\lambda_\infty}\xi, \end{displaymath}$

(5)

and is measured starting from ${\vec r}_0$ . This gives us

$\begin{displaymath}\frac{{\rm d}r}{{\rm d}s} \equiv \dot{r}= \frac{ {\vec n} \cdot {\vec r}_0 +s}{r}=\frac{{\vec n}\cdot{\vec r}}{r}, \end{displaymath}$

(6)

$\begin{displaymath}\ddot{r}=\frac{1-\dot{r}^2}{r}, \end{displaymath}$

(7)

and

$\begin{displaymath}r=\vert{\vec r}\vert=\vert{\vec r}_0+{\vec n}s\vert=\sqrt{r_0^2+2({\vec n\cdot r_0})s+s^2}, \end{displaymath}$

(8)

where

$\displaystyle {\vec n\cdot r_0}$	=	$\displaystyle r_0[\sin\theta_0\sin\theta_n\cos(\phi_n-\phi_0)+\cos\theta_0\cos\theta_n],$
	=	x₀p_x+y₀p_y+z₀p_z.	(9)

Note also that

$\begin{displaymath}\frac{{\rm d}\gamma}{{\rm d}r}\equiv \gamma'=\gamma^3\beta\beta',\> \beta'=\frac{{\rm d}\beta}{{\rm d}r}\cdot \end{displaymath}$

(10)

From Eqs. (5) above and (18) of Chen et al. (2007), we find

$\begin{displaymath} \frac{\partial I_\lambda}{\partial s}\vert _\lambda +\frac{{... ...}\right)I_\lambda+\eta_\lambda \frac{\lambda_\infty}{\lambda} \end{displaymath}$

(11)

with

$\begin{displaymath} \frac{\lambda_\infty}{\lambda}=\gamma(r)\left[1-\dot{r}\beta(r)\right]\equiv f(s). \end{displaymath}$

(12)

From Eq. (12) we find

$\displaystyle \frac{1}{\lambda}\frac{{\rm d}\lambda}{{\rm d}s}$	=	$\displaystyle \frac{ (\beta/r)\left(1-\dot{r}^2\right)-\gamma^2\beta'\dot{r}\left(\beta-\dot{r}\right)}{1-\dot{r}\beta(r)}$
	$\textstyle \equiv$	a(s).	(13)

Now we have

$\begin{displaymath}\frac{\partial I_\lambda}{\partial s}\vert _\lambda + a(s)\la... ...ambda}=-[\chi_\lambda f(s)+5 a(s)]I_\lambda+\eta_\lambda f(s). \end{displaymath}$

(14)

Finally, this can be put into the standard form used in PHOENIX (Hauschildt & Baron 2004b,1999)

$\begin{displaymath}\frac{\partial I_\lambda}{\partial s} + a(s)\frac{\partial}{\... ..._{\lambda} = -\chi_\lambda f(s) I_\lambda + \eta_\lambda f(s), \end{displaymath}$

(15)

where a(s) is given by Eq. (13), and f(s) is given by Eq. (12).

In order to finite difference this equation we need to explicitly difference the $\frac{\partial}{\partial\lambda} (\lambda I_\lambda)$ term. As described in Chen et al. (2007) we can write

$\begin{displaymath}\frac{\partial}{\partial\lambda} (\lambda I_\lambda) = \frac{... ...mbda_{l-1} I_{\lambda_{l-1}}}{\lambda_l - \lambda_{l-1}}\cdot \end{displaymath}$

(16)

Then, Eq. (15) can be written as:

$\displaystyle \frac{{\rm d}I_\lambda}{{\rm d}s} + \left[a(s)\frac{\lambda_l } {... ...ambda_{l-1} I_{\lambda_{l-1}}} {\lambda_l - \lambda_{l-1}} + \eta_\lambda f(s).$

(17)

Now we can define an effective optical depth,

$\displaystyle {\rm d} \tau$	=	$\displaystyle - \left( \chi_\lambda f(s) + 4a(s) + \frac{a(s)\lambda_l}{\lambda_l - \lambda_{l-1}} \right) {\rm d}s$	(18)
	$\textstyle \equiv$	$\displaystyle \hat\chi {\rm d}s,$	(19)

which defines $\hat\chi$ . We can also define the traditional source function $S_\lambda = \eta_\lambda/\chi_\lambda$ , so that Eq. (17) becomes

$\displaystyle \frac{{\rm d}I_\lambda}{{\rm d}\tau}$	=	$\displaystyle I_\lambda + \frac{\chi_\lambda}{\hat\chi_\lambda} \left( S_\lambd... ...ambda}\frac{\lambda_{l-1} I_{\lambda_{l-1}}} {\lambda_l - \lambda_{l-1}}\right)$	(20)
	$\textstyle \equiv$	$\displaystyle I_\lambda + \hat S_\lambda,$	(21)

which defines $\hat S_\lambda$ .

The more sophisticated discretization in $\lambda$ described in Hauschildt & Baron (2004a) or Knop et al. (2009) can also be implemented. For the case of arbitrary velocity fields the method of Knop et al. (2009) will be required. Nevertheless, this straightforward method yields excellent results when compared with the more sophisticated treatment used in the 1-D code (see below).

3 Angular integration

To solve the scattering problem in the co-moving frame, we need to calculate the mean intensity and $\mbox{$\Lambda^*$ }$ in the co-moving frame. Recall that the specific intensity is calculated in a frame where five of the six phase-space variables are actually observer's frame quantities. In particular, the two momentum directions are fixed observer's frame quantities. Thus, we need to perform the angular integration in the co-moving frame. We have

$\begin{displaymath}u= \gamma[1,{\vec\beta}],\>\> u_0 = [1,0,0,0], \end{displaymath}$

(22)

and

$\begin{displaymath}p = \frac{hc}{\lambda}[1,\vec{\hat n}] .\end{displaymath}$

(23)

Then

$\begin{displaymath}{\rm d}\Omega = \left(\frac{u_0\cdot p}{u\cdot p}\right)^2 {\rm d}\Omega_0. \end{displaymath}$

Now

$\begin{displaymath}u\cdot p = \frac{hc}{\lambda}\gamma [ 1 - \vec{\beta\cdot \hat n}],\>\>\> u_0\cdot p = \frac{hc}{\lambda}\cdot \end{displaymath}$

(24)

Thus

$\displaystyle {\rm d} \Omega$	=	$\displaystyle (\gamma[1 - \vec{\beta\cdot \hat n}])^{-2} {\rm d} \Omega_0$
	=	$\displaystyle (\gamma[1 - \beta(r)\dot{r}])^{-2} {\rm d}\Omega_0$
	=	$\displaystyle f(s)^{-2}{\rm d}\Omega_0.$	(25)

3.0.1 Computation of $\mbox{$\Lambda^*$ }$

The computation of $\mbox{$\Lambda^*$ }$ proceeds following the same procedure as in Paper I. As demonstrated by Olson et al. (1986) and Olson & Kunasz (1987), the coefficients $\alpha$ , $\beta$ , and $\gamma$ can be used to construct diagonal and tri-diagonal $\mbox{$\Lambda^*$ }$ operators for 1D radiation transport problems. In fact, up to the full $\Lambda$ matrix can be constructed by a straightforward extension of the idea (Hauschildt & Baron 2004a; Hauschildt et al. 1994). These non-local $\mbox{$\Lambda^*$ }$ operators not only lead to excellent convergence rates but they avoid the problem of false convergence that is inherent in the $\Lambda$ iteration method and can also be an issue for diagonal (purely local) $\mbox{$\Lambda^*$ }$ operators. Therefore, it is highly desirable to implement a non-local $\mbox{$\Lambda^*$ }$ for the 3D case. The tri-diagonal operator in the 1D case is simply a nearest neighbor $\mbox{$\Lambda^*$ }$ that considers the interaction of a point with its two direct neighbors. In the 3D case, the nearest neighbor $\mbox{$\Lambda^*$ }$ considers the interaction of a voxel with the (up to) 3³-1=26 surrounding voxels (this definition considers a somewhat larger range of voxels than a strictly face-centered view of just 6 nearest neighbors). This means that the non-local $\mbox{$\Lambda^*$ }$ requires the storage of 27 (26 surrounding voxels plus local, i.e., diagonal effects) times the total number of voxels $\mbox{$\Lambda^*$ }$ elements.

The construction of the $\mbox{$\Lambda^*$ }$ operator proceeds in the same way as discussed in Hauschildt (1992) and Paper I. In the 3D case, the ``previous'' and ``next'' voxels along each characteristic must be known so that the contributions can be attributed to the correct voxel. Therefore, we use a data structure that attaches to each voxel its effects on its neighbors. The scheme can be extended trivially to include longer range interactions for better convergence rates (in particular on larger voxel grids). However, the memory requirements to simply store $\mbox{$\Lambda^*$ }$ ultimately scales like n³ where n is the total number of voxels. The storage requirements can be reduced by, e.g., using $\mbox{$\Lambda^*$ }$ 's of different widths for different voxels. Storage requirements are not so much a problem if a domain decomposition parallelization method is used and enough processors are available.

We describe here the general procedure of calculating the $\mbox{$\Lambda^*$ }$ with arbitrary bandwidth, up to the full $\Lambda$ -operator, for the method in spherical symmetry (Hauschildt et al. 1994). The construction of the $\mbox{$\Lambda^*$ }$ is described in Olson & Kunasz (1987), so that we here summarize the relevant formulae. In the method of Olson & Kunasz (1987), the elements of the row of $\mbox{$\Lambda^*$ }$ are computed by setting the incident intensities (boundary conditions) to zero and setting S(i_x,i_y,i_z)=1 for one voxel (i_x,i_y,i_z) and performing a formal solution analytically.

We describe the construction of $\mbox{$\Lambda^*$ }$ using the example of a single characteristic. The contributions to the $\mbox{$\Lambda^*$ }$ at a voxel j are given by

$\displaystyle \Lambda_{i,j}$	=	$\displaystyle 0 \quad {\rm for\ } i<j-1$	(26)
$\displaystyle \Lambda_{j-1,j}$	=	$\displaystyle f(s_{j-1})\gamma_{j-1} \quad {\rm for\ } i=j-1$	(27)
$\displaystyle \Lambda_{j,j}$	=	$\displaystyle \Lambda_{j-1,j}\exp(-\Delta\tau_{j-1}) + f(s_j)\beta^k_{j} \quad{\rm for\ } i=j$	(28)
$\displaystyle \Lambda_{j+1,j}$	=	$\displaystyle \Lambda_{j,j}\exp(-\Delta\tau_{j}) + f(s_{j+1})\alpha_{j+1} \quad{\rm for\ } i=j+1$	(29)
$\displaystyle \Lambda_{i,j}$	=	$\displaystyle \Lambda_{i-1,j}\exp(-\Delta\tau_{i-1}) \quad {\rm for} j+1 < i.$	(30)

These contributions are computed along a characteristic, here i labels the voxels along the characteristic under consideration. These contributions are integrated over solid angle with the same method (either deterministic or through Monte-Carlo integration) that is used for the computation of J. For a nearest neighbor $\mbox{$\Lambda^*$ }$ , the process of Eq. (30) is stopped with i=j+1, otherwise it is continued until the required bandwidth has been reached (or the characteristic has reached an outermost voxel and terminates). Comparing with the results of Paper I, the $\mbox{$\Lambda^*$ }$ operator is altered simply by the Doppler-shift factor f(s) at the appropriate point.

4 From co-moving frame to global inertial frame

The specific intensity $I_\lambda$ is observer dependent, it is related to the observer invariant phase space distribution F(x,p) by

$\begin{displaymath}I_{\lambda}=-\frac{c^2}{h}(u\cdot p)^5F(x,p), \end{displaymath}$

(31)

therefore, the invariant quantity should be $I_\lambda\lambda^5$ . We immediately have

$\begin{displaymath}I_{\lambda_\infty}=\left(\frac{\lambda}{\lambda_\infty}\right)^5 I_\lambda=\frac{I_\lambda}{f(s)^5}\cdot \end{displaymath}$

(32)

We do not need to transform the direction vector, because when we write down our transfer equation, the only co-moving quantity we used is the co-moving wavelength, the other two momentum space variables (e.g., n_x, n_y) are global inertial (for the case we are working on, the $\vec n$ vector is the direction of photon in physical space, not the direction seen by the co-moving observer). For our flat spacetime case, if we want the direction of photon seen by $u^a=\gamma[1,{\vec \beta}],$ we simply need to do a Lorentz boost, for example using Eq. (11.98) of Jackson (1975).

5 Application examples

As a first step we have built upon the MPI parallelized Fortran 95 program described in Papers I-IV. The parallelization of the formal solution is presently implemented over solid angle space as this is the simplest parallelization option and also one of the most efficient (a domain decomposition parallelization method will be discussed in a subsequent paper). In addition, the Jordan solver of the Operator splitting equations is parallelized with MPI. The number of parallelization related statements in the code is small.

Our basic continuum scattering test problem is similar to that discussed in Hauschildt (1992), Hauschildt & Baron (2004a) and Papers I-II. This test problem covers a large dynamic range of about 9 dex in the opacities and overall optical depth steps along the characteristics and, in our experience, constitutes a reasonably challenging setup for the radiative transfer code. The application of the 3D code to ``real'' problems is in preparation and requires a substantial amount of development work (in progress). Comparing this test case to real world problems in 1D we have found that this test is close to the worst case scenario and that convergence, etc is generally better in real world problems. We use a sphere with a grey continuum opacity parametrized by a power law in the continuum optical depth $\ensuremath{\tau_{{\rm std}}}\xspace$ . The basic model parameters are:

1.

Inner radius $\hbox{$R_{\rm in}$ }=10^{13}$ cm, outer radius $\hbox{$R_{\rm out}$ }= 1.01\times 10^{15}$ cm.

2.

Minimum optical depth in the continuum $\tau_{{\rm std}}^{{\rm min}} =10^{-4}$ and maximum optical depth in the continuum $\tau_{{\rm std}}^{\max} = 10^{4}$ .

3.

Grey temperature structure with $\ensuremath{T_{{\rm model}}}\xspace=10^4$ K.

4.

Outer boundary condition $I_{\rm bc}^{-} \equiv 0$ and diffusion inner boundary condition for all wavelengths.

5.

Continuum extinction $\chi_{\rm c} = C/r^2$ , with the constant C fixed by the radius and optical depth grids.

6.

Parametrized coherent and isotropic continuum scattering by defining

$\begin{displaymath}\chi_{\rm c} = \epsilon_{\rm c} \kappa_{\rm c} + (1-\epsilon_{\rm c}) \sigma_{\rm c} \end{displaymath}$

(33)

with $0\le \epsilon_{\rm c} \le 1$ . $\kappa_{\rm c}$ and $\sigma_{\rm c}$ are the continuum absorption and scattering coefficients.

The test model is just an optically thick sphere put into the 3D grid. This problem is used because the results can be directly compared with the results obtained with our 1D spherical radiation transport code (Hauschildt 1992) to assess the accuracy of the method. The sphere is centered at the center of the Cartesian grid, which is in each axis 10% larger than the radius of the sphere. The solid angle space was discretized in $(\theta,\phi)$ with $n_\theta=n_\phi$ if not stated otherwise. In the following we discuss the results of various tests. In all tests we use the full characteristics method for the 3D RT solution.

$\begin{figure} \par\includegraphics[angle=90,width=8.5cm,clip]{1681fig1.eps} \end{figure}$	Figure 1: The results of 1-D calculations are compared with 3-D calculations for $\beta_{\max} = (0.03,0.33,0.67,0.87)$ . The momentum space directions were discretized using $n_\theta = n_{\phi } = 256$ .
Open with DEXTER

5.1 LTE tests

In this test we have set $\epsilon=1$ to test the accuracy of the formal solution by comparing to the results of the 1D code. The 1D solver uses 64 radial points, distributed logarithmically in optical depth. For the 3D solver we tested `moderate' grids with $n_r=n_\phi=2*32+1$ and $n_\theta = 2*16+1$ points along each axis, for a total of $65^2*33 \approx 1.4\times 10^{5}$ voxels. The momentum space discretization uses in general, $n_\theta = n_{\phi } = 256$ points. In Fig. 1 we show the mean intensities as a function of distance from the center for both the 1D (+ symbols) and the 3D solver. For the 3D results J is plotted at every voxel on the surface and the spherical symmetry is reproduced very well at every point on the surface. The results show excellent agreement between the two solutions, thus the 3D RT formal solution is comparable in accuracy to the 1D formal solution. Figure 1 shows the results for four different maximum velocities, and clearly indicates that the 3D RT solution is just as accurate as the 1D code. This is actually quite a nice demonstration since the affine method used in the 3D code is completely different from the full co-moving momentum space method used in the 1-D code, that is, the specific intensity is solved for in different frames in the two codes. However, since J depends only on the co-moving frequency they can be directly compared in the same frame.

As shown in Paper I, for the conditions used in these tests a larger number of solid angle points significantly improves the accuracy of the mean intensities. Our tests show that reasonable accuracy can be achieved with as few as 16² momentum space points, but in these test calculations we have used more points in order to really compare the 3D results to the 1D results. A full investigation of the number of angle points needed for realistic asymmetric calculations will be the subject of future work.

The line of the simple 2-level model atom is parametrized by the ratio of the profile averaged line opacity $\chi_l$ to the continuum opacity $\chi_{\rm c}$ and the line thermalization parameter $\epsilon_l$ . For the test cases presented below, we have used $\epsilon_{\rm c}=1$ and a constant temperature and thus a constant thermal part of the source function for simplicity (and to save computing time) and set $\chi_l/\chi_{\rm c} = 10^2$ to simulate a strong line, with $\epsilon_l = (1.0,0.1)$ (see below). With this setup, the optical depths as seen in the line range from 10^-2 to 10⁶. We use 92 wavelength points to model the full line profile, including wavelengths outside the line for the continuum. We did not require the line to thermalize at the center of the test configurations, this is a typical situation one encounters in a full 3D configurations as the location (or even existence) of the thermalization depths becomes more ambiguous than in the 1D case.

The sphere is put at the center of the Cartesian grid, which is in each axis 10% larger than the radius of the sphere. For the test calculations we use voxel grids with the same number of spatial points in each direction (see below). The solid angle space was discretized in $(\theta,\phi)$ with $n_\theta=n_\phi$ Unless otherwise stated, the tests were run on parallel computers using a variety of number of CPUs, architectures, and interconnects.

$\begin{figure} \par\includegraphics[angle=90,width=8cm,clip]{1681fig2.eps} \end{figure}$	Figure 2: The results of 1-D calculations for a scattering line are compared with 3-D calculations for $\beta_{\max} = 0.03$ . The momentum space directions were discretized using $n_\theta = n_\phi = 512$ and the calculation was run on the Franklin Cray XT4 using 2¹⁴ = 16 384 processors.
Open with DEXTER

5.2 LTE line and continuum test

We first have set $\epsilon_l=1$ to test the accuracy of the formal solution by comparing to the results of the 1D code. The 1D solver uses 64 radial points, distributed logarithmically in optical depth. Comparing the line mean intensities ${\bar J}$ as function of distance from the center for both the 1D and the 3D solver we again found excellent agreement.

5.3 Tests with line scattering

We have run a number of test calculations similar to the LTE case but with line scattering included. In Fig. 2 we show the co-moving $J_\lambda$ as a function of $\lambda$ for $\epsilon_l=0.1$ with $\beta_{\max} = 0.03$ and $n_\theta = n_\phi = 512$ . The 3D calculations compare very well to the 1D calculations. The small variation in some parts of the surface of the sphere (each black line represents a pixel on the surface of the sphere) is due to the different wavelength discretization. The 3D case uses only the simple method described above, whereas the 1D case uses the full Crank-Nicholson-like method described in Hauschildt & Baron (2004a).

Figure 3 shows the wall-clock time as a function of resolution (number of momentum-space angles $n_\theta$ and $n_\phi$ or number of CPUs). The computational work is kept constant with each CPU required to calculate 16 characteristics. The modest (14%) increase in wall-clock time from 16 CPUs to 16 384 CPUs is acceptable given the huge increase in communication required and the fact that load balancing is quite simple. The wall-clock time for this test was about 625 s for each direction angle. The memory usage is controlled by the size of the spatial grid (this is also true in Papers I-IV). The only additional storage is that the values of the intensity for both the previous and current wavelength point must be stored. The total memory per process was about 100 MB.

$\begin{figure} \par\includegraphics[angle=90,width=8.5cm,clip]{1681fig3.eps} \end{figure}$	Figure 3: The wall-clock time to solve a scattering line $\epsilon = 0.1$ on the Franklin Cray XT4 as a function of momentum frame angular resolution. The test was run so the amount of computational work per processor was constant. The roughly 14% communication time increase from 16 to 16 384 processors is acceptable.
Open with DEXTER

6 Wavelength parallelization

We have implemented and tested a ``pipeline'' wavelength parallelization method using wavelength clusters in the manner described in Baron & Hauschildt (1998). As in Baron & Hauschildt (1998), the parallelization over characteristics is done within a ``wavelength cluster'' and each worker thus must send only its values of the specific intensity to the corresponding process in the next wavelength cluster. For the simple test problems considered so far the opacity calculation is trivial and hence, there is no speedup (but also no penalty) for this wavelength parallelization. However, in real world problems the time to calculate the opacities is roughly equal to the time required to solve the transfer equation and this leads to linear speedups in the 1D case of up to about a factor of eight.

7 Conclusions

We have implemented the affine method described in Chen et al. (2007) for the case of homologous flows and shown that it gives excellent results compared to the full co-moving method that we use in our 1D code. We have also been able to parallelize it both over characteristic and wavelength. The characteristic scaling is excellent and immediately brings us into the forefront of massively parallel computation. The next step is to include the 3D calculations in the full real world code and begin applying it to numerous astrophysical problems.

Acknowledgements

This work was supported in part by SFB 676 from the DFG, NASA grant NAG5-12127, NSF grant AST-0707704, and US DOE Grant DE-FG02-07ER41517. This research used resources of the National Energy Research Scientific Computing Center (NERSC), which is supported by the Office of Science of the US Department of Energy under Contract No. DE-AC02-05CH11231; and the Höchstleistungs Rechenzentrum Nord (HLRN). We thank all these institutions for a generous allocation of computer time.

Appendix A: Determining s from Coordinates

The characteristics are followed through the voxel grid from an entry boundary point to an exit boundary point. It is convenient to choose some particular voxel as a ``starting point'' r₀ and determine the distance s to the two boundary points. We will choose our sign convention such that the distance to the entry point is negative and the distance to the exit point is positive. Given a starting point r₀ we can find the distance to a boundary point R (where $R=R_{\rm in}$ , or $R=R_{\rm out}$ ) from Eq. (4) and find

$\begin{displaymath} s^2+2({\vec n \cdot r_0})s+(r_0^2-R^2)=0 \end{displaymath}$

(A.1)

The characteristics can be divided into three classes. Tangential characteristics (those that do not hit the inner boundary R_in have

$\begin{displaymath}R=R_{\rm out}, \end{displaymath}$

and satisfy the constraint that the impact parameter is greater than $R_{\rm in}$ ,

$\begin{displaymath}r_0^2-({\vec n\cdot r_0})^2>R^2_{\rm in}. \end{displaymath}$

(A.2)

For this case, Eq. (A.1) has two solutions

$\begin{displaymath}s_-=-({\vec n \cdot r_0})-\sqrt{({\vec n \cdot r_0})^2-(r_0^2-R^2)}, \end{displaymath}$

(A.3)

and

$\begin{displaymath}s_+=-({\vec n \cdot r_0})+\sqrt{({\vec n \cdot r_0})^2-(r_0^2-R^2)}, \end{displaymath}$

(A.4)

such that

$\begin{displaymath}s_-\leq 0\leq s_+. \end{displaymath}$

(A.5)

Core-intersecting characteristics include two cases, incoming and outgoing rays. Incoming core-intersecting characteristics are determined by

$\begin{displaymath}{\vec n\cdot r_0}<0, \end{displaymath}$

(A.6)

where

$\begin{displaymath}R=R_{\rm out}, \end{displaymath}$

and there is only one solution

$\begin{displaymath}s_-=-({\vec n \cdot r_0})-\sqrt{({\vec n \cdot r_0})^2-(r^2_0-R^2)}, \end{displaymath}$

(A.7)

the other solution

$\begin{displaymath}s_+=-({\vec n \cdot r_0})+\sqrt{({\vec n \cdot r_0})^2-(r^2_0-R^2)}, \end{displaymath}$

(A.8)

should be dropped because it passes through the core. Here

s_-<0<s₊.

(A.9)

For outgoing core-intersecting characteristics

$\begin{displaymath}{\vec n\cdot r_0}>0. \end{displaymath}$

(A.10)

this time the characteristic should start from the core

$\begin{displaymath}R=R_{\rm in}, \end{displaymath}$

For this case

$\begin{displaymath}s_-=-({\vec n \cdot r_0})+\sqrt{({\vec n \cdot r_0})^2-(r^2_0-R^2)}, \end{displaymath}$

(A.11)

is the desired solution and the other solution

$\begin{displaymath}s_+=-({\vec n \cdot r_0})-\sqrt{({\vec n \cdot r_0})^2-(r^2_0-R^2)} \end{displaymath}$

(A.12)

passes through the core. In this case,

0>s->s₊.

(A.13)

References

Baron, E., & Hauschildt, P. H. 1998, ApJ, 495, 370 [NASA ADS] [CrossRef] (In the text)
Baron, E., & Hauschildt, P. H. 2007, A&A, 468, 255 [NASA ADS] [CrossRef] [EDP Sciences]
Chen, B., Kantowski, R., Baron, E., Knop, S., & Hauschildt, P. 2007, MNRAS, 380, 104 [NASA ADS] [CrossRef] (In the text)
Hauschildt, P. H. 1992, JQSRT, 47, 433 [NASA ADS] (In the text)
Hauschildt, P. H., & Baron, E. 1999, J. Comp. Appl. Math., 109, 41 [NASA ADS] [CrossRef]
Hauschildt, P. H., & Baron, E. 2004a, A&A, 417, 317 [NASA ADS] [CrossRef] [EDP Sciences] (In the text)
Hauschildt, P. H., & Baron, E. 2004b, Mitteilungen der Mathematischen Gesellschaft in Hamburg, 24, 1
Hauschildt, P. H., & Baron, E. 2006, A&A, 451, 273 [NASA ADS] [CrossRef] [EDP Sciences]
Hauschildt, P. H., & Baron, E. 2008, A&A, 490, 873 [NASA ADS] [CrossRef] [EDP Sciences]
Hauschildt, P. H., & Baron, E. 2009, A&A, 498, 981 [NASA ADS] [CrossRef] [EDP Sciences]
Hauschildt, P. H., Störzer, H., & Baron, E. 1994, JQSRT, 51, 875 [NASA ADS]
Jackson, J. D. 1975, Classical Electodynamics, 2nd edn. (New York: Wiley) (In the text)
Knop, S., Hauschildt, P. H., & Baron, E. 2009, A&A, in press (In the text)
Olson, G. L., & Kunasz, P. B. 1987, JQSRT, 38, 325 [NASA ADS] (In the text)
Olson, G. L., Auer, L. H., & Buchler, J. R. 1986, JQSRT, 35, 431 [NASA ADS] (In the text)
Wang, L., & Wheeler, J. C. 2008, ARA&A, 46, 433 [NASA ADS] [CrossRef] (In the text)

All Figures

$\begin{figure} \par\includegraphics[angle=90,width=8.5cm,clip]{1681fig1.eps} \end{figure}$	Figure 1: The results of 1-D calculations are compared with 3-D calculations for $\beta_{\max} = (0.03,0.33,0.67,0.87)$ . The momentum space directions were discretized using $n_\theta = n_{\phi } = 256$ .
Open with DEXTER
In the text

$\begin{figure} \par\includegraphics[angle=90,width=8cm,clip]{1681fig2.eps} \end{figure}$	Figure 2: The results of 1-D calculations for a scattering line are compared with 3-D calculations for $\beta_{\max} = 0.03$ . The momentum space directions were discretized using $n_\theta = n_\phi = 512$ and the calculation was run on the Franklin Cray XT4 using 2¹⁴ = 16 384 processors.
Open with DEXTER
In the text

$\begin{figure} \par\includegraphics[angle=90,width=8.5cm,clip]{1681fig3.eps} \end{figure}$	Figure 3: The wall-clock time to solve a scattering line $\epsilon = 0.1$ on the Franklin Cray XT4 as a function of momentum frame angular resolution. The test was run so the amount of computational work per processor was constant. The roughly 14% communication time increase from 16 to 16 384 processors is acceptable.
Open with DEXTER
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.