EDP Sciences
Free Access
Issue
A&A
Volume 607, November 2017
Article Number A125
Number of page(s) 12
Section Numerical methods and codes
DOI https://doi.org/10.1051/0004-6361/201731757
Published online 24 November 2017

© ESO, 2017

1. Introduction

The modelling of observations of astrophysical objects in the wavelength range from the UV to the submm is a challenging task. For a vast variety of scales, from proto-planetary systems to galaxies, the emission in this wavelength range is dominated either by primary sources of radiation (e.g. stars or active galactic nuclei, AGN), predominant in the UV and optical, or by the re-emission of absorbed photons by interstellar dust, predominant in the mid- and far-infrared. Usually, the near-infrared range is a region of transition between the two kinds of emissions.

The observed emission produced by the primary sources and the dust is mutually affected. On the one hand, the dust dims and scatters the light generated by the stars or AGNs. On the other hand, the light from the primary sources heats the dust, determining its temperature and thus its emission spectra. Furthermore, although most astrophysical objects are optically thin at long infrared wavelengths, the dust emission produced at one location can also be absorbed and scattered by the dust located elsewhere, a process often referred to as dust “self-heating”.

Performing dust radiative transfer (RT) calculations is the essential step to reproduce the observations in a (as much as possible) self-consistent way. The problem is computationally challenging because of its non-locality (in the spatial, angular direction, and wavelength dimensions) and non-linearity (e.g. the dust emission spectra depends non linearly on the absorbed luminosity). Furthermore, the presence of six independent variables (three spatial coordinates, two angular coordinates and the wavelength) makes it very challenging to handle the large memory required if the 3D dust radiation transfer equation has to be solved directly (see Steinacker et al. 2013, for a recent review).

For all the above reasons, the vast majority of the dust radiative transfer codes adopt a Monte-Carlo (MC) approach (e.g. see lists of codes in Steinacker et al. 2013; Pascucci et al. 2004; Pinte et al. 2009; and Gordon et al. 2017, hereafter G17), which is an elegant, flexible way to follow the propagation of light within dusty objects and it heavily reduces the memory requirements since there is no need to store the intensity as a function of the angular direction at each spatial position. The basic Monte-Carlo approach is not efficient in producing images at specific line-of-sights but this problem can be solved by combining it with a ray-tracing procedure called peel-off (Yusef-Zadeh et al. 1984). Other acceleration techniques to maximize the use of photon particles are available (Steinacker et al. 2013).

Despite all the known technical difficulties, in the last few years we have been developing a 3D dust RT code that is purely ray-tracing, that is, it does not make any use of MC techniques but is simply based on the calculation of the radiation intensity variation along numerous directions chosen deterministically by the code algorithm. The code, named DART-Ray, and its basic algorithm have been introduced in Natale et al. (2014, hereafter NA14). As stated in that article, the main motivation to develop this code is having a specific tool for the calculation of the radiation field energy density (RFED) throughout any region of the RT model under consideration. To this goal, our group already made extensive use of a ray-tracing RT code based on the scattering intensity approximation of Kylafis & Bahcall (1987; see e.g. Popescu et al. 2000, 2011). However, this code can be applied only to axisymmetric galaxy models, while DART-Ray is able to handle any geometry.

RFED are obviously also calculated by MC codes, but the main focus of MC codes is the production of surface brightness images that might or might not require the RFED being accurately calculated at all positions and all wavelengths. In particular, our focus on the RFED is due to the importance of this quantity in other fields of astrophysics, such as high-energy astrophysics, where it is necessary to calculate the radiation due to inverse-Compton of cosmic rays interacting with the interstellar radiation field produced by stellar and dust emission.

Furthermore, the DART-Ray algorithm is not a brute-force ray-tracing algorithm. It takes advantage of a so far not well-studied property of the radiation sources within RT models, that is, that these sources often do not contribute significantly to the RFED everywhere but only within a fraction of it called the source influence volume. Although being essentially a cell-to-cell radiation transfer algorithm, the gain in efficiency of DART-Ray with respect to a brute force algorithm comes from its method to estimate the extent of the source influence volumes and perform radiation transfer calculations only within them. The extent of these volumes in astrophysical objects and the possible advantages that can be exploited in radiation transfer codes have never been clarified. Intuitively, in dusty objects the extent of this volume could be quite reduced relative to the size of the models, especially for the scattered light sources, which are low intensity sources compared to the sources actually producing radiation, such as stars and dust thermal emission. DART-Ray allows one to examine the extent of the source influence volumes and thus verify when they are small relative to the entire model size. Finally, handling 3D dust radiative transfer in a manner different to those of the widely used MC techniques provides a useful test for the reliability of scientific results obtained by MC codes. In principle, an agreement between two or more MC codes could be due to the adoption of the same numerical method, but this interpretation can be discarded when a different RT solver obtains the same result. This kind of comparisons is already underway with the TRUST benchmark project in which DART-Ray is participating (see G17).

The code presented in NA14 was a good first step in the development of a mature code, but there was some scope for improvement to ameliorate several limitations: firstly it could only be executed one wavelength at a time, neglecting dust self-heating, which requires multi-wavelength runs; secondly parallelization was implemented only for shared memory machines; thirdly the inaccuracy from the blocking of the rays could not be set at the start of the calculation and could only be measured by re-running the model with a different value for a threshold parameter (fU, see Na14 or below); fourthly stochastically heated dust emission was excluded from the calculation (this was added in Natale et al. 2015). Furthermore, a substantial reduction in the execution time could have been achieved through the implementation of a more efficient algorithm for the optimization of the ray angular density.

In this paper, we present a new version of the code (hereafter DART-Ray V2), which is a substantial improvement to the one presented previously. The new code is publicly available and documented for both users and developers1. As well as addressing the issues highlighted above, we have added new features, such as the ability to create “internal observer” maps viewed from within the RT models in HEALPix format (which can be used for Milky Way studies) and at arbitrary lines of sight, without repeating the radiation transfer calculation. This latter feature can be used, for example, to create animations for the presentation of the results.

The structure of the paper is the following. In Sect. 2 we briefly summarize the radiation transfer algorithm used in DART-Ray. In Sect. 3 we describe the numerous updates to the code. In Sect. 4 we show the comparison of the code with benchmark solutions including dust self-heating. In Sect. 5 we present a study of the extent of the source influence volumes for radiation sources within different galaxy models. In Sect. 6 we discuss the advantages and disadvantages of the DART-Ray algorithm and in Sect. 7 its possible further improvements.

2. The DART-Ray dust radiation transfer algorithm

The general strategy of the RT algorithm of DART-Ray V2 is the same as the one presented in NA14. However, there are many differences regarding the implementation and the newly added capabilities (see Sect. 3). Here we make a brief summary of the RT algorithm, highlighting the main steps. We encourage users of the code and readers interested in more specific details to read the user guide and the code documentation on the code webpage as well as Sects. 2 and 3 of NA14 for further clarifications on specific points.

In DART-Ray, a RT model is subdivided into an adaptive 3D Cartesian grid of cells, each with a given input value of stellar volume emissivity jλ(r) (luminosity per unit volume per unit frequency and per unit solid angle) and dust optical depth per unit length kλρd(r) (with kλ the extinction coefficient and ρd the dust density). The albedo ωλ and the anisotropy parameter of the Henyey-Greenstein phase function gλ are determined by the assumed dust model. Given these input quantities, the code calculates:

  • the RFED Uλ(r) for each cell;

  • the scattered luminosity source function jλ,sca(r,θ,φ) which contains the scattered radiation luminosity per unit volume and per unit solid angle for each dusty cell. In general, the scattered luminosity is not isotropically distributed and thus depends on the angular direction (θ,φ);

  • the dust emission source function jλ,d(r) which contains the luminosity per unit volume and per unit solid angle produced in each cell containing dust;

  • the specific intensity Iλ,obs(r,θ,φ) of the radiation produced by each cell/point source, and reaching the observer located either far away or within the RT model. It is derived from the source terms jλ(r), jλ,sca(r,θ,φ) and jλ,d(r), and the optical depth between the cell/point source and the observer. It can be used to calculate surface brightness maps at the position of the observer.

The code performs first the RT calculation for the stellar emission and subsequently that for the dust emission (the latter added in this code version, see Sect. 3.3). In both cases, the RT algorithm is subdivided in three steps:

  • 1)

    the determination of a lower limitUλ,LL(r) to the RFED distribution Uλ(r);

  • 2)

    the processing of radiation coming directly from radiation sources;

  • 3)

    the processing of radiation scattered by dust.

In all these steps, the DART-Ray algorithm considers one radiation source at a time (that is, either an “emitting cell” whose stellar or dust volume emissivity is not zero or a point source). For each source, it calculates the contributions of the radiation emitted by the source to the RFED within a certain volume surrounding it. In steps 2 and 3 the contributions to jλ,sca(r,θ,φ) for each cell of this volume are also calculated. The value of these contributions is derived after each ray-cell intersection (see Sect. 3.2 in NA14). In the new code version, the ray tracing from a radiation source within the surrounding volume involves a ray angular density optimization procedure described in Sect. 3.1.1.

During step 1, the volume considered around each radiation source has a fixed extent chosen by the user (typically 1020% of the entire model size). In this way, a lower limit of the RFED distribution Uλ,LL(r) is derived because the contributions in the regions beyond these volumes are not taken into account.

In step 2, the ray-tracing calculation is performed once again from the beginning but this time the rays originating from the radiation sources are blocked if the ray contribution δUλ to the local RFED is “negligible” at all wavelengths, that is, when (1)where fU is a threshold parameter chosen indirectly by the user depending on the desired numerical accuracy (see Sect. 3.5).

Finally, during step 3 the scattered radiation stored within the dusty cells is processed. In opposition to step 2, the radiation produced by emitting cells is typically direction-dependent, since the assumed scattering phase function (Henyey-Greenstein) is in general not isotropic2. Nonetheless, apart from few technical differences, the calculation during this step proceeds essentially as in step 23. Since scattered radiation can be scattered multiple times, several scattering iterations are needed. These iterations are stopped when the remaining scattered radiation luminosity waiting to be processed is only a small fraction fL of the total scattered luminosity of the model as found at the end of step 2.

The code performs firstly the radiation transfer calculations only for the stellar emission. Then, it starts the calculation for the dust emission. The dust emission spectra produced by each dusty cell can be derived from the luminosity absorbed by the dust (which depends on Uλ(r), the dust density and the dust opacity coefficients). The radiation emitted by dust undergoes the same type of propagation as for the stellar emission with the difference that the extra-radiation absorbed in this process affects the dust temperature and thus its emission. Therefore, since dust emission and absorption are coupled, multiple iterations of the entire radiation transfer procedure described in this section are performed until the dust emission spectra have converged at all positions (see Sect. 3.3).

Once jλ(θ,φ), jλ,sca(θ,φ) and jλ,d are known, one can calculate the specific intensity Iλ,0 of the radiation departing from each source along any angular direction (see Eq. (7) of NA14). The specific brightness for the radiation arriving to the observer is then simply Iλ,obs = Iλ,0eτλ with τλ the optical depth between the source and the observer position. The code calculates Iλ,obs for all cells and point sources and then use volume rendering techniques to produce surface brightness maps. We note that, if the source functions are saved, Iλ,obs(r,θ,φ) can be calculated for arbitrary observer positions without repeating the entire RT calculation.

3. Update descriptions

3.1. Optimizations

Compared to NA14, we implemented two main changes affecting the code speed and memory requirement. These are an improved algorithm for the optimization of the ray angular density and the implementation of wavelength-dependent angular resolution for the scattering source function jλ,sca.

3.1.1. Ray angular density optimization

DART-Ray performs ray-tracing operations from each radiation source (either an emitting cell or a point source) throughout a 3D Cartesian adaptive grid. When a source contributes significantly to the RFED within a grid cell, it is necessary that at least several rays, originating from the source, intersect the cell in order to achieve good numerical accuracy for the RFED and the source functions jλ,sca and jλ,d at that grid position. In this way, one also avoids missing cells at similar distances. Unfortunately, the extent of the source influence volume cannot be known in advance. Therefore, it is not possible to set a sufficiently high ray angular density (the number of rays per unit solid angle) right at the beginning of the ray propagation, so that all the cells within the source influence volume are properly intersected. Instead, the ray angular density has to be derived iteratively while the rays propagate through the model. The optimal ray angular density is also not necessarily uniform over the entire solid angle, as seen from the radiation source, but it can well be direction-dependent. Furthermore, the variable cell size of the adaptive 3D grid of emissivity and opacity can make the optimal angular ray density vary with the distance from the radiation source.

The directions along which the rays are cast are those defined by the lines passing through the source position and the centres of the spherical pixels of a concentric sphere, subdivided according to the HEALPix sphere pixelation scheme (Górski et al. 2005). The advantage of using HEALPix is that the angular resolution of the sphere pixelation (defined as a quad-tree) can be varied easily and there are fast routines available for spherical pixel searches (e.g. to obtain spherical angles from the pixel number and vice versa). The basic idea is to start the RT calculation by using a low initial HEALPix resolution. While a ray is propagating throughout the model, the code can vary the ray angular density by moving from one HEALPix resolution level to the immediately higher or lower as described below. Because of the quad-tree structure of HEALPix, any change in HEALPix resolution corresponds to a factor 4 in the variation of the ray angular density.

thumbnail Fig. 1

Flowchart of the angular density optimization algorithm for the rays originating from a single source. All departing rays are collected in lists (high or low, see text) and their propagation is followed through the RT model. Depending on their beam size and contribution to the RFED, they are split or merged during the “select ray” phase. We note that, during step 1 of the RT algorithm (see Sect. 2), the decision condition δU>fUULL is substituted by dray<Rpre with dray the distance crossed by the ray and Rpre the input maximum distance that can be crossed in this step.

Open with DEXTER

Specifically, the ray angular density has to be increased when the following two conditions are met:

  • 1.

    the ray beam size is larger than the maximum allowed size, that is: (2)where ΩHP,EM is the solid angle of the beam associated with a ray, ΩINT is the solid angle subtended by the last intersected cell and Nrays is the minimum number of rays that has to intersect a cell (input-defined);

  • 2.

    the ray has either not reached the boundary of the user-defined region, during the calculation of Uλ,LL(r), or it does not contribute significantly to the RFED of the last intersected cell, during the processing of direct and scattered radiation.

Conversely, rays can be merged when the ray angular density is too high. That is, when: (3)where is the user-defined maximum number of rays allowed to cross a cell.

The previous version of DART-Ray already contained an algorithm for the ray angular density optimization, but it had the problem that many contributions to the RFED of cells already crossed by rays had to be recalculated several times (see NA14 for details). Instead, in DART-Ray V2 we implemented an optimization strategy for the ray angular density, in which rays can be split and merged along the path they are following, and it avoids repeating the ray-tracing calculations for cells already intersected with a sufficient number of rays.

The method we implemented is similar to the algorithm of Abel & Wandelt (2002), but with several technical differences. This method is described by the flowchart in Fig. 1. At the beginning, the code selects a ray and follows its propagation through the RT model. At each ray-cell intersection, it checks whether the ray-beam satisfies any of the conditions expressed in Eqs. (2) or (3). If not, it adds the ray contributions to the RFED and to the scattering source function. Unless the ray has already reached the model border, the ray propagation continues to the next cell intersection. If the ray beam is found too large after any of these intersections (that is, it satisfies Eq. (2)), the code checks if the ray still carries a significant contribution to the RFED (see Eq. (1)). If so, it adds the current ray to the “high” list, the list of rays to be split. Otherwise the ray further propagation is ignored. Instead, if the ray beam is found too small (according to Eq. (3)), the ray is added to the “low” list, the list of rays that can potentially be merged.

Once all rays within an HEALPix sector have been processed for the current HEALPix angular resolution, DART-Ray checks whether there are rays in the high ray list. If so, it proceeds with the ray tracing at immediately higher HEALPix resolution. That is, for each ray in the high ray list, four child rays are generated with directions corresponding to the HEALPix directions within the spherical pixel associated with the parent ray. The ray tracing calculation for these child rays starts directly from the same distance dray from the source which has been already crossed by the parent ray.

After all rays in the high ray list have been processed, the code looks for rays that can potentially be merged among those in the low ray list. In order to be merged, the directions of four rays in the list should be contained within the same HEALPix spherical pixel at the immediately lower angular resolution, and these four rays should have been blocked after crossing the same grid plane. If so, the code merges them into a single ray with specific intensity equal to the average intensity of the four merged rays. After that, the code starts the propagation of the newly created rays from the average distance crossed by corresponding merged rays. The code proceeds with the propagation of all rays from the high and low lists iteratively until there are no more rays in both lists.

3.1.2. Wavelength-dependent angular resolution for the scattering source function

The scattering source function jλ,sca(r,θ,φ) is the computed quantity requiring more memory in the DART-Ray code, since it depends on six independent variables (λ,r,θ,φ). By sampling appropriately each dimension, its values can be stored in a big array of size typically in the range ~1100 Gbytes. Furthermore, the algorithm needs one more array of the same size to store the scattering luminosity to be processed within each scattering iteration, separately from the total scattered luminosity stored in jλ,sca(r,θ,φ).

As big as it is, storing the scattering source function is still cheaper in terms of memory requirements than solving the radiative transfer equation directly for the specific intensity Iλ(r,θ,φ). This is because Iλ(r,θ,φ) can present unpredictably rapid angular variations at each spatial point r which are determined by the radiation sources and dust distribution geometry as well as the assumed scattering phase function Φλ(n,n) (dependent on the incoming light direction n and the scattering light direction n)4. The latter determines the angular re-distribution of the scattered radiation after each ray-cell intersection. Instead the rapidity of the angular variations for jλ,sca is determined only by the shape of Φλ, typically modelled as a Henyey-Greenstein profile (Henyey & Greenstein 1941): (4)with θ being the angle between n and n, and the anisotropy parameter gλ determining the angular width of the Φλ(n,n) profile. For typical interstellar dust models, this profile is quite sharp at UV wavelengths, but it gradually becomes flatter going towards the NIR and then almost completely flat in the FIR. Therefore, the number of angular points needed to sample jλ,sca(r,θ,φ) properly has to be quite high at shorter wavelengths, while relatively few points are sufficient in the FIR where scattering is essentially isotropic. This property of jλ,sca(r,θ,φ) allows a significant reduction in the memory requirement if the storage of Iλ(r,θ,φ) is not needed, as in DART-Ray.

The sampling points for the angular directions of jλ,sca(r,θ,φ) are those of a discretized HEALPix sphere with a total number of pixels equal to with Nside = 2kHP and kHP a positive integer value (see Górski et al. 2005). In DART-Ray we implemented the following formula to derive an appropriate kHP for the scattering source function at each wavelength: (5)with θλ,min, the pixel angular size for the required minimum angular resolution, given by: (6)with FWHM [ Φλ−Φλ(π) ] the Full Width Half Maximum of Φλ minus its “background” value at θ = π (in turn depending on gλ) and nFWHM the minimum number of pixels within the FWHM5. We found that by choosing nFWHM = 5 a good accuracy is reached for the benchmark models examined in Sect. 4. We note that the values of kλ,HP have to be integers, so the result of formula (5) is approximated to its integer part. A maximum allowed value kλ,HP = 2−3 has to be set to avoid very high memory requirements for very narrow Φλ profiles at short UV wavelengths.

Examples of this sampling can be seen in Fig. 2 for values of gλ in the range 0.10.6 (approximately the value range typical of the NIR to UV wavelength range). The figure shows the Henyey-Greenstein functions plotted over the entire sphere using Mollweide projection, together with the contours of the HEALPix pixels for the derived values of kλ,HP. Our implementation guarantees that at least several points are sampling the peak of the Henyey-Greenstein profile, which convolves any scattered light contribution added to jλ,sca(r,θ,φ).

The variable angular resolution for the scattering source function allows a considerable reduction in memory. For example, in the TRUST benchmark slab model (see G17) the “BASIC” lambda grid contains 31 wavelengths from the UV until 60 μm, the range we used for the stellar emission RT. By assuming kλ,HP = 2 at all wavelengths, and given about 700.000 3D grid points, the memory requirement for jλ,sca(r,θ,φ) is about 33 Gbytes. By using the variable angular resolution described above, this is reduced to about 12 Gbytes.

thumbnail Fig. 2

Examples of the implemented sampling of the Henyey-Greenstein phase function Φλ for different values of the anisotropy parameter gλ. The Φλ values on a sphere are visualized using Mollweide projection and according to the colour bar below each panel. The contours of the HEALPix pixels corresponding to the values of kλ,HP from Eq. (5) are overplotted as black lines. The dark blue circle shows the size of the FWHM [ Φλ−Φλ(π) ].

Open with DEXTER

3.2. Multi-wavelength RT

Implementing multi-wavelength calculations in a purely ray-tracing code is harder than for MC codes because of the high memory requirement for both the specific intensity Iλ(r,θ,φ) and the scattering source function jλ,sca. However, this task becomes easier once the memory requirements are reduced when Iλ(r,θ,φ) does not have to be stored in memory and by using a variable angular resolution for jλ,sca. Because of this latter optimization, DART-Ray V2 can perform multi-wavelength RT calculations without exceeding the RAM memory of modern computer cluster nodes typically of the order of 100 Gbytes. This addition allows the inclusion of dust self-heating (see Sect. 3.3) which, being non local in wavelength, cannot be easily handled using a succession of monochromatic RT calculations. Furthermore, many ray-tracing steps are exactly the same for all wavelengths and they are not repeated in multi-wavelength runs.

Since rays carry multi-wavelength intensities, one has to check that the ray blocking criterion (Eq. (1)) is fulfilled at all wavelengths. During the ray propagation, this criterion may only be satisfied at some wavelengths. In this case, the code still propagates the ray in the current direction but it does not add the contributions to the RFED and jλ,sca at the wavelengths for which the intensity has become negligible. Since updating jλ,sca is demanding computationally, this helps to reduce further the calculation time.

3.3. Dust self-heating

The previous version of DART-Ray assumed that the dust emission is always optically thin. Thus, it did not consider the absorption of dust emission at other locations (called dust self-heating) as well as the scattering of dust emission. This assumption is not correct for models which are optically thick in the infrared range. This was the main source of disagreement at infrared wavelengths between DART-Ray and the other codes in the TRUST slab benchmark project for the most optically thick models (see G17). In this section, we explain the implementation of the dust self-heating in DART-Ray V2. The result comparison for optically thick models is shown in Sect. 4.

The dust emission spectra at each position is determined by the RFED (or alternatively the average radiation field intensity), the dust density and the dust absorption coefficient (see e.g. Popescu et al. 2011; Steinacker et al. 2013). In DART-Ray this can be calculated assuming either equilibrium between the dust and the radiation field or by deriving the full stochastically heated dust emission spectra (see Natale et al. 2015; Camps et al. 2015, and the code user guide).

The dust emission RT calculation is performed after the stellar emission RT is completed. The main difference between the two calculations is that for the former the emission source spectra depend on the RFED. Therefore, while the stellar emission RT run can be performed only once, following the three steps procedure described in Sect. 2, the dust emission RT requires in principle several iterations of that procedure until the dust emission and the infrared RFED both converge. To handle these dust self-heating iterations, we implemented the following procedure:

  • 1.

    the dust emission spectra are calculated taking into accountonly dust heating from absorbed stellar emission;

  • 2.

    a first RT calculation for the dust emission is performed following the RT algorithm described in Sect. 2;

  • 3.

    the dust emission spectra are recalculated taking into account the dust heating due to both absorbed stellar emission and the absorbed dust emission;

  • 4.

    the difference between the dust emission spectra just calculated jd(r) and the ones calculated at the end of the previous dust self-heating iteration is evaluated: (7)

  • 5.

    another dust radiative transfer calculation is performed during which only the dust emission luminosity stored in Δjd(r) is processed. The RT algorithm is performed skipping the calculation of the RFED lower limit Uλ,LL, which is set equal to the RFED calculated in the previous iteration. Also, the RFED Uλ and jλ,sca are initialized with the corresponding values found in the previous iteration;

  • 6.

    steps 3–5 are repeated until at all positions and wavelengths.

As one can see, the dust emission RT iterations are performed without processing the same dust emission luminosity more than one time. For moderately optically thick models, Δjd(r) tends to be very small compared to jd(r) already after the first self-heating iteration. So, the iterations that follow proceed much faster compared to the first. We tested the validity of this approach in Sect. 4.

3.4. Parallelization

3D dust radiation transfer is computationally very expensive independently of the algorithm used. Therefore, most of the more advanced dust radiative transfer codes use parallelization to reduce the time needed for the calculations.

Task parallelization is straightforward to implement because 3D dust radiation transfer is largely an additive problem. For a given RT model, all the quantities to derive, with the exception of the dust emission source function, are equal to the sum of contributions provided by the radiation from the single sources. Therefore, task parallelization is done by distributing the processing of the radiation sources (or photon packages for MC codes) between different CPUs.

On shared memory machines, as the single nodes of a typical computer cluster, one can easily parallelize the loops over the radiation sources using OpenMP. Unlike MPI, OpenMP allows multiple CPUs to operate on shared arrays. In this way, there is no need for replicating any array or distributing arrays among different processes. Then, to take advantage of multiple nodes simultaneously, a hybrid OpenMP+MPI parallelization scheme is a natural choice, since one can use OpenMP for parallelization within a single node and MPI to handle communication between nodes.

With multiple nodes it is possible to increase substantially the number of CPUs, and so in principle reduce the total computational time. However, in practice, some overheads that are introduced by the time needed for nodes to communicate and to process the exchanged information. These overheads can become significant when data parallelization among nodes is implemented.

In DART-Ray, the vast majority of the memory consumption is due to the scattered luminosity source function. For this reason, we did not implement any data parallelization for the other arrays (e.g. the 3D spatial grid coordinates) which are all replicated in all nodes. Instead, we have implemented two data parallelization choices for the scattered luminosity source function: a “communication” mode and a “no-communication” mode.

In the communication mode, the scattered luminosity source function is distributed among the node memories such that each node contains the scattered source function for different sets of wavelengths. The communication between nodes is needed in two cases: firstly to add the δjλ,sca contribution at all wavelengths after each ray-cell intersection; secondly, during the scattering iterations, the values at all wavelengths of jλ,sca for each source are needed by the same node that has to process it.

In the first case, in order to minimize data exchange, instead of passing the δjλ,sca contribution to the scattering source function in each node, only the total scattered luminosities (integrated over all angular directions) and the ray directions are passed to the corresponding node. Then, this information is processed to calculate the angular distribution of scattered luminosity to be added to jλ,sca for a certain dusty cell. Furthermore, to minimize communication, large packets of data, containing the scattered luminosity contributions due to many ray-cell intersections, are collected within each node and then exchanged between the nodes (see code documentation for more details).

Despite all efforts we put to minimize data exchange and reduce the processing time of the received data, the overheads in the communication mode can still be substantial. Therefore, we also implemented a simpler no-communication mode where all arrays are replicated in each node, including the scattering source function. In this mode, data communication is performed only at the end of the radiation source loops to sum up the arrays calculated separately by all nodes. An example of the speed-up performances of the communication and no communication parallelization modes is shown in Fig. 3. In this figure, we show the wall clock speed-up of the calculation, compared to the serial execution, for the N-body and SPH galaxy model example contained in the DART-Ray current release (see DART-Ray User Guide). As expected, the no communication mode scales much better with the number of CPUs than the communication mode. However, it is also much more expensive in terms of memory. Nonetheless, with the typical RAM memory of computer cluster nodes of the order of hundreds of Gbytes and thanks to our implementation of the wavelength-dependent angular resolution for the scattering source function, it is possible to use this parallelization mode in the majority of cases. We recommend that users of the code use this mode, unless the memory requirements are so high that data distribution is unavoidable.

thumbnail Fig. 3

Wall clock speed-up with respect to the serial execution for an N-body and SPH galaxy simulation by using the no communication and communication parallelization mode. For all the parallel runs, we used eight CPUs for each MPI process (that is, eight OpenMP threads per MPI process).

Open with DEXTER

3.5. Control of the inaccuracy due to ray blocking

There are several factors affecting the numerical accuracy of the RT calculations performed by DART-Ray. Firstly, DART-Ray uses a spatial grid to discretize the distribution of the diffuse stellar emission and dust mass. The RFED distribution as well as the source functions are also evaluated on this same spatial grid. However, it is not possible to set the resolution of the spatial grid sufficiently high to attain a pre-defined level of accuracy by the end of the calculation. While creating the grid, one typically utilizes higher resolution grid elements in regions of higher stellar volume emissivity and dust density. Although reasonable to expect a more rapid variation of the radiation field in those regions, there is no guarantee that the spatial resolution is adequate everywhere on the grid. In the absence of iterative procedures to increase the spatial resolution during the RT calculation, the effect of the grid discretization on the numerical accuracy can only be checked by repeating the calculations at progressively higher spatial resolutions. Similarly, the finite number of angular directions of the rays that are cast from each radiation source, as well as the discretization of the scattering source function, also affect the calculation accuracy.

Apart from these factors, common to all 3D dust RT codes although in different forms6, in DART-Ray the numerical accuracy is also affected by the estimate of the extent of the source influence volumes. This is because DART-Ray calculates the RFED contributions from each source only within this volume surrounding the source itself, thus neglecting the contributions outside it. Since this is a core characteristic of DART-Ray, we are interested in quantifying the accuracy error due to the cut off of the rays when they reach the boundary of the estimated source influence volume. We note that this accuracy error is systematic, since it will always produce RFEDs which are underestimated compared to the correct value.

The ray cut off occurs when the RFED contribution δUλ carried by a ray satisfy the criterion 1. So, once a lower limit to the RFED distribution Uλ,LL has been estimated, the input-defined parameter fU is the key factor affecting the numerical accuracy of the calculation. In N14, we stated that fU should be low enough to preserve energy balance, in the sense that, at the end of the calculation, the total radiation luminosity that has been neglected because of the ray cut-off should be only a small fraction of the total luminosity of the model. In this case, the effect of cutting the rays is minimal because almost all the radiation luminosity has been followed in its propagation throughout the model. However, since this energy balance can only be checked at the end of the RT run, potentially several attempts had to be made to find the appropriate value for fU. Instead, in the following, we show how the value of the parameter fU can be set before an RT run such to guarantee the desired level of accuracy.

We have been able to find a relation between fU and the accuracy of the derived RFED distribution by making a minor change in the definition of δUλ(r) in formula 1, compared to NA14. This is now defined as: (8)where Iλ is the average specific intensity of the ray within the ray-cell intersection path, AEM is the area of the emitting cell originating the ray, ΩINT is the solid angle subtended by the intersected cell, LINT is the linear size of the intersected cell, VINT its volume and c the light speed. This formula differs only slightly from the corresponding formula in N14: the factor LINT replaces the ray-cell intersection path length, and the factor ΩINT replaces the ray beam solid angle ΩHP,EM. So, in the previous version, δUλ(r) in relation 1 was the RFED contribution of the single ray to the intersected cell RFED. Instead, δUλ(r) now represents approximately the total RFED contribution of the radiation source originating the ray to the intersected cell. In this way, every time the criterion for δUλ(r) is checked, the relevance of the radiation source in determining the local RFED is considered, not just that of the single ray (which can simply have a small intersection path or a small associated ΩHP,EM).

With this change, an appropriate value for fU can be derived as follows. When the condition 1 is realized during the ray propagation, one would like to assure that the small contribution δUλ(r) does not sum up with comparable contributions from many other radiation sources which cumulatively provide a non negligible contribution to the intersected cell RFED. In the highly improbable case that all other radiation sources in the RT model provide a RFED contribution as low as δUλ(r), the cumulative contribution iδUλ,i(r) would be such that: (9)where Ns is the total number of radiation sources in the model. By requiring that the RHS of the above inequality is only a small fraction aRT of the final value Uλ(r) for the RFED, we have: (10)in the above relation, the factor aRT represents the desired accuracy of the RT calculation at each position. By assuming conservatively that Uλ,LL(r) is a substantial fraction of Uλ(x), that is Uλ,LL(r) ~ 0.25Uλ(r), we have then a relation between fU and aRT: (11)By taking advantage of the above relation, DART-Ray sets the fU parameter to , for a given input-defined accuracy parameter aRT. We point out that this input-parameter can be used to control only the inaccuracy due to the blocking of the rays, not the other factors mentioned at the beginning of this section.

3.6. Other updates

We list here other relevant updates to the code.

3.6.1. Point sources

It is now possible to include a set of point sources at arbitrary positions within the 3D grid. This is useful for including unresolved objects, such as stars within a molecular cloud, or an AGN within a galaxy model.

3.6.2. Use of HDF5 files

The 3D grid, the output arrays, such as the RFED and the scattering source function, and the surface brightness maps are now be written to files in Hierarchical Data Format7 (HDF5), although the output can be defined and restricted by the user. HDF5 format offers quicker I/O and smaller file sizes compared to standard ASCII output.

3.6.3. Internal observer maps

Surface brightness maps, as seen by an observer within the RT model, can now be produced with DART-Ray. This is useful for creating images and animations for public presentations, and it allows the user to reproduce observations of the Milky Way. The output is in HEALPix format, which is a format used in all-sky surveys, including the recent Planck data in the infrared. This feature has been used in Popescu et al. (2017) and Natale et al. (in prep.) to construct a radiation transfer model of our own Galaxy.

thumbnail Fig. 4

Comparison of the λ = 35.11 μm edge-on images of the TRUST slab benchmark for the vertical optical depth τ(1 μm) = 10. The solutions provided by the old and new DART-Ray version are included as well as those of the other codes participating to the project. Units on the images are MJy/sr. The plots on the left show the average surface brightness profiles and the relative differences between the solutions along a vertical and a horizontal strip, whose boundaries are shown within the top two images (CRT and new DART-Ray code). The X-axis of these plots are in units of pixels. The inclusion of dust self-heating in the new version of DART-Ray allows a much closer agreement with the other codes. We note that these solutions are for the “effective grain” case (see G17).

Open with DEXTER

3.6.4. 2D mode

DART-Ray contains a 2D mode which can be used for axisymmetric models. The calculations are still performed on a 3D Cartesian grid but, in this mode, DART-Ray performs the ray-tracing calculations only for the cells located in the first grid octant. Then, taking advantage of the problem symmetries, it derives the RFED and scattering phase function contributions from cells in other octants. This mode is about a factor of 8 times faster than the standard 3D mode.

4. Comparison with TRUST benchmark solutions at high optical depth

DART-Ray has been the only purely ray-tracing code that provided solutions for the first benchmark paper of the TRUST radiation transfer benchmark project (see G17). In that study, several codes have been used to compare the results for a geometry constituted by a dusty slab of uniform density illuminated by a star placed above it. Each code had to provide both total SEDs and images for a set of observer lines-of-sight and a number of wavelengths. Four different models have been considered which varied only for the vertical optical depth of the slab at 1 μm. In that paper, it is shown that the DART-Ray solutions are in good agreement with all model solutions from the other codes, except for one model, which has the largest optical depth (τ(1 μm) = 10, see the TRUST benchmark website for all the comparison plots8). In addition, it was shown that the discrepancy in the dust emission for the most optically thick case was due to the absence of dust self-heating in the old version of DART-Ray. In particular, the lack of scattered dust emission on the images produced large discrepancies between DART-Ray, as well as TRADING, and most of the other codes (see Fig. 9 in G17).

DART-Ray V2 includes dust self-heating as well as dust emission scattering. In order to test that the implementation of these effects is correct, we re-calculated all the TRUST benchmark solutions with the current code. These solutions have now been added to the TRUST website where one can check that they differ at most by ~10% from the other code solutions in all cases. Here we show only the comparison of the images at λ = 35.11 μm, τ(1 μm) = 10 and for the edge-on view, which was taken as an example in G17 of the importance of including dust self-heating. This is shown in Fig. 4 where we included all the other code solutions as well as the old and new DART-Ray solutions. As one can see from the average vertical and horizontal profiles of the surface brightness, DART-Ray V2 produces a MIR image which is much closer to the images of the codes including dust emission scattering. The residual discrepancy is mainly due to the lower spatial resolution of the DART-Ray grid compared to that of the other codes (see G17) together with a contribution due to the ray blocking criterion up to a few percents. We found the same result for all other cases not shown here.

thumbnail Fig. 5

Distributions of the relative number of cells as a function of the relative discrepancy for the RFED for the MW model derived by assuming aRT = 0.005 and aRT = 0.05. The plots show the results for the UV, optical and NIR wavelengths used in the RT calculations. We note that the relative discrepancy is always lower than 5% as expected.

Open with DEXTER

thumbnail Fig. 6

Distributions of the relative number of cells as a function of the average ray length crossed by the rays from each cell, normalized to the model linear size, for the direct light processing phase and for some of the scattering iterations. The results are shown for each UV, optical, NIR wavelengths used in the calculations and for the optical depths scaled by the factor 0.5, 1, and 2 compared to the original MW model.

Open with DEXTER

Table 1

Median values of the average ray crossing length distribution for the Milky Way models with the optical depth scaled by the factors 0.5, 1 and 2.

Apart from images and SEDs, the other main quantity calculated by RT codes is the RFED. Unfortunately, this quantity is more difficult to compare because different codes use different types of grids and resolutions. For this reason, no comparison of the RFED has been made in G17 and the agreement for the dust emission has been taken as an evidence that the RFED have been calculated correctly. We note that the RFED solutions provided by DART-Ray for axisymmetric galaxy models have been compared and found in good agreement with those presented by Popescu & Tuffs (2013; see NA14).

5. The source influence volume in galaxy models

The efficiency of the DART-Ray algorithm is based on its criterion to block rays expressed by Eq. (1). In the best case scenario, this criterion is satisfied after the rays have crossed only a small part of the model. This would allow the RT calculations to proceed rather quickly. Instead, if the rays have to cross a large fraction of the model before being blocked, the DART-Ray algorithm becomes inefficient. It is therefore interesting to measure the lengths after which the rays are blocked compared to the model size. In this section, we show an analysis of the distribution of these crossed lengths for the Milky Way (MW) galaxy model presented in Popescu et al. (2017). The analytical formulae describing the distribution of stars and dust opacity at each wavelength for this model can be found in that paper.

As discussed in Sect. 3.5, the size of the source influence volume (and thus the lengths at which the rays are blocked) depends on the numerical accuracy that has to be reached in the RT calculation. For the tests presented in this section, we set a maximum numerical inaccuracy to 5%. This can be achieved by setting aRT = 0.05 since the code uses Eq. (11) to set the threshold parameter fU. In order to check that Eq. (11) can be used to set the maximum inaccuracy correctly, we also performed a much more accurate calculation with aRT = 0.005 and compared the results for the RFED for a UV (0.150 μm), an optical (0.443 μm) and a NIR (2.2 μm) wavelength. The distribution of the cells as a function of the relative discrepancy for the RFED for these two different calculations is shown in Fig. 5. As one can see, the relative discrepancy is never higher than 5% in absolute values, proving that the accuracy prescription used to set the threshold parameter fU works correctly.

By assuming aRT = 0.05, we derived the distribution of the average path crossed by rays departing from each cell for the Milky Way model at the UV, optical and NIR wavelengths mentioned above. We derived this distribution for the direct light processing phase as well as for each scattering iteration. Also, in order to see the effect of varying optical depth on the distributions, we also calculated them for Milky Way models with the dust opacity distribution artificially scaled by the factor 0.5 and 2. All these distributions are shown in Fig. 6. We note that the ray path lengths are expressed in units of the model linear size. Also, not all the scattering iteration distributions are shown in order to make the histograms clearer. The median values for all distributions are shown in Table 1.

From Fig. 6 and Table 1 a few conclusions can be drawn about the sizes of the source influence volumes for each cell and for each calculation phase. Independently of the optical depth, the sizes of the source influence volume are the highest for the direct light processing phase where they can be of the order of half of the model linear size or more. However, the volume sizes decrease with the order of the scattering iterations. In particular, the decrease is rather steeper in the NIR infrared wavelength than that at UV and optical wavelengths. We note that for the NIR wavelength the sizes of the influence volume for the direct light processing are the highest while, at the same time, they shrink rapidly with the order of the scattering iterations. This is because both the optical depth and the albedo of the models at NIR wavelengths are much smaller compared to UV and optical wavelengths. In fact, because of the lower optical depth in the NIR, the ray specific intensity decreases less rapidly during the ray propagation, and the collective contributions to the RFED by many cells at large distances are more important. This makes the source influence volumes larger for the direct light. At the same time, the scattered light has much lower intensity compared to the direct light and does not contribute much to the RFED far away from the dusty cells that originate it. Therefore, the influence volume sizes for the scattered light become small very quickly.

The effect of increasing the optical depth for the same geometry seems to be different for the direct light and the scattered light iterations. For the direct light, the sizes of the source influence volumes do not change much. Instead, for the same scattered light iteration, the influence volumes seem to be larger with increased opacity. There is no simple explanation for this effect. On one hand increasing the opacity increases the efficiency of scattering light. On the other, it also reduces more rapidly the ray intensity and thus the contribution to the RFED and to the scattered light intensity at large distances. The overall effect seems to be an enlargement of the source influence volumes as well as an increase of the number of scattering iterations required to complete the RT calculation.

6. Pro and cons of the DART-Ray code

The DART-Ray code is one of the few 3D dust RT codes which do not use the MC method (see Steinacker et al. 2013) and the only one using an algorithm based on estimating the source influence volume extents. The originality and the relative novelty of this code are accompanied by several advantages and disadvantages:

Advantages

  • no MC noise;

  • RT calculation very efficient for higher order of the scattered light;

  • it calculates the radiation field energy density accurately everywhere, even when its knowledge is not required to produce images;

  • alternative method that can be used to validate further scientific results obtained by MC codes;

  • allows to calculate images at arbitrary observer positions without repeating the entire RT calculation;

  • flexibility to change input geometry, dust model, stellar emission library;

  • easy to import N-body and SPH simulations in tipsy format.

Disadvantages

  • high memory requirements;

  • lack of subgrid resolution (exploited by MC codes);

  • typically longer calculation times compared to those of MC codes;

  • direct light calculation rather inefficient when the source influence volumes are close to the entire RT model;

  • only Cartesian adaptive grid implemented.

7. Possible further improvements

DART-Ray V2 is a major improvement compared to the code presented in NA14. Apart from the new capabilities, the code has now a solid structure and documentation that makes further development possible. The main barrier to overcome in DART-Ray is the reduction of the calculation time for models in which the sources have influence volume sizes of the order of the entire model size. For example, this typically happens for galaxy models in the infrared range, where the galaxy is more transparent and sources cumulatively contribute to the RFED at large distances. However, in the same models, the sources of scattered light tend to have rather small influence volumes and, therefore, the processing of scattered light proceeds much faster. A more efficient algorithm could be built which processes the direct light in a more efficient way than the adopted source-to-cell approach, while leaving the algorithm as it is for the scattered light processing.


2

Scattering is particularly anisotropic in the UV and optical wavelength regimes, while it is almost isotropic in the infrared (Draine 2003).

3

One important difference is that the value of Uλ,LL(x) is updated firstly with the RFED distribution found at the end of step 2 and then with that found at the end of each scattering iteration.

4

This problem affects all methods which determine Iλ(r,θ,φ) directly, including finite-differencing, discrete ordinates and other ray-tracing methods.

5

Formula (5) can be found by inverting the following equivalence between the approximate pixel solid angle required for the minimum angular resolution and the exact pixel solid angle, equal to the ratio of the total solid angle divided by the number of spherical pixels:

6

Even in MC codes, although photon particles can propagate in any possible direction and be scattered at any location within an RT model, the number of particle directions that can be followed is still finite. This inevitably produces a discretization error which can only be reduced by increasing substantially the number of particles.

Acknowledgments

G.N. and C.C.P. would like to acknowledge support from the Leverhulme Trust research project grant RPG-2013-418. V.P.D. is supported by STFC Consolidated grant #ST/M000877/1. G.N. thanks Dimitris Stamatellos for useful comments that helped improve the paper and Karl Gordon for uploading the solutions presented in this paper to the TRUST project website. We thank the anonymous referee for insightful comments on the code algorithm.

References

All Tables

Table 1

Median values of the average ray crossing length distribution for the Milky Way models with the optical depth scaled by the factors 0.5, 1 and 2.

All Figures

thumbnail Fig. 1

Flowchart of the angular density optimization algorithm for the rays originating from a single source. All departing rays are collected in lists (high or low, see text) and their propagation is followed through the RT model. Depending on their beam size and contribution to the RFED, they are split or merged during the “select ray” phase. We note that, during step 1 of the RT algorithm (see Sect. 2), the decision condition δU>fUULL is substituted by dray<Rpre with dray the distance crossed by the ray and Rpre the input maximum distance that can be crossed in this step.

Open with DEXTER
In the text
thumbnail Fig. 2

Examples of the implemented sampling of the Henyey-Greenstein phase function Φλ for different values of the anisotropy parameter gλ. The Φλ values on a sphere are visualized using Mollweide projection and according to the colour bar below each panel. The contours of the HEALPix pixels corresponding to the values of kλ,HP from Eq. (5) are overplotted as black lines. The dark blue circle shows the size of the FWHM [ Φλ−Φλ(π) ].

Open with DEXTER
In the text
thumbnail Fig. 3

Wall clock speed-up with respect to the serial execution for an N-body and SPH galaxy simulation by using the no communication and communication parallelization mode. For all the parallel runs, we used eight CPUs for each MPI process (that is, eight OpenMP threads per MPI process).

Open with DEXTER
In the text
thumbnail Fig. 4

Comparison of the λ = 35.11 μm edge-on images of the TRUST slab benchmark for the vertical optical depth τ(1 μm) = 10. The solutions provided by the old and new DART-Ray version are included as well as those of the other codes participating to the project. Units on the images are MJy/sr. The plots on the left show the average surface brightness profiles and the relative differences between the solutions along a vertical and a horizontal strip, whose boundaries are shown within the top two images (CRT and new DART-Ray code). The X-axis of these plots are in units of pixels. The inclusion of dust self-heating in the new version of DART-Ray allows a much closer agreement with the other codes. We note that these solutions are for the “effective grain” case (see G17).

Open with DEXTER
In the text
thumbnail Fig. 5

Distributions of the relative number of cells as a function of the relative discrepancy for the RFED for the MW model derived by assuming aRT = 0.005 and aRT = 0.05. The plots show the results for the UV, optical and NIR wavelengths used in the RT calculations. We note that the relative discrepancy is always lower than 5% as expected.

Open with DEXTER
In the text
thumbnail Fig. 6

Distributions of the relative number of cells as a function of the average ray length crossed by the rays from each cell, normalized to the model linear size, for the direct light processing phase and for some of the scattering iterations. The results are shown for each UV, optical, NIR wavelengths used in the calculations and for the optical depths scaled by the factor 0.5, 1, and 2 compared to the original MW model.

Open with DEXTER
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.