Computing the gravitational potential on nested meshes using the convolution method

Eduard I. Vorobyov; James McKevitt; Igor Kulikov; Vardan Elbakyan

doi:10.1051/0004-6361/202244701

Home

All issues

Volume 671 (March 2023)

A&A, 671 (2023) A81

Full HTML

Open Access

Issue		A&A Volume 671, March 2023


Article Number		A81
Number of page(s)		15
Section		Numerical methods and codes
DOI		https://doi.org/10.1051/0004-6361/202244701
Published online		08 March 2023

A&A 671, A81 (2023)

Computing the gravitational potential on nested meshes using the convolution method^★

Eduard I. Vorobyov¹, James McKevitt¹, Igor Kulikov² and Vardan Elbakyan³^,4

¹ University of Vienna, Department of Astrophysics, Türkenschanzstrasse 17, Vienna 1180, Austria
e-mail: eduard.vorobiev@univie.ac.at
² Institute of Computational Mathematics and Mathematical Geophysics SB RAS, Lavrentieva ave., 6, Novosibirsk 630090, Russia
³ Research Institute of Physics, Southern Federal University, Stachki Ave. 194, Rostov-on-Don 344090, Russia
⁴ School of Physics, University of Leicester, Leicester, LE1 7RH, UK

Received: 5 August 2022
Accepted: 6 January 2023

Abstract

Aims. Our aim is to derive a fast and accurate method for computing the gravitational potential of astrophysical objects with high contrasts in density, for which nested or adaptive meshes are required.

Methods. We present an extension of the convolution method for computing the gravitational potential to the nested Cartesian grids. The method makes use of the convolution theorem to compute the gravitational potential using its integral form.

Results. A comparison of our method with the iterative outside-in conjugate gradient and generalized minimal residual methods for solving the Poisson equation using nonspherically symmetric density configurations has shown a comparable performance in terms of the errors relative to the analytic solutions. However, the convolution method is characterized by several advantages and outperforms the considered iterative methods by factors 10–200 in terms of the runtime, especially when graphics processor units are utilized. The convolution method also shows an overall second-order convergence, except for the errors at the grid interfaces where the convergence is linear.

Conclusions. High computational speed and ease in implementation can make the convolution method a preferred choice when using a large number of nested grids. The convolution method, however, becomes more computationally costly if the dipole moments of tightly spaced gravitating objects are to be considered at coarser grids.

Key words: methods: numerical / gravitation / hydrodynamics

^⋆

Code can be viewed at https://github.com/jamesmckevitt/CM4NG

The Authors 2023

Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1 Introduction

Gravity, together with magnetic fields and radiation, is a key mechanism that determines the formation and evolution of many astrophysical objects from planets to clusters of galaxies. On rare occasions the gravity force can be found analytically (Binney et al. 1987). In practice, various numerical methods are often employed to compute the gravitational potential using the Poisson equation (e.g., Binney et al. 1987; Press et al. 1992; Bodenheimer et al. 2007). The gravity force can then be calculated from the gradient of the potential using discretization schemes with various levels of complexity (e.g., Stone & Norman 1992; Kulikov & Vorobyov 2016; Wang & Yen 2020). The required computational resources are often substantial and may take up a large portion of the entire runtime of a three-dimensional numerical hydrodynamics code. Therefore, fast methods such as the fast Fourier transform (FFT) are often a method of choice if the geometry of the system permits regular meshes with equidistant cells (Press et al. 1992).

In many astrophysical applications, the FFT method is difficult to apply because of the requirement for irregular computational meshes that can better resolve the spatial regions where matter accumulates and density increases. For example, the gravitational collapse of interstellar clouds involves an increase in the gas density in the cloud interiors by many orders of magnitude (e.g., Masunaga et al. 1998; Larson 2003). Extensions of the FFT algorithm to nonequidistant meshes exist (Potts & Tasche 2008) but they are usually much slower than their classical counterparts. In such situations, nested meshes or, more generally, adaptive meshes are often used to achieve a better numerical resolution in the innermost regions (e.g., Matsumoto & Hanawa 2003; Machida et al. 2005; Tomida et al. 2015; Hennebelle 2018; Lebreuilly et al. 2020).

Finding the gravitational force on nested meshes is a complicated task. A contemporary approach involves solving for the Poisson equation first on the coarsest grid. The result is then used to calculate the gravitational potential on the refined grid using, for the boundary conditions, the potential at the coarsest grid (e.g., Matsumoto & Hanawa 2003; Guillet & Teyssier 2011; Wang & Yen 2020). Multigrid methods are often used on each level to improve the accuracy, especially when tightly spaced gravitating objects are present on the deepest levels of refinement (Matsumoto & Hanawa 2003; Matsumoto 2007; Truelove et al. 1998; Stone et al. 2020). Moving in this fashion from the coarsest to the most refined grid, the gravitational potential on the entire nested mesh can be retrieved. However, this “outside-in” approach has a potentially serious drawback – this algorithm is difficult to parallelize because computations on a refined grid cannot be done until the potential on a coarse grid is found. According to our experience, calculating the gravity force in numerical hydrodynamics equations on nested meshes with a moderate base resolution N ≤ 64³ may take a large fraction of the entire computational time. Therefore, developing alternative methods is worthwhile.

Here, we present a conceptually different method of computing the gravitational force on nested meshes, which employs the integral form of the gravitational potential and the convolution theorem to solve the resulting triple sum. To the best of our knowledge, this method has not been explicitly described in the literature (but see Ruffert & Mueller 1990), although its application to regular (non-nested) meshes in astrophysics is well known (e.g., Müller et al. 2012; Vorobyov et al. 2020; Stücker et al. 2020). The layout of this work is as follows. In Sect. 2 we briefly outline the outside-in solution procedure using the conjugate gradient method and provide the test results on two nonspherically symmetric mass distributions. In Sect. 3 we review the general convolution method and in Sect. 4 we describe its extension to nested meshes. The comparison of models is provided in Sect. 5. The analysis of runtimes is given in Sect. 6. Our main conclusions are summarized in Sect. 8.

Fig. 1

Example of a nested mesh with three levels of refinement: Ω¹, Ω², and Ω³. The gray-shaded cells represent boundary layers for the coarsest grid; the gravitational potential at these boundary layers has to be found a priori.

2 Outside-in conjugate gradient solution procedure

In this section, we briefly outline the outside-in solution method for the gravitational potential Φ on nested meshes. This solution procedure is used merely for comparison with an alternative method that we propose later in Sect. 3. In general, the gravitational potential of an arbitrary mass distribution can be found by solving for the Poisson equation $\nabla \cdot \nabla Φ=ΔΦ=4 π G ρ,$ $\nabla \cdot \nabla {\rm{\Phi = \Delta \Phi = 4}}\pi G\rho ,$ (1)

where Δ is the Laplace operator, G is the gravitational constant, and ρ is the mass volume density. Figure 1 presents a simple example of three nested meshes on the Cartesian two-dimensional grid (x, y). For simplicity we show only two dimensions. However, a third dimension will be added in practical tests. By default, the outer coarsest grid Ω¹ encompasses the entire computational domain with a size of [ω_L : ω_R] × [ω_D : ω_T]. The inner finer grids Ω² and Ω³ provide a better numerical resolution in the inner regions of the computational domain. The physical size of each nested grid in the x- and y-coordinate directions is determined as (ω_R − ω_L)/2^m−1 and (ω_T − ω_B)/2^m−1, where m takes values from 1 to 3 in our simple example. Furthermore, each nested grid is subdivided into N = 4 square cells. Such a mesh refinement is often needed to better resolve the central denser regions of astrophysical objects (e.g., gravitationally collapsing interstellar clouds of gas).

The outside-in solution procedure consists of a sequence of steps starting from the calculation of the gravitational potential on the coarsest Ω^{m = 1} grid. We discretize the Laplace operator using a simple five-point stencil $ΔΦ \approx \frac{1}{d x_{m}^{2}} (Φ_{i + 1, j}^{m} - 2 Φ_{i, j}^{m} + Φ_{i - 1, j}^{m}) + \frac{1}{d y_{m}^{2}} (Φ_{i, j + 1}^{m} - 2 Φ_{i, j}^{m} + Φ_{i, j - 1}^{m}),$ ${\rm{\Delta \Phi }} \approx {1 \over {{\rm{d}}x_m^2}}\left( {{\rm{\Phi }}_{i + 1,j}^m - 2{\rm{\Phi }}_{i,j}^m + {\rm{\Phi }}_{i - 1,j}^m} \right) + {1 \over {{\rm{d}}y_m^2}}\left( {{\rm{\Phi }}_{i,j + 1}^m - 2{\rm{\Phi }}_{i,j}^m + {\rm{\Phi }}_{i,j - 1}^m} \right),$ (2)

where dx_m and dy_m are the sizes of computational cells in x- and y-coordinate direction defined as dx_m = (ω_R − ω_L)/(2^m−1 N) and dy_m = (ω_T − ω_D)/(2^m−1 N), respectively. We note that we adopt dx_m = dy_m throughout the paper. A similar algorithm applies to the third z-coordinate dimension in practical tests. The discretization of the Laplace operator introduces boundary values of the potential Φ_B for each nested domain. The corresponding boundary ghost zones for the coarsest Ω¹ grid are shown in Fig. 1 as additional layers of gray-colored cells. These boundary values can be found using the multipole expansion of the Laplace equation ΔΦ = 0 in spherical harmonics (note that by our model setup the volume density outside the computational domain is equal zero). The pertaining formulae can be found in, for example, Jackson & Fox (1999). The higher-order harmonics are added until a desired accuracy in $Φ_{B}^{m = 1}$ ${\rm{\Phi }}_{\rm{B}}^{m = 1}$ is reached. This method usually converges fast for compact density configurations when most of the mass is concentrated in the center of the computational domain, but may become computationally costly or even diverge for arbitrary mass distributions with large amounts of mass present near outer boundaries.

Once the boundary values $Φ_{B}^{m = 1}$ ${\rm{\Phi }}_{\rm{B}}^{m = 1}$ of the gravitational potential are found, the potential on the coarsest grid Φ^m=1 can be computed using one of the methods of linear algebra. In this work, we employ the conjugate gradient method described in, for example, Press et al. (1992). As with any iterative method, a truncation error has to be established a priori. The completion of the iterative process in the conjugate gradient method occurs when the norm of the relative residuals falls below ϵ = 10⁻⁶. In practice, we substitute the calculated grid values of the potential into the numerical scheme for approximating the Poisson equation (see Eq. (2)) and use the resulting data to form a residual difference vector with the grid values of the right-hand side (4πGρ). The norm of this residual vector should not exceed 10⁻⁶ relative to the norm of the right-hand side vector. It is important to note that the achieved accuracy is not the value of the relative error in the gravitational potential. The relative error in the gravitational potential is improved only when the grid is refined. In the iterative process, it is only guaranteed that the grid values of the potential, when substituting into the scheme for the Poisson equation, restore the right-hand side with a chosen truncation error of ϵ = 10⁻⁶. The dependence of the solution on ϵ is investigated in Sect. 5. The potentials Φ^m on the finer grids can then be found by taking the corresponding potentials on the coarser grids Φ^m−1 as the boundary values (which are known from the previous step). Moving in this manner from the outer coarsest grid toward the inner finest grid the gravitational potential in the entire computational domain can be computed. In the following text, the outlined solution procedure is denoted as the outside-in conjugate gradient method (hereafter, OiCG method).

An important prerequisite for the OiCG method is a proper interpolation of the potentials on a coarser grid to be used as the boundary values on a finer grid (Matsumoto & Hanawa 2003; Wang & Yen 2020). Figure 2 shows two examples for the choice of the boundary values $Φ_{B}^{m = 2}$ ${\rm{\Phi }}_{\rm{B}}^{m = 2}$ on the second nested grid m = 2. The first choice shown in the bottom-right corner consists in taking the nearest value of the potential on the corresponding coarser grid m = 1, so that $Φ_{B}^{m = 2} = Φ_{4, 3}^{m = 1}$ ${\rm{\Phi }}_{\rm{B}}^{m = 2} = {\rm{\Phi }}_{4,3}^{m = 1}$ . The subscripts (4,3) refer to the cell number on the two-dimensional (x, y) mesh. This choice, however, is not optimal and will lead to degraded accuracy of the solution, as we subsequently demonstrate below. A better accuracy can be achieved by taking a linear interpolation of the potentials on the coarser m = 1 grid to the center of the corresponding ghost zone of the m = 2 mesh. The algorithm is illustrated in the upper-left corner of Fig. 2 and for the left-hand side interface between the nested grids (i = 1, x-coordinate) can be described as follows $\begin{array}{l} Φ_{B, (i, k, l)}^{m + 1} = 0.75 Φ_{i b, k b, l b}^{m} + 0.25 Φ_{i b + 1, k b + 1, l b + 1,}^{m} \\ if mod (k,2) \neq 0 and mod (1, 2) \neq 0, \end{array}$ $\matrix{ {{\rm{\Phi }}_{{\rm{B,}}\left( {i,k,l} \right)}^{m + 1} = 0.75{\rm{\Phi }}_{ib,kb,lb}^m + 0.25{\rm{\Phi }}_{ib + 1,kb + 1,lb + 1,}^m} \hfill \cr {\quad \quad \,\,\,\,\,\,\,\,\,\,\,{\rm{if}}\,{\rm{mod}}\left( {{\rm{k,2}}} \right)\, \ne 0\,\,{\rm{and}}\,\,{\rm{mod}}\left( {1,2} \right) \ne 0,} \hfill \cr }$ (3) $\begin{array}{l} Φ_{B, (i, k, l)}^{m + 1} = 0.75 Φ_{i b, k b, l b}^{m} + 0.25 Φ_{i b + 1, k b + 1, l b - 1,}^{m} \\ if mod (k,2) \neq 0 and mod (1, 2) = 0, \end{array}$ $\matrix{ {{\rm{\Phi }}_{{\rm{B,}}\left( {i,k,l} \right)}^{m + 1} = 0.75{\rm{\Phi }}_{ib,kb,lb}^m + 0.25{\rm{\Phi }}_{ib + 1,kb + 1,lb - 1,}^m} \hfill \cr {\quad \quad \,\,\,\,\,\,\,\,\,\,\,{\rm{if}}\,{\rm{mod}}\left( {{\rm{k,2}}} \right)\, \ne 0\,\,{\rm{and}}\,\,{\rm{mod}}\left( {1,2} \right) = 0,} \hfill \cr }$ (4) $\begin{array}{l} Φ_{B, (i, k, l)}^{m + 1} = 0.75 Φ_{i b, k b, l b}^{m} + 0.25 Φ_{i b + 1, k b - 1, l b + 1,}^{m} \\ if mod (k,2) = 0 and mod (1, 2) \neq 0, \end{array}$ $\matrix{ {{\rm{\Phi }}_{{\rm{B,}}\left( {i,k,l} \right)}^{m + 1} = 0.75{\rm{\Phi }}_{ib,kb,lb}^m + 0.25{\rm{\Phi }}_{ib + 1,kb - 1,lb + 1,}^m} \hfill \cr {\quad \quad \,\,\,\,\,\,\,\,\,\,\,{\rm{if}}\,{\rm{mod}}\left( {{\rm{k,2}}} \right)\, = 0\,\,{\rm{and}}\,\,{\rm{mod}}\left( {1,2} \right) \ne 0,} \hfill \cr }$ (5) $\begin{array}{l} Φ_{B, (i, k, l)}^{m + 1} = 0.75 Φ_{i b, k b, l b}^{m} + 0.25 Φ_{i b + 1, k b - 1, l b - 1,}^{m} \\ if mod (k,2) = 0 and mod (1, 2) = 0. \end{array}$ $\matrix{ {{\rm{\Phi }}_{{\rm{B,}}\left( {i,k,l} \right)}^{m + 1} = 0.75{\rm{\Phi }}_{ib,kb,lb}^m + 0.25{\rm{\Phi }}_{ib + 1,kb - 1,lb - 1,}^m} \hfill \cr {\quad \quad \,\,\,\,\,\,\,\,\,\,\,{\rm{if}}\,{\rm{mod}}\left( {{\rm{k,2}}} \right)\, = 0\,\,{\rm{and}}\,\,{\rm{mod}}\left( {1,2} \right) = 0.} \hfill \cr }$ (6)

Here, the indices k and l take values from 1 to N + 2 and ib, kb, and lb are related to i, k, and l as ib = N/4 + i/2 + 1, kb = N/4 + k/2 + 1, and lb = N/4 + l/2 + 1, respectively. The algorithm can be generalized to other grid interfaces in a similar manner. More sophisticated methods involving the introduction of additional boundary layers can be found in, e.g., Wang & Yen (2020).

To test the accuracy of the OiCG method, we computed the gravitational potential and gravitational acceleration for two mass configurations with known analytical solutions: an oblate homogeneous ellipsoid and two nonintersecting homogeneous spheres. The former has a ratio of semi-major axes a₁ : a₂ : a₃ = 0.5 : 1.0 : 1.0. The latter is composed of two homogeneous spheres, both having a radius of 0.2. Their centers lie in the x − y plane at −0.5 and 0.5 with respect to the coordinate center. This latter test model mimics a binary star. The computational domain has dimensions of [−2.25:2.25] in each coordinate direction. The ratio of masses of the two spheres is 2:1. The analytic solution for the gravitational potential of an oblate ellipsoid was derived by Chandrasekhar (1969) and is provided in Stone & Norman (1992). The analytic solution for the binary star is the sum of analytic solutions for the two homogeneous spheres, which can also be found in Stone & Norman (1992).

The left pair of columns in Fig. 3 presents the relative errors in the gravitational potential and gravitational acceleration for an ellipsoid with the ratio of semi-major axes a₁ : a₂ : a₃ = 0.5 : 1.0 : 1.0. The errors in the gravitational potential and acceleration are calculated with respect to the analytical solutions as follows ${err}_{Φ} = | \frac{Φ_{num} - Φ_{an}}{Φ_{an}} |; {err}_{g} = | \frac{{| g |}_{num} - {| g |}_{an}}{{| g |}_{an}} |,$ ${\rm{er}}{{\rm{r}}_{\rm{\Phi }}} = \left| {{{{{\rm{\Phi }}_{{\rm{num}}}} - {{\rm{\Phi }}_{{\rm{an}}}}} \over {{{\rm{\Phi }}_{{\rm{an}}}}}}} \right|;\,\,\,\,{\rm{er}}{{\rm{r}}_{\rm{g}}} = \left| {{{{{\left| g \right|}_{{\rm{num}}}} - {{\left| g \right|}_{{\rm{an}}}}} \over {{{\left| g \right|}_{{\rm{an}}}}}}} \right|,$ (7)

where the subscripts “num” and “an” refer to the numerical and analytical values, respectively.

The modulus of gravitational acceleration |g| was calculated using a simple six-point stencil for the central discretization scheme, so that the x-component of the gravitational acceleration is written as $g_{x} = - \frac{\partial Φ}{\partial x} \approx \frac{Φ_{i + 1, j, k}^{m} - Φ_{i - 1, j, k}^{m}}{2 d x_{m}} .$ ${g_x} = - {{\partial {\rm{\Phi }}} \over {\partial x}} \approx {{{\rm{\Phi }}_{i + 1,j,k}^m - {\rm{\Phi }}_{i - 1,j,k}^m} \over {2{\rm{d}}{x_m}}}.$ (8)

The other y- and z-components are expressed accordingly. This simple six-point stencil has to be modified at the cells adjacent to the boundary between fine and coarse grids. When calculating the gravitational acceleration at the cells of a finer grid that are adjacent to a coarser grid, we use the interpolated values of the gravitational potential as described in Eqs. (3)–(6). The opposite case of the gravitational acceleration on a coarser grid is more subtle. A straightforward method would be to use the potentials in the adjacent cells on a finer grid, arithmetically averaged to the geometrical center of the underlying coarser cell. A more physically motivated method makes use of the Gauss theorem and requires that the gravitational acceleration at the coarse-fine grid interface be equal when computed using the fine and coarse grid cells, respectively. We review these cases in Appendix A and note that we used the first simple method in this paper but in real simulations the second method may be more appropriate. The boundary conditions on the gravitational potential for the outermost mesh m = 1 were calculated using the analytical solution. Two mesh configurations were considered: m = 1 (no mesh refinement) and m = 4 (four nested grids). The number of grid cells for each mesh was set equal to N = 128 in each coordinate direction. Clearly, the solution with a simple extrapolation of the boundary potentials to the nearest cell on the coarse grid (as shown in the bottom row of Fig. 2) produces a notably higher relative error in both Φ and |g| than the solution that employs a more sophisticated linear interpolation scheme (middle row in Fig. 2). In the latter case, the mesh interfaces can still be noticed as locations where the relative errors experience jumps or discontinuities.

The relative errors in the gravitational potential and gravitational acceleration for a binary star are shown in the right pair of columns in Fig. 3. This example is useful because it tests the method on a nonaxisymmetric mass distribution. The resulting trends are nevertheless similar – mesh refinement and linear interpolation of the boundary potentials help to improve the accuracy and reduce the errors. The corresponding errors relative to the analytic solution for both density configurations, as well as the standard deviation in the relative errors σ, are provided in Tables 1 and 2 for comparison. The numbers separated by the slash correspond to the cases without and with a linear interpolation following Eq. (6), respectively, and σ represents the standard deviation in the errors.

Fig. 2

Examples of finding the boundary potentials for the m = 2 nested grid. The bottom-right example presents a simple extrapolation of the nearest potential at the m = 1 grid. The top-left example shows a more sophisticated linear interpolation scheme, which involves two neighboring potentials at the m = 1 level.

Fig. 3

Gravitational potential and acceleration on nested grids in the OiCG method. Left pair of columns: relative error in the calculation of the gravitational potential (first column) and gravitational acceleration (second column) for an oblate ellipsoid with the semi-major axis ratio 0.5:1.0:1.0. The horizontal bars present the relative errors in percent (log units). The top row illustrates the errors obtained with no mesh refinement (m = 1). The middle and bottom rows show the errors for the case of four nested grids (m = 4). In particular, the middle row corresponds to a linear interpolation of the boundary potentials, whereas the bottom row presents the errors when a simple extrapolation is used. Right pair of columns: relative error in the gravitational potential (first column) and gravitational acceleration (second column) for a binary star, the spatial location of which is outlined by two circles. The left component is twice as massive as the right one.

Table 1

Errors in gravitational potential: ellipsoid (left) and binary star (right).

Table 2

Errors in gravitational acceleration: ellipsoid (left) and binary (right).

3 Integral form of the gravitational potential and the convolution method

An alternative approach to computing the gravitational potential is to use the integral representation $Φ (r) = - G \underset{V}{∭} \frac{ρ (r')}{| r - r' |} d^{3} r',$ ${\rm{\Phi }}\left( {\bf{r}} \right) = - G\mathop{\int\!\!\!\int\!\!\!\int}\limits_{\kern-5.5pt V} {{{\rho \left( {{\bf{r}}\prime } \right)} \over {\left| {{\bf{r}} - {\bf{r}}\prime } \right|}}{{\rm{d}}^3}{\bf{r}}\prime } ,$ (9)

where the integration is performed over the volume V occupied by the computational domain. On a discrete Cartesian computational mesh the gravitational potential Φ(x_i, y_j, z_k) in a cell (i, j, k) can be written in the following form $\begin{array}{l} Φ (x_{i}, y_{j}, z_{k}) = - G \sum_{i', j', k'} \frac{M (x_{i'}, y_{j'}, z_{k'})}{\sqrt{{(x_{i} - x_{i'})}^{2} + {(y_{j} - y_{j'})}^{2} + {(z_{k} - z_{k'})}^{2}}} \\ = - G \sum_{i', j', k'} M (x_{i'}, y_{j'}, z_{k'}) G (x_{i} - x_{i'}, y_{j} - y_{j'}, z_{k} - z_{k'}), \end{array}$ $\matrix{ {{\rm{\Phi }}\left( {{x_i},{y_j},{z_k}} \right) = - G\sum\limits_{i\prime ,j\prime ,k\prime } {{{M\left( {{x_{i\prime }},{y_{j\prime }},{z_{k\prime }}} \right)} \over {\sqrt {{{\left( {{x_i} - {x_{i\prime }}} \right)}^2} + {{\left( {{y_j} - {y_{j\prime }}} \right)}^2} + {{\left( {{z_k} - {z_{k\prime }}} \right)}^2}} }}} } \hfill \cr {\quad \quad \quad \,\,\,\,\,\,\,\,\,\,\,\, = - G\sum\limits_{i\prime ,j\prime ,k\prime } {M\left( {{x_{i\prime }},{y_{j\prime }},{z_{k\prime }}} \right){\cal G}\left( {{x_i} - {x_{i\prime }},{y_j} - {y_{j\prime }},{z_k} - {z_{k\prime }}} \right),} } \hfill \cr }$ (10)

where M(x_i′, y_i′, z_k′) is the mass in a numerical cell (i′, j′, k′) with coordinates (x_i′, y_i′, z_k′), 𝒢(x_i − x_i′, y_j − y_j′, z_k − z_k′) is the inverse distance between cells (i, j, k) and (i′, j′, k′), and the summation is performed over all numerical cells.

Finding the gravitational potential for a numerical mesh with N³ cells by direct summation requires (2N)⁶ operations, which becomes numerically expensive for N greater than a few tens. Therefore, the convolution theorem is often invoked to compute the potential given by Eq. (10). The convolution theorem states that the sum of the following form (Binney et al. 1987) $h_{k} = \frac{1}{\sqrt{2 K}} \sum_{k' = - N}^{k' = N - 1} f_{k - k'} g_{k'}$ ${h_k} = {1 \over {\sqrt {2K} }}\sum\limits_{k\prime = - N}^{k\prime = N - 1} {{f_{k - k\prime }}{g_{k\prime }}}$ (11)

can be computed by taking the Fourier transforms separately for functions f_k−k′ and g_k′, multiplying them in the Fourier space to form ${\hat{h}}_{p} = {\hat{f}}_{p} {\hat{g}}_{p}$ ${{\hat h}_p} = {{\hat f}_p}{{\hat g}_p}$ , and finally taking the inverse Fourier transform of ${\hat{h}}_{p}$ ${{\hat h}_p}$ . For a two-dimensional grid, for example, this requires 2N[6 log₂(2N) + 1] additions and multiplications to be compared to (2N)² operations for a direct evaluation of the sum (11), see Binney et al. (1987). We note that the classic Fourier transformation method to solve the Poisson equation (e.g., Press et al. 1992) is inapplicable to Eq. (10). This is because this method is used to turn a system of discretized Poisson equations into an algebraic expression in the Fourier space, whereas Eq. (10) is already an algebraic expression, the value of which can be found most efficiently using the convolution theorem.

To apply the convolution theorem to the gravitational potential given by Eq. (10), certain manipulations with the mass M and inverse distance 𝒢 have to be made. The convolution theorem is applicable to periodic functions, whereas M and 𝒢 may not be periodic. This problem can be solved by doubling the computational domain in each coordinate direction and assigning zero values to M on this extended mesh (note that M on the original domain is kept intact). Both 𝒢 and M can then be made periodic on the extended domain by rearranging the extended mesh as shown in Fig. 4 (see also, e.g., Hockney & Eastwood 1988). We note that the zero-padding of the ghost domains makes the mass distribution in the active domain effectively isolated, meaning that this method can be only applied to isolated mass distributions and not to periodic ones.

Furthermore, one can notice that the inverse distance 𝒢 formally turns into infinity when i = i′, j = j′, and k = k′. This problem can be solved by introducing a smoothing length ϵ_sm, so that the inverse distance takes the following form $G = \frac{1}{\sqrt{{(x_{i} - x_{i'})}^{2} + {(y_{j} - y_{j'})}^{2} + {(z_{k} - z_{k'})}^{2} + ϵ_{sm}^{2}}}$ ${\cal G} = {1 \over {\sqrt {{{\left( {{x_i} - {x_{i\prime }}} \right)}^2} + {{\left( {{y_j} - {y_{j\prime }}} \right)}^2} + {{\left( {{z_k} - {z_{k\prime }}} \right)}^2} + _{{\rm{sm}}}^2} }}$ (12)

However, there is always an uncertainty as to what value is to assign to ϵ_sm (Müller et al. 2012). A way to circumvent the problem is to evaluate the input to the potential in the cell (i = i′, j = j′, k = k′) from the mass distribution within the cell itself. If we assume that the density ρ is constant within the cell, then the corresponding potential can be found as (see Eq. (9)) $\begin{array}{l} Φ (x_{i} = x_{i'}, y_{j} = y_{j'}, z_{k} = z_{k'}) \\ = - G ρ (x_{i}, y_{j}, z_{k}) \int_{- \frac{d x}{2}}^{\frac{d x}{2}} \int_{- \frac{d y}{2}}^{\frac{d y}{2}} \int_{- \frac{d z}{2}}^{\frac{d z}{2}} \frac{d x' d y' d z'}{\sqrt{x'^{2} + y'^{2} + z'^{2}}}, \end{array}$ $\matrix{ {{\rm{\Phi }}\left( {{x_i} = {x_{i\prime }},{y_j} = {y_{j\prime }},{z_k} = {z_{k\prime }}} \right)} \hfill \cr {\quad \,\,\,\,\,\,\, = - G\rho \left( {{x_i},{y_j},{z_k}} \right)\int\limits_{ - {{{\rm{d}}x} \over 2}}^{{{{\rm{d}}x} \over 2}} {\int\limits_{ - {{{\rm{d}}y} \over 2}}^{{{{\rm{d}}y} \over 2}} {\int\limits_{ - {{{\rm{d}}z} \over 2}}^{{{{\rm{d}}z} \over 2}} {{{{\rm{d}}x\prime {\rm{d}}y\prime {\rm{d}}z\prime } \over {\sqrt {x{\prime ^2} + y{\prime ^2} + z{\prime ^2}} }},} } } } \hfill \cr }$ (13)

where dx, dy, and dz are the sizes of the cell in each coordinate direction. We also used variable substitutions of the following form: x_i - x_i′ = x′, y_j − y_j′ = y′, and z_k − z_k′ = z′. The integral in Eq. (13) can be found either numerically or analytically (Macmillan 1985; Stücker et al. 2020). We note that the assumption of constant ρ may be relaxed and the input to the potential from the cell itself can also be computed for piecewise-linear density profiles (Stücker et al. 2020).

We emphasize that the convolution method does not require us to precompute the boundary values of the gravitational potential at the coarsest grid (using, e.g., the multipole expansion) because of the absence of spatial derivatives in Eq. (10). This can be considered as a clear advantage compared to numerical solutions of the Poisson equation. Nevertheless, the practical applicability of the method described in this section has been limited to Cartesian meshes with grid cells of similar size in each coordinate direction (equidistant grids). This limitation is dictated by the fast Fourier transform (FFT), which was developed for the case of equidistant grids. There are extensions of the FFT methods to nonequidistant grids (e.g., Potts & Tasche 2008), but in our experience they usually work notably slower than the classic FFT methods. The convolution method can be also applied to finding the gravitational potential of disk-like configurations on the polar grid (r, ϕ) with a logarithmic scaling in the radial direction. In this case, the introduction of a new variable u = ln r helps to map the polar grid onto the equidistant Cartesian grid, so that the classic convolution theorem can be used (Binney et al. 1987). However, such cases are rather specific and for a three-dimensional cylindrical grid (z, r, ϕ) the above described method of variable substitution does not work. For 3D cylindrical coordinates, the convolution theorem can be applied only along the ϕ and z-directions, while in the radial direction the direct summation has to be used, which considerably decelerates the method. For 3D spherical coordinates the situation is even worse since the convolution theorem works only along the ϕ-coordinate.

Fig. 4

Schematic illustration of the mesh doubling and rearrangement in the convolution method. The original active region 1 is doubled in each coordinate direction to form three ghost regions (2–4) that are filled with zero masses (M = 0). The doubled mesh is rearranged as shown by the arrows to make the inverse distance G and mass M periodic functions on the new domain.

Fig. 5

Decomposition of two nested grids (top) into three individual grids with cells of equal size to which the classic FFT method can be applied: a coarse grid (Ω¹, bottom left), refined grid (Ω², bottom center), and coarse grid with a cavity ( $Ω_{hole}^{1}$ ${\rm{\Omega }}_{{\rm{hole}}}^1$ , bottom right).

4 Modification of the convolution method for nested grids

In this section, we describe a modification to the convolution method that can be applied to nested Cartesian grids. A nested grid with only two levels of refinement (Ω¹ : Ω²) is used for simplicity to illustrate the method but the procedure can easily be generalized to any level of refinement.

Figure 5 presents an example of the mesh decomposition, which is used to compute the gravitational potential on nested grids. The original mesh (top) is split into three regular grids to which the FFT method can be applied: a coarse m = 1 grid (Ω¹, bottom left), a refined m = 2 grid (Ω², bottom center), and a coarse m = 1 grid with a central cavity ( $Ω_{hole}^{1}$ ${\rm{\Omega }}_{{\rm{hole}}}^1$ , bottom right). The physical size of the cavity is equal to that of the refined m = 2 grid. We note that the density ρ in the cavity is set equal to zero. The grid with a central cavity accounts for the gravitational potential in the refined m = 2 grid created by the matter that lies in the hatched cells outside of this grid. Indeed, if we calculate the potentials on the coarse and refined grids, and then simply add them up, the procedure would not be complete. This is because the calculation of the potential on the refined grid does not take into account the contribution of the matter that lies outside this grid.

The solution procedure consists of three steps: (1) calculating the potential Φ^m=2 on the refined m = 2 grid, (2) calculating the potential Φ^m=1 on the coarse m = 1 grid, and (3) calculating the potential $Φ_{hole}^{m = 1}$ ${\rm{\Phi }}_{{\rm{hole}}}^{m = 1}$ on the coarse m = 1 grid with a central cavity. An additional substep includes an interpolation of $Φ_{hole}^{m = 1}$ ${\rm{\Phi }}_{{\rm{hole}}}^{m = 1}$ to the centers of the refined m = 2 grid as, for example, shown in Fig. 6. The resulting interpolated potential ${\tilde{Φ}}_{hole}^{m = 1}$ ${\rm{\tilde \Phi }}_{{\rm{hole}}}^{m = 1}$ is used in the construction of the final solution. Other more sophisticated interpolation procedures can also be devised and may improve the accuracy of the final solution.

The potential on the original nested mesh is then constructed as illustrated in Fig. 7. The gravitational potential on the coarse m = 1 grid is set equal to Φ^m=1, while the potential on the refined m = 2 grid is set equal to the sum of Φ^m=2 and ${\tilde{Φ}}_{hole}^{m = 1}$ ${\rm{\tilde \Phi }}_{{\rm{hole}}}^{m = 1}$ . We note that each of the three steps is independent and can be done in parallel, unlike the outside-in solution procedure outlined in Sect. 2. This procedure can be extended to any number of nested grids as follows $\begin{array}{l} Φ (Ω^{1}) = Φ^{m = 1} \\ Φ (Ω^{2}) = Φ^{m = 2} + {\tilde{Φ}}_{hole}^{m = 1}, \\ Φ (Ω^{3}) = Φ^{m = 3} + {\tilde{Φ}}_{hole}^{m = 1} + {\tilde{Φ}}_{hole}^{m = 2}, \\ Φ (Ω^{n}) = Φ^{m = n} + \sum_{k = 1}^{n - 1} {\tilde{Φ}}_{hole}^{m = k}, \end{array}$ $\matrix{ {{\rm{\Phi }}\left( {{\Omega ^1}} \right) = {{\rm{\Phi }}^{m = 1}}} \hfill \cr {{\rm{\Phi }}\left( {{\Omega ^2}} \right) = {{\rm{\Phi }}^{m = 2}} + {\rm{\tilde \Phi }}_{{\rm{hole}}}^{m = 1},} \hfill \cr {{\rm{\Phi }}\left( {{\Omega ^3}} \right) = {{\rm{\Phi }}^{m = 3}} + {\rm{\tilde \Phi }}_{{\rm{hole}}}^{m = 1} + {\rm{\tilde \Phi }}_{{\rm{hole}}}^{m = 2},} \hfill \cr {{\rm{\Phi }}\left( {{\Omega ^n}} \right) = {{\rm{\Phi }}^{m = n}} + \sum\limits_{k = 1}^{n - 1} {{\rm{\tilde \Phi }}_{{\rm{hole}}}^{m = k}} ,} \hfill \cr }$ (14)

where Φ(Ωⁿ) is the gravitational potential on the n-th nested grid. In the following section, we use four nested grids to test the proposed convolution method.

Fig. 6

Interpolation of the gravitational potential $Φ_{hole}^{m = 1}$ ${\rm{\Phi }}_{{\rm{hole}}}^{m = 1}$ , obtained on the coarse m = 1 grid with a central cavity, on the refined m = 2 grid highlighted with hatches. The resulting potential ${\tilde{Φ}}_{hole}^{m = 1}$ ${\rm{\tilde \Phi }}_{{\rm{hole}}}^{m = 1}$ is shown with the arrow.

Fig. 7

Construction of the final solution on the original nested mesh (Ω¹ : Ω²).

5 Comparison of the models

In this section, we carry out a comparison of the gravitational potential and acceleration calculated using the two considered models: the convolution method and OiCG method. As in Sect. 2, we consider an oblate ellipsoid and a binary star. We note, however, that here we use the multipole expansion with up to 6 multipole moments to find the gravitational potential at the outer boundary of the coarsest m = 1 grid. This makes the comparison more realistic because in real simulations the analytic solution is rarely available. In both cases, linear interpolation of the boundary potentials at the interfaces between the nested grids is applied. In the convolution method, the interpolation is only used to obtain the gravitational acceleration, while in the OiCG method the interpolation is also required to calculate the boundary potentials at the finer grids for the outside-in procedure to work.

We found that both methods are characterized by comparable accuracy, as can be seen from Fig. 8 for the case of four nested grids. The relative errors are provided in Tables 3 and 4 for convenience. For instance, the maximum error in the gravitational potential does not exceed 1.0% in the case of the binary star and is notably smaller for the ellipsoid. The mean errors are a few per mille (in %) in the case of the ellipsoid and just a fraction of a percent for the binary. As expected, the calculation of the gravitational acceleration shows larger relative errors and the maximum values can reach 10% for the binary. Nevertheless, the mean errors are smaller than 1%. The standard deviation is also low, implying that large errors are rare. A closer look at Fig. 8 indicates that the interfaces between the nested grids are hardly seen in the errors of the gravitational potential, whereas the errors in the gravitational acceleration are often the highest at the position of the interfaces. This means that more accurate schemes for calculating the gravitational acceleration from the gravitational potential (rather than a simple six-point stencil) may be needed to reduce the error and improve the accuracy (e.g., Wang & Yen 2020).

Figure 9 presents the relative errors (in %) for the oblate ellipsoid as a function of radial distance from the coordinate center in the vertical x − z plane. The errors have a wide spread but are mostly below 0.01% for the potential and 0.1% for the acceleration. There is little systematic difference between the two considered methods.

As another comparison test, we calculate the maximum and mean errors in the gravitational potential and acceleration of the oblate ellipsoid for nested grids ranging from m = 1 to m = 10. The results are shown in Fig. 10. The errors in the vertical x − z plane passing through the coordinate center are only considered. In the case of the OiCG method, we also vary the tolerance ϵ with which the solution is found by setting it to 10⁻⁶ and 10⁻⁸ (see Sect. 2). We note that the convolution method is not iterative and this procedure is not needed. Figure 10 indicates that both methods are characterized by errors in the gravitational potential of comparable magnitude. The convolution method produces relative errors in Φ that decline with increasing m, while the corresponding errors in the OiCG method saturate already at m ≥ 3. An important difference between the two considered methods can be seen in the behavior of relative errors for the gravitational acceleration. While the errors in the convolution method saturate with increasing m, the errors in the OiCG method begin to grow. The turning point depends on the tolerance ϵ with which the iterations are controlled. We found that the deterioration occurs in the very central regions of the ellipsoid. This test problem reveals a potentially dangerous behavior of the OiCG method and requires further investigation in the future.

Fig. 8

Comparison of the relative errors in the gravitational potential and acceleration for an oblate ellipsoid (left pair of columns) and binary star (right pair of columns). The top row shows the results for the convolution method, while the bottom row presents the relative errors for the OiCG method. Four nested grids and 128 grid zones in each coordinate direction are used.

Table 3

Errors in gravitational potential: ellipsoid (left) and binary star (right).

Table 4

Errors in gravitational acceleration: ellipsoid (left) and binary (right).

Fig. 9

Relative errors in the gravitational potential (top row) and gravitational acceleration (bottom row) as a function of radial distance from the coordinate center in the vertical x − z plane. The case of the oblate ellipsoid is shown. Left and right columns present the results for the OiCG and convolution method, respectively.

Fig. 10

Relative errors for the gravitational potential (top panel) and gravitational acceleration (bottom panel) for the oblate ellipsoid as a function of the number of nested grids. The black and red lines present the maximum and mean errors, respectively. The dashed lines correspond to the convolution method, while the solid and dash-dotted lines show the results for the OiCG method with the tolerance ϵ set equal to 10⁻⁶ and 10⁻⁸, respectively.

Fig. 11

Runtime vs. the number of cores in computing the gravitational potential of an oblate ellipsoid on m = 8 nested grids. Red and black lines correspond to the convolution method and OiCG method, respectively.

6 Performance comparison

In this section, we compare the two methods for computing the gravitational potential in terms of its performance on the Intel Xeon Platinum 8174 processors with 48 physical cores. We run the test in a loop that has 20 full cycles of the potential calculation in order to reduce the associated overheads and then divide the obtained runtime by a factor of 20. Figure 11 presents the results of our tests for the number of OpenMP threads ranging from 1 to 32, increasing by a factor of 2. The basic tests use the Intel/19 FORTRAN compiler without explicit optimization. Additionally, we run tests with the -Ofast optimization key. For the test configuration, we have chosen an oblate ellipsoid as in Sect. 5. The resolution is N = 128 cells in each direction and the number of nested grids is m = 8. Both values are based on our practical simulations of cloud core collapse and protoplanetary disk formation in the Solar metallicity environment, resulting in a numerical resolution of ≈1 au near the forming star. The tolerance ϵ for the OiCG method is set equal to 10⁻⁸. To compute the FFTs, we used two external libraries: the AMD Core Math Library (ACML) and the FFTW¹ library.

Analysis of Fig. 11 indicates that the convolution method outperforms the OiCG method by more than an order of magnitude on multicore OpenMP applications. Even without parallelization, the convolution method is faster than the OiCG method by a factor of several. The optimization key -Ofast marginally improves the performance of the OiCG method but has little effect on the convolution method. The ACML library is found to be somewhat faster than that of FFTW.

An even better performance on the convolution method can be achieved when graphics processing units (GPUs) are utilized. While an extension of the Intel FORTRAN compiler can offer an interface with GPUs (hereafter, devices) through OpenMP target offload, CUDA by NVIDIA is widely accepted as faster in most cases and thus was the chosen method of implementation (Gayatri et al. 2019). This required the adoption of the NVIDIA FORTRAN compiler². Various NVIDIA devices were used to accelerate the computations of the gravitational potential on each nested mesh, and this was coupled with CPU (hereafter, host) calculations of the boundary conditions and the final assembly of the gravitational potential from potentials on each individual mesh (see Eq. (14)). The results presented here are for the NVIDIA A40 and NVIDIA V100, the parameters for which can be found in Table 5. While the host of the NVIDIA V100, the Intel Xeon E5-2650 v3, offers a maximum of 20 physical cores, for consistency and in keeping with the dimensions of the computational domain, results are presented to a maximum of 16 physical cores. To compute the FFTs, the CUDA library cufft and the iso_c_binding module for FORTRAN to C interoperability were used. As in the example shown in Fig. 11, an oblate ellipsoid was chosen, with a numerical resolution of N = 128 grid zones in each coordinate direction and m = 8 nested meshes. The results are presented in Fig. 12. The CPU-GPU coupled version can be seen to achieve acceleration by a factor of 10 compared to the pure CPU model when at full OpenMP host parallelization. During operations accelerated by the device, 75% of the time is spent performing FFT computation, 20% on transferring the data between the host and the device, and 5% on other auxiliary computations. In total when using GPUs, a factor of 200 times acceleration can be achieved when compared to the original OiCG method with CPU-only implementation (see the black lines in Fig. 11).

Finally, we tested the performance of the Generalized Minimal Residual method (GMRes) when computing the gravitational potential using the discretized form of the Poisson equation (see, Eqs. (1) and (2)). In our particular case, we applied the OiCG method outlined in Sect. 2 to find the potential on each grid. Intel/19 FORTRAN compiler and the MKL Intel library were used for this test problem. No explicit optimization was used. We did this additional test to illustrate how our method can outperform the “standard” methods offered by frequently used numerical libraries. GMRes is an iterative method and, similar to the conjugate gradient method, it requires presetting a desired accuracy when calculating the gravitational potential. We set the tolerance in the relative residuals to ϵ = 10⁻⁶ to be consistent with the OiCG method. We found, however, that GMRes does not converge on N = 128 grid zones unless we reduce the tolerance ϵ. Therefore, we decreased the number of grid zones in each direction to N = 64. GMRes also required too many iterations to converge for nested grid levels m ≥ 7, which may be related to the problem of divergence discussed in the context of Fig. 10. We, therefore, limited our test runs to m = 6. The runtime for the GMRes method for an oblate ellipsoid (again, using 20 loops of full potential calculations to reduce the overheads) is 29 s and this number weakly depends on the OpenMP parallelization. This value is comparable to the best performance of the OiCG method but is much worse than the performance of the convolution method. Considering that the test was done on a twice lower resolution and with a smaller number of nested grids, the GMRes methods of the Intel MKL library is clearly inferior to both methods presented in this work. The GMRes method can be accelerated by reducing the tolerance ϵ, but we found that this leads to degraded accuracy in the calculation of the potential on the innermost nested grid and is therefore not recommended.

Table 5

Different device-host configurations.

Fig. 12

Runtime vs. the number of CPU cores in computing the gravitational potential of an oblate ellipsoid on m = 8 nested grids. The red lines show the results of CPU plus GPU computations, while the black lines present the CPU model only.

7 The dipole problem and the modified convolution method

The original convolution method described in Sects. 3 and 4 can be modified if specific cases are encountered. We illustrate this using the following idealized example problem. We consider an electric dipole with a dipole moment p = qd, where d is the distance between unit charges +q and −q. For simplicity, we place the dipole on a nested mesh with only two nested grids Ω¹ and Ω². Each grid has a numerical resolution of N = 32³. The unit charges are located near the geometrical center of the fine Ω² grid and their spatial position is chosen so as to cancel each other on the course Ω¹ grid. This means that on the coarse grid, the net charge is zero. The resulting electric field geometry is illustrated in Fig. 13. While the field lines show a regular and correct pattern on the fine Ω² grid, the field is artificially suppressed on the course Ω¹ grid because of the charge cancellation. The problem was outlined and successfully addressed in Matsumoto (2007); Truelove et al. (1998); Stone et al. (2020).

Our augmentation of the convolution method is described in detail below. We do not consider possible updates to the OiCG method as this method is used mainly for comparison purposes. Figure 14 presents an example of the mesh decomposition, which can be used to solve the aforementioned dipole problem on two nested grids. In particular, the solution procedure is split into three steps. The first two steps are identical to the original method: the potential on a fine m = 2 grid (Ω², bottom center) and on a coarse m = 1 grid with a central cavity ( $Ω_{hole}^{1}$ ${\rm{\Omega }}_{{\rm{hole}}}^1$ , bottom right) is found. The third step, however, involves expanding the fine grid to the physical size of the coarse grid as shown in the bottom left panel. The resulting doubled fine grid $Ω_{dpb}^{2}$ ${\rm{\Omega }}_{{\rm{dpb}}}^2$ has the same physical size as the coarse grid Ω¹ but the size of individual numerical cells is similar to that of Ω². As a result, $Ω_{dpb}^{2}$ ${\rm{\Omega }}_{{\rm{dpb}}}^2$ has twice as many cells in each coordinate direction. The additional layers of cells are then filled with zeros (white cells in the bottom left panel), while the original physical size of the fine m = 2 grid (indicated by the light-blue shade) retains the original values from the fine Ω² grid. The described algorithm can be written as follows $\begin{array}{l} Φ (Ω^{1}) = Φ_{hole}^{m = 1} + {\tilde{Φ}}_{dbl}^{m = 2}, \\ Φ (Ω^{2}) = {\tilde{Φ}}_{hole}^{m = 1} + Φ^{m = 2}, \end{array}$ $\matrix{ {} \hfill \cr {{\rm{\Phi }}\left( {{{\rm{\Omega }}^1}} \right) = {\rm{\Phi }}_{{\rm{hole}}}^{m = 1} + {\rm{\tilde \Phi }}_{{\rm{dbl}}}^{m = 2},} \hfill \cr {{\rm{\Phi }}\left( {{{\rm{\Omega }}^2}} \right) = {\rm{\tilde \Phi }}_{{\rm{hole}}}^{m = 1} + {{\rm{\Phi }}^{m = 2}},} \hfill \cr {} \hfill \cr }$ (15)

The introduction of the doubled fine grid allows us to avoid the charge cancellation on the coarse grid and retrieve the correct behavior of the electric field lines across the grid interface as illustrated in the top panel of Fig. 15. Now, the field lines are continuous and are present on the entire nested mesh (and not only on the fine mesh as was in the original method). Finding the potential on the coarse grid Ω¹ is now a more complex procedure than it was in the original method (see Eq. (14)). The potential on the doubled fine grid $Φ_{dbl}^{m = 2}$ ${\rm{\Phi }}_{{\rm{dbl}}}^{m = 2}$ should further be reduced to the coarse m = 1 grid (this mathematical operation is denoted by tilde) and added to the potential on the coarse grid with a hole $Φ_{hole}^{m = 1}$ ${\rm{\Phi }}_{{\rm{hole}}}^{{\rm{m}} = 1}$ . We note that the above procedure can in principle be extended to any number of nested grids (see Appendix B). For instance, the bottom panel in Fig. 15 presents the electric field of a dipole on three nested grids. As can be seen, the field lines are present and smooth across all grids. However, the modified convolution method suffers from computational overheads and we discuss this issue in more detail in Appendix B.

We want to make the following notes in this context. The electric dipole problem considered above is an extreme case, which helped us to highlight the problem. A close analog of an electric dipole in astrophysics is a binary star. However, masses are never cancelled out but add up to produce the common gravitational field. We have already considered a wide-separation binary system in Sect. 3 and our original convolution method has revealed good performance on this nonaxisymmetric test problem (see also Appendix C), but the situation with close binaries may be different. The dipole component of the gravity force of a close binary star falls off with distance faster (~1/r³) than the basic field component (~1/r²). This means that special care should be taken when computing the potential of a binary star in its vicinity, where the dipole moments are important, but at distances much greater than the binary separation the dipole contribution may be insignificant.

We illustrate the problem in Fig. 16 by computing the gravitational field of a close binary in the original and modified convolution methods. We consider a tight binary with companions located next to each other on the finest grid (Ω² for two nested grids and Ω³ for three nested grids). The densities in the neighboring cells occupied by the binary are set to ρ = 1 and ρ = 2, implying the companion mass ratio 1:2, and are zeroed everywhere else in the computational domain. We then calculate the relative error (in per cent) between the two methods, keeping in mind that the modified method computes the dipole moments accurately, but the original method does not. The top panel corresponds to the binary placed in the center of the finest grid. The bottom panel presents the case when the binary is located near the upper interface between Ω¹ and Ω² when two nested grids are considered and near the interface of Ω² and Ω³ for the case of three nested grids. Each level of refinement has N = 128³ grid cells.

We first consider two nested grids shown in the left column of Fig. 16. Both methods produce identical solutions on the refined Ω² grid and, hence, the relative error is zero there. The deviations between the two methods occur on the coarse Ω¹ grid, for which the original method does not reproduce correctly the dipole moments, and the errors depend on the spatial location of the binary. The errors are highest when the binary is near the grid interface (bottom panel) and can reach several per cent on the coarse grid. On the other hand, the errors stay below 1% if the binary is in the center of the fine grid (top panel). The case of three nested grids is shown in the right column of Fig. 16. Here, the comparison is made between the modified convolution method that correctly computes the dipole moments only on the finer Ω² and Ω³ grids, but not on the coarsest Ω¹ grid, and the modified convolution method that correctly computes the dipole moments on all three nested grids. Both cases produce identical potentials in the Ω² and Ω³ grids and the relative error is zero there, but they deviate in the outermost Ω¹ grid. However, the errors stay low and do not exceed 0.35%, regardless of the spatial position of the binary. If we degrade the numerical resolution to N = 64³ and N = 32³, the maximum errors slightly increase to 0.7 and 1.5%, which can be expected with the overall decrease in accuracy of the method as the resolution drops.

These test examples demonstrate the utility of the updated convolution method in correctly reproducing the dipole field of a binary across the coarse-fine grid interfaces. But it also shows that the error quickly drops with distance from the binary. A similar behavior was found when we varied the separation between the binary components by several grid cells. Consideration between the associated numerical costs and desired accuracy is needed if the dipole moments over a larger number of grid interfaces are to be computed. For instance, the convolution method was originally developed for astrophysical problems involving the gravitational collapse of an individual prestellar core. The star(s) and the circumstellar disk that form in the process are located within two-three innermost grids and the rest is filled with a rarified infalling envelope. The correct calculation of the dipole moments over the three innermost nested grids should be sufficient in this case. We also note that when applied to a smooth density distribution, e.g., an oblate ellipsoid, or wide separation binaries we did not find significant differences between the original and modified convolution methods (see Appendix C for details).

Finally, we performed a convergence test on the modified convolution method, in which we varied the resolution of each nested grid from N = 16 to N = 128, but kept the number of nested grid fixed at m = 4. An oblate ellipsoid with the ratio of semi-axes 0.5:1.0:1.0 was used for this purpose. The results are shown in Fig. 17. When the mean numerical errors are considered, the method demonstrates second-order accuracy in both the gravitational potential and acceleration. The maximum error is characterized by second-order convergence only for the gravitational potential, while the maximum error in acceleration reveals first-order convergence. We attribute this decrease in accuracy to boundary effects between the nested grids, where the adopted linear interpolation procedure of the gravitational potential is only first-order accurate. The development of more accurate interpolation schemes across the grid interfaces should be our next task in future works.

Fig. 13

Electric field vectors of a dipole schematically shown by the red and blue circles near the coordinate center. The original convolution method is applied to construct the field. We note that the field is present and regular on the fine Ω² grid but is absent on the coarse Ω¹ grid. This occurs because of the cancellation of the charges when they are projected from the fine to the coarse grid in the original convolution method.

Fig. 14

Decomposition of two nested grids (top) into three individual grids in the modified convolution method: a fine m = 2 grid (Ω², bottom center), a coarse m = l grid with a central cavity ( $Ω_{hole}^{1}$ ${\rm{\Omega }}_{{\rm{hole}}}^1$ , bottom right), and a doubled fine grid m = 2 ( $Ω_{dbl}^{2}$ ${\rm{\Omega }}_{{\rm{dbl}}}^2$ , bottom left). The density in the latter grid is set to zero everywhere except the original physical size of the fine m = 2 grid (shown with a light blue shade), where the density inherits the values from the fine m = 2 grid, ρ = ρ(Ω²).

Fig. 15

Electric field vectors of a dipole. The dipole is located at the center of the finest grid and is shown schematically by the red and blue circles. The modified convolution method is applied to construct the field (see Fig. 13 for comparison with the original method). The top panel presents the case of two nested grids, while the bottom panel corresponds to three nested grids. The black squares identify the grid interfaces. Note that the field is present and continuous across all nested grids Ω¹, Ω² and Ω³.

Fig. 16

Relative errors between the gravitational potentials of a close binary computed by the original and modified convolution methods. The horizontal bar shows the errors in percent. Two mesh setups are used for this test problem: left column – two nested grids and right column – three nested grids. The grid interfaces are shown by the dashed lines for convenience. The binary position is schematically indicated by the red and yellow circles. The arrows show the gravitational field in the modified convolution method. Note that the field lines are continuous across the grid interfaces.

Fig. 17

Convergence test in the modified convolution method. Shown are the mean (black lines) and maximum (red lines) errors in the gravitational potential and acceleration as a function of numerical resolution N. The dotted and dash-dotted lines show the linear and quadratic convergence for convenience.

8 Conclusions

In this paper, we presented an extension of the convolution method for calculating the gravitational potential on regular and equidistant grids, which can be applied to the practical case of nested grids. We compared the performance of the convolution method against the OiCG method using two test problems: an oblate ellipsoid and a binary star, for which the analytic solution of the gravitational potential and acceleration are known. We found that both methods are characterized by comparable computational errors, which do not exceed a few percent for the gravitational acceleration and are notably lower for the gravitational potential.

The convolution method, however, has several clear advantages over the iterative OiCG method. First, the convolution method does not require the precalculation of the gravitational potential at the boundaries of the coarsest m = 1 grid. Second, the convolution method permits decomposition of the nested grids into individual subgrids, followed by efficient computation of the potential on each subgrid in a parallel mode. Third, the convolution method does not require a preset tolerance ϵ, with which the iterations in the OiCG method are controlled. We found that the OiCG method reveals an increase in the errors with increasing m, and the magnitude of this effect depends on the predefined value of ϵ.

A comparison of runtimes of the OiCG and convolution methods demonstrates a clear advantage of the latter method, which can outperform the former by more than a factor of 10 on the CPUs with OpenMP parallelization. When GPUs are utilized, the convolution method can be accelerated by another factor of 10. The GMRes method offered by the Intel MKL library is found to be inferior in speed performance as compared to both methods presented in this work. The convolution method revealed an overall second-order convergence, except for the maximum error at the coarse-fine grid interfaces where the convergence is linear, in accordance with the adopted linear interpolation of the gravitational potential across the grid interface.

The disadvantage of the convolution method lies in the requirement to interpolate the gravitational potential from the coarse grids onto the refined grids. This step can be improved by devising more sophisticated interpolation procedures. When tightly spaced gravitating sources are present at the deepest levels of refinement, for example, close binary/multiple stars surrounded by a circumstellar disk, a modification to the convolution method is required to correctly calculate the dipole moments on the coarser grids. The modification introduces computational overheads, which may be substantial for a large number of nested grids, but acceptable accuracy can already be reached when the method is applied to each pair or triplet of multiple nested grids.

Acknowledgements

We are thankful to the referee for the comments and suggestions that helped to improve the manuscript. This work was supported by the FWF project I4311-N27 (E. I. V. and J. M.) and RFBR project 19-51-14002 (I. K. and V. E.). The simulations were performed on the VSC Vienna Scientific Cluster. We thank Oliver Hahn for pointing to an analytic solution of Eq. (13). The code that uses the convolution theorem for calculating the gravitational potential on CPUs and GPUs can be employed on reasonable request.

Appendix A Gravitational acceleration at the coarse-fine grid interface

The components of gravitational acceleration g across the coarse-fine grid interface can be calculated using three methods illustrated in Figure A.1. We take two nested grids, Ω¹ and Ω², for simplicity and consider two dimensions, with the indices i and k corresponding to the x- and y-coordinates, respectively. The schemes can straightforwardly be extended to three dimensions.

Fig. A.1

Three methods to calculate the gravitational acceleration at the cells adjacent to the coarse-fine grid interface. Top panel. Method A schematically illustrates a scheme to calculate the acceleration at the adjacent fine cells. Middle panel. Method B presents a scheme to calculate the acceleration at the adjacent coarse cells. Bottom panel Method C shows yet another scheme to calculate the acceleration at the adjacent coarse cells, which fulfills the Gauss theorem (see the text for details).

Panel A graphically describes the case of the fine grid cells that are adjacent to the coarse grid (hereafter, method A). The x-component of the gravitational acceleration at the fine-to-coarse grid interface is calculated as $g_{x, (i, k)}^{m = 2} = - \frac{Φ_{i + 1, k}^{m=2} - Φ_{B, (i - 1, k)}^{m=2}}{2 d x_{m=2}},$ $g_{{\rm{x,}}\left( {i,k} \right)}^{m = 2} = - {{\Phi _{i + {\rm{1,}}k}^{{\rm{m = 2}}} - \Phi _{{\rm{B,}}\left( {i - 1,k} \right)}^{{\rm{m = 2}}}} \over {2d{x_{{\rm{m = 2}}}}}},$ (A.1)

where $Φ_{B, (i - 1, k)}^{m = 2}$ ${\rm{\Phi }}_{{\rm{B,}}\left( {i - 1,k} \right)}^{m = 2}$ are the boundary values of the potential on the fine grid found using the interpolation scheme described in Fig. 2 of Sect. 2 and dx_m=2 is the size of the fine grid cell.

Panel B presents the opposite case of the coarse grid cells that are adjacent to the fine grid (hereafter, method B). The gravitational acceleration at the coarse-to-fine grid interface can be calculated as $g_{x, (i, k)}^{m = 1} = - \frac{Φ_{i + 1, k}^{m=1} - Φ_{(i - 1, k)}^{m=1}}{2 d x_{m=1}},$ $g_{{\rm{x,}}\left( {i,k} \right)}^{m = 1} = - {{\Phi _{i + {\rm{1,}}k}^{{\rm{m = 1}}} - \Phi _{\left( {i - 1,k} \right)}^{{\rm{m = 1}}}} \over {2d{x_{{\rm{m = 1}}}}}},$ (A.2)

where $Φ_{(i - 1, k)}^{m = 1}$ ${\rm{\Phi }}_{\left( {i - 1,k} \right)}^{m = 1}$ is found using an arithmetic average of $Φ_{(i, k)}^{m = 2}, Φ_{(i + 1, k)}^{m = 2}, Φ_{(i, k + 1)}^{m = 2}$ ${\rm{\Phi }}_{\left( {i,k} \right)}^{m = 2},{\rm{\Phi }}_{\left( {i + 1,k} \right)}^{m = 2},{\rm{\Phi }}_{\left( {i,k + 1} \right)}^{m = 2}$ , and $Φ_{(i + 1, k + 1)}^{m = 2}$ ${\rm{\Phi }}_{\left( {i + 1,k + 1} \right)}^{m = 2}$ . Here dx_m=1 is the size of the coarse grid cells.

Panel C presents a more subtle and physically motivated method (hereafter, method C) for calculating the gravitational acceleration at the coarse-to-fine grid interface (see also Feng et al. 2018). Here, we first calculate the x-components of gravitational acceleration at the coarse-fine grid interface using the coarse and fine grid values separately $g_{x, (i - 1 / 2, k)}^{m = 1} = - \frac{Φ_{i + 1, k}^{m=1} - Φ_{(i, k)}^{m=1}}{d x_{m=1}},$ $g_{{\rm{x,}}\left( {{{i - 1} \mathord{\left/ {\vphantom {{i - 1} {2,k}}} \right. \kern-\nulldelimiterspace} {2,k}}} \right)}^{m = 1} = - {{\Phi _{i + {\rm{1,}}k}^{{\rm{m = 1}}} - \Phi _{\left( {i,k} \right)}^{{\rm{m = 1}}}} \over {d{x_{{\rm{m = 1}}}}}},$ (A.3) $g_{x, (i - 1 / 2, k)}^{m = 2} = - \frac{Φ_{1, k}^{m=2} - Φ_{B, (i - 1, k)}^{m=2}}{d x_{m=2}},$ $g_{{\rm{x,}}\left( {{{i - 1} \mathord{\left/ {\vphantom {{i - 1} {2,k}}} \right. \kern-\nulldelimiterspace} {2,k}}} \right)}^{m = 2} = - {{\Phi _{{\rm{1,}}k}^{{\rm{m = 2}}} - \Phi _{{\rm{B,}}\left( {i - 1,k} \right)}^{{\rm{m = 2}}}} \over {d{x_{{\rm{m = 2}}}}}},$ (A.4) $g_{x, (i - 1 / 2, k + 1)}^{m = 2} = - \frac{Φ_{i, k + 1}^{m=2} - Φ_{B, (i - 1, k + 1)}^{m=2}}{d x_{m=2}} .$ $g_{{\rm{x,}}\left( {{{i - 1} \mathord{\left/ {\vphantom {{i - 1} {2,k}}} \right. \kern-\nulldelimiterspace} {2,k}} + 1} \right)}^{m = 2} = - {{\Phi _{i{\rm{,}}k + 1}^{{\rm{m = 2}}} - \Phi _{{\rm{B,}}\left( {i - 1,k + 1} \right)}^{{\rm{m = 2}}}} \over {d{x_{{\rm{m = 2}}}}}}.$ (A.5)

We note that these values of gravitational acceleration are defined at the cell interfaces and not at the cell centers. We then require that the gravitational accelerations at the coarse-fine grid interface be equal when using the coarse and fine grids separately $g_{x, (i - 1 / 2, k)}^{m = 1} = \frac{1}{2} (g_{x, (i - 1 / 2, k)}^{m = 2} + g_{x, (i - 1 / 2, k + 1)}^{m = 2}) .$ $g_{{\rm{x,}}\left( {{{i - 1} \mathord{\left/ {\vphantom {{i - 1} {2,k}}} \right. \kern-\nulldelimiterspace} {2,k}}} \right)}^{m = 1} = {1 \over 2}\left( {g_{{\rm{x,}}\left( {{{i - 1} \mathord{\left/ {\vphantom {{i - 1} {2,k}}} \right \kern-\nulldelimiterspace} {2,k}}} \right)}^{m = 2} + g_{{\rm{x,}}\left( {{{i - 1} \mathord{\left/ {\vphantom {{i - 1} {2,k + 1}}} \right. \kern-\nulldelimiterspace} {2,k + 1}}} \right)}^{m = 2}} \right).$ (A.6)

From the last expression it is straightforward to find the value of $Φ_{i + 1, k}^{m = 1}$ ${\rm{\Phi }}_{i + 1,k}^{m = 1}$ and then employ Equation (A.2) to compute the gravitational acceleration $g_{x, (j, k)}^{m = 1}$ $g_{x,\left( {j,k} \right)}^{m = 1}$ at the coarse grid cells that are adjacent to the fine grid. This scheme is not fully self-consistent in the sense that the value of $Φ_{j + 1, k}^{m = 1}$ ${\rm{\Phi }}_{j + 1,k}^{{\rm{m}} = 1}$ depends on the boundary values on the fine grid $Φ_{B, (i - 1, k)}^{m = 2}$ ${\rm{\Phi }}_{{\rm{B,}}\left( {i - 1,k} \right)}^{{\rm{m}} = 2}$ and $Φ_{B, (i - 1, k + 1)}^{m = 2}$ ${\rm{\Phi }}_{{\rm{B,}}\left( {i - 1,k + 1} \right)}^{{\rm{m}} = 2}$ , which in turn are found using the potentials on the coarse grid (see Eqs. 3–6). This inconsistency can be resolved through iterations when the value of $Φ_{j + 1, k}^{m = 1}$ ${\rm{\Phi }}_{j + 1,k}^{{\rm{m}} = 1}$ is substituted back to Eqs. (3)–(6) to find the boundary potentials $Φ_{B, (i - 1, k)}^{m = 2}$ ${\rm{\Phi }}_{{\rm{B,}}\left( {i - 1,k} \right)}^{{\rm{m}} = 2}$ and $Φ_{B, (i - 1, k + 1)}^{m = 2}$ ${\rm{\Phi }}_{{\rm{B,}}\left( {i - 1,k + 1} \right)}^{{\rm{m}} = 2}$ and the calculations involving Eqs. (A.3)–(A.6) are repeated. Our experience shows that convergence is achieved after several iterations.

The difference between methods B and C is illustrated in Figure A.2, which shows the relative error in the gravitational acceleration at the x − z plane obtained for both methods. We consider two nested grids and an ellipsoid with the ratio of semi-axes 0.75:1.5:1.5. We note that for both cases, method A is used to calculate the acceleration in the cells of the fine grid that are adjacent to the coarse grid. We found that the span of errors is similar in both methods. In particular, the maximum errors occur in the vicinity of the ellipsoid surface and also at the coarse-fine grid interface. When method B is considered, the error is maximal at the adjacent fine cells and is much smaller at the adjacent coarse cells. When method C is considered, the error is similar and maximal on both sides of the coarse-fine grid interface. This effect is noticeable if the coarse-fine grid interface lies inside the ellipsoid, as illustrated by the arrows in Figure A.2. The corresponding region with the highest error appears thicker in method C than in method B. To summarize, method C is physically more accurate as it fulfills the Gauss theorem at the grid interfaces, but it comes at the expense of an increased error in the gravitational acceleration at the coarse cells that are adjacent to the fine grid.

Fig. A.2

Relative error in the gravitational acceleration of an oblate ellipsoid. The top and bottom panels correspond to methods B and C for computing the gravitational acceleration at the coarse cells that are adjacent to the coarse-fine grid interface. The arrows illustrate the regions where differences between methods B and C are found. The color bars show errors relative to the analytic solution in per cent.

Appendix B Modified convolution method and reduction of overheads

In this section, we describe how the convolution method can be extended to tackle the dipole problem across multiple nested grids and highlight the associated numerical costs. First, we provide a scheme, in which the dipole moments are accurately computed for each pair of neighboring grids on a nested mesh with m ≥ 3. As was demonstrated in Sect. 7, this allows notably reducing the error in the gravitational potential of a close binary near the coarse-fine grid interface. The corresponding algorithm can be written as $Φ (Ω^{n}) = Φ_{hole}^{m=n} + {\tilde{Φ}}_{dbl}^{m=n+1} + \sum_{k = 1}^{n - 1} {\tilde{Φ}}_{hole}^{m=n}, for n < M_{depth}$ $\Phi \left( {{\Omega ^n}} \right) = \Phi _{{\rm{hole}}}^{{\rm{m = n}}} + \tilde \Phi _{{\rm{dbl}}}^{{\rm{m = n + 1}}} + \sum\limits_{k = 1}^{n - 1} {\tilde \Phi _{{\rm{hole}}}^{{\rm{m = n}}},\,{\rm{for}}\,n lt; {M_{{\rm{depth}}}}}$ (B.1) $Φ (Ω^{n}) = \sum_{k = 1}^{n - 1} {\tilde{Φ}}_{hole}^{m=n} + Φ^{m=n}, for n = M_{depth} .$ $\Phi \left( {{\Omega ^n}} \right) = \sum\limits_{k = 1}^{n - 1} {\tilde \Phi _{{\rm{hole}}}^{{\rm{m = n}}} + {\Phi ^{{\rm{m = n}}}},\,{\rm{for}}\,n = {M_{{\rm{depth}}}}.}$ (B.2)

Here, M_depth is the maximum number of nested grids. Each pair of neighboring grids Ωⁿ :Ωⁿ⁺¹ has now its own auxiliary doubled grid ${\tilde{Φ}}_{dbl}^{m = n+1}$ ${\rm{\tilde \Phi }}_{{\rm{dbl}}}^{{\rm{m}} = {\rm{n + 1}}}$ . As usual, the tilde sign means proper interpolation from a coarser to a finer grid and vice versa. When this modified scheme is used on CPUs, the runtime increases by about a factor of four compared to the original method. When CPUs and GPUs are used together, the runtime increases by a factor of 2.

When the dipole moment are to be computed across three neighboring grids (as in the bottom panel of Fig. 15), the corresponding algorithm reads as $Φ (Ω^{n}) = Φ_{hole}^{m=n} + {\tilde{Φ}}_{dbl,hole}^{m=n+1} + {\tilde{Φ}}_{qdp}^{m=n+2} + \sum_{k = 1}^{n - 1} {\tilde{Φ}}_{hole}^{m=n},$ $\Phi \left( {{\Omega ^n}} \right) = \Phi _{{\rm{hole}}}^{{\rm{m = n}}} + \tilde \Phi _{{\rm{dbl,hole}}}^{{\rm{m = n + 1}}} + \tilde \Phi _{{\rm{qdp}}}^{{\rm{m = n + 2}}} + \sum\limits_{k = 1}^{n - 1} {\tilde \Phi _{{\rm{hole}}}^{{\rm{m = n}}},\,}$ (B.3) $for n < M_{depth} - 1$ ${\rm{for}}\,n lt; {M_{{\rm{depth}}}} - 1$ (B.4) $\begin{array}{l} Φ (Ω^{n}) = Φ_{hole}^{m=n} + {\tilde{Φ}}_{dbl}^{m=n+1} + \sum_{k = 1}^{n - 1} {\tilde{Φ}}_{hole}^{m=n}, for n < M_{depth} \end{array}$ $\matrix{ {\Phi \left( {{\Omega ^n}} \right) = \Phi _{{\rm{hole}}}^{{\rm{m = n}}} +\tilde \Phi _{{\rm{dbl}}}^{{\rm{m = n + 1}}} + \sum\limits_{k = 1}^{n - 1} {\tilde \Phi _{{\rm{hole}}}^{{\rm{m = n}}},\,{\rm{for}}\,n lt; {M_{{\rm{depth}}}}} } \hfill \cr }$ (B.5) $\begin{array}{l} Φ (Ω^{n}) = \sum_{k = 1}^{n - 1} {\tilde{Φ}}_{hole}^{m=n} + Φ^{m=n}, for n < M_{depth} . \end{array}$ $\matrix{ {\Phi \left( {{\Omega ^n}} \right) = \sum\limits_{k = 1}^{n - 1} {\tilde \Phi _{{\rm{hole}}}^{{\rm{m = n}}} + {\Phi ^{{\rm{m = n}}}},\,{\rm{for}}\,n lt; {M_{{\rm{depth}}}}.} } \hfill \cr }$ (B.6)

Figure B.1 presents the corresponding mesh decomposition for the case of three nested grids. The solution procedure is now split into six steps. Three steps shown in the top row of the figure are similar to those of the original scheme (see Fig 5). We calculate the potential Φ^m=3 on the finest Ω³ grid, the potential $Φ_{hole}^{m = 1}$ ${\rm{\Phi }}_{{\rm{hole}}}^{{\rm{m}} = 1}$ on the coarsest $Ω_{hole}^{1}$ ${\rm{\Omega }}_{{\rm{hole}}}^1$ grid with a central hole, and the potential $Φ_{hole}^{m = 2}$ ${\rm{\Phi }}_{{\rm{hole}}}^{{\rm{m}} = 2}$ on the intermediate $Ω_{hole}^{2}$ ${\rm{\Omega }}_{{\rm{hole}}}^2$ grid with a central hole. The other three steps shown in the bottom row of Figure B.1 involve the calculation of the potentials on the auxiliary grids, which are introduced for the correct calculation of the dipole moments. We note that the doubled $Ω_{dbl,hole}^{2}$ ${\rm{\Omega }}_{{\rm{dbl,hole}}}^2$ grid has now a central hole, which is filled with zero densities to avoid counting twice the mass distribution on the finest Ω³ grid. The doubled $Ω_{dbl}^{3}$ ${\rm{\Omega }}_{{\rm{dbl}}}^3$ grid is analogous to $Ω_{dbl}^{2}$ ${\rm{\Omega }}_{{\rm{dbl}}}^2$ grid for the case of two nested meshes. Finally, the $Ω_{qdp}^{3}$ ${\rm{\Omega }}_{{\rm{qdp}}}^3$ grid is introduced, which is an extension of the finest Ω³ grid to the physical size of the coarsest Ω¹ grid. The additional cells are filled with zero densities. This grid allows us to correctly calculate the dipole moments at the coarsest Ω¹ grid. When the dipole moments are calculated across three neighboring grids, the runtime increases by a factor of 4.5 for N = 64 and by a factor of 8.5 for N = 128. An extension of the scheme to a larger number of neighboring grids can be done by analogy to Eqs. (B.1)–(B.6).

Fig. B.1

Decomposition of the m = 3 mesh in the modified convolution method accounting for the dipole moments. Six individual grids are now considered with the three grids in the bottom row representing the auxiliary steps needed to compute the dipole moments for three neighboring grids.

We note that the calculation of $Φ_{qdp}^{m = 3}$ ${\rm{\Phi }}_{{\rm{qdp}}}^{m = 3}$ comes at a substantial computational expense in both the computational time and the memory requirements. A solution to this problem may be in switching to direct summation when calculating the input of the mass distribution on the finest Ω³ grid to the total gravitational potential on the coarsest Ω¹ grid. In this case, the calculation of the potential $Φ_{qdp}^{3}$ ${\rm{\Phi }}_{{\rm{qdp}}}^3$ on the quadrupled grid $Ω_{qdp}^{3}$ ${\rm{\Omega }}_{{\rm{qdp}}}^3$ becomes obsolete. Instead, the input of the finest Ω³ grid to the potential on the coarsest Ω¹ grid is calculated as $\begin{array}{l} Φ_{d.s.}^{3 \to 1} (x_{i}, y_{j}, z_{k}) & = & - G \sum_{i^{'}, j^{'}, k^{'}} M^{m = 3} (x_{i^{'}}, y_{j^{'}}, z,_{k^{'}}) \\ \times & G^{3 \to 1} (x_{i} - x_{i^{'}}, y_{j} - y_{j^{'}}, z_{k} - z_{k^{'}}), \end{array}$ $\matrix{ {\Phi _{{\rm{d}}{\rm{.s}}{\rm{.}}}^{3 \to 1}\left( {{x_i},{y_j},{z_k}} \right)} \hfill & = \hfill & { - G\sum\limits_{i',j',k'} {{M^{m = 3}}\left( {{x_{i'}},{y_{j'}},z{,_{k'}}} \right)} } \hfill \cr {} \hfill & \times \hfill & {{{\cal G}^{3 \to 1}}\left( {{x_i} - {x_{i'}},{y_j} - {y_{j'}},z{_k} - {z_{k'}}} \right),} \hfill \cr }$ (B.7)

where M^m=3 is the mass distribution on the finest Ω³ grid and 𝒢^3→1 is the inverse distance between the cells on the Ω¹ grid with coordinates (xi, y_j, z_k) and the cells on the Ω³ grid with coordinates ( $x_{i}^{^{'}}, y_{j}^{^{'}}, z_{k}^{^{'}}$ $x_i^{^\prime },y_j^{^\prime },z_k^{^\prime }$ ). Our numerical experiments reveal that for a numerical resolution of N ≤ 32³ the direct summation is faster than the calculation of $Φ_{qdp}^{3}$ ${\rm{\Phi }}_{{\rm{qdp}}}^3$ using the convolution theorem.

Appendix C Further comparison of the modified and original convolution methods

Here, we perform an additional comparison between the original and modified convolution methods, the latter considering the dipole moments at every pair of neighboring grids (see Eqs. B.1–B.2). Figure C.1 presents the errors in the gravitational potential relative to analytic solutions for each method and also the relative errors in the gravitational potential between the two methods. Four nested grids with a numerical resolution of N = 128³ for each grid were used in this test problem. The mass distributions similar to those considered in Sect. 6 were used: an oblate ellipsoid and a wide separation binary. In particular, the ratio of semi-major axes of the ellipsoid was increased to 0.75:1.5:1.5 compared to the previous test problem (0.5:1.0:1.0) so the ellipsoid now covers all grid interfaces.

Fig. C.1

Comparison of the modified and original convolution method for an oblate ellipsoid (top row) and wide separation binary (bottom row). The errors in the gravitational potential compared to the analytical solution are shown in the left and middle columns (in log units), while the right column displays the relative error in the gravitational potential between the two methods in percent.

In the modified method, a smoother transition between the grid interfaces can be noted, but otherwise the potentials are similar. The relative error between the two methods for a smooth density distribution (ellipsoid, top row) is very small, < 0.006%, indicating that the modification to the convolution method in this case is not really required. For a wide separation binary (bottom row), the error increased somewhat but still stays quite low, < 0.045%. Therefore, the modified method is recommended for models where the formation of tightly spaced gravitating objects is expected, for example, close binary stars or gravitationally fragmenting protostellar disks.

References

Binney, J., Tremaine, S., & Ostriker, J. 1987, Galactic Dynamics, Princeton series in astrophysics (Princeton: Princeton University Press) [Google Scholar]
Bodenheimer, P., Laughlin, G. P., Rózyczka, M., & Yorke, H. W. 2007, Numerical Methods in Astrophysics: An Introduction (Boca Raton: CRC Press) [Google Scholar]
Chandrasekhar, S. 1969, Ellipsoidal Figures of Equilibrium (New Haven: Yale University Press) [Google Scholar]
Feng, W., Guo, Z., Lowengrub, J. S., & Wise, S. M. 2018, J. Comput. Phys., 352, 463 [NASA ADS] [CrossRef] [Google Scholar]
Gayatri, R., Yang, C., Kurth, T., & Deslippe, J. 2019, in Accelerator Programming Using Directives, eds. S. Chandrasekaran, G. Juckeland, & S. Wienke (Cham: Springer International Publishing), 75 [CrossRef] [Google Scholar]
Guillet, T., & Teyssier, R. 2011, J. Comput. Phys., 230, 4756 [Google Scholar]
Hennebelle, P. 2018, A&A, 611, A24 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Hockney, R. W., & Eastwood, J. W. 1988, Computer Simulation Using Particles (Boca Raton: CRC Press) [Google Scholar]
Jackson, J. D., & Fox, R. F. 1999, Am. J. Phys., 67, 841 [NASA ADS] [CrossRef] [Google Scholar]
Kulikov, I., & Vorobyov, E. 2016, J. Comput. Phys., 317, 318 [NASA ADS] [CrossRef] [Google Scholar]
Larson, R. B. 2003, Rep. Prog. Phys., 66, 1651 [NASA ADS] [CrossRef] [Google Scholar]
Lebreuilly, U., Commerçon, B., & Laibe, G. 2020, A&A, 641, A112 [EDP Sciences] [Google Scholar]
Machida, M. N., Matsumoto, T., Tomisaka, K., & Hanawa, T. 2005, MNRAS, 362, 369 [Google Scholar]
Macmillan, W. D. 1985, Theory of the Potential (New York: Dover Publications Inc.) [Google Scholar]
Masunaga, H., Miyama, S. M., & Inutsuka, S.-I. 1998, ApJ, 495, 346 [NASA ADS] [CrossRef] [Google Scholar]
Matsumoto, T. 2007, PASJ, 59, 905 [NASA ADS] [Google Scholar]
Matsumoto, T., & Hanawa, T. 2003, ApJ, 583, 296 [NASA ADS] [CrossRef] [Google Scholar]
Müller, T. W. A., Kley, W., & Meru, F. 2012, A&A, 541, A123 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Potts, D., & Tasche, M. 2008, J. Comput. Appl. Math., 222, 655 [NASA ADS] [CrossRef] [Google Scholar]
Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 1992, Numerical Recipes in FORTRAN. The Art of Scientific Computing (Cambridge: Cambridge University Press) [Google Scholar]
Ruffert, M., & Mueller, E. 1990, A&A, 238, 116 [NASA ADS] [Google Scholar]
Stone, J. M., & Norman, M. L. 1992, ApJS, 80, 753 [Google Scholar]
Stone, J. M., Tomida, K., White, C. J., & Felker, K. G. 2020, ApJS, 249, 4 [NASA ADS] [CrossRef] [Google Scholar]
Stücker, J., Hahn, O., Angulo, R. E., & White, S. D. M. 2020, MNRAS, 495, 4943 [Google Scholar]
Tomida, K., Okuzumi, S., & Machida, M. N. 2015, ApJ, 801, 117 [Google Scholar]
Truelove, J. K., Klein, R. I., McKee, C. F., et al. 1998, ApJ, 495, 821 [Google Scholar]
Vorobyov, E. I., Khaibrakhmanov, S., Basu, S., & Audard, M. 2020, A&A, 644, A74 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Wang, H.-H., & Yen, C.-C. 2020, ApJS, 247, 2 [NASA ADS] [CrossRef] [Google Scholar]

¹

https://fftw.org

²

https://developer.nvidia.com/cuda-fortran

All Tables

Table 1

Errors in gravitational potential: ellipsoid (left) and binary star (right).

In the text

Table 2

Errors in gravitational acceleration: ellipsoid (left) and binary (right).

In the text

Table 3

Errors in gravitational potential: ellipsoid (left) and binary star (right).

In the text

Table 4

Errors in gravitational acceleration: ellipsoid (left) and binary (right).

In the text

Table 5

Different device-host configurations.

In the text

All Figures

	Fig. 1 Example of a nested mesh with three levels of refinement: Ω¹, Ω², and Ω³. The gray-shaded cells represent boundary layers for the coarsest grid; the gravitational potential at these boundary layers has to be found a priori.
In the text

	Fig. 2 Examples of finding the boundary potentials for the m = 2 nested grid. The bottom-right example presents a simple extrapolation of the nearest potential at the m = 1 grid. The top-left example shows a more sophisticated linear interpolation scheme, which involves two neighboring potentials at the m = 1 level.
In the text

Fig. 3

Gravitational potential and acceleration on nested grids in the OiCG method. Left pair of columns: relative error in the calculation of the gravitational potential (first column) and gravitational acceleration (second column) for an oblate ellipsoid with the semi-major axis ratio 0.5:1.0:1.0. The horizontal bars present the relative errors in percent (log units). The top row illustrates the errors obtained with no mesh refinement (m = 1). The middle and bottom rows show the errors for the case of four nested grids (m = 4). In particular, the middle row corresponds to a linear interpolation of the boundary potentials, whereas the bottom row presents the errors when a simple extrapolation is used. Right pair of columns: relative error in the gravitational potential (first column) and gravitational acceleration (second column) for a binary star, the spatial location of which is outlined by two circles. The left component is twice as massive as the right one.

In the text

	Fig. 4 Schematic illustration of the mesh doubling and rearrangement in the convolution method. The original active region 1 is doubled in each coordinate direction to form three ghost regions (2–4) that are filled with zero masses (M = 0). The doubled mesh is rearranged as shown by the arrows to make the inverse distance G and mass M periodic functions on the new domain.
In the text

	Fig. 5 Decomposition of two nested grids (top) into three individual grids with cells of equal size to which the classic FFT method can be applied: a coarse grid (Ω¹, bottom left), refined grid (Ω², bottom center), and coarse grid with a cavity ( $Ω_{hole}^{1}$ ${\rm{\Omega }}_{{\rm{hole}}}^1$ , bottom right).
In the text

	Fig. 6 Interpolation of the gravitational potential $Φ_{hole}^{m = 1}$ ${\rm{\Phi }}_{{\rm{hole}}}^{m = 1}$ , obtained on the coarse m = 1 grid with a central cavity, on the refined m = 2 grid highlighted with hatches. The resulting potential ${\tilde{Φ}}_{hole}^{m = 1}$ ${\rm{\tilde \Phi }}_{{\rm{hole}}}^{m = 1}$ is shown with the arrow.
In the text

	Fig. 7 Construction of the final solution on the original nested mesh (Ω¹ : Ω²).
In the text

	Fig. 8 Comparison of the relative errors in the gravitational potential and acceleration for an oblate ellipsoid (left pair of columns) and binary star (right pair of columns). The top row shows the results for the convolution method, while the bottom row presents the relative errors for the OiCG method. Four nested grids and 128 grid zones in each coordinate direction are used.
In the text

	Fig. 9 Relative errors in the gravitational potential (top row) and gravitational acceleration (bottom row) as a function of radial distance from the coordinate center in the vertical x − z plane. The case of the oblate ellipsoid is shown. Left and right columns present the results for the OiCG and convolution method, respectively.
In the text

Fig. 10

Relative errors for the gravitational potential (top panel) and gravitational acceleration (bottom panel) for the oblate ellipsoid as a function of the number of nested grids. The black and red lines present the maximum and mean errors, respectively. The dashed lines correspond to the convolution method, while the solid and dash-dotted lines show the results for the OiCG method with the tolerance ϵ set equal to 10⁻⁶ and 10⁻⁸, respectively.

In the text

	Fig. 11 Runtime vs. the number of cores in computing the gravitational potential of an oblate ellipsoid on m = 8 nested grids. Red and black lines correspond to the convolution method and OiCG method, respectively.
In the text

	Fig. 12 Runtime vs. the number of CPU cores in computing the gravitational potential of an oblate ellipsoid on m = 8 nested grids. The red lines show the results of CPU plus GPU computations, while the black lines present the CPU model only.
In the text

Fig. 13

Electric field vectors of a dipole schematically shown by the red and blue circles near the coordinate center. The original convolution method is applied to construct the field. We note that the field is present and regular on the fine Ω² grid but is absent on the coarse Ω¹ grid. This occurs because of the cancellation of the charges when they are projected from the fine to the coarse grid in the original convolution method.

In the text

Fig. 14

Decomposition of two nested grids (top) into three individual grids in the modified convolution method: a fine m = 2 grid (Ω², bottom center), a coarse m = l grid with a central cavity ( $Ω_{hole}^{1}$ ${\rm{\Omega }}_{{\rm{hole}}}^1$ , bottom right), and a doubled fine grid m = 2 ( $Ω_{dbl}^{2}$ ${\rm{\Omega }}_{{\rm{dbl}}}^2$ , bottom left). The density in the latter grid is set to zero everywhere except the original physical size of the fine m = 2 grid (shown with a light blue shade), where the density inherits the values from the fine m = 2 grid, ρ = ρ(Ω²).

In the text

Fig. 15

Electric field vectors of a dipole. The dipole is located at the center of the finest grid and is shown schematically by the red and blue circles. The modified convolution method is applied to construct the field (see Fig. 13 for comparison with the original method). The top panel presents the case of two nested grids, while the bottom panel corresponds to three nested grids. The black squares identify the grid interfaces. Note that the field is present and continuous across all nested grids Ω¹, Ω² and Ω³.

In the text

Fig. 16

Relative errors between the gravitational potentials of a close binary computed by the original and modified convolution methods. The horizontal bar shows the errors in percent. Two mesh setups are used for this test problem: left column – two nested grids and right column – three nested grids. The grid interfaces are shown by the dashed lines for convenience. The binary position is schematically indicated by the red and yellow circles. The arrows show the gravitational field in the modified convolution method. Note that the field lines are continuous across the grid interfaces.

In the text

	Fig. 17 Convergence test in the modified convolution method. Shown are the mean (black lines) and maximum (red lines) errors in the gravitational potential and acceleration as a function of numerical resolution N. The dotted and dash-dotted lines show the linear and quadratic convergence for convenience.
In the text

Fig. A.1

Three methods to calculate the gravitational acceleration at the cells adjacent to the coarse-fine grid interface. Top panel. Method A schematically illustrates a scheme to calculate the acceleration at the adjacent fine cells. Middle panel. Method B presents a scheme to calculate the acceleration at the adjacent coarse cells. Bottom panel Method C shows yet another scheme to calculate the acceleration at the adjacent coarse cells, which fulfills the Gauss theorem (see the text for details).

In the text

Fig. A.2

Relative error in the gravitational acceleration of an oblate ellipsoid. The top and bottom panels correspond to methods B and C for computing the gravitational acceleration at the coarse cells that are adjacent to the coarse-fine grid interface. The arrows illustrate the regions where differences between methods B and C are found. The color bars show errors relative to the analytic solution in per cent.

In the text

	Fig. B.1 Decomposition of the m = 3 mesh in the modified convolution method accounting for the dipole moments. Six individual grids are now considered with the three grids in the bottom row representing the auxiliary steps needed to compute the dipole moments for three neighboring grids.
In the text

	Fig. C.1 Comparison of the modified and original convolution method for an oblate ellipsoid (top row) and wide separation binary (bottom row). The errors in the gravitational potential compared to the analytical solution are shown in the left and middle columns (in log units), while the right column displays the relative error in the gravitational potential between the two methods in percent.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Binney, J., Tremaine, S., & Ostriker, J. 1987, Galactic Dynamics, Princeton series in astrophysics (Princeton: Princeton University Press) [Google Scholar]

[2] Bodenheimer, P., Laughlin, G. P., Rózyczka, M., & Yorke, H. W. 2007, Numerical Methods in Astrophysics: An Introduction (Boca Raton: CRC Press) [Google Scholar]

[3] Chandrasekhar, S. 1969, Ellipsoidal Figures of Equilibrium (New Haven: Yale University Press) [Google Scholar]

[4] Feng, W., Guo, Z., Lowengrub, J. S., & Wise, S. M. 2018, J. Comput. Phys., 352, 463 [NASA ADS] [CrossRef] [Google Scholar]

[5] Gayatri, R., Yang, C., Kurth, T., & Deslippe, J. 2019, in Accelerator Programming Using Directives, eds. S. Chandrasekaran, G. Juckeland, & S. Wienke (Cham: Springer International Publishing), 75 [CrossRef] [Google Scholar]

[6] Guillet, T., & Teyssier, R. 2011, J. Comput. Phys., 230, 4756 [Google Scholar]

[7] Hennebelle, P. 2018, A&A, 611, A24 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[8] Hockney, R. W., & Eastwood, J. W. 1988, Computer Simulation Using Particles (Boca Raton: CRC Press) [Google Scholar]

[9] Jackson, J. D., & Fox, R. F. 1999, Am. J. Phys., 67, 841 [NASA ADS] [CrossRef] [Google Scholar]

[10] Kulikov, I., & Vorobyov, E. 2016, J. Comput. Phys., 317, 318 [NASA ADS] [CrossRef] [Google Scholar]

[11] Larson, R. B. 2003, Rep. Prog. Phys., 66, 1651 [NASA ADS] [CrossRef] [Google Scholar]

[12] Lebreuilly, U., Commerçon, B., & Laibe, G. 2020, A&A, 641, A112 [EDP Sciences] [Google Scholar]

[13] Machida, M. N., Matsumoto, T., Tomisaka, K., & Hanawa, T. 2005, MNRAS, 362, 369 [Google Scholar]

[14] Macmillan, W. D. 1985, Theory of the Potential (New York: Dover Publications Inc.) [Google Scholar]

[15] Masunaga, H., Miyama, S. M., & Inutsuka, S.-I. 1998, ApJ, 495, 346 [NASA ADS] [CrossRef] [Google Scholar]

[16] Matsumoto, T. 2007, PASJ, 59, 905 [NASA ADS] [Google Scholar]

[17] Matsumoto, T., & Hanawa, T. 2003, ApJ, 583, 296 [NASA ADS] [CrossRef] [Google Scholar]

[18] Müller, T. W. A., Kley, W., & Meru, F. 2012, A&A, 541, A123 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[19] Potts, D., & Tasche, M. 2008, J. Comput. Appl. Math., 222, 655 [NASA ADS] [CrossRef] [Google Scholar]

[20] Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 1992, Numerical Recipes in FORTRAN. The Art of Scientific Computing (Cambridge: Cambridge University Press) [Google Scholar]

[21] Ruffert, M., & Mueller, E. 1990, A&A, 238, 116 [NASA ADS] [Google Scholar]

[22] Stone, J. M., & Norman, M. L. 1992, ApJS, 80, 753 [Google Scholar]

[23] Stone, J. M., Tomida, K., White, C. J., & Felker, K. G. 2020, ApJS, 249, 4 [NASA ADS] [CrossRef] [Google Scholar]

[24] Stücker, J., Hahn, O., Angulo, R. E., & White, S. D. M. 2020, MNRAS, 495, 4943 [Google Scholar]

[25] Tomida, K., Okuzumi, S., & Machida, M. N. 2015, ApJ, 801, 117 [Google Scholar]

[26] Truelove, J. K., Klein, R. I., McKee, C. F., et al. 1998, ApJ, 495, 821 [Google Scholar]

[27] Vorobyov, E. I., Khaibrakhmanov, S., Basu, S., & Audard, M. 2020, A&A, 644, A74 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[28] Wang, H.-H., & Yen, C.-C. 2020, ApJS, 247, 2 [NASA ADS] [CrossRef] [Google Scholar]

Computing the gravitational potential on nested meshes using the convolution method★

1 Introduction

2 Outside-in conjugate gradient solution procedure

3 Integral form of the gravitational potential and the convolution method

4 Modification of the convolution method for nested grids

5 Comparison of the models

6 Performance comparison

7 The dipole problem and the modified convolution method

8 Conclusions

Acknowledgements

Appendix A Gravitational acceleration at the coarse-fine grid interface

Appendix B Modified convolution method and reduction of overheads

Appendix C Further comparison of the modified and original convolution methods

References

All Tables

All Figures

Computing the gravitational potential on nested meshes using the convolution method^★