Performance of high-order Godunov-type methods in simulations of astrophysical low Mach number flows

High-order Godunov methods for gas dynamics have become a standard tool for simulating different classes of astrophysical flows. Their accuracy is mostly determined by the spatial interpolant used to reconstruct the pair of Riemann states at cell interfaces and by the Riemann solver that computes the interface fluxes. In most Godunov-type methods, these two steps can be treated independently, so that many different schemes can in principle be built from the same numerical framework. Because astrophysical simulations often test out the limits of what is feasible with the computational resources available, it is essential to find the scheme that produces the numerical solution with the desired accuracy at the lowest computational cost. However, establishing the best combination of numerical options in a Godunov-type method to be used for simulating a complex hydrodynamic problem is a nontrivial task. In fact, formally more accurate schemes do not always outperform simpler and more diffusive methods, especially if sharp gradients are present in the flow. For this work, we used our fully compressible S EVEN -L EAGUE H YDRO ( SLH ) code to test the accuracy of six reconstruction methods and three approximate Riemann solvers on two-and three-dimensional (2D and 3D) problems involving subsonic flows only. We considered Mach numbers in the range from 10 − 3 to 10 − 1 , which are characteristic of many stellar and geophysical flows. In particular, we considered a well-posed, 2D, Kelvin–Helmholtz instability problem and a 3D turbulent convection zone that excites internal gravity waves in an overlying stable layer. Although the different combinations of numerical methods converge to the same solution with increasing grid resolution for most of the quantities analyzed here, we find that (i) there is a spread of almost four orders of magnitude in computational cost per fixed accuracy between the methods tested in this study, with the most performant method being a combination of a low-dissipation Riemann solver and a sextic reconstruction scheme; (ii) the low-dissipation solver always outperforms conventional Riemann solvers on a fixed grid when the reconstruction scheme is kept the same; (iii) in simulations of turbulent flows, increasing the order of spatial reconstruction reduces the characteristic dissipation length scale achieved on a given grid even if the overall scheme is only second order accurate; (iv) reconstruction methods based on slope-limiting techniques tend to generate artificial, high-frequency acoustic waves during the evolution of the flow; and (v) unlimited reconstruction methods introduce oscillations in the thermal stratification near the convective boundary, where the entropy gradient is steep.


Introduction
High-resolution schemes for gas dynamics (see, e.g., van Leer 1979;Colella & Woodward 1984;Harten et al. 1987;Colella 1990;Liu et al. 1994;Jiang & Shu 1996;Colella & Sekora 2008;Toro 2009;Balsara 2017) are routinely used for modeling a broad variety of astrophysical flow phenomena.Their popularity derives from their conservation properties and robustness, which allow them to accurately capture both smooth and discontinuous solutions on the same computational grid without sacrificing numerical stability.These schemes are based on higher-order extensions of the original first-order accurate method of Godunov (1959) and their time-integration algorithm is typically carried out in three steps.
First, a pair of Riemann states is reconstructed at each grid cell interface by applying high-order monotonic interpolants to a set of cell-averaged hydrodynamic quantities.Second, the resulting Riemann problems are solved (either exactly or approximately) 17 to obtain fluxes across every cell boundary.Finally, the cell sur-18 face integral of the fluxes is evaluated, allowing the cell-volume-19 averaged state quantities to be advanced in time 1 .20   In most high-order Godunov schemes, the solution strategy 21 of the Riemann problem is independent of the spatial interpolant 22 used for reconstructing the Riemann states.Therefore, many dif-23 ferent schemes can be built from the same numerical framework.24 The choices made in the construction of a particular scheme, 25 however, do have a strong effect on its accuracy, that is the dif-26 A&A proofs: manuscript no.output ference between the numerical (U) and the true (u) solution, computed in some norm • (see, e.g., LeVeque 2002).Here, ∆x is the width of the grid cell and ∆t is the time step.Although the formal order of the spatial and temporal accuracy of a Godunovlike scheme, "m" and "n" in Eq. ( 1), can be derived for smooth flows, they do not give any information about the magnitude of the numerical errors generated on a given grid, which is problemdependent.The convergence rates can also be significantly lower than the formal order of accuracy of the scheme for problems that admit nonsmooth solutions.Consequently, formally higherthan-second-order interpolants do not always outperform simpler linear spatial reconstruction schemes when large gradients or discontinuities are present in the flow (Greenough & Rider 2003).Moreover, if the flow is stochastic or chaotic, as in the case of turbulence, convergence may not be achieved in the sense of Eq. ( 1), but rather the quality of the numerical results can only be judged in terms of the global or ensemble-averaged quantities that characterize the flow and its evolution.
Given these considerations, it is impossible to generalize the convergence properties of a certain combination of numerical methods in a Godunov-type scheme, so they have to be explored by running numerical tests.For such tests to be significant, they have to be challenging enough and close to the actual application case.The performance of the numerical scheme is another crucial aspect to be considered alongside its accuracy, especially in astrophysical simulations, which often test out the limits of what is feasible nowadays with available computational resources.Therefore, the question arises of what combination of different ingredients in a Godunov-type scheme should be used to produce the desired solution at a minimal computational cost.
Several comparison studies have been presented in the literature with the aim of shedding light on the behavior of different high-resolution schemes in simulations of complex hydrodynamic phenomena, such as forced turbulence (Klingenberg et al. 2007;Kritsuk et al. 2011;San & Kara 2015;Radice et al. 2015;Seo & Ryu 2023), convection (Müller 2020), jet evolution (Beckwith & Stone 2011;Musoke et al. 2020), magnetorotational instabilities in accretion disks (Flock et al. 2010), Richtmyer-Meshkov instabilities (Latini et al. 2007), and shear instabilities (McNally et al. 2012;Lecoanet et al. 2017).These numerical experiments focused on supersonic or mildly subsonic flow regimes, for which Godunov-type methods are highly optimized (see, e.g., LeVeque 2002;Toro 2009).Nonetheless, highresolution schemes have also been proven to be a powerful tool for modeling regimes of low Mach numbers (M |V|/c 0.1, where V is the fluid velocity and c is the sound speed), especially in simulations of terrestrial (see, e.g., Day & Bell 2000;Klein 2009;Dumbser et al. 2009;Motheau et al. 2018) and stellar (see, e.g., Meakin & Arnett 2007;Muthsam et al. 2010;Woodward et al. 2014;Goffrey et al. 2017;Müller 2020;Horst et al. 2021;Canivete Cuissa & Teyssier 2022) flows.To our knowledge, other than idealized tests, no extensive work along the lines of the aforementioned comparison studies has been presented for low-Mach-number flows yet.Only a few studies evaluated the impact of the order of the spatial and temporal discretization in the numerical scheme on the properties of highly subsonic turbulent flows, but they kept the Riemann solver fixed (Wongwathanarat et al. 2016;Teissier & Müller 2023).
In this work, we used our fully compressible Seven-League Hydro (SLH) code to test 18 different combinations of spatial reconstruction schemes and Riemann solvers on two test problems in which flows are highly subsonic (10 −3 M 10 −1 ).In partic-88 ular, we considered a two-dimensional (2D) Kelvin-Helmholtz 89 instability with smooth initial conditions and a 3D, turbulent 90 convection zone that entrains material from an upper, stably 91 stratified layer, where internal waves are free to propagate.The 92 initial conditions of the latter test were adopted from the work of 93 Andrassy et al. (2022).Here, we opted to reduce the strength of 94 the heat source driving the convection in order to achieve lower 95 convective speeds than those obtained by Andrassy et al. (2022).96 Also, contrary to that work, we provided performance measure-97 ments for all the methods tested in our study.In both tests, we 98 conducted a resolution study to analyze the convergence of the 99 numerical results obtained by each method. 100 The paper is structured as follows: in Sect.2, we provide 101 a detailed description of the equations solved and the numeri-102 cal methods included in this study.In Sect.3, we measure the 103 convergence properties of different Godunov-type schemes for 104 the 2D, Kelvin-Helmholtz instability test problem (see Sect. 3.1) 105 and for the 3D setup involving turbulent convection, convective 106 boundary mixing, and wave excitation (see Sect. 3.2).In Sect.4, 107 we use the kinetic energy spectrum of the convective flows sim-108 ulated in the latter test, which is close to a real astrophysical 109 application, to provide measurements of the computational cost 110 per fixed accuracy for each method.Finally, in Sect.5, we sum-111 marize the main results and we give some guidance on which 112 methods to use for specific applications. where is the set of conserved quantities, |Ω| is the volume of a fluid 119 element Ω enclosed by a surface ∂Ω, n is the outward normal 120 vector to the surface, and T = [F|G|H] is a tensor defined by the 121 flux vectors Here, ρ denotes the mass density, V = (u, v, w) the velocity field, 123 e tot = e int + 1 2 |V| 2 the total energy per unit mass, e int the specific 124 internal energy, and X the mass fraction of a passive scalar used 125 as a tracer advected with the fluid.The system represented by 126 Eq. ( 2) is closed by an equation of state (EoS), which gives the 127 gas pressure as a function of the density and internal energy, In this work, we only considered a perfect gas with a given adiabatic index γ, for which p(ρ, e int ) = (γ − 1)ρe int . (6) For modeling the test problem introduced in Sect.3.2, which involves the presence of a gravitational field, g = (g x , g y , g z ), the corresponding source term, must be included in the right-hand-side term of Eq. ( 2).Because we assumed g to be time-independent, we opted to solve an equivalent form of Eq. ( 2), in which the gravitational potential φ was directly added to e tot , eliminating the ρg • V source term from the energy equation, Numerical schemes that solve this form of Eq. ( 2) are capable of conserving the total energy of the system over time exactly.Such a property is particularly important in low-Mach-number hydrodynamics, where even small energy conservation errors can become comparable to the kinetic energy content of the flows (Müller 2020;Edelmann et al. 2021).

Seven-League Hydro code
In our study, Eq. ( 2) was solved numerically using the Seven-League Hydro code (SLH, Miczek 2013; Edelmann 2014), which was originally developed to model the broad variety of hydrodynamic processes that characterize the deep interiors of stars, such as shear instabilities (Edelmann et al. 2017), excitation of internal waves (Horst et al. 2020), convective boundary mixing (Horst et al. 2021;Andrassy et al. 2022Andrassy et al. , 2023)), and turbulent dynamos (Leidi et al. 2022(Leidi et al. , 2023)).SLH makes use of the finite-volume discretization on an arbitrarily curvilinear, but log- and it has been proven to scale up to several hundred thousand processes (Edelmann & Röpke 2016).
SLH allows the user to choose among many different numerical options at compile time, which makes this code perfectly suited to run the comparison study introduced in Sect. 1.In particular, in addition to the well-known Rusanov (Rusanov 1962), Roe (Roe 1981), and  In SLH, the cell-volume-averaged source term and the cell-181 surface-averaged fluxes are approximated using the midpoint 182 method, making the code at best second-order accurate in space.183 On a 3D, evenly spaced Cartesian grid, the final expressions for 184 these integrals read 1 1 1 where S i, j,k is the point value of the source term S at the center of 186 the cell represented by the set of indices (i, j, k) and vector quan-187 tities such as F i+1/2, j,k refer to the face-centered value of the flux 188 at the boundary between two adjacent cells, in this case (i, j, k) 189 and (i + 1, j, k).The volume of the cell and the surface of a cell 190 face are ∆V = ∆x 3 and ∆A = ∆x2 , respectively.The discretized 191 source term and fluxes in Eqs. ( 9)-( 12 The time update on the cell-volume-averaged conserved vari-196 ables, U i, j,k , is then carried out in a dimensionally unsplit fashion 197 using explicit or implicit time stepping.Here, we only consid-198 ered a limited set of all of the numerical methods available in 199 SLH to avoid constructing a too large parameter space.In the 200 following sections, we provide a detailed description of the al-201 gorithms that we used for running the tests presented in Sect.3.1 202 and 3.2. 203

204
We used six reconstruction methods as summarized in Table 1.205 We did not attempt to be exhaustive in the choice of our methods, 206 which is why, for example, essentially nonoscillatory (ENO) and 207 weighted ENO (WENO) schemes (see, e.g., Liu et al. 1994;208 Jiang & Shu 1996;Shu 2009) were left out from our compar-209 ison study 2 .We also excluded multidimensional reconstruction 210 methods.However, the methods we did include cover a wide range in complexity and formal order of accuracy and several of them are often used in stellar hydrodynamics.
The simplest are unlimited linear (LIN) and parabolic (PAR) methods, which have proved to be well-behaved in implicit simulations of slow flows using our SLH code (e.g., Horst et al. 2020Horst et al. , 2021;;Andrassy et al. 2023).Their main disadvantagethe generation of artificial oscillations around steep gradients -can be eliminated using slope limiters.Out of a wide spectrum of limiters available, we have decided to include the popular van Leer limiter (van Leer 1974) in combination with linear reconstruction (LIN+VL).This limiter has the total-variationdiminishing (TVD) property and eliminates the oscillations completely. 3As examples of higher-order methods with limiters, we included two versions of the widely used piecewise-parabolic method (PPM): Colella & Woodward (1984, CW84 hereinafter) and Colella & Sekora (2008, CS08 hereinafter), for which we introduce the acronyms PPM84 and PPM08, respectively.We also constructed a hybrid method, which we named piecewise sextic hybrid (PSH).It combines unlimited sextic reconstruction for dynamic variables (ρ, V, p) with the PPM08 method for passive scalars.We provide lower-order alternatives to the PSH method in Appendix A, although we did not test them in this study.
In the remainder of this section, we provide detailed descriptions of all the six reconstruction methods in unified notation.We do this to (i) maximize the reproducibility of our work, (ii) simplify the methods' more general original forms (e.g., for nonuniform grids), (iii) make clear what choices we made if several options were available, and (iv) to point out a few typographic mistakes in the original works.
We refer to the variable being reconstructed as a in Sections 2.3.1-2.3.6.Although reconstruction can be performed in physical coordinates, we chose to do it in logical coordinates defined by the cell index i, which leads to much simpler expressions.This choice did not influence our results in any way since we used uniform Cartesian grids.In some cases, we introduce the continuous linear logical coordinate ζ such that the coordinate of the center of cell i is ζ i = i and the left and right cell interfaces are located at spectively.The reconstruction is usually discontinuous at the interfaces.We use the notation a i+1/2,L and a i+1/2,R to denote the reconstructed states on the left and right side of the interface at ζ i+1/2 , respectively.Reconstruction is performed along each spatial axis separately, that is to say the reconstruction procedure is always one-dimensional.To highlight the difference between cell averages and point values, we use the notation a i for the average value of a within cell i.
In this work, we always reconstructed the set of cell-volumeaveraged primitive variables q, where q = (ρ, V, p, X), because it helps reducing oscillations near discontinuities as compared to reconstructing cell-volume-averaged conserved quantities U.
In SLH, transformations between q and U are performed using 2 nd -order approximations, where ξ is a nonlinear, invertible transformation, and U i is the i-th component of U. The unlimited linear reconstruction method is based on a linear 267 approximation to the underlying function a(ζ).Its slope δ i (in 268 cell-index coordinates) is estimated using the central difference 269 (the Fromm method) using which we obtain the reconstructed states The LIN method is exact wherever a( The van-Leer-limited linear reconstruction method is partic-276 ularly easy to describe in terms of the 2 nd -order-accurate, 277 interface-centered slopes The final slope δ i,lim is then obtained by applying the limiter of 279 van Leer (1974), which gives the reconstructed states The limiter makes the slope less steep where a(ζ) is strongly 282 curved and flat at local extrema.With a smooth and monotonic 283 function a(ζ), the effect of the limiter weakens upon grid refine-284 ment and the left and right states converge to those provided 285 by the LIN method.This makes the LIN+VL method formally 286 2 nd -order accurate away from any extrema.Two ghost cells are 287 required at domain boundaries.Notes.ID: identifier; Theoretical order: formal order of accuracy for smooth and monotonic solutions in 1D; Practical order: asymptotic order obtained in the 1D experiments reported in Appendix B (smooth but nonmonotonic solutions).In the case of PPM84, the higher order of accuracy corresponds to linear advection and the lower one to the propagation of a sound wave using the LHLLC flux function, see Appendix B for details.

The PAR method 289
The unlimited parabolic method assumes that a(ζ) can within 290 cell i be described using the parabola The three coefficients c n are uniquely determined by the require-292 ment that the averages of a(ζ) in cells i − 1, i, and i + 1 equal 293 a i−1 , a i , and a i+1 , respectively.The reconstructed states are then 294 obtained by evaluating Eq. ( 24) at ζ i−1/2 and ζ i+1/2 , respectively.

295
The resulting expressions are The PAR method is exact wherever a(ζ) is locally parabolic, This expression is the equivalent of Eq. 1.6 of CW84 in the spe-304 cial case of a uniform grid.CW84 then replace δ i by the limited which is the monotonized central limiter of van Leer (1977).In 307 our implementation, we did not set The interpolated value a i+1/2 is initially assigned to both a i+1/2,L and a i+1/2,R , meaning there is no discontinuity at the interface.However, CW84 approximate the distribution of variable a in cell i by the parabola uniquely defined by a i−1/2,R , a i+1/2,L , and the cell average a i .This parabola may in some cases take on values outside of the interval defined by a i−1/2,R and a i+1/2,L , that is overshoots may appear within the cell.To prevent this, a second limiting step is introduced.We express it in terms of the differences The limiter is then defined by the variable replacements The reconstructed states a i−1/2,R and a i+1/2,L are recovered using Eqs.( 29) and (30).Equation ( 31) is equivalent to Eq. 1.10 of CW84.This second limiter introduces discontinuities at cell interfaces in regions where the gradient of a(ζ) changes rapidly and it flattens the assumed parabola at local extrema.
Unlike CW84, we did not use the parabolic model of a(ζ) inside the cell in any way.We only needed the states at the two sides of each interface to construct Riemann problems and time integration was done using a Runge-Kutta scheme (see Sect. 2.5 for details).However, the second limiter (Eq.( 31)), which is based on the parabolic model, was still needed to remove oscillations and to introduce dissipation where necessary.
Although the interpolation formula that PPM84 starts with is 4 th -order accurate, the slope flattening introduced at all extrema reduces the practically attainable order of accuracy substantially for nonmonotonic solutions even if they are smooth.Our 1D experiment in Sect.B.1 gives the order of 2.3 whereas Colella & Sekora (2008) reach the order of 2.6 in a similar advection experiment with a Gaussian-shaped profile and PPM84 reconstruction.The PPM84 method requires three ghost cells at domain boundaries.

The PPM08 method
The piecewise parabolic method of CS08 is based on ideas similar to those of CW84 in constructing the PPM84 scheme and PPM08 also contains two limiters.However, the limiters are flattening them.

351
In PPM08, the first estimate of a i+1/2 is obtained using the If a i+1/2 does not satisfy the condition (c.f.Eq. 13 of CS08) a limiter is applied.It is based on three 2 nd -order-accurate second In Eq. ( 35), the more common finite difference formula with a 357 prefactor 4 would arise if the involved quantities were of the 358 same kind, meaning all three point values, or all three averages.

359
These derivatives are combined to obtain a limited derivative if all three derivatives have the same sign and 362 D 2 a i+1/2,lim = 0 (39) otherwise.We used C = 1.25 in Eq. ( 38).The limited derivative 363 is then used to modify the value of a i+1/2 , cell i is close to a local extremum, which should be preserved if are combined to judge the solution's smoothness (CS08 have one wrong index in their equivalent of our Eq.( 43), c.f. their Eq.21).
We then set if all four second derivatives have the same sign and otherwise.Finally, the reconstructed states are updated, If D 2 a * i < ε = 10 −12 we do not modify the reconstructed states in this step.Equations ( 48) and ( 49) are equivalent to Eq. 23 of CS08.
If the condition in Eq. ( 41) is not satisfied, meaning cell i is not in the vicinity of a local extremum, we use Eqs.( 29)-( 31) instead of Eqs. ( 48) and ( 49) to limit the reconstructed states.CS08 propose to use a slightly less restrictive limiter away from extrema (their Eq. 26), but that limiter produces oscillations with our time-discretization scheme and we did not use it.
Just as we did in the case of PPM84, we only used the reconstructed states and not the assumed parabolic model of a(ζ) within the cell, see Sect.2.3.4 for details.The PPM08 method is 6 th -order accurate for smooth functions a(ζ) even if they are not monotonic.Our 1D experiment in Sect.B.1 confirms this.The PPM08 method requires four ghost cells at domain boundaries.

The PSH method
Whereas all of the previous reconstruction methods are applied to all variables in the same way, the PSH method is hybrid: it is an unlimited piecewise-sextic method for all dynamic variables combined with PPM08 for passive scalars.This allows us to eliminate certain issues that occur with methods containing limiters when applied to slow flows (see Sect. 3.1 for details) while essentially eliminating oscillations in the passive scalars, which could represent mass fractions.The piecewise sextic reconstruction assumes that within cell i a(ζ) can be described by the sextic polynomial The seven coefficients c n are uniquely determined by the requirement that the averages of a(ζ) in cells i − 3 + n equal a i−3+n for n = 0, 1, . . ., 6.The reconstructed states are then obtained by evaluating Eq. ( 50) at ζ i−1/2 and ζ i+1/2 , respectively.The result-ing expressions are The method is exact wherever a(ζ) is locally a sextic polynomial, 7 th -order accurate for general but smooth functions a(ζ), and it requires four ghost cells at domain boundaries.

Approximate Riemann solvers
The reconstructed pair of primitive state quantities, q i+1/2,L,R , de- for consistency, we refer to the low-dissipation HLLC solver as "LHLLC" throughout the text5 .In the rest of this section, we summarize the main aspects of each of these solvers and provide the implementation details whenever several choices can be made for specifying the value of a certain quantity that is needed to evaluate the numerical flux.

RUSANOV
The RUSANOV flux is computed by adding an upwind, numerical diffusive term proportional to the maximum wave speed at the cell interface, S max , to every component of the central flux.
The final expression for the numerical flux reads6 where U L,R are the left and right sets of conserved quantities, respectively.In SLH, S max is estimated as where c = (γp/ρ) 1/2 is the sound speed.The diffusive term in Eq. ( 53) scales with the Mach number of the flow M and allows showing the wave structure of the HLLC solver for a Riemann problem of gas dynamics.The S L,R are the two outer sonic waves (solid lines), while S * is the speed of the linearly degenerate contact and shear waves (dashed line) that separate the two intermediate "star" states, U * L,R .
the scheme to achieve numerical stability by smearing out any 446 discontinuity that may arise in the vector of state quantities U.

447
The RUSANOV solver is one of the simplest schemes that 448 can be used to approximate the fluxes at grid cell interfaces, 449 which makes it very efficient in terms of Floating Point Oper-450 ations per Second.However, it does not take into account the 451 complex structure of the solution arising from the Riemann prob-452 lem of gas dynamics (see, e.g., Toro 2009), so the states between 453 the two outer waves in the Riemann fan are averaged out.For 454 this reason, this flux function is particularly diffusive for trans-455 porting contact and shear waves, which lack the self-steepening 456 property of sound waves.Different from RUSANOV, the HLLC solver restores the linearly 459 degenerate contact and shear waves back to the structure of the 460 solution of the Riemann problem (see Fig. 1).In this method, 461 the numerical flux is chosen according to the sign of the wave 462 speeds in the Riemann fan, 463 While the computation of the physical fluxes Here, U * L and U * R represent the state quantities in the star regions.468 In order to solve Eqs. ( 56)-( 57), proper estimates of the wave 469 speeds S L and S R must be provided beforehand.SLH computes 470 these wave speeds as After some assumptions and algebraic manipulations (see Toro 472 2009), these estimates allow Eqs. ( 56)-( 57) to be solved for the 473 with and S * being the speed of the intermediate wave, The gas pressure is preserved across the middle wave and takes 477 the value These expressions are then inserted back into Eqs.( 56)-( 57) to 479 compute F * L and F * R .Finally, the interface flux is chosen according to Eq. ( 55).

481
In our study, we used a variant of the original HLLC solver and we computed p * using a linearized Riemann solver for the 487 equations of gas dynamics (see, e.g., Toro 1991), where c = (c L + c R )/2 and ρ = (ρ L + ρ R )/2.The system in Eq. ( 69) is consistent with the physical fluxes in the sense that 490 and it satisfies the Rankine-Hugoniot jump conditions across the contact wave S * as the original solver, Diagnostic tests of the Kelvin-Helmholtz instability problem,  We stress, however, that the fluxes in Eq. ( 69) do not satisfy 497 the jump conditions across the sonic waves S L and S R (see 498 Eqs. ( 56) and ( 57)).Therefore, there is no guarantee that the 499 resulting scheme preserves positivity of density and internal en-  (Harten et al. 1983) in simulations involving the presence of material interfaces or the propagation of entropy waves.However, the effects of the numerical dissipation introduced by HLLC on the evolution of the flow become progressively more dominant as M → 0, thus producing unnecessarily large diffusive errors in highly subsonic velocity regimes (see, e.g., Fleischmann et al. 2020).In our variant of HLLC, this behavior is caused by the upwind term in the expression for p * (see Eq. ( 70)), This term scales with M, which is inconsistent with the scaling of pressure fluctuations dynamically generated by subsonic flows.In fact, in the asymptotic limit M → 0, the solution to the compressible Euler equations approaches the incompressible regime (Guillard & Viozat 1999), in which the gas pressure is homogeneous in space except for fluctuations proportional to M 2 .At low Mach numbers, the numerical term in Eq. ( 73) can eventually become larger than the physical pressure fluctuation at the cell interface, thus yielding highly inaccurate numerical results.
In order to correct for the flawed scaling of the numerical dissipation introduced by HLLC-like methods, here we followed the approach described in Minoshima & Miyoshi (2021), who proposed to multiply the diffusive term in Eq. ( 73) by a factor φ proportional to the local Mach number of the flow7 .Such a correction was originally applied to the magnetohydrodynamic solver HLLD, but it can easily be used in HLLC by setting all magnetic field components to zero, resulting in with The final expression for p * then reads The resulting upwind term in this "low-dissipation" version of the HLLC flux (LHLLC) scales with M 2 when the flow is subsonic, so the ratio of the numerical diffusive term to the amplitude of pressure fluctuations is independent of M.
We note that the same fix cannot equally be applied to the RUSANOV flux without sacrificing numerical stability.In particular, a diffusive coefficient proportional to M 2 would result in too little dissipation for sound waves.This is not the case for LHLLC, in which the complex upwinding performed in Eq. ( 69) guarantees that the scheme remains stable for the propagation of sound waves (see also Appendix B.1).

Time discretization
Because the acoustic Courant-Friedrichs-Lewy (CFL, Courant et al. 1928) criterion on the time step becomes excessively strict in regimes of very low Mach numbers, implicit time discretization techniques are typically better suited for simulating the evolution of such slow flows (see, e.g., Viallet et al. 2011;Miczek et al. 2015;Dumbser et al. 2019).However, we recognize that most hydrodynamic codes nowadays do not have time-implicit integration capabilities, whose implementation requires a considerable effort from code developers.Thus, to make our study easily reproducible, we decided to target in our test setups, see Sect.3.1 and 3.2, flows with Mach numbers in the range 10 −3 M 10 −1 , where simple time-explicit marching schemes are still competitive with implicit ones.In this work, explicit time integration was performed in a semi-discrete fashion, in which the cell-surface integral of the fluxes and the cell-volume integral of the source terms in Eq. ( 2) were first separately discretized in space whilst the system was left continuous in time according to the method of lines (see Sect. 2.2).The resulting system of ordinary differential equations (see Eq. ( 13)), was then solved numerically to advance the cell-volume-averaged state quantities in time.To solve Eq. ( 13), we used the third-order accurate, strong stability preserving (SSP) RK3 method of Shu & Osher (1988), in which the update on In particular, we computed the spatial residuals at stage s, using the numerical techniques described in Sects.2.3 and 2.4.
Finally, in order to achieve numerical stability, we limited the time step according to where N dim is the number of spatial dimensions.In all the tests presented in Sect.3, we always adopted CFL = 0.8.We preferred to use a third-order accurate time stepper over less computationally expensive (but more inaccurate) methods, such as the "Midpoint rule" or SSP-RK2 (Shu & Osher 1988), so that the spatial instead of the temporal discretization would contribute most to the building up of global truncation errors.

Godunov-type methods
In this section, we check if the methods included in our study (described in Sect.2) converge to the same numerical solution for several physical quantities of interest.In particular, we test a Kelvin-Helmholtz instability and a more complex setup charac-587 terized by the presence of turbulent convective flows, turbulent 588 entrainment, and wave excitation.The results of the latter set of 589 simulations allow us to estimate the computational cost per fixed 590 accuracy for any given scheme, which we show in Sect. 4. We first test all of our 18 combinations of numerical schemes 593 as described in Sects.2.3 and 2.4 on a 2D Kelvin-Helmholtz 594 problem with the initial condition where γ = 1.4 and The smooth function η(y) provides a resolvable transition be-597 tween layers moving in opposite horizontal directions.The ini-598 tial speed of sound is unity, so M 0 is a tunable initial Mach num-599 ber of the shear flow.We discuss solutions with M 0 = 10 −1 , 600 10 −2 , and 10 −3 .In Eq. ( 84), a smooth initial perturbation with an 601 amplitude of M 0 /10 is included as a velocity component perpen-602 dicular to the shear flow.The computational domain, assumed to 603 be periodic in both x and y, spans 0 ≤ x ≤ 2, −0.5 ≤ y ≤ 0.5.

604
The fact that the transition function η(y) between the shear-605 ing layers is smooth8 allows us to compute numerically con-606 verged solutions even in the absence of physical viscosity as 607 long as the simulations are stopped before the flow field be-608 comes chaotic (see also Robertson et al. 2010;McNally et al. 609 2012;Lecoanet et al. 2017;Berlok & Pfrommer 2019).Each of 610 the two transitions spans only 1 /16 of the domain height and is 611 poorly resolved on the coarser grids used in our tests.Therefore, 612 we improve the accuracy of the initial cell averages that involve 613 η(y) by averaging η(y) over 100 points uniformly distributed in 614 the y-range covered by each cell.We measure numerical errors 615 with respect to a reference solution computed using PSH recon-616 struction and the LHLLC flux function on a 8192 × 4096 grid.617 The solution for M 0 = 10 −2 is shown in Fig. 2 at four points 618 in time9 .As the instability grows in amplitude, the sinusoidal 619 initial perturbation is rolled up into a series of vortices.Parts of 620 the initial shear layers are stretched and become trapped in the 621 centers of the vortices.Other parts of the shear layers become 622 substantially narrower.We quantify this phenomenon by com-623 puting the minimum scale height min(H X ) ≡ 1/ max (|∇X|) of 624 the passive scalar X. Figure 3 shows that this quantity drops by 625 as much as a factor of 28 between t = 0 and t = 0.  scale artifacts produced by different methods.All of the six reconstruction functions lead to extremely diffusive solutions with the RUSANOV flux, although high-order methods with limiters (PPM84, PPM08, PSH) still preserve steep gradients at some places.Numerical diffusion is strongly suppressed with the HLLC flux function owing to the explicit treatment of the contact wave in HLLC.The three highest-order methods (PPM84, PPM08, PSH) reproduce the structure of the primary vortices (c.f.Fig. 2) much more closely than the lower-order methods (LIN, LIN+VL, PAR).However, the two PPM methods develop secondary instabilities around the primary vortices.This effect, not present in the reference solution, occurs also with the LH-LLC flux function (see also McNally et al. 2012).The secondary instabilities become the dominant source of numerical errors.We find that these instabilities tend to grow when excessive velocity shear is generated at the grid scale and there is not enough numerical dissipation to suppress their growth.Therefore, the excitation of artificial, short wavelength Kelvin-Helmholtz instabilities is favored on coarser grids, which generate larger shear at the grid scale across the poorly resolved slip line, and by less dissipative Riemann solvers.For the same reason, simulations run with the HLLC solver are more prone to developing secondary instabilities at higher rather than lower Mach numbers, as HLLC introduces less numerical diffusion into the system when modeling faster flows (see, e.g., the panels for PPM84+HLLC in Fig. 4 and Fig. D.1).As the grid is refined, the shear layers are progressively better resolved thus reducing grid scale shear and suppressing the growth of the secondary instabilities.Not surprisingly, the smooth interiors of the primary vortices are best represented with the highest-order method PSH (c.f.Fig. 2).The magnitude of the largest overshoot (or undershoot) in the passive scalar is shown in the insets in Figs. 4, D.1, and D.2.Thanks to their use of limiters, the LIN+VL, PPM84, PPM08, and PSH methods largely eliminate the overshoots, making the methods useful for the advection of mass fractions.We observe only some accumulation of round-off-level overshoots (up to ≈ 10 −14 ) with LIN+VL, independently of the initial Mach number or grid resolution.The PPM84, PPM08, and PSH methods produce small but finite overshoots.At t = 0.8M −1 0 , the magnitude of the largest overshoots is ≈ 10 −4 with PPM84, ≈ 10 −2 with PPM08, and ≈ 10 −6 with PSH across all grids and all initial Mach numbers.However, they only occur in a few cells or groups of cells scattered across the computational grid, so the total mass affected by this effect is negligibly small.Moreover, the amplitude of such overshoots, drops to the round-off level in many of our simulations as the resolution is increased.Unlim-G.Leidi et al.: Performance of high-order Godunov methods ited methods produce substantially larger overshoots, reaching 8% and 17% for LIN and PAR, respectively.They are the result of the dispersion errors characteristic of linear schemes of 2 nd or higher order, so they are nearly always present around sharp structures and affect a much larger amount of mass than the sporadic and isolated overshoots produced by PPM84, PPM08, and PSH.The benefits of the hybrid approach in PSH thus become obvious -secondary instabilities are suppressed (as compared with PPM84 and PPM08) thanks to the absence of limiters for dynamic variables while mass fractions, for which limiters are used, are well behaved.
We quantify numerical convergence of all of our methods by measuring L 1 errors with respect to the reference solution in the density ρ, kinetic energy E k,y associated with motions perpendicular to the initial shear flow, and mass fraction X of the passive scalar.We define the relative L 1 error for any quantity q as where q ref is the reference solution and the sums run over the whole computational grid of N x × N y cells.The error is nor- produce the smallest errors on a given and sufficiently fine grid.
The errors differ by up to one order of magnitude, although the overall 2D scheme is 2 nd -order accurate in all of the cases.
The magnitude of numerical errors strongly increases when we decrease initial Mach number with the non-low-Mach flux functions RUSANOV and HLLC.As described above, the spatial structure of errors produced by linear methods differs from that produced by methods with limiters.background of relatively weak, large-scale sound waves.On the other hand, the solutions computed using the PPM84 and PPM08 757 methods show a large amount of small-scale numerical noise.758 After inspecting the time dependence of these artificial struc-759 tures11 , we conclude that some of these structures travel with the 760 flow while others have the character of small-scale sound waves.761 We believe that both originate from cumulative effects of the lim-762 iters switching their local state many times over the time span of 763 the simulation12 .The switching can result in rapid changes in the 764 magnitude of the discontinuities at cell faces and, consequently, 765 in the amount of dissipation applied in the Riemann solver (i.e., 766 the numerical flux function).This effect is weakest with the low-767 dissipation flux function LHLLC but it is still clearly present 768 on the 4096 × 2048 grid, see Fig. D.8.The structures disappear 769 when we disable the limiters in PPM84 and PPM08 (not shown 770 in the figures).We do not observe the fast-propagating, small-771 scale sound waves with LIN+VL but the method does produce 772 thin structures in the velocity divergence around the primary vor-773 tices, which follow the flow.

774
Overall, the PSH method produces by far the smallest errors 775 with the RUSANOV and HLLC flux functions and initial Mach 776 numbers M 0 ≤ 10 −2 (Figs. 5 and D.4).The only exception is 777 the poor convergence (or even divergence) in the passive tracer 778 X observed with the PSH+RUSANOV combination in simula-779 tions with M 0 ≤ 10 −2 performed on very fine grids.This effect 780 is also likely related to the limiters switching their local state 781 many times over the time span of the simulation but we did not 782 investigate it further.In this section, we consider a test problem in which a variety 786 of complex hydrodynamic phenomena can be captured on the 787 same computational grid.These include convective transport of 788 energy, turbulent mixing at a convective boundary, and the prop-789 agation of internal waves in a stably stratified layer, all of which 790 are often encountered in simulations of geophysical and stellar 791 hydrodynamics.The initial conditions for this test are adopted 792 from the work of Andrassy et al. (2022) and they represent a 793 hydrostatic solution of Eq. (2).In particular, the thermodynamic 794 conditions in this setup are similar to those found in an oxygen-795 burning shell of a massive star.We map the hydrostatic stratifica-796 tion (see Fig. 7) on an evenly spaced, 3D, Cartesian grid defined 797 by (x, y, z) The gravitational acceler-798 ation points downward in the y-direction, where g 0 = −1.414870and As in Andrassy et al. (2022), we work with rescaled quantities 801 such that the speed of sound and the density at the base of the 802 of the convective oxygen shell as unit of length.A detailed list 804 of units with the associated values can be found in Table 1 of The initially hydrostatic stratification is described by a 807 piecewise-polytropic relation in the form where γ 0 = 5/3, γ 1 = 1.3, and η(y) is a smooth function, The adiabatic index is γ = γ 0 .In this work, we assume peri- is added to the right-hand side of Eq. ( 2) as In the work of Andrassy et al. (2022), the amplitude of the heat IGWs in the stable layer.Due to the fully compressible nature of SLH, we also expect short-wavelength sound waves to be generated, although at much lower amplitudes than those of IGWs at the typical Mach numbers encountered in this test problem (Lighthill 1952).
As done for the test described in Sect.3.1, here we run simulations for each of the 18 considered combinations of Riemann solvers and spatial reconstruction schemes.To judge the numerical convergence of our results, each combination of methods is run on grids with 128 3 and 256 3 cells.Additionally, we run a single simulation on a 512 3 grid using the LHLLC Riemann solver and the PAR reconstruction scheme, which we consider the reference solution for this test problem.However, because of the chaotic nature of the turbulent flows that arise in the convective layer, convergence is not expected in the exact flow morphology, so we do not compute L 1 errors as done in Sect.3.1.Instead, we analyze the convergence of the numerical results in terms of ensemble-averaged quantities that are representative of the dynamical properties of the system, such as kinetic energy spectra computed in the convective and stable layers.
To break the initial symmetry, we add a perturbation to the hydrostatic density stratification in the form ∆ρ = 1.1×10 −5 q(y) q0 [sin(3πx)+cos(πx)][sin(3πz)−cos(πz)].( 95) The subsequent evolution of the system and the development of convection is shown in Fig. 8.The density perturbation, alongside the action of the heat source, generate packets of fluid with higher entropy content than the adiabatic surroundings.The packets of hot and low-density material buoyantly rise in the adiabatic stratification until they reach y ≈ 2. At this height, the temperature stratification turns subadiabatic and the buoyant force acting on the rising plumes changes sign, forcing them to overturn.IGWs excited at the bottom boundary of the subadiabatic region propagate upward in the stratification (with characteristic Mach numbers in the range from 0.005 to 0.01) and are subsequently reflected at the top boundary of the domain.Shear instabilities break the large-scale buoyant structures that arise in the adiabatic layer and initiate the cascade of kinetic energy toward smaller scales.Turbulent convection fully develops after approximately one convective turnover time scale, τ conv = 133 time units, which we define according to In Eq. ( 96), the root-mean-square convective speed is averaged over several convective turnover time scales and L conv = 1 is taken as representative of the vertical extent of the convection zone.In our reference simulation, the root-mean-square Mach number in the convection zone is13 M rms = 0.019±0.001,which is in agreement with the value predicted by the M rms ∝ q1/3 0 scaling relation.All simulations are run until t max = 32τ conv to have a proper coverage of the dynamical evolution of the system and to compute meaningful time averages needed for the following analysis.

Flow morphology and spatial kinetic energy spectra
In Fig. 9, we show snapshots of the Mach number taken at the vertical plane z = 0, obtained at the final time t = 32τ conv with all the methods tested here.Although all panels show results computed using the same grid resolution (256 3 ), there is a vast difference between the methods in terms of effects induced by numerical dissipation.With the RUSANOV flux, convection mainly happens in the form of large-scale plumes and IGWs are barely excited in the stable layer except when high-order methods such as PSH or PPM08 are used to reconstruct the Riemann states.Smaller-scale structures typical of turbulent flows become progressively more visible in the convection zone with higherorder reconstruction methods and less dissipative solvers.Internal waves with progressively shorter wavelengths also tend to be more visible in the stable layer with less dissipative methods.The effects of the numerical dissipation on the properties of the flow are better quantified in Fig. 10 and 11, where we show the power spectra of kinetic energy extracted from a horizontal plane in the middle of the convection zone (y = 1.5) and stable layer (y = 2.5) as functions of the horizontal wavenumber, We compute k x and k z as where .represents the floor function and N x and N z are the numbers of cells in the x− and z−direction, respectively.The spectra are averaged over the time interval t ∈ (10τ conv , 32τ conv ).
In the convection zone, all of the curves approximately agree with the Kolmogorov scaling law (k −5/3 h ) on some intermediate scales.We notice that the kinetic energy spectra shown in Fig. 10 present a pile-up of kinetic energy at the bottom of the inertial range.This phenomenon, also known as "bottleneck effect," is often observed both in hydrodynamic simulations (see, e.g., Dobler et al. 2003) and experimental studies (see, e.g., Küchler et al. 2019) of turbulent flows.The extent of the inertial range of the turbulent kinetic energy spectrum greatly differs from method to method.With a fixed spatial reconstruction scheme, the kinetic energy spectrum obtained with LHLLC enters the viscous subrange (where the kinetic energy of the turbulent eddies is dissipated into internal energy of the gas) at higher wavenumbers as compared to both HLLC and RUSANOV.Reconstruction schemes based on slope limiters are characterized by shallower spectra in the viscous subrange than those generated by unlimited schemes.In the stable layer (see Fig. 11), RUSANOV and HLLC generate much weaker IGWs than the reference run even on large scales, except when used in combination with PSH.
Both in the convective and stable layers, all spectra converge to the reference solution if the order of the spatial reconstruction method or the grid resolution are increased.

Turbulent entrainment at the convective boundary
The overturning of the convective flows at the upper convective boundary gives rise to a variety of hydrodynamic processes (including shear instabilities, convective overshoot, and breaking of surface waves) that lead to the entrainment of material from the upper, stably stratified layer into the convection zone.The conglomeration of these processes is known in the literature as "convective boundary mixing" (see, e.g., Meakin & Arnett 2007;Andrassy et al. 2020;Horst et al. 2021;Anders & Pedersen 2023) and it increases the size of convective regions over time.As part of our comparison study, we quantify the impact of the choice of a specific combination of methods in Godunov-type schemes 951 on the evolution of the convective boundary.At any given point 952 in time, we assume that the vertical coordinate of the boundary 953 between the convective and stable layer, y cb , is the position at 954 which the horizontal average of the passive scalar X takes the 955 value where X min and X max are the minimum and maximum values of 957 X, respectively.Such a choice for y cb is justified by the fact that 958 the initial abundance of the passive scalar, X, smoothly transi-959 tions from 0 to 1 at y ≈ 2, which is the boundary of the initially 960 adiabatic layer where convection sets in first.The time evolu-961 tion of y cb is shown in Fig. 12 for all of our combinations of 962 methods.We note that PPM-based methods used in combina-963 tion with RUSANOV do not show signs of numerical converge 964 to the reference solution.Also, the other reconstruction schemes 965 used with RUSANOV seem to accelerate the mixing at the con-966 vective boundary with respect to the reference run, especially 967 on the 128 3 grid.In reality, this is an artifact of the method 968 used to estimate the position of the convective boundary y cb .969 In fact, because RUSANOV does not resolve shear or contact 970 waves (see Sect. 2.4), the initial gradient in X at the convec-971 tive boundary is further smoothed out by the effects of numer-972 ical dissipation.When convection fully develops, it mixes the 973 passive scalar that has diffused inside the convection zone and 974 homogenizes its abundance, thus shifting the formal position of 975 X = X outward.This effect is further enhanced with the most 976 diffusive reconstruction methods tested here.Therefore, with the 977 RUSANOV solver, entrainment of material from the stable to 978 the convective layer is mostly determined by numerical diffusion 979 rather than turbulent mixing and the distribution of the passive 980 tracer is not representative of y cb .On the other hand, all of the 981 other methods tested here quickly achieve numerical converge 982 to the reference solution with increasing grid resolution, the or-983 der of the reconstruction scheme, or by using progressively less 984 dissipative Riemann solvers.Among the data sets obtained with 985 the six reconstruction schemes, the maximum relative deviation 986 of y cb from the reference solution at the final time is 20% and 987 15% with HLLC on the 128 3 and 256 3 grids, respectively, while 988 with LHLLC it decreases from 12% to 5% with increasing grid 989 resolution.
where 14 Here, we assume that the gas constant is unity. 15https://zenodo.org/doi/10.5281/zenodo.10280900 16The array of the values of v obtained at each time step is multiplied by the Hanning window function to reduce the amplitude of the discontinuities in the signal at the boundaries of the time domain.
nonevanescent) in which the spectra are characterized by a steep decrease in power (although non monotonic) toward higher frequencies.The forest of lines visible at frequencies in the range from 1 to 50 correspond to the resonant pressure modes of the cavity as predicted by the linear wave theory in the Cowling approximation (Aerts 2021, see also Fig. 15).In the frequency regime of IGWs, the power spectrum seems to converge to the reference solution when increasing the grid resolution or when using less dissipative Riemann solvers.A more quantitative analysis of the IGW spectrum would require substantially longer simulations, which would benefit from more time averaging as well as reach higher frequency resolution.
On the other hand, the high-frequency domain is well resolved even in simulations run on the 128 3 grid, and differences between the spectra are clearly visible.Overall, there is a large spread in power (by almost 20 orders of magnitude at the Nyquist frequency) among the different methods tested in our study, which does not decrease with grid resolution.Therefore, numerical convergence is not achieved at high frequencies.We note that the spectra with the highest power density in the high-frequency range are always those obtained with limited reconstruction methods, in particular PPM84 and PPM08, while unlimited reconstruction methods tend to generate much quieter spectra and that are closer to the reference solution.In the simulations of the Kelvin-Helmholtz instability shown in in Sect.3.1, we find that a power excess in high-frequency, short-wavelength sound waves is generated by the complex limiting procedure performed in PPM84 and PPM08 to reconstruct the Riemann states at cell interfaces.In this setup, a similar phenomenon is likely responsible for generating an acoustic power excess close to the Nyquist frequency, where the spread between the spectra is maximum.The power excess, however, is still large down to frequencies similar to the that of the fundamental oscillation mode of the cavity ω 0 = 1.1.Also, PPM-based methods used in combination with HLLC excite much stronger resonant lines than those generated with LHLLC, and their peak power density is almost as high as the flat part of the spectrum associated with IGWs.When used with PPM84, both RUSANOV and LHLLC produce a broad feature in the frequency spectrum whose peak is at ω peak ≈ 50 and 100 on the 128 3 and 256 3 grids, respectively.If these were sound waves, their wavelength λ would be λ = 2π which is close to the Nyquist frequency in space.Therefore, such a feature in the spectrum may be caused by odd-even cell decoupling traveling at the local speed of sound.Additional peaks are observed at even higher frequencies than the frequency of the least resolvable sound wave on the grid (with a wavelength of two cells), so these are most likely numerical artifacts.Tests performed with lower CFL factors of 0.4 and 0.2 show that the amplitude of these peaks is slightly reduced when using shorter time steps in the simulation, but the level of the continuum in the power spectrum remains essentially unaltered.In light of these results, we advise against using PPM-based methods in simulations of sound generation by low-Mach-number turbulence.

Performance metrics
After proving that all of the methods tested in our study converge to the correct solution for most of the physical quantities of inter- Equation ( 106) is rescaled such that the cost of the simulation run with PPM08 + LHLLC on any given grid is unity.All values of Θ(N x = 128) and Θ(N x = 256) are shown in Table 4.The spread of almost four orders of magnitude in the relative computational cost among the different methods is due to the steep dependence of Θ(N x ) on N eff (N x ), which varies by as much as a factor of 10 (see Table 2).On the other hand, Θ(N x ) only scales linearly with δt(N x ), which in our simulations varies at most by a factor of ≈ 3 (with LIN + RUSANOV and PPM84 + LHLLC achieving the lowest and highest wall-clock time per cell-update, respectively).The most expensive combination of methods among those tested in our study (in terms of computational cost per fixed resolving power) is LIN + VL + RUSANOV, ∼1000 times as expensive as PPM08 + LHLLC.The strong numerical dissipation generated by RUSANOV leads to very poor performance of the finite-volume scheme even when used in combination with PPM-based reconstruction methods.Only with the unlimited PSH method such a flux function is capable of achieving acceptable performance (Θ(N x ) ≈ 4).When the reconstruction scheme is kept the same, using progressively less dissipative Riemann solvers decreases the cost of reaching the same effective resolution.At the typical Mach numbers encountered in the convection zone (M ≈ 0.02), the combination of HLLC and second-order reconstruction schemes or PAR is considerably more expensive than PPM08 + LHLLC (Θ(N x ) in the range from 10 to 30).The performance of the scheme increases when HLLC is used with PPM-based reconstruction schemes (Θ(N x ) in the range from 2 to 6) and it is even higher than that of PPM08 + LHLLC when coupled to PSH.PSH + LHLLC is the most performant method on both grids according to the chosen metric.Overall, the computational cost of the finite-volume scheme is considerably reduced when using the low-dissipation solver (by a factor from 2 to 10 with respect to a scheme using HLLC and the same spatial reconstruction method).The relative performance of the scheme only varies slightly when LHLLC is used in combination with reconstruction methods less accurate than PSH, with the worst performance being achieved by LIN + VL (Θ(N x ) ≈ 3).
We note that, for most methods, the value of Θ(N x ) sensibly increases (by as much as a factor of ≈ 2) from the 128 3 to the 256 3 grid.Such differences are due to the small but systematic decrease of the rescaled effective resolution N eff (N x )/N x with increasing the grid resolution (this behavior is also confirmed by the reference solution computed on the 512 3 grid, see Table 2).Although the differences observed in N eff (N x )/N x among the two grids are at most 20%, they are significantly amplified after applying the steep scaling relation between Θ(N x ) and N eff (N x ).However, this effect is negligible if compared to the large spread obtained in the values of Θ(N x ) on a given grid.Furthermore, we stress that our measure of Θ(N x ) is based on a crude approximation of effective resolution of the turbulent flows, so the values provided in Table 4 should only be taken as estimates of the relative performance of different Godunov-type methods in simulations of turbulent convection.The measurements of the absolute performance provided in Table 3, which are needed to estimate Θ(N x ), may also depend on the parallelization strategy and the number of cores used to run the simulations, especially for the methods that require many ghost cells (e.g., PPM08 and PSH) and are therefore characterized by higher communication costs.

Summary and conclusions
High-resolution, finite-volume schemes are popular methods for simulating the behaviour of astrophysical fluids.There is a wide range of spatial-reconstruction schemes, numerical flux  The low-Mach flux function LHLLC is found to be much less dissipative and much more accurate than the HLLC and RU-SANOV flux functions at the two lowest Mach numbers considered.Even though the overall 2D numerical scheme is 2 nd -order accurate, the errors produced by different spatial reconstruction schemes span as much as an order of magnitude.Unlimited reconstruction schemes of increasing order, up to 7 th , provide progressively more accurate results when the solution is sufficiently well resolved.However, these schemes are of limited use for the advection of mass fractions due to the oscillations and overshoots they produce around discontinuities.This issue is largely eliminated by the use of limiters in the schemes LIN+VL, PPM84, and PPM08.However, we find that the switching behaviour of the limiters introduces spurious structures and smallscale sound waves to the solutions, which severely reduce the accuracy of the methods when applied to slow flows.Our hybrid method PSH, which combines unlimited, 7 th -order reconstruction for dynamic variables with PPM08 for mass fractions, is found to provide the most accurate solutions in nearly all of our simulations of the Kelvin-Helmholtz instability.
Our second test problem, chosen to be as close as possible to practical applications in the dynamics of stellar interiors, involves stratified turbulent convection generating waves in and entraining mass from an overlying stably stratified layer.The typical Mach number of the convection is ≈ 0.02.The turbulent nature of the convective flow makes the solutions chaotic but space-and time-averaged quantities can still be meaningfully compared between different simulations.We ran the simulations on grids of 128 3 and 256 3 cells with one additional simulation on a 512 3 grid serving as a reference solution.
Qualitatively speaking, the solutions match the trends seen in the simulations of the Kelvin-Helmholtz instability, that is to say combinations of the least dissipative flux functions with solving power in both the convective and stable layers.Spatial to a conventional Riemann solver by even larger factors (Leidi et al. 2022).Although low-dissipation Riemann solvers such as LHLLC bring clear advantages in simulations of subsonic flows when used in Eulerian codes, their robustness and accuracy properties in quasi-Lagrangian, moving-mesh schemes still has to be verified.We note that such schemes hold promise for a particularly significant improvement because mesh motions aim at reducing advection errors by minimizing the velocities relative to the cell interfaces.This naturally leads to low-Mach-number flows in the comoving frame, which are better modeled by low-Mach-number Riemann solvers.We are currently testing an implementation of the LHLLC/D solvers of Minoshima & Miyoshi (2021) in the moving-mesh MHD code AREPO (Springel 2010), which we will report in a forthcoming study.
On the other hand, the choice of the spatial reconstruction scheme depends on the physical application of interest.Overall, second-order reconstruction methods lead to the generation of considerably more inaccurate results than higher-order schemes when the Mach number of the flow is low.If numerical overshoots have to be suppressed whilst retaining high resolving power, PPM84 and PPM08 should be the methods of choice.However, higher-order unlimited reconstruction methods should be preferred over PPM-based methods in simulations of sound wave generation by subsonic turbulence because they generate much less numerical noise in the acoustic spectrum.Less oscillatory alternatives to the hybrid PSH method, such as the 3 thorder piecewise parabolic hybrid (PPH) or the 5 th -order piecewise quartic hybrid (PQH) methods described in Appendix A, could offer the best compromise between the complete removal of numerical overshoots and reduction of acoustic noise.
Finally, we note that there are complex astrophysical environments where both high-and low-Mach-number flows can coexist.This scenario often occurs, for instance, in accretion disks (see, e.g., Klessen & Hennebelle 2010), star forming regions (see, e.g., Wang et al. 2024), or outer layers of stars (see, e.g., Wedemeyer et al. 2017).To model such diverse flow regimes, the simple use of unlimited higher-order reconstructions and lowdissipation solvers may give rise to numerical instabilities, especially in the proximity of shock fronts.To cure this problem, an alternative approach based on the local dynamical properties of the flow is needed (see, e.g., Mignone et al. 2011;Fleischmann et al. 2020).One possibility is to add a selection criterion to the function that computes the flux at a cell interface such that the LHLLC solver is used if the Mach number of the flow is lower than a conservative threshold (e.g., M 0.6), whereas a proper shock-capturing method (e.g., HLL or HLLC) is chosen for modeling faster flows.A similar argument can be made for the choice of the spatial reconstruction scheme.To avoid introducing large oscillations in the state quantities near large, nonlinear discontinuities, schemes that are more dissipative and possibly close to being TVD (such as LIN+VL, PPM84, PPM08, or WENO) should be preferred over high order unlimited methods.Eq. (C.1).This feature allows CTO-WENO to achieve thirdorder spatial accuracy in smooth parts of the flow while at the same time it remains robust near discontinuities.The parameter WENO that appears in the denominator of α m avoids division by zero in the case a smoothness indicator IS m = 0. Thus, the oscillatory behavior of a(ζ) is also determined by the value of this parameter.In fact, when WENO IS L,R,C , the weighting coefficients become almost equal and the scheme is unlimited even near discontinuities, making it similar to our PAR method.On the other hand, if WENO IS m , the weighting coefficients are only determined by IS L,R,C and the scheme becomes close to being TVD near discontinuities or in poorly resolved parts of the flow.
Although CTO-WENO, unlike LIN+VL or the PPM methods, does not involve any conditional statements, the weighting coefficients that occur in a(ζ) can still abruptly change from time step to time step if a barely resolved wave or feature in the flow crosses that particular cell.Such a rapid change in the form of the polynomial a(ζ) can generate high-frequency perturbations in the state quantities on the grid scale and affect the propagation of sound waves.Therefore, we expect a version of CTO-WENO that uses small values of WENO to generate artificial acoustic noise, whereas large values of WENO should generate results closer to our unlimited reconstruction methods.
fully compressible, inviscid Euler equations with 116 a source term S in the integral form ically rectangular, Eulerian grid to retain the conservation properties of the fluid-dynamics equations.Numerical solutions to Eq. (2) are computed by means of Godunov-type methods based on the definition of Riemann problems at cell interfaces.The code is parallelized using the Message Passing Interface (MPI) fines a Riemann problem at the cell interface i + 1/2, which SLH solves by means of 1D approximate Riemann solvers to obtain the face-centered value of the fluxes F i+1/2 .We ran the tests presented in Sect. 3 using two widely popular flux functions, namely the RUSANOV and HLLC solvers.Because in this work we only focused on simulations of subsonic flows, for comparison we also built a low-dissipation version of HLLC following the approach of Minoshima & Miyoshi (2021), who modified the Harten-Lax-van Leer-Discontinuities (HLLD, Miyoshi & Kusano 2005) scheme for magnetohydrodynamics to diminish the magnitude of the numerical dissipation for low-Mach-number flows.The authors called this low-dissipation flux "LHLLD," so,

482of
Toro et al. (1994), which allowed the low-Mach correction 483 presented in the next section to be implemented trivially into the 484 solver.In particular, we directly evaluated the physical fluxes in 485 the selected state of the Riemann fan, 493 described in Sect.3.1, showed that the numerical solutions com-494 puted with our modified version and the original solver of Toro 495 et al. (1994) are virtually indistinguishable for subsonic flows. 496

Fig. 2 .
Fig. 2. Reference solution to the Kelvin-Helmholtz problem with initial Mach number M 0 = 10 −2 .The solution was computed using PSH reconstruction and the LHLLC flux function on an 8192 × 4096 grid.The mass fraction X of the passive scalar is shown at four points in time: late in the linear growth of the instability (t = 0.2 M −1 0 ), at an early stage of nonlinear evolution (t = 0.4 M −1 0 ), at a stage when the primary vortices have fully formed (t = 0.8 M −1 0 ; the final time for all of our other Kelvin-Helmholtz simulations), and at a late stage when fine threads have formed inside the primary vortices (t = 1.6 M −1 0 ).We use the same color scale as in Figs. 4, D.2, and D.1, although 0 ≤ X ≤ 1 in the reference solution.

Fig. 3 .Figure 4
Fig.3.Resolution dependence of the minimum scale height min(H X ) of the passive scalar in the Kelvin-Helmholtz problem expressed in units of the computational cell width and shown at five points in time.Grid resolution is given by the number N x of computational cells along the x axis.The initial Mach number is M 0 = 10 −2 and we use PSH reconstruction and the LHLLC flux function in this series of simulations.Once the steepest gradients in X become resolved the minimum scale height starts to follow the linear scaling relations shown.
Figures D.1 and D.2, respectively, show that the differences between the three flux functions become smaller with M 0 = 10 −1 and much larger with M 0 = 10 −3 .This is expected because the amount of numerical dissipation (relative to the flow of interest) introduced by the RUSANOV and HLLC flux functions increases with decreasing Mach number of the flow, see Sect.2.4.With M 0 = 10 −3 , only the PSH method reproduces the basic structure of the primary vortices when combined with the HLLC flux.Comparing Fig. 4 with Fig. D.2, we see that the LHLLC flux produces results independent of the initial Mach number M 0 because all of the flows are considerably subsonic.All of the methods we test converge to the same flow pattern with M 0 = 10 −2 , see Fig. D.7 for solutions computed on the 4096 × 2048 grid 10 .
Figs. 5, D.3, and D.4), confirming its low-Mach property.The highest-order reconstruction methods usually (but not always) Figures 5, D.3, and D.4      show that the linear methods LIN, PAR, and for dynamic variables also PSH approach 2 nd -order convergence on sufficiently fine grids even if the magnitude of the errors is much larger than what we obtain with the LHLLC flux function.The two PPM methods converge poorly in ρ at low Mach numbers (Figs. 5 and D.4).Because the initial density is constant everywhere, all of the density fluctuations are the integrated effects of the divergence (or convergence) of the velocity field in the continuity equation.The magnitude of the velocity divergence in simulations computed on the 512 × 256 grid is shown in Figs. 6, D.5, and D.6.The solutions computed using the LIN, PAR, and PSH methods clearly show the structure of the primary vortices with some oscillations in the shear layers and a simulations of convection, turbulent entrainment, and 784 wave excitation 785

810
odic boundary conditions in the horizontal x-and z-direction, 811 whereas reflecting, stress-free boundaries are used in the verti-812 cal y-direction.This problem is set up such that turbulent con-813 vective flows develop in the lower half of the domain, which is 814 initially adiabatic, while internal gravity waves are free to prop-815 agate in the upper, stably stratified layer.To keep track of the 816 position of the upper convective boundary, at t = 0 we fill the sta-817 ble layer with a passive scalar whose abundance smoothly turns 818 to 0 across the upper boundary of the adiabatic region according 819 to X = η(y).820 In order to drive the convection, we include a time-821 independent heat source that continuously injects energy into the 822 system close to the base of the box.The rate of energy released 823 per unit volume, 824 q

826
source was q0 = 3.795720 × 10 −4 , which gave rise to convective 827 flows with a root-mean-square Mach number M rms ≈ 0.04.To 828 make the problem more challenging, here we decrease q by a 829 factor of ten, so q0 = 3.795720 × 10 −5 .The lower heating rate, 830 according to the well-established M ∝ q1/3 0 relation (see, e.g., 831 Woodward et al. 2014; Käpylä 2021; Horst et al. 2021), should 832 drive convection at M rms ≈ 0.02.833 In this setup, internal gravity waves (IGWs) are excited by 834 the interaction of the convective flows with the bottom boundary 835 of the stably stratified layer.The wavelength of IGWs in the di-836 rection of gravity becomes shorter when the waves are excited 837 at progressively lower temporal frequencies (Sutherland 2010).838 At the heating rate we consider, the most prominent IGWs that 839 originate at the convective boundary are only barely spatially re-840 solved on the coarsest of our grids with 128 3 cells.Therefore, 841 we decide not to decrease q0 even further for these simulations 842 because it would give rise to convective flows with lower char-843 acteristic frequencies and lead to the generation of unresolved 844

T
Fig. 13.Overall, the amplitude of the undershoots increases with 1029
(Berberich et al. 2021;Edelmann et al. 2021der methods, some of 174 which are described in Sect.2.3.In problems involving the pres-175 ence of a gravitational field, the deviation well-balancing method 176(Berberich et al. 2021;Edelmann et al. 2021) is used to preserve 177 hydrostatic solutions and to reduce the strength of spurious flows 178 generated by grid discretization errors in strongly stratified me-179 dia.
(Liou 2006;Miczek et al. 2015;Minoshima & Miyoshi 20214)approximate Riemann solvers, SLH adopts special low-Mach-number methods(Liou 2006;Miczek et al. 2015;Minoshima & Miyoshi 2021) to reduce the excessive numerical dissipation introduced by shock-capturing schemes at low Mach numbers (see Sect. 2.4).A wide spectrum of spatial reconstruction methods can be used to generate a pair or Riemann states at each grid cell interface, ranging from (first-order accurate)

Table 1 .
Overview of spatial reconstruction methods used in this work.
the fluxes in the intermediate regions, 465 F * L and F * R , are obtained by solving the Rankine-Hugoniot jump 466 conditions across the two outer sonic waves, As discussed in Sect.2.4.2,HLLC restores the intermediate, linearly degenerate waves, so it is generally more accurate than two-wave solvers such as RUSANOV or HLL 500ergy when the flow is nearly transonic, in which case effects of 501 compressibility and nonlinearities can become dominant.Such a 502 flow regime, however, was not considered in this study.
Meakin & Arnett 2007; upper convective boundary aRizzuti et al. 2023)by means of horizontal averages in the Brunt-Väisälä frequency, 993 which determines both the spectrum of convectively driven 994 IGWs (see, e.g., Sutherland 2010) and the rate of mass entrain-995 ment from the stable layer according to the bulk-Richardson-996 number mixing model (see, e.g.,Meakin & Arnett 2007; An-997  drassy et al. 2020;Rizzuti et al. 2023).Here, the Brunt-Väisälä 998 frequency is computed as Leidi et al.:Performance of high-order Godunov methods est17, we can now search for the most efficient way to generate a numerical solution at the desired accuracy.In principle, one 1114 could use the L 1 errors computed from the simulations of the 1115 Kelvin-Helmholtz instability (see Sect. 3.1) to find the resolu-1116 tion of the grid on which the scheme achieves a given level of 1117 accuracy.Then, the computational cost of the simulation run on 1118 such a grid can be estimated if the wall-clock time spent by the 1119 program to perform a single cell update is known.Here, we previs (N x ) cannot be defined uniquely.Here, we choose By combining the values of N eff (N x ) and the average wall- G.1138 Therefore, λ 1153

Table 2 .
Effective resolution N eff (N x ) as defined in in Sect.3.2 for all of the 18 Godunov-type methods tested in our study on grids with N x = 128 and N x = 256 cells per dimension.Here, all numbers are given as a percentage of the grid resolution N x for an easier comparison between different grids.In this metric, the rescaled effective resolution of the reference run (PAR + LHLLC + 512 3 ) is 24.43.

Table 3 .
Mean wall-clock time spent by the program to advance the solution by one time step with RK3 in a single cell of the computational grid, in units of µs.The numbers provided in the tables are averages of five measurements, each of which is obtained by evolving the setup described in Sect.3.2 for 100 steps.Relative errors in the averages (in the sense of 1σ) are smaller than 2% in all cases.Every simulation run for this analysis is MPI parallelized using grids with 32 3 cells per task.
1228focus on the spatial accuracy of the schemes and the related need 1229 to suppress time-stepping errors.1230 We considered two main test problems.The first is a Kelvin-1231 Helmholtz instability problem, in which the initial shear flows 1232 have Mach numbers of 10 −1 , 10 −2 , and 10 −3 .We used a smooth 1233

Table 4 .
Relative computational cost of each Godunov-type method considered in this study to achieve the same effective resolution as the PPM08 + LHLLC combination (Θ, see Eq. (106)) in the simulations described in Sect.3.2.Here all numbers are rounded to two significant figures.make it possible to obtain numerically convergeable solutions to the inviscid Euler equations at a fixed point in time in the nonlinear phase of the instability.We characterized the accuracy of the solutions by (i) performing a qualitative assessment of the spatial structure of the solutions and (ii) by measuring L 1 errors with respect to a reference solution obtained on a fine grid of 8192 × 4096 cells.