Issue |
A&A
Volume 672, April 2023
|
|
---|---|---|
Article Number | A91 | |
Number of page(s) | 21 | |
Section | Numerical methods and codes | |
DOI | https://doi.org/10.1051/0004-6361/202245730 | |
Published online | 06 April 2023 |
A general basis set algorithm for galactic haloes and discs
Department of Astrophysics, University of Vienna,
Türkenschanzstraße 17,
1180
Vienna, Austria
e-mail: edward.lilley@univie.ac.at; glenn.vandeven@univie.ac.at
Received:
19
December
2022
Accepted:
14
February
2023
We present a unified approach to (bi-)orthogonal basis sets for gravitating systems. Central to our discussion is the notion of mutual gravitational energy, which gives rise to a ‘self-energy inner product’ on mass densities. We consider a first-order differential operator that is self-adjoint with respect to this inner product, and prove a general theorem that gives the conditions under which a (bi-)orthogonal basis set arises by repeated application of this differential operator. We then show that these conditions are fulfilled by all the families of analytical basis sets with infinite extent that have been discovered to date. The new theoretical framework turns out to be closely connected to Fourier-Mellin transforms, and it is a powerful tool for constructing general basis sets. We demonstrate this by deriving a basis set for the isochrone model and demonstrating its numerical reliability by reproducing a known result concerning unstable radial modes.
Key words: galaxies: halos / galaxies: structure / methods: numerical
© The Authors 2023
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
1 Introduction
Orthogonal basis sets play a key role in the efficient calculation of the gravitational potential of perturbed, isolated mass distributions. They can also be of great value when investigating the stability of dynamical models for galaxies. Both these topics have attracted renewed interest recently in light of the mounting observational evidence that the Milky Way and other galaxies are not as symmetric in shape as assumed previously (Vera-Ciro & Helmi 2013; Law & Majewski 2010), and moreover that they may not be in exact dynamical equilibrium (Erkal et al. 2021; Petersen & Peñarrubia 2021). A small sample of recent applications of basis sets includes: efficiently reconstructing individual trajectories in time-varying snapshots of N-body simulations of dark matter haloes (Lowing et al. 2011; Sanders et al. 2020; Petersen et al. 2022a); flexible non-parametric models for the Milky Way (Garavito-Camargo et al. 2021); and a wide variety of perturbation calculations (Hamilton et al. 2018; Fouvry & Prunet 2022).
The development of the so-called biorthogonal basis sets began with Clutton-Brock (1972, 1973), who introduced two remarkable analytical sets of potential-density pairs based on the Kuzmin (1956) disc and Plummer (1911) sphere, respectively. These mathematical discoveries (along with some later results discussed below), while fortunate, are limited. It has long been recognised that to make best use of the basis set technique, one would prefer complete freedom in choosing the zeroth-order (as well as underlying coordinate system and geometry), while making minimal sacrifices regarding computational efficiency.
To this end, there are basically three possible directions of generalisation. One might hope to have the good fortune of finding other ‘analytical’ basis sets, taking some known model as the zeroth-order potential-density pair and then hoping that by some ingenious change of variables or integral transform a set of orthogonal higher-order functions could be written down. This approach is limited but has provided a handful of further results in both spherical polar coordinates (Hernquist & Ostriker 1992; Zhao 1996; Rahmati & Jalali 2009; Lilley et al. 2018b,a) and for infinitesimally thin discs (Kalnajs 1976; Qian 1993). Generally speaking, for both spheres and thin discs, basis sets exist for some double power laws and for certain types of exponential distributions of mass.
Secondly, one could posit an arbitrary sequence of nonorthogonal potential-density pairs, and from them derive an orthogonal set using the Gram-Schmidt algorithm. This is the approach of Saha (1993), Robijn & Earn (1996). The downsides are the large number of expensive numerical integrations required to compute the required inner products, the numerical instability inherent to the Gram-Schmidt process, and the uncertain completeness or convergence properties of the resulting orthogonal basis.
Lastly, the strategy devised by Weinberg (1999), Petersen et al. (2022b) generalises the original result of Clutton-Brock (1973) directly by noticing that the potential-density relation takes the form of a Sturm-Liouville eigenfunction equation with a certain weight function; by choosing a different weight function and using a numerical Sturm-Liouville solver, a different set of eigenfunctions (and hence basis set) can be found. This approach has the upside that certain guarantees about completeness| and convergence can be made, but the downside that the resulting eigenfunctions must be tabulated numerically on a coordinate grid.
In this paper we describe a different generalisation of Clutton-BrockÆs original results – we jettison the eigenfunction equation but retain a three-term recurrence relation. Essentially our approach is motivated by the observation that the extant basis sets1 so far described in the literature admit the curious property of ‘tridiagonality’ with respect to a radial derivative operator. That is, for a given density basis function ρn(r) (suppressing the angular indices and coordinates), the following holds, (1)
where an, bn and cn are constants. This may seem to be merely a curiosity, but upon further reflection it motivates a far-reaching generalisation: armed with just knowledge of an arbitrary (smooth) zeroth-order basis element, the tridiagonality property (Eq. (1)) allows us to build up an entire ladder of basis elements recursively, using just one additional integral perrecur-sive step. The resulting basis elements are linear combinations of derivatives of the zeroth order and hence require no further interpolation. Along these lines, in Sect. 2 we present an algorithm to generate general basis sets from arbitrary zeroth-order potential-density pairs.
Underlying this main result is a link to the general theory of orthogonal polynomials, which motivates us to claim completeness of the resulting basis sets. This theoretical background is discussed in Sect. 3, where we introduce the Fourier-Mellin transform, and show a correspondence between tridiagonal orthogonal basis sets and orthogonal polynomials in the transformed space. Key to this link is the notion of the gravitational self-energy inner product, and an operator (𝒟) that is self-adjoint with respect to it.
The new approach was, in part, first suggested implicitly by Kalnajs (1976), who introduced the Fourier-Mellin transform (but not naming it as such) in the case of thin discs2, but nevertheless only used it to rederive the Clutton-Brock (1972) basis set. Those results are partially repeated (using our updated notation) in Sect. 4.2, where we show that a formalism equivalent to the spherical case exists for thin discs in cylindrical polar coordinates, along with a similar self-adjoint operator (𝒜).
As further motivation for our new algorithm, in Sect. 4 we demonstrate concretely how the formalism applies to some existing basis sets in the literature. Specifically we show that the two major families of basis sets – corresponding to double power laws in the spherical (Lilley et al. 2018a) and thin disc (Qian 1993) scenarios (along with their various limiting forms) – both possess the tridiagonality property, and hence each admit a representation in terms of a polynomial in 𝒟 or 𝒜, respectively.
In Sect. 5, we return to the general algorithm described in Sect. 2, and discuss the numerical and computational issues that arise when trying to implement it in practice. In particular it is necessary to find a fast, stable method to evaluate the requisite numerical integrals. This is most easily accomplished using Gauss-Laguerre quadrature in the transformed (Fourier–Mellin) space, first computing the underlying system of orthogonal polynomials. The recommended procedure is illustrated with the case of the isochrone model, which we use in Sect. 5.4 to recover a known result about unstable radial modes.
Finally in Sect. 6, we discuss some of the geometric ideas underlying the new formalism. We outline how our results might be extended to other geometries or coordinate systems relevant to the study of realistic galaxies, and give an outlook on future work to be done in the area.
2 Description of algorithm
First we make some new definitions as well as recapitulating the standard terminology. We define the ‘self-energy inner product’ 〈·, ·〉 on mass densities (2)
This is sometimes referred to as the mutual gravitational potential energy of ρ1 with respect to ρ2. Of course, the total self-energy is just ‖ρ‖2 = 〈ρ,ρ〉, which here must clearly always be real and positive (although the normal convention is for this quantity to be negative, the overall choice of sign is irrelevant for our purposes). It is important that Eq. (2) obeys the standard properties of an inner product: linear in its first and conjugate linear in its second argument. Generically we allow mass densities to be complex-valued, as it eases some of the following derivations; however the entire formalism (necessarily) also works in the case of purely real mass densities. We are also not limited to densities with finite total mass, only finite total self-energy3. Finally we note that if we have a solution to Poisson’s equation for ρ1 and ρ2, finding their gravitational potentials to be Φ1 and Φ2, then (using Green’s identities) we can rewrite the inner product (Eq. (2)) as (3)
We set the gravitational constant G = 1 throughout. Now we introduce both spherical polar coordinates (r, φ, ϑ) and cylindrical polar coordinates (R, φ, z), the latter being used here only in the situation where the mass density is confined to a thin disc aligned with the z-axis. We define two important operators, (5)
These have the important property of being self-adjoint with respect to the inner product (Eq. (2)) (see Appendix A for a proof), that is (7)
and (when f and ɡ are thin discs) (8)
Our standard notation for basis sets is as follows. We denote by {ρnlm}a complete basis for the set of smooth mass densities satisfying: (9)
The set {ρnlm} is assumed orthogonal with respect to Eq. (2), (10)
These basis functions are the product of radial and angular components, (11)
where Knl are constants factored out of ρnl just to simplify the expressions; and is the radial part of the Laplacian when operating on (radial functions) × (a spherical harmonic of order l): (13)
The purely radial functions ρnl (r) and Φnl(r) are real-valued, and satisfy a biorthogonality relation (14)
and it is for this reason that such basis sets are traditionally referred to as ‘bi-orthogonal’. We take Ylm throughout to be a unit-normalised (complex) spherical harmonic. If non-orthonormal spherical harmonics are employed then Nnlm must contain the appropriate factor that normalises them. The radial functions Φnl and ρnl are typically functions of the quantity r/rs, where rs is some ‘scalelength’ with units of length; we generally use rs = 1 implicitly4.
An analogous notational convention is used throughout for the case of a thin disc. We write {σnm} to represent a complete basis, where (15)
In an abuse of notation we suppress the z-dependence and elide the quantities which have subscript nm, writing the potential in the disc plane as (16)
We now describe a natural method for deriving basis sets with any smooth analytical zeroth-order element. We shall focus on the spherical case, and afterwards describe the (slight) changes required in the thin disc case.
The first step is to choose a suitable zeroth-order potential, which we denote Φ (r). This can be chosen according to the problem at hand, the only requirements being that it must be a smooth spherically symmetric function of r, and the potential-density pair must have finite total gravitational self-energy. Starting from Φ we must then invent a function Φ0l(r) that provides the zeroth radial order for the higher multipoles indexed by l. This function must satisfy two boundary conditions5: Φ0l ~ rl as r → 0 and Φ0l ~ r−l−1 as r → ∞. One way to achieve this is to take (17)
but any choice with the correct asymptotic behaviour will do just as well6. Once Φ0l is chosen, the corresponding density multipoles ρ0l are fully determined by (18)
where K0l is an arbitrary constant chosen to simplify the algebra.
The defining relation for the basis set with zeroth order ρ0l is the differential-recurrence relation, (19)
with initial conditions ρ−1,l = 0, and where βnl are some (as yet undetermined) constants. Note that the operator applied to the ρnl term on the RHS is equal to −i𝒟 (Eq. (5)). We can immediately write down a similar recurrence for the potential elements, (20)
due to the commutation relation between 𝒟 and the radial Laplacian (see Appendix B). By taking the inner product of Eq. (19) with both ρn+1,l and ρn−1,l, and exploiting the self-adjointness property (Eq. (7)), we find that the constants βnl are given by (21)
This is just the ratio of the gravitational self-energy of the nth and (n − 1)th basis elements. Because the RHS of Eq. (19) depends only on the nth and lower elements, we can now build up the entire sequence of basis elements by alternating applications of Eqs. (19) and (21).
This deceptively simple algorithm leaves some unresolved issues: firstly, whether these basis sets are truly complete; secondly, how we deal with the differentiation required in Eq. (19); and thirdly, whether the numerical integrals in Eq. (21) are stable.
We can give at least convincing heuristic answers to these questions. The question of completeness we consider in the course of the theoretical discussion in Sect. 3.2. The repeated differentiation will in general require some form of ‘symbolic’ or ‘automatic’ differentiation, which we discuss in Sect. 5.3 –unless the specific form of the zeroth-order allows for a simplification. The question of numerically calculating the recurrence coefficients βnl is thorny, and we return to it in Sect. 5 after developing in Sect. 3 the theoretical machinery that links these basis sets to the theory of general orthogonal polynomials.
Our resulting basis elements are linear combinations of the higher-derivatives of the zeroth-order functions: {𝒟nρ0lm} in the case of the density, and {𝒟nΦ0lm} in the case of the potential. This means that, given a closed form zeroth-order, all higher elements are generated through differentiation – no numerical interpolation is required, unlike Weinberg (1999)’s algorithm based on Sturm–Liouville eigenfunctions. In fact, given a particular zeroth-order, a basis computed via the Sturm–Liouville approach will not in general coincide with the basis set developed from our own algorithm, except for certain special cases that are known to obey eigenfunction equations (for example the Zhao 1996 basis sets).
In addition, unlike Saha (1993), we are able to avoid the brute force approach of Gram Schmidt orthogonalisation (with complexity O(n2) in the number of inner products, and uncertain numerical stability). This is due to the self-adjointness of the operator 𝒟, which ensures that each basis set maps onto an underlying orthogonal polynomial in Fourier-Mellin space, a mathematical connection elaborated upon in Sect. 3. Thus we can reuse the large body of literature regarding the construction of general orthogonal polynomials, the most important property being that any set of orthogonal polynomials obeys a three-term recurrence relation - this relation is transferred over to the basis set, manifesting as the differential-recurrence relation (19).
Lastly we note that in the case of a thin disc the surface densities σnm have fundamental differential-recurrence relation (22)
where the operator applied to σnm on the RHS is now -i𝒜 (Eq. (6)); but the algorithm is otherwise identical to the spherical case. In both the spherical and thin disc case the algorithm can be initialised by choosing either the zeroth-order potential or the zeroth-order density; but starting with a density may be more difficult in the thin disc case, as analytical potential-density pairs are harder to come by the required boundary conditions on ψ0m (with azimuthal index m standing in for l) are unchanged, as is the requirement of smoothness and finite self-energy.
3 Theoretical background
3.1 Functional calculus of 𝒟 and the Fourier–Mellin transform
Consider the eigenfunctions of 𝒟, which we denote Ψs. These satisfy (23)
We combine this with a spherical harmonic to define the ‘𝒟-eigenbasis’ (25)
Now let F(r) be a general mass density, and F1m(r) its spherical multipole moments. Then the expansion coefficient of F in the 𝒟-eigenbasis is (see Appendix C for proof) (26)
where Kl(is) is defined through (27)
and ℳr is the Mellin transform, (28)
We subsequently refer (with some precedent) to the combination (Eq. (26)) of taking multipole moments and a Mellin transform as the three-dimensional ‘Fourier–Mellin transform’. We can re-express F in terms of its Fourier–Mellin expansion coefficients using the Mellin inversion theorem (via an appropriate change of variable), (29)
where the inverse Mellin transform ℳ−1 is (30)
for some constant c, the choice of which does not affect any of our results. The mutual gravitational energy of two general mass densities F and G can therefore be expressed as (31)
Because 𝒟 is self-adjoint the spectral theorem applies, and we can consider arbitrary bounded complex-valued functions of 𝒟. The Fourier–Mellin transform can be viewed as the (unitary) map to the space in which 𝒟 acts as a multiplication operator. In practice though, we can limit ourselves to considering polynomials in 𝒟.
The formalism developed above also applies mutatis mutandis to the thin disc case. The derivation is now mostly the same as that found in Kalnajs (1971,1976), but we update his notation. Our self-adjoint operator A has eigenfunctions Σs satisfying (32)
The functions Σsm(R) are Kalnajs’ ‘logarithmic spirals’7. For a general razor-thin mass density σ(R) we have the thin disc version of the Fourier–Mellin transform (see Appendix C.2 for proof), (33)
where σm(R) are the cylindrical multipoles of σ(R), and (34)
3.2 Tridiagonality and polynomials
Associated to each of our basis sets is a polynomial we refer to as the ‘index-raising polynomial’ – depending on the normalisation we write either ρn1(s) or ρnl(s) (for a polynomial of degree n in the variable s). The general result proved below is that applying the nth-degree polynomial with argument 𝒟 to the zeroth-order density element gives the nth-order density element. It may help with interpretation to note that these polynomials in a sense ‘live’ in the Fourier–Mellin space introduced in the previous section.
There are several related statements that one can make about a given basis set and its associated index-raising polynomial:
The tridiagonality of the density basis functions {ρnlm} with respect to the operator 𝒟;
The expressibility of each basis function in terms of a polynomial in 𝒟 applied to the lowest-order basis function, with these polynomials obeying a three-term recurrence relation;
The orthogonality of ρnl(s) with respect to a weight function wl(s) given in terms of the Mellin transform of ρ0l;
The orthogonality of the basis functions {ρnlm} with respect to the self-energy inner product 〈· , ·〉.
Below we show that the first and second statements are equivalent. We also find that the third statement implies the fourth, and the second and fourth together imply the third. However, while it is easy to show that the third statement implies the second, the converse is much harder. Favard’s theorem guarantees that a set of polynomials obeying a three-term recurrence relation is orthogonal with respect to some measure, however this is a difficult computation and is not what we actually want. In practice we want the freedom to specify zeroth-order basis elements, not the recurrence coefficients themselves.
Therefore, to construct an arbitrary basis set we impose the first and fourth statements. Then the second and third statements (which provide the polynomials Pnl(s) or pnl(s)) are a useful representation of the underlying basis set, which we exploit in order to solve numerical issues in the implementation described in Sect. 5.
The idea of finding orthogonal polynomials from tridiagonal matrices or operators is not new; in the finite-dimensional case the corresponding matrix is called a ‘Jacobi matrix’, and gives rise to polynomials of discrete argument. Our work invokes the infinite-dimensional case, in which a ‘Jacobi operator’ (here 𝒟 or 𝒜) operates on an infinite sequence of functions, which we generally assume to be a complete orthogonal set that spans the relevant function space. Such infinite-dimensional Jacobi operators are studied in Granovskii & Zhedanov (1986), Ismail & Koelink (2011), Dombrowski (1985), and our set-up mimics the development given in the first paper, with the difference that our 𝒟 and 𝒜 are taken as given and do not arise from any Lie algebraic considerations8.
As in the previous section, we give the main derivations in the case of spherical polar coordinates; the thin disc case then follows with little modification.
3.2.1 Polynomials from tridiagonality
We show that any set of densities {ρnlm} that is tridiagonal with respect to D gives rise to an expression for each ρnlm in terms of an index-raising polynomial in 𝒟, of the form (35)
By ‘tridiagonality’ we mean that the following expression holds, (36)
for some constants anl, bnl and cnl. First, define (37)
From Eq. (36), there exists an expansion of χnlm of the form (38)
Then by inverting Bnjl(interpreted as a matrix with respect to the n j indices) it is evidently possible to write an expansion for ρnlm of the form (39)
To prove that ρnl(s) is a polynomial, take the Fourier–Mellin expansion of Eq. (39), (41)
where the second equality uses the self-adjointness property (Eq. (7)) as well as the definition of the eigenbasis (Eq. (25)). Dividing through by 〈Ψslm,ρ0lm〉 then gives Pnl(s) as a polynomial in s with (as yet undetermined) coefficients Anjl. But from the definition (Eq. (39)), we see that Pnl(𝒟) is just the operator expression for ρnlm that raises the radial index from 0 to n, which is Eq. (35). To find the three-term recurrence relation for Pnl(s), take the Fourier–Mellin expansion of Eq. (36), divide through by 〈Ψslm,ρ0lm〉, and rearrange, giving (42)
For the converse statement, substituting 𝒟 for s in the above recurrence and left-applying to ρ0lm trivially recovers the tridiagonality property (we must also take the initial conditions P0l = 1 and P−1l = 0).
3.2.2 Orthogonal polynomials
From Favard’s theorem we know that the Pnl(𝒟) are a system of orthogonal polynomials, as they satisfy a three-term recurrence relation (42). However, in order to actually construct the orthogonalising weight function, we first assume that the underlying basis functions are already orthogonal. It follows that (43)
where the (positive, real-valued) weight function wl(s) is related to the zeroth-order density basis function ρ0l by (44)
the proof of which is in Appendix D. The orthogonality relation (43) works in both directions: if we instead assume that Pnl(s) are orthogonal with respect to (a given) wl(s), then the orthogonality of the ρnlm follows.
In fact, Pnl(s) can be written in terms of purely real polynomials pnl(s), which are also orthogonal with respect to (45)
It is often more convenient in applications to deal with these real-valued polynomials. Without loss of generality (up to normalisation) we can take the polynomials pnl(s) to be monic9, obeying a three-term recurrence relation (47)
In this way we only have to consider the single sequence of recurrence coefficients βnl. According to this normalisation the Pnl(s) therefore obey the recurrence (48)
Replacing s with 𝒟 and applying to ρ0l on the right then leads to the defining recurrence for the density basis elements (Eq. (19)). Alternatively we can express the density and potential directly in terms of pnl(s), (49)
3.2.3 Disc case
As expected, similar results apply in the case of thin discs. Take {σnm} to be a set of infinitesimally thin surface densities that are tridiagonal with respect to the operator 𝒜 and orthogonal with respect to 〈·, ·〉. We have index-raising polynomials Pnm(s), defined by (50)
This gives rise to a representation of the basis functions via repeated application of the operator 𝒜, (51)
The orthogonality relation can be written (52)
The details of this derivation are in Appendix D.2. As before we can instead use real-valued polynomials pnm(s), also orthogonal with respect to Ωm(s). The potential and surface density in terms of pnm(s) are (54)
In general it is difficult to find the z-dependence of the potential for thin discs analytically, although in exceptional cases there may be a simple solution (for example the Kuzmin-Toomre discs). In any case because pnm(𝒜) acts by differentiation with respect to R alone, this guarantees that if the z-dependence of the zeroth-order potential is known, then the correct the z-dependence is preserved in all higher-order potential basis elements. This will have important implications when considering the extension of our results to the Robijn & Earn (1996) method for thickened-disc basis sets, however we do not pursue this in the present work.
3.3 Completeness
We make some informal comments about the completeness of a general basis set {ρnlm}, derived from a zeroth-order ρ0l(r) as described above. The completeness of the angular part of each basis (the spherical harmonics) is taken as given.
The question then of whether a set {ρ0l, 𝒟ρ0l, 𝒟2ρ0l,…} forms a complete basis for (the th multipole of) the space of mass densities is the same as asking whether ρ0l is a ‘cyclic vector’ for the operator 𝒟. This is related to the completeness of the associated orthogonal polynomials pnl(s), as powers of 𝒟 correspond to powers of s; so we require that the monomials sn (weighted by wl(s)) form a complete basis for functions on the interval (−∞, ∞). This is achieved if wl(s) is nonzero everywhere. By the definition of wl(s), this then requires the Mellin transform of ρ0l to be nonzero everywhere, which in turn requires that 𝒟nρ0l be non-vanishing everywhere for all n (Marín & Seubert 2006).
Therefore, to be a valid zeroth-order density, ρ0l must fulfil the following: (55)
These conditions are required to hold for all n ∈ ℕ; restricting to n = 0 gives the conditions (9) on ‘representable’ mass densities. While these conditions are fairly restrictive, in general any reasonable analytical potential-density pairs will satisfy them; in particular those described in the following section whose corresponding basis sets or index-raising polynomials have closed form expressions.
4 Application to known basis sets
In Sect. 3, we developed a theoretical justification for the simple algorithm described in Sect. 2. We now provide further motivation by applying the formalism to some concrete examples of basis sets from the literature. Remarkably, all known analytical spherical (resp. thin disc) basis sets of infinite extent have a representation in terms of 𝒟 (resp. 𝒜). In fact, it is extremely theoretically suggestive that these previously described analytical basis sets have index-raising polynomials that can be written in terms of known classical orthogonal polynomials. The expressions we derive below for the various basis sets’ index-raising polynomials may appear complicated; however the presence of a classical polynomial indicates simply that in each case the recurrence coefficient βnl (Eq. (19)) can be written as a rational combination of the given basis set’s fixed shape parameters.
4.1 Spherical case
4.1.1 Clutton-Brock’s Plummer basis set
The simplest possible useful basis set in spherical polar coordinates is that of Clutton-Brock (1973), which uses the Plummer (1911) model as its zeroth-order. By making an appropriate variable substitution, Clutton-Brock transformed the Poisson equation for the radial components (Eq. (12)) into the defining second-order differential equation for the Gegenbauer polynomials (DLMF, Sect. 18.8). Each radial density and potential component is proportional to just one polynomial, (56)
and the normalisation constant is10 (57)
This basis set is in fact a special case of the family described in Sect. 4.1.2, but as it is the simplest (and earliest) of all the spherical basis sets we present it in some depth as a didactic example.
Plugging into the definition of the weight function (Eq. (44)), we find that (58)
where w(x; a, b) is the weight function for the continuous Hahn polynomials pn(x; a, b) (Appendix E.1). It can be verified that each basis element and indeed Fourier-Mellin-transforms into a single continuous Hahn polynomial. Specifically, we find that (59)
Looking at the definition of a continuous Hahn polynomial (Eq. (E.1)), we find a hypergeometric function that terminates after n terms, but where the argument s appears as a ‘parameter’. Given how this relates to the definition of the density elements, this means that (60)
where the operator 𝒟 alarmingly also appears as a parameter; each term in the sum is proportional to a Pochhammer symbol whose argument involves 𝒟. However these are unproblematic to evaluate, as they expand to (61)
and each occurrence of r∂r then operates to the right on (r) in the expected fashion. The index-raising polynomials (of argument 𝒟 or 𝒜) in the remainder of this section are evaluated in a similar way.
4.1.2 The double power law basis sets
Practically all known double power law basis sets in spherical polar coordinates are contained within one super-family described in Lilley et al. (2018a, containing within it the basis sets of Clutton-Brock 1973; Hernquist & Ostriker 1992; Zhao 1996; Rahmati & Jalali 2009; Lilley et al. 2018b). There are two free parameters (α and v) controlling both the asymptotic power law slope and turnover. We refer to the expressions given in Lilley et al. (2018a) for the potential, density and normalisation constants (Eqs. (30)–(33) of that work), and label them with the superscript LSE. The zeroth-order has , and as r → ∞. Inserting into the definition of the weight function (Eq. (44)) and writing µ = α(1 + 2l), we find that (62)
which is again proportional to a continuous Hahn weight function (Appendix E.1). Explicitly for the index-raising polynomials we have (63)
4.1.3 The cuspy-exponential basis sets
These basis sets were not mentioned in Lilley et al. (2018a) and are therefore newly presented in the literature11; but they are a straightforward derivation from the double power law result, obtained by letting the parameter v and the scalelength simultaneously tend to infinity. The result is a family of basis sets with both an exponential fall-off and a central cusp in density, both controlled by the parameter α − hence the nickname ‘cuspy-exponential’ (CE). The lowest order density function is . Important cases are α = 1/2 which gives a Gaussian, and α = 1 which is a density familiar to chemists as the Slater-type orbital. We use the superscript CE for these basis functions. The density and potential are (with µ = α(1 + 2l)) (64)
where γ(µ, z) is the (lower) incomplete Gamma function and (z) is a Laguerre polynomial. The relevant constants are (65)
We can apply the limiting procedure directly to (s), and the calculation is simpler than for the basis functions themselves. The operator 𝒟 does not depend on the scalelength, and hence is unaffected by the limiting procedure. So we need only consider the limit in v. The result is proportional to a Meixner-Pollaczek polynomial (z; ϕ, Appendix E.2), (66)
4.2 Thin disc case
4.2.1 Clutton-Brock’s Kuzmin-Toomre basis set
The Kuzmin–Toomre model (Kuzmin 1956; Toomre 1963) is the simplest power law for the infinitesimally thin disc. This model provides the zeroth-order for a basis set introduced by Clutton-Brock (1972). This basis set turns out to be a special case of Qian’s family (Sect. 4.2.2), but here at least we can write down simple expressions in terms of a single Gegenbauer polynomial (Aoki & Iye 1978), so it is worth recording the results separately. The density and potentials in the plane are (67)
and the normalisation constant in the orthogonality relation is (68)
The corresponding Fourier-Mellin weight function (Eq. (53)) is then (69)
which is proportional to the weight function for a Meixner-Pollaczek polynomial (Appendix E.2) with parameters λ = m + 1/2 and ϕ = π/2. So the index-raising polynomials have a simple expression in terms of the Meixner-Pollaczek polynomials, (70)
4.2.2 Qian’s k-basis sets
The family of basis sets introduced by Qian (1993) is a generalisation of Clutton-Brock (1972), allowing for an arbitrary generalised Kuzmin–Toomre model to be the zeroth order.
That is, the zeroth-order density functions are (using the superscript Q) (71)
Here B(x, y) = Γ(x)Γ(y)/Γ(x + y) is the standard Beta function, and the prefactors have been chosen so that all derived expressions are compatible with those in Qian (1993). The higher-order potential and density functions that Qian provides are given in terms of very complicated recursion relations, that are only valid when k is an integer. However there is no such limitation in our representation. The weight function is proportional to that for a continuous Hahn polynomial pn(s; a, b, Appendix E.1), and so the index-raising polynomial is (72)
We therefore have closed form expressions for and , that are valid for all real values of k, as long as the zeroth-order model has finite total self-energy. The original Clutton-Brock (1972) basis set (Sect. 4.2.1) is recovered when k = 0. The normalisation constant for the orthogonality relation can be derived from that of the continuous Hahn polynomials, and is (73)
4.2.3 Qian’s Gaussian basis set
A Gaussian density profile is another plausible model for the density of a galactic disc, and such a basis set was also studied by Qian (1993). Just as we derived the cuspy-exponential basis sets of Sect. 4.1.3 from the double power law result by taking the infinite limit of the shape parameter v, it turns out that Qian’s basis set for the Gaussian disc can be derived by taking the limit k → ∞ in the corresponding expressions (Eq. (71)) for the generalised Kuzmin-Toomre basis set of Sect. 4.2.2. The zeroth-order density and potential are (using the superscript G) (74)
The function denoted 1F1 is a confluent hypergeometric (Kummer) function, that reduces to combinations of modified Bessel functions for any given m. At zeroth-order we have the well-known result that the potential of a plain Gaussian disc involves a single modified Bessel function,
Again Qian gives the higher-order potential and densities only as complicated recursion relations. However, explicit expressions follow upon taking the limit k → ∞ in Eq. (72). We find that (75)
where is a Meixner-Pollaczek polynomial (Appendix E.2). Then Eqs. (74) and (75) can be combined to find (76)
which works because the factor of cancels out in 𝒜; there is a similar expression for the potential functions.
4.2.4 Exponential disc
Interestingly, there is another thin disc model which has classical index-raising polynomials: we briefly sketch the derivation for an exponential disc.
We require all density components to fall off exponentially like e−R but also to behave like an interior multipole as R → 0, so as a zeroth-order ansatz for the density we take simply (77)
This gives a weight function (via Eq. (53)) proportional to that for a continuous Hahn polynomial12. Thus the index-raising polynomials can be written down explicitly as (78)
along with closed form expressions for the recurrence coefficient and normalisation constant. The remaining complication is the zeroth-order potential. The m = 0 case is awkward but classical (Binney & Tremaine 1987, Ch. 2) and uses modified Bessel functions, (79)
Deriving expressions when m > 0 is trickier – we give the details in Appendix F –but it can be accomplished with the following differential-recurrence relation: (80)
Some examples of the potential basis elements are plotted in Fig. 2.
5 Numerical implementation
At the end of Sect. 2 we mentioned the main obstacles to the effective implementation of the new algorithm – primarily the numerical stability when computing the coefficients βnl, but also the need to compute repeated radial derivatives of the zeroth-order elements.
For the recurrence coefficients βnl the difficulty is that naively computing the integrals (Eq. (21)) become computationally expensive very quickly with increasing order n (and to some extent also with l). Therefore it is essential to pick a numerical integration method that is fast without sacrificing accuracy. Unfortunately due to the total freedom in choice of zeroth-order ρ0l, it is difficult to find a quadrature scheme for the integrals (Eq. (21)) that is optimal in general.
Fortunately, due to the link to the polynomials pnl(s) developed in Sect. 3, we can take advantage of the extensive literature on the construction of general orthogonal polynomials. Following Gautschi (1985) we have two options: either the discretised Stieltjes procedure or the modified Chebyshev algorithm. As it happens, computing the recurrence coefficients naively as in Eq. (21) is directly analogous to using the discretised Stieltjes procedure, except that now we perform the integrals in Fourier– Mellin space. This turns out to be the better option numerically, as the modified Chebyshev algorithm runs into floating point issues sooner due to catastrophic cancellation of terms. However, for completeness we describe both algorithms (Sects. 5.1 and 5.2). We also discuss computer-assisted techniques for performing the repeated differentiations (Sect. 5.3).
All these methods are illustrated throughout for a basis set constructed to have the isochrone model (Henon 1959) as its zeroth-order, and we follow up the numerical discussion with a demonstration of the validity of the isochrone-adapted basis set (Sect. 5.4); however the underlying methods we describe are applicable to any suitable zeroth-order model. The potential, density and polynomial weight function for the isochrone model are as follows: (81)
The precise l-dependence of these expressions is of course arbitrary to some extent, but we have made a suitable natural choice.
5.1 Discretised Stieltjes procedure
The sequence of recurrence coefficients βnl that we need to compute can be expressed as the ratio of two integrals, (82)
and so for each higher n we need one additional evaluation of Inl. Evaluations of βnl alternate with applications of the recurrence relation (47) to find the next basis element ρnl. Once sufficient βnl have been found, the potential or density functions ρnlm and Φnlm, can be evaluated via their own recurrences as described in Sect. 2.
The difficulty then is in finding an appropriate strategy to compute the integrals Inl. We opt to evaluate them in Fourier– Mellin-space, using the polynomials pnl(s) directly, and making use of the fact that the integral can be written (83)
Therefore, the first step is to determine the weight function wl(s). This can be found in terms of the (Fourier-)Mellin transform of either the zeroth-order potential or the density, (84)
The Mellin transform is perhaps one of the less familiar integral transforms, but in practice a wide variety of Mellin transforms can be found in closed form (helped especially by computer algebra systems), in part because with a logarithmic change of variable it can be written as a Fourier transform. All the polynomial weight functions considered in this paper can be found symbolically using MATHEMATICA13. Numerical evaluation of the Mellin transform is also an option – by transformation to the Fourier transform and approximation using fast Fourier transform methods – however we do not pursue this further in the present work.
Now we consider the asymptotic behaviour of the weight function wl(s) as s → ±∞. The smoothness requirement (Eq. (9)) on ρ0l forces wl(s) to decay faster than any power of s, that is, at least exponentially. We expect that (85)
so we need to determine the decay constant a. In the case of our isochrone basis set this asymptotic behaviour is derived from the behaviour of the complex gamma function at infinity (DLMF, Sect. 5.11.9), giving wl(s) ~ |s|−1 e−π|s|, or a = π. When wl(s) can be written down, it is usually simple to read off the decay constant a; for example, the double power law basis sets (Sect. 4.1.2) have a = α.
The at-least-exponential decay of the weight function suggests that the appropriate discretization scheme for Eq. (83) is Gauss–Laguerre quadrature. To implement this for the isochrone case, rewrite Eq. (83) to pull out a factor of e−πs, and use the symmetry of the integrand to change the domain of integration to (0, ∞) (defining x = πs), (86)
We can then implement Gauss-Laguerre quadrature of order v, as a weighted sum over evaluation points xjv, which are the roots of the vth Laguerre polynomial Lv(x): (87)
The quadrature rule of order v integrates polynomials exactly up to order 2v − 1, so to compute Inl with the isochrone weight function we would expect to need at least v ≥ n. (An acceptable rule of thumb is that Inl requires v = max(n + l, 10).) It may be necessary to compute the weights wjv and roots xjv to a higher order of precision internally using arbitrary-precision arithmetic, but this is not a bottleneck in practice – and typically Gauss-Laguerre quadrature is implemented as a library function whose implementation details are hidden. In this way we can get, for example, 50 orders of βnl to floating-point precision in under a tenth of a second using one core of a modern CPU.
The radial parts of some examples of potential elements in the isochrone basis set are plotted in Fig. 1.
Fig. 1 Radial parts of the isochrone potential basis for n = 0, 1, 2, 3 and l = 0 (top) and l = 1 (bottom). The potentials have been unit-normalised. |
5.2 Modified Chebyshev algorithm
This is an alternative method described in Gautschi (1985), which we find to be less numerically stable in practice. However we describe it here for completeness, as it may yet find some usefulness (for example to facilitate finding exact expressions for the recurrence coefficients in certain cases).
Gautschi’s modified Chebyshev algorithm prescribes the ‘modified moments ‘14 (88)
Here, are some auxiliary set of (monic) polynomials, orthogonal with respect to a symmetric measure on the interval (−∞, ∞), and obeying a three-term recurrence relation (89)
By symmetry this means that are nonzero only for even k. In principle the choice of auxiliary polynomial is wide open, but the obvious choice in our case (for ease and stability of computation) is the monic Hermite polynomials , for which . We can then proceed to find the ‘mixed’ moments (90)
via a system of recurrence relations that produces the desired recurrence coefficients βnl as a byproduct: (91)
In practice (for our isochrone basis set) we find that σjkl suffers from catastrophic cancellation beyond approximately j = 20. Alternatively if the modified moments are known in closed form then this method is convenient for finding ‘exact’ recurrence coefficients. This turns out to be the case for the isochrone basis set, for which see Appendix G.
Fig. 2 Radial parts of the exponential disc basis for n = 0, 1, 2, 3 and m = 0 (top) and m = 1 (bottom). The potentials have been unit-normalised. |
5.3 Repeated differentiation
There are three classes of algorithm for computer-assisted differentiation: 1. finite-differencing, 2. symbolic differentiation, and 3. automatic differentiation. The first of these we can discount pretty much immediately as being wildly numerically unstable and expensive compared to the other two.
The second, symbolic differentiation via computer algebra, is potentially competitive at low expansion orders, but it is hard to predict the degree of blow-up in the number of algebraic terms. It depends strongly on the precise form of the function that is being differentiated. In practice we find that efficient application of symbolic differentiation at high expansion orders requires alternating between differentiation and algebraic simplification.
Of course, one may also attempt symbolic differentiation by hand, attempting to find simplifications that reduce the tower of applications of 𝒟n to a simpler form – whether this is possible also depends on the form of Φ0l and ρ0l. Many of the basis sets considered in Sect. 4 have simple closed forms (at all orders) due to fortunate simplification in repeated differentiation. For example, taking the double power law basis (Sect. 4.1.2) with parameters α = 1/2, v = p − 3/2 and n = l = 0 (and labelling each density function with p), we have (92)
Using this identity in Eq. (63) then leads (after some further simplification) to a known closed form expression for . However it is likely to be difficult to find easy differentiation formulas in general. For our isochrone basis set, the method we give in Appendix G for computing the modified moments can be adapted to find expressions for the higher-order derivatives, but the result is complicated and of dubious numerical stability.
The third method, automatic differentiation (AD) is what we find to be most competitive in practice. This is a general term referring to a class of algorithms implemented entirely at the software library level, that provides an evaluation of the derivative at a single point given only knowledge of the chain rule and the differentiation rules for primitive arithmetic operations and standard mathematical library functions. Essentially, the function to be differentiated is written in ordinary code, and the AD algorithm ‘automatically’ deduces the correct sequence of chain rule steps to carry out. For our purposes we require higher-order derivatives; while applying an AD algorithm to itself works in principle (and often works in practice) it is very inefficient, as the AD logic itself must be differentiated. It is better to use an AD implementation that natively understands higher-order derivatives.
As we are coding in the JULIA programming language, we use a suitable library called TAYLORSERIES.JL (Benet & Sanders 2019). A special variable t(N) is instantiated that represents the first N terms of an (abstract) Taylor series. Given a point r0, we can use t + r0 as the argument of any ordinary mathematical function15; the result is the first N coefficients of the Taylor series around r0 that approximates that function. For example, setting N = 3 and r0 = 1.0 and using the potential of the isochrone model (Eq. (81)) as our function, the computer prints a data structure representing the following truncated Taylor series,
When it comes to the actual implementation, we have two choices, which we find to have similar efficiency in practice. The first option begins with computing the vector of derivatives (at a point r0) up to some maximum order N all in one go, (93)
In fact, because 𝒟 can be expressed as a single differentiation with respect to a transformed variable (via r d/dr = d/ds, where s = log r), Vl can be obtained directly from a single N-term Taylor series evaluation. Separately, we derive from βnl the matrix elements (Al)nj = Anjl in the expansion (94)
To evaluate a vector of potential functions at a single point, (95)
we perform the contraction Φl= Al · Vl. At each different point r1, we have to re-compute V but not A.
The second option is to use the recurrence relation directly (Eq. (19)) or (Eq. (20)). Because we know ahead of time that we want N iterations of the recurrence relation, we set up the Taylor series t(N), and use r0 + t(N) as the dependent variable. The length of the series then shrinks as we go up the ladder of basis function evaluations. In practice this second method seems to be marginally slower than the first one, as more operations on the abstract Taylor series need to be performed.
5.4 Unstable modes of a spherical system
It is important to check whether a basis set constructed according to the prescriptions of Sect. 5 actually works in practice. One simple approach might be to just construct n basis functions, and integrate up the n × n square of inner products, testing whether orthogonality is achieved to a given floating-point precision. However, we know that it is possible to construct basis sets that are genuinely orthogonal but whose expansions of realistic mass densities fail to converge in practice, or display other undesirable numerical effects16. Therefore we choose to demonstrate the validity of our approach by reproducing a physical result from the literature – the unstable radial mode of the isochrone model.
We use the discretised Stieltjes method described in Sect. 5.1, where the basis set is adapted to the isochrone model at zeroth order. However the specific adaptation is not the crucial part; for this particular application only the perturbing density needs to be accurately resolved by the basis elements, so the key feature required of the basis set is only that it has the correct asymptotic behaviour. To this end, we adapt the code and method of Fouvry & Prunet (2022) to show that the same unstable mode is recovered by our isochrone-adapted basis set. The part of the code that implements the basis set has been provided online in a GIT repository17.
The details of the computation can be found in Fouvry & Prunet (2022). In brief, we start with knowledge of an isotropic distribution function that solves the collisionless Boltzmann equation for the isochrone potential. We also have the corresponding action and angle coordinates (J, w) as a function of position and momentum, which for the isochrone potential are known in closed form. Then, each potential basis element must be Fourier-transformed with respect to the angle coordinates (96)
out of which a matrix M is formed18, (97)
where the Rn(s) represents the collisionless Boltzmann operator for a perturbation with growth rate proportional to est. The unstable growing mode then corresponds to a solution A of the matrix equation (98)
with the vector of coefficients A = (Anl) giving the expansion of the mode δΦ with respect to the basis {Φnlm}, (99)
A plot of this mode is shown in Fig. 3. The maximum expansion orders were nmax = 6 and lmax = 2, with a scale length of rs = 1 and a maximum resonance number of . All our other integration parameters are identical to those in Fouvry & Prunet (2022, Appendix C), where a matching result was obtained using the Clutton-Brock (1973) basis set with nmax = 100 and rs = 20 – the mode shape also agreeing with the original result of Saha (1991). As mentioned previously, it is not strictly necessary to exactly match the zeroth-order element of the basis set to the underlying equilibrium model. However the basis elements must have the correct asymptotic behaviour, so using the isochrone-adapted basis set guarantees that this condition is satisfied. Nevertheless, our results do hint that accurate mode recovery may be possible with many fewer basis elements when the basis is suitably adapted, although we hesitate to draw any firm conclusions until a more systematic comparison can be drawn.
Calculating the matrix M is very computationally expensive, as it requires multiple truncated infinite summations, over several indices (n, l, and the vector of wavenumbers n). It also requires two nested integrations, as the Fourier transform (Eq. (96)) must also be performed numerically. In the general non-isochrone case a third level of integration is required, because the action and angle coordinates are no longer known in closed form. Any method of reducing this computational effort is therefore desirable. It is possible that judicious choice of basis elements and application of their differential-recursion relation Eq. (20) may ameliorate these calculations, but further investigation is needed.
6 Discussion and conclusions
We have reformulated the study of bi-orthogonal basis sets using the language of Fourier–Mellin transforms. This unexpected development unifies many previous results into a coherent theoretical framework. The general idea of generating new potential-density pairs from old by differentiation is not entirely new. Traditionally this is accomplished by differentiating with respect to the model’s scalelength – in particular, Aoki & Iye (1978) found compact expressions for the thin disc basis of Clutton-Brock (1972) by repeatedly applying the operator a∂a (for a the scalelength) and orthogonalising the resulting sequence of potential-density pairs by the Gram-Schmidt process. |Subsequently de Zeeuw & Pfenniger (1988), in the course of deriving a series of ellipsoidal potential-density pairs, noted that the operators r∂r and obey an important commutation relation (which we re-derive in Appendix B). Therefore Aoki & Iye (1978)’s result – and by extension our algorithm presented here – can be expressed in terms of the coordinates alone, without reference to an arbitrary scalelength.
The formalism developed in Sects. 2–3 deserves some further interpretation. In particular, the operator 𝒟 on which the whole development hinges may appear to have been plucked out of thin air, but it is in fact no accident: 𝒟 is precisely the inflnites-imal generator of the scaling symmetry of the self-energy inner product (Eq. (2)). To see this, let St be a ‘radial scaling’ operator, (100)
As is immediately evident from dimensional analysis, this preserves the self-energy, (101)
The operator 𝒟 is now defined in terms of the infinitesimal generator of St, (102)
Differentiating Eq. (101) with respect to the parameter t, it is immediately evident that 𝒟 is self-adjoint19. In Sect. 3.1, we implicitly invoked Stone’s theorem from functional analysis to provide a Fourier-like transform whose integral kernel is the eigenfunction of a self-adjoint operator. In our case the operator is 𝒟, the eigenfunction is Ψs (Eq. (24)), and the resulting integral transform is exactly the radial part of the Fourier-Mellin transform that we defined in Eq. (26). The spherical harmonics arise from a similar argument applied to the generators of the coordinate rotations20.
This line of reasoning suggests that it may be worthwhile to look for other symmetries of the self-energy inner product, perhaps arising from other coordinate systems or geometries in which the Laplacian separates. Given a set of three mutually commuting operators arising from three symmetries of the self-energy, we would expect to be able to construct a basis set formalism similar to that of the present work. To sketch out what this looks like in full generality, let τ be a suitable self-adjoint operator according to the criteria just described (restricting to one spatial dimension for the sake of discussion). Then the self-adjointness condition (Eq. (7)) combined with the properties of the inner product (Eq. (2)) implies that (103)
where τ* is the Hermitian adjoint of τ with respect to the ordinary inner product on L2 functions (Eq. (A.3)). If we suppose further that we have found a set of orthogonal potential functions {Φn}, with an index-raising polynomial pn(s) such that (104)
then the associated density functions (obeying ) are given by (105)
There are further simplifications involved in Sect. 3, which come about essentially because 𝒟 = 𝒟* + c for some constant c, which means that the eigenfunctions of 𝒟 and 𝒟* are the same up to a constant shift in the eigenvalue. Generically we would expect a different relationship between τ and τ*.
The task remaining, which we leave to future efforts, is therefore to classify the symmetries of the self-energy inner product, in order to develop expansions that are usefully adapted to different coordinate systems and geometries. In a sense, the ‘holy grail’ would be the construction of an expansion adapted to the confocal ellipsoidal coordinate system, appropriate for studying the equilibrium dynamics of ellipsoidal galaxies21.
Some symmetries are already known. For example, in Cartesian coordinates (x, y, z) we trivially have the three cardinal translations (x → x + a etc.). Writing down their associated infinitesimal generators X = i∂x, Y = i∂y and Z = i∂z, their joint eigenfunction eik r is just the kernel of the standard Fourier transform, with the wavevector k taking the role of the (continuous) eigenvalue. The Fourier transform would therefore play the same role in the resulting basis set formalism as the Fourier-Mellin transform did in ours (Sect. 3). Poisson solvers directly using the Fourier transform are ubiquitous in astrophysical applications, so it would be interesting to construct a set of ‘Cartesian’ basis functions and compare their performance with the current state of the art.
Other symmetries are known from classical potential theory. Firstly, the Kelvin transform; this is an inversion in a sphere and preserves the self-energy up to a sign (Kalnajs 1976). However, it is not a continuous symmetry, so there is no associated infinitesimal self-adjoint operator. Secondly, a symmetry that takes spheres to concentric ellipsoids (sometimes called homeoids); this maps the spherical radius to an ‘ellipsoidal’ radius, . It has long been known that this transformation preserves the mutual self-energy of any two charge or mass densities (Carlson 1961), up to a constant factor that is essentially just an elliptical integral of the three semi-axes (a, b, c). We can use this to transform any purely spherical basis set22 into one stratified on concentric ellipsoids. It is important to note, however, that the concentric ellipsoids in this transformation are distinct from the confocal ellipsoids inherent in the ellipsoidal coordinate system that is more dynamically relevant due to its relationship to the Stäckel potentials (de Zeeuw 1985; de Zeeuw et al. 1986).
Also, we would like to mention some gaps in our analysis. While we purport in this work to provide a general theory of orthogonal basis sets, there are some aspects that are still not fully characterised. Firstly, it is clear from Sect. 4 that there exists a connection between basis sets that have a classical index-raising polynomial Pn(s), and those whose potential and density elements are known in closed form (that is, possessing a recurrence relation independent of 𝒟 or 𝒜). However, the exact nature of this connection is unknown, although it is likely related to the fact that the Hahn-type polynomials appearing in the various index-raising polynomials obey second-order difference equations23. Secondly, we do not touch on the issue of basis sets appropriate for finite-radius systems. This was approached by Kalnajs (1976) in the case of thin discs, using a formalism initially similar to our own. There are also contributions from Polyachenko & Shukhman (1981) for finite spheres, and from Tremaine (1976) for finite elliptical discs. In general, it appears to be straightforward to construct basis sets for finite systems out of polynomials or Bessel functions, but it would be useful to make a concrete connection between the finite bases of Kalnajs (1976) and our new formalism. A more rigorous form of the argument about completeness in Sect. 3.3 would also be desirable, as would a quantitative comparison with basis sets computed via the Sturm-Liouville approach of Weinberg (1999).
We end with some broader speculation. It is possible that applications may be found for the general ideas developed here, beyond the solution of Poisson’s equation. In physics we are often required to compute the inverse of Hermitian operators with a continuous spectrum – a well-known example being the Schrödinger operator for certain boundary conditions and choices of potential. These operators could conceivably be supplied with a set of (adapted) orthogonal basis functions, by identifying a suitable commuting set of self-adjoint operators and then diagonalising their cyclic vectors. Any such basis set then provides an infinite series representation of the Green’s function of the underlying Hermitian operator24 where the coordinates appear multiplicatively separated in each term. A use for such series representations may be found in various applications. The appearance of tridiagonal Jacobi operators in particular may presage links to similar numerical methods in quantum mechanics (Alhaidari et al. 2008; Ismail & Koelink 2011).
Fig. 3 Recovery of the unstable radial mode of the isotropic isochrone model. The mode is recovered well despite the low (nmax = 6) number of basis functions used. |
Acknowledgements
E.L. and G.vdV. acknowledge funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 724857 (Consolidator Grant ArcheoDyn). We thank Jean-Baptiste Fouvry for granting permission to adapt his computer code for the purposes of our Sect. 5.4; and also to the referee Michael Petersen for numerous helpful suggestions that have strengthened the results of this paper.
Appendix A Self-adjointness of 𝒟
Let f, ɡ be densities that are non-zero on a δ-dimensional hyperplane in three-dimensional space (δ ≤ 3). Then (A.1)
where the (three-dimensional Newtonian) Green’s function G is (A.2)
and ϕ is the angle between the two position vectors. Also define the ordinary L2 inner product, (A.3)
We write θ = r∂r and . Preliminaries: first note that (A.4)
and also note that (from integration by parts on r) (A.5)
where to obtain the final result we applied (A.5), then (A.4), and then (A.5) again. So define 𝒟 in a δ-dependent way, as (A.7)
Setting δ = 3 (no restriction to a hyperplane) gives the appropriate result for spherical geometry. For thin discs we have 𝒜 = 𝒟|δ=2. We could also consider δ = 1 for an infinite line density.
Appendix B Commutator of 𝒟 and
Working on a δ-dimensional hyperplane again, write the potential using the Green’s function (A.2), (B.1)
where we used (A.4) and then (A.5). Note that in the spherical (δ = 3) case this is equivalent to calculating the following commutator, (B.3)
which can be shown directly by differentiation and the Leibniz rule; however (B.3) is inapplicable to the thin disc (δ = 2) case, so the previous derivation in terms of the Green’s function is required. Now, writing these results in terms of the self-adjoint operator 𝒟, we have (B.4)
Specialising to the spherical case, if for some basis set {ρn} there exists a suitable index-raising polynomial Pn(s), we have (B.5)
for the density functions, and (B.6)
for the potentials. Analogously, in the thin disc case, we have (B.7)
Appendix C The Fourier-Mellin transform
We develop expressions for the forwards and reverse Fourier-Mellin transform, and the corresponding orthogonality relation. A similar procedure is followed for both the spherical and the thin disc cases.
Appendix C.1 Spherical case
We work in spherical polar coordinates (r,ϑ, φ), with . Our density basis function for the Fourier-Mellin transform is Ψslm, defined in (25). The corresponding potential, obeying , is (C.1)
where Kl(is) is defined in (27). The expansion of an arbitrary mass density F with respect to the Ψslm-basis is the Fourier-Mellin transform of F: (C.2)
where are the spherical multipole moments of F. Inverting this using the Mellin inversion theorem (30) (choosing the constant c = 5/2 in the integral), we have (C.3)
The potential corresponding to the density F can be expressed similarly by replacing Ψslm(r) in (C.3) by its potential ϕslm(r). Finally, the mutual energy of two densities F1 and F2 is (C.4)
and the Fourier-Mellin basis functions satisfy the orthogonality relation (C.5)
Appendix C.2 Disc case
We work in cylindrical polar coordinates (R, φ, z), with . Define 𝒜 = i(R∂R + 3/2). Then for two arbitrary thin disc densities σ1,σ2 ∞ δ(z) we have 〈𝒜σ1, σ2〉 = 〈σ1, 𝒜σ2〉, so 𝒜 is self-adjoint (see Appendix A for proof, setting δ = 2 at the end to give the thin disc case). The eigenfunctions of 𝒜 are Σs(R) = R−is−3/2 with real eigenvalue s. We then adjoin a cylindrical harmonic to form the basis functions (Kalnajs’ logarithmic spirals) (C.6)
Using Toomre’s Hankel-transform method we can find the potential corresponding to this density, which is25 (C.7)
so that in the plane (that is, first acting with the full Laplacian, then afterwards setting z = 0) we have (C.8)
Now we compute the thin disc Fourier-Mellin transform for an arbitrary thin disc density σ, (C.9)
where σm(R) are the cylindrical multipoles of σ(R). Using the Mellin inversion theorem to invert this transform (30) (with constant c = 3/2) gives (C.10)
Therefore the mutual energy of two thin disc densities can be expressed as (C.11)
We also have the orthogonality relation (C.12)
As noted in Sect. 3.2.3, these results are independent of the z-dependence of the potential away from the disc plane.
Appendix D Orthogonality relation
Appendix D.1 Spherical case
For the inner product of any two density basis functions ρnlm we have (D.1)
The non-polynomial factors in the above expression are collected into a weight function wl(s), which can be written explicitly in terms of the Mellin transform of the zeroth-order density (or a similar expression in terms of the zeroth-order potential - see (84)), (D.2)
We also assume we have found the (real) monic polynomials orthogonal with respect to the weight function wl(s), writing them as pnl(s), so that (D.3)
Now write Pnl(s) in terms of pnl(s) as (D.4)
so that the orthogonality relation for the pnlm becomes (D.5)
We have that Pnl(𝒟) is a real operator, because (D.6)
This ensures that applying Pnl(𝒟) to a real function (such as p0l(r)) gives a real result. Note that we used pnl(−x) = (−l)npnl(x), which is true for any orthogonal polynomial where the weight function and domain of integration are both symmetric.
Appendix D.2 Thin disc case
For σnm = Pnm(𝒜)σ0m we have the orthogonality relation (D.7)
where the weight function can be written in terms of either the zeroth-order potential or density, (D.8)
Appendix E Classical polynomials
Here we record two types of orthogonal polynomial that are used in Sect. 4 –the continuous Hahn and the Meixner-Pollaczek polynomials. We summarise only the properties that are relevant for our purposes, and direct the reader to other sources for more comprehensive information (DLMF, Sect. 18.19).
These two polynomials are perhaps obscure compared to the well-known classical polynomials of Jacobi, Laguerre and Hermite. However, a slight generalisation of the notion of ‘classical’ leads to the Askey scheme (Koekoek et al. 2010), according to which the continuous Hahn and Meixner-Pollaczek polynomials lie just one level above the Jacobi polynomials. Like the standard classical polynomials, all Askey polynomials possess 1. closed form expressions in terms of hypergeometric functions, and 2. three-term recurrence relations with simple expressions for the recurrence coefficients. The latter property means that detailed knowledge about the polynomials is usually unnecessary, and the end-user can just plug in the recurrence formulas (E.4) and (E.11).
Appendix E.1 Continuous Hahn
The continuous Hahn polynomials conventionally take four real parameters, usually written in terms of two complex parameters: . We restrict ourselves to the case of two real parameters26, so and , and an explicit representation in terms of a terminating 3F2 hypergeometric series is (E.1)
The orthogonality relation is (E.2)
Note that pn(s; a, b) is a real-valued polynomial in s of degree n, symmetric in the parameters a and b, despite the fact that s appears (abnormally) in the ‘parameter’ part of the hypergeometric function. Like any orthogonal polynomial on a symmetric interval, each individual polynomial is either an even or an odd function, according to the parity relation pn(−s; a, b) = (−1)npn(s; a, b). We also define the monic form of the polynomials, (E.3)
The monic form obeys the three-term recurrence relation (E.4)
Appendix E.2 Meixner-Pollaczek
The Meixner-Pollaczek polynomials are another set of orthogonal polynomials on the interval (−∞, ∞), depending on two real parameters λ and ϕ, and have an explicit representation in terms of a terminating2 F1 hypergeometric function (E.5)
The orthogonality relation is (E.6)
Note that once again the variable x appears in the parameter part of the hypergeometric function. The weight function is (E.7)
In the case that the parameter ϕ = π/2, the Meixner-Pollaczek polynomials can be derived from the continuous Hahn polynomials in two different ways (DLMF, Sect. 18.21): if the two parameters (of the latter) differ by one half (E.8)
or if the second parameter is taken to infinity (E.9)
and for the case ϕ = π/2 the three-term recurrence relation is (E.11)
Appendix F Exponential disc potential
We find the potential multipoles corresponding to the exponential disc density given in (77), using Toomre’s Hankel transform method as a starting point. Applying the Toomre method to the disc density gives an auxiliary function (F.1)
from which the potential is found via (F.2)
Surprisingly, this integral does not appear in the standard tables, and computer algebra provides an unsatisfactory result involving a Meijer G-function. The m = 0 case is given (79), but to derive the higher-orders we need to combine two basic ideas. Firstly, Lynden-Bell (1989) shows how to (in effect) raise the angular index m of the RHS of (F.2), using an operator (modifying his notation) (F.3)
that obeys (for generic ψm and ɡm) (F.4)
Secondly, inspired by the use of the operator θ = R∂R in the main part of the present work, we apply it to (F.2) and perform some integration by parts to find (writing θk = k∂k) (F.5)
It remains to apply linear combinations of θ and Δm to (F.2), and then rearrange the terms inside the integral sign according to our knowledge of ɡm(k) such that only a term proportional to ɡm+1(k) Jm+1(kR) remains on the RHS. The result is the recursion relation given in (80).
Appendix G Exact moments for the isochrone
Using the expressions for the isochrone model in (81), we seek the modified moments of self-energy (G.1)
The auxiliary polynomials p̃jl(s) here are the monic Hermite polynomials27. To facilitate variable substitutions in this integral, it is useful to rewrite both and in rationalised-surd form, (G.2)
We also define an auxiliary quantity Kjl, (G.3)
In fact Kjl can always be reduced (by a computer algebra system such as MATHEMATICA) to a form a + b/π with a, b rational. Writing the integral for the zeroth-order self-energy µ0l with the variable substitution , we find that (G.4)
To find the higher-order moments, consider the following polynomial Qjl(t) of degree 2 j – 1, (G.5)
and note that r∂r = –t(1 – t2)∂t. Using the recurrence relation (89) for the auxiliary polynomials p̃jl(s) we can therefore write Qjl(t) recursively as (G.6)
Writing out the polynomial explicitly as , we have the following recurrence on the coefficients qjkl, (G.7)
Now insert this into the integral for μ̃jl, finally giving us the modified moments (G.8)
Appendix G.1 Specific exact coefficient expressions
Plugging the expression for the modified moments into the modified Chebyshev method described in Sect. 5.2, we can get exact expressions for the recurrence coefficients βnl. Setting (for Bz(a, b) an incomplete Beta function), the first few are
References
- Alhaidari, A. D., Yamani, H. A., Heller, E. J., & Abdelmonem, M. S., 2008, The J-Matrix Method (Netherlands: Springer) [CrossRef] [Google Scholar]
- Aoki, S., & Iye, M. 1978, PASJ, 30, 519 [Google Scholar]
- Benet, L., & Sanders, D. P. 2019, J. Open Source Softw., 4, 1043 [NASA ADS] [CrossRef] [Google Scholar]
- Binney, J., & Tremaine, S. 1987, Galactic Dynamics (Princeton, NJ, Princeton University Press), 747 [Google Scholar]
- Carlson, B. C. 1961, J. Math. Phys., 2, 441 [CrossRef] [Google Scholar]
- Clutton-Brock, M. 1972, Ap&SS, 16, 101 [NASA ADS] [CrossRef] [Google Scholar]
- Clutton-Brock, M. 1973, Ap&SS, 23, 55 [NASA ADS] [CrossRef] [Google Scholar]
- de Zeeuw, T. 1985, MNRAS, 216, 273 [NASA ADS] [CrossRef] [Google Scholar]
- de Zeeuw, T., & Pfenniger, D. 1988, MNRAS, 235, 949 [NASA ADS] [CrossRef] [Google Scholar]
- de Zeeuw, T., Peletier, R., & Franx, M. 1986, MNRAS, 221, 1001 [NASA ADS] [CrossRef] [Google Scholar]
- Dombrowski, J. 1985, Pac. J. Math., 120, 47 [Google Scholar]
- Earn, D. J. D. 1996, ApJ, 465, 91 [NASA ADS] [CrossRef] [Google Scholar]
- Erkal, D., Deason, A. J., Belokurov, V., et al. 2021, MNRAS, 506, 2677 [NASA ADS] [CrossRef] [Google Scholar]
- Fouvry, J.-B., & Prunet, S. 2022, MNRAS, 509, 2443 [NASA ADS] [Google Scholar]
- Garavito-Camargo, N., Besla, G., Laporte, C. F. P., et al. 2021, ApJ, 919, 109 [NASA ADS] [CrossRef] [Google Scholar]
- Gautschi, W. 1985, J. Comput. Appl. Math., 12–13, 61 [Google Scholar]
- Granovskii, Y. I., & Zhedanov, A. S. 1986, Sov. Phys. J., 29, 387 [NASA ADS] [CrossRef] [Google Scholar]
- Hamilton, C., Fouvry, J.-B., Binney, J., & Pichon, C. 2018, MNRAS, 481, 2041 [NASA ADS] [CrossRef] [Google Scholar]
- Henon, M. 1959, Ann. Astrophys., 22, 126 [NASA ADS] [Google Scholar]
- Hernquist, L., & Ostriker, J. P. 1992, ApJ, 386, 375 [Google Scholar]
- Ismail, M. E., & Koelink, E. 2011, Adv. Appl. Math., 46, 379 [Google Scholar]
- Kalnajs, A. J. 1971, ApJ, 166, 275 [CrossRef] [Google Scholar]
- Kalnajs, A. J. 1976, ApJ, 205, 745 [CrossRef] [Google Scholar]
- Koekoek, R., Lesky, P. A., & Swarttouw, R. F. 2010, Hypergeometric Orthogonal Polynomials and Their q-Analogues (Berlin, Heidelberg: Springer) [CrossRef] [Google Scholar]
- Kuzmin, G. G. 1956, Publ. Tartu Astrofizica Observ., 33, 75 [NASA ADS] [Google Scholar]
- Law, D. R., & Majewski, S. R. 2010, ApJ, 714, 229 [Google Scholar]
- Lilley, E. J. 2020, PhD thesis, University of Cambridge [Google Scholar]
- Lilley, E. J., Sanders, J. L., & Evans, N. W. 2018a, MNRAS, 478, 1281 [NASA ADS] [CrossRef] [Google Scholar]
- Lilley, E. J., Sanders, J. L., Evans, N. W., & Erkal, D. 2018b, MNRAS, 476, 2092 [NASA ADS] [CrossRef] [Google Scholar]
- Lowing, B., Jenkins, A., Eke, V., & Frenk, C. 2011, MNRAS, 416, 2697 [NASA ADS] [CrossRef] [Google Scholar]
- Lynden-Bell, D. 1989, MNRAS, 237, 1099 [NASA ADS] [CrossRef] [Google Scholar]
- Marín, J., & Seubert, S. M. 2006, J. Math. Anal. Applic., 320, 599 [CrossRef] [Google Scholar]
- Navarro, J. F., Frenk, C. S., & White, S. D. M. 1997, ApJ, 490, 493 [Google Scholar]
- Olver, F. W. J., Daalhuis, A. B. O., Lozier, D. W., et al. 2022, NIST Digital Library of Mathematical Functions, Release 1.1.7 of 2022-10-15 [Google Scholar]
- Petersen, M. S., & Peñarrubia, J. 2021, Nat. Astron., 5, 251 [NASA ADS] [CrossRef] [Google Scholar]
- Petersen, M. S., Peñarrubia, J., & Jones, E. 2022a, MNRAS, 514, 1266 [CrossRef] [Google Scholar]
- Petersen, M. S., Weinberg, M. D., & Katz, N. 2022b, MNRAS, 510, 6201 [NASA ADS] [CrossRef] [Google Scholar]
- Plummer, H. C. 1911, MNRAS, 71, 460 [Google Scholar]
- Polyachenko, V. L., & Shukhman, I. G. 1981, Soviet Ast., 25, 533 [NASA ADS] [Google Scholar]
- Qian, E. E. 1993, MNRAS, 263, 394 [CrossRef] [Google Scholar]
- Rahmati, A., & Jalali, M. A. 2009, MNRAS, 393, 1459 [NASA ADS] [CrossRef] [Google Scholar]
- Robijn, F. H. A., & Earn, D. J. D. 1996, MNRAS, 282, 1129 [NASA ADS] [CrossRef] [Google Scholar]
- Saha, P. 1991, MNRAS, 248, 494 [NASA ADS] [CrossRef] [Google Scholar]
- Saha, P. 1993, MNRAS, 262, 1062 [CrossRef] [Google Scholar]
- Sanders, J. L., Lilley, E. J., Vasiliev, E., Evans, N. W., & Erkal, D. 2020, MNRAS, 499, 4793 [CrossRef] [Google Scholar]
- Toomre, A. 1963, ApJ, 138, 385 [NASA ADS] [CrossRef] [Google Scholar]
- Tremaine, S. D. 1976, MNRAS, 175, 557 [NASA ADS] [CrossRef] [Google Scholar]
- Vera-Ciro, C., & Helmi, A. 2013, ApJ, 773, L4 [Google Scholar]
- Weinberg, M. D. 1999, AJ, 117, 629 [NASA ADS] [CrossRef] [Google Scholar]
- Zhao, H. 1996, MNRAS, 278, 488 [Google Scholar]
Many popular mass laws have infinite mass but finite self-energy, for example the NFW model (Navarro et al. 1997).
Other choices may be preferable from an analytical point of view, for example Φ0l = rlΦ(r2l+1) or Φ0l = rlΦ(r)/(1 + r)2l, the latter suggested by Saha(1993).
See for example Kalnajs (1971, Eq. (14)), set u = log R and relabel θ → φ and α → −s. The RHS there is then proportional to our Σsm(R), apart from a factor of R−3/2.
The operators 𝒟 and 𝒜 do in fact arise as the generators of symmetries of the self-energy inner product; see the discussion in Sect. 6.
‘Monic’ meaning that the term of highest-degree has coefficient 1. Note that in Sect. 4 the polynomials are not necessarily in monic form.
This corrects a typo in Clutton-Brock (1973).
See Lilley (2020, Ch. 6) for a detailed derivation.
Some MATHEMATICA code demonstrating this is included in the repository at https://github.com/ejlilley/basis
For example, the ‘defective’ NFW basis set constructed in Lilley (2020, Ch. 2), which does not converge with the addition of higherorder angular terms. See also Saha (1991), who suggests that “glitches and generally anomalous behaviour” in the recovery of modes may be related to the form of the chosen basis functions – this should be systematically investigated.
Limited work on perturbation analysis has been done for the fully ellipsoidal case, including for example Tremaine (1976). There is also some existing work on (non-orthogonal) spheroidal basis sets (Earn 1996; Robijn & Earn 1996).
All Figures
Fig. 1 Radial parts of the isochrone potential basis for n = 0, 1, 2, 3 and l = 0 (top) and l = 1 (bottom). The potentials have been unit-normalised. |
|
In the text |
Fig. 2 Radial parts of the exponential disc basis for n = 0, 1, 2, 3 and m = 0 (top) and m = 1 (bottom). The potentials have been unit-normalised. |
|
In the text |
Fig. 3 Recovery of the unstable radial mode of the isotropic isochrone model. The mode is recovered well despite the low (nmax = 6) number of basis functions used. |
|
In the text |
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.