Issue 
A&A
Volume 670, February 2023



Article Number  A134  
Number of page(s)  22  
Section  Planets and planetary systems  
DOI  https://doi.org/10.1051/00046361/202244955  
Published online  21 February 2023 
An improved Representative Particle Monte Carlo method for the simulation of particle growth
Institute of Theoretical Astrophysics (ITA), Center for Astronomy (ZAH), RuprechtKarlsUniversität Heidelberg,
AlbertUeberleStr. 2,
69120
Heidelberg, Germany
email: moritz.beutel@ziti.uniheidelberg.de
Received:
12
September
2022
Accepted:
21
November
2022
Context. A rocky planet is formed out of the agglomeration of around 10^{40} cosmic dust particles. As dust aggregates grow by coagulation, their number decreases. But until they have grown to hundreds of kilometres, their number still remains well above the number of particles a computer model can handle directly. The growth from micrometres to planetesimalsized objects therefore has to be modelled using statistical methods, often using size distribution functions or Monte Carlo methods. However, when the particles reach planetary masses, they must be treated individually. This can be done by defining two classes of objects: a class of many small bodies or dust particles treated in a statistical way, and a class of individual bodies such as one or more planets. This introduces a separation between small and big objects, but it leaves open how to transition from small to big objects, and how to treat objects of intermediate sizes.
Aims. We aim to improve the Representative Particle Monte Carlo (RPMC) method, which is often used for the study of dust coagulation, to be able to smoothly transition from the manyparticle limit into the singleparticle limit.
Results. Our new version of the RPMC method allows for variable swarm masses, making it possible to refine the mass resolution where needed. It allows swarms to consist of few numbers of particles, and it includes a treatment of the transition from swarm to individual particles. The correctness of the method for a simplified twocomponent test case is validated with an analytical argument. The method is found to retain statistical balance and to accurately describe runaway growth, as is confirmed with the standard constant kernel, linear kernel, and product kernel tests as well as by comparison with a fiducial nonrepresentative Monte Carlo simulation.
Key words: planets and satellites: formation / protoplanetary disks / methods: numerical / methods: statistical / accretion / accretion disks
© The Authors 2023
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
1 Introduction
How planets are formed is a difficult question to answer. We know that the process starts with micrometresized cosmic dust particles in a protoplanetary disk, which coagulate and form ever larger dust aggregates (Birnstiel et al. 2016). As these aggregates grow to a pebble size (of the order of millimetres to centimetres, henceforth conveniently called pebbles), they may, under certain conditions, form large overdensities in the disk that gravitationally collapse to form multikilometre sized planetesimals (Johansen et al. 2009; Wahlberg Jansson & Johansen 2014). From there onwards, these planetesimals agglomerate gravitationally to form planetary embryos (Wetherill & Stewart 1993; Kokubo & Ida 1998), which, in turn, grow further via accretion of planetesimals and any remaining pebbles (Safronov 1964; Ormel & Klahr 2010; Lambrechts & Johansen 2012; Bitsch et al. 2015).
One of the fundamental limiting factors of any numerical treatment of this problem is the vast number of particles involved. To form a single Earthlike planet, we need ~10^{40} dust particles, ~10^{30} pebbles, or ~10^{10⋯20} planetesimals (dependent on their mass). This would not necessarily be a problem if we could treat these as an equilibrium fluid, just in the way we treat a gas. Unfortunately, solid particles behave very differently than a gas, and thus require a more sophisticated treatment.
For small particles, the problem is often approached with a particle size distribution function n(m)dm, giving the number of particles per unit volume for a particle mass interval dm. This function is then sampled numerically on a grid in m, and the growth and fragmentation are then modelled using a version of the Smoluchowski coagulation equation (Smoluchowski 1916). This method (which we call the continuum approach) has been used by numerous teams (e.g. Weidenschilling 1980; Tanaka et al. 2005; Dullemond & Dominik 2005) and is reasonably fast and efficient. Furthermore, it can be extended in space by adding the spatial dimensions to the coordinates (Birnstiel et al. 2010; Okuzumi et al. 2012; Drążkowska et al. 2019). It works very well for small particles because they are well coupled to the gas. These particles do not have independent orbital elements, but instead largely move along with the gas, except for a slow drift. However, with this method, it is hard to add independent particle properties. By the Eulerian nature of such a gridbased approach, all particles have the same properties for each location and mass. Approximate methods exist to overcome this limitation (Okuzumi et al. 2009; Stammler et al. 2017), but a full solution would require an extension of the dimensionality of the problem, creating a huge computational overhead. For instance, by adding only a porosity parameter given by the value p, the particle distribution function now becomes n(m, p) dm dp. This quickly pushes the problem beyond the limit of computational feasibility.
For large particles (planetesimals and upwards), the problem becomes even more severe, because now the particles are large enough to acquire their own independent orbital elements. A distribution function treatment as described above becomes unfeasible. Instead, the Nbody method is the method of choice.
As a result of these (and other) complications, it is hard to develop a model of planet formation that starts with dust and ends with fullgrown planets. The few complete dusttoplanets models available today involve a patchwork of different methods covering different size regimes (e.g. Ormel et al. 2017; Schneider & Bitsch 2021; Emsenhuber et al. 2021).
If we want to develop a single technique that can, at least in principle, handle large amounts of dust particles as well as individual planetlike bodies, and everything in between, the method must be based on a particle sampling approach. This could be a Monte Carlo method (stochastic), an Nbody method (deterministic), or a blend of both. In this approach, it is easy to add any number of particle properties to the problem, because we simply store these properties for each computational particle. Even if all particles are in the same location and have the same mass, we can then still handle a heterogeneous set of supplementary properties (porosity, charge, chemical composition, orbital elements, etc.), which is not possible for continuum methods.
Ormel et al. (2007) developed a Monte Carlo solver for dust coagulation in which particles can have not only a mass but also other properties such as porosity or ice content. This method stands at the basis of further works (e.g. Krijt et al. 2015) and overcomes the limitations of the continuum methods. While the method was developed for small particles, it has also been used for modelling runaway growth of planetesimal size bodies (Ormel et al. 2010).
An alternative Monte Carlo method was developed by Zsom & Dullemond (2008). They key difference with the method of Ormel et al. (2007) is that it is rigourously based on the concept of ‘representative particles’. We therefore call this method the ‘Representative Particle Monte Carlo’ (RPMC) method in this paper, and provide a brief review of the method in Sect. 2. This method has several advantages over other methods, one of them being its robustness and simplicity. But it has, so far, only been formulated in a somewhat restrictive sense: (1) All swarms must have the same mass, and (2) all swarms must contain a very large number of actual particles. As a result, this method has so far only been applied to problems of dust coagulation and fragmentation, where the number of actual particles is so large that condition (2) is always fulfilled. In this context the method has been successfully used for modelling dust coagulation, compactification, and fragmentation based on the full complexity of laboratory measurements (Zsom et al. 2010, 2011a). It has also been used for exploring the effect of external disturbances on the protoplanetary disk (Zsom et al. 2011b), the effect of particle collisions during the gravitational collapse of a pebble cloud (Wahlberg Jansson et al. 2017), and for various other applications (e.g. Windmark et al. 2012; Drążkowska et al. 2014; Krijt & Ciesla 2016; Krijt et al. 2016).
As the RPMC method stands, however, it cannot be used for problems where the growth proceeds to the point when swarms become individual particles (e.g. planets). In fact, the validity of the method, in the form presented by Zsom & Dullemond (2008), already formally breaks down well before that. And although the equalmass swarms provide a certain robustness, and focus computational effort where the mass is, this condition can also severely limit the applications of the method. There are wellknown particle growth problems where initially insignificant particles can grow to dominate the process, for example the planetesimal runaway growth regime (Greenberg et al. 1978) or the ‘lucky particle’ scenario (Windmark et al. 2012). To model these with a Monte Carlo approach either requires a very large number of representative particles, or the clever splitting of swarms into smaller swarms to resolve particles of interest. This is only possible if the method allows swarms of different masses.
The goal of this paper is to improve the RPMC method such that the drawbacks (1) and (2) are remedied. The method will then still have the advantages of robustness of the original RPMC method, but will gain in flexibility and applicability so that it can be used to model planet formation over a wide range of particle masses. In a companion paper (Beutel et al., in prep.), we address another problem of the RPMC method: the n^{2} scaling of all possible pairwise interactions.
2 Conceptual summary of the original RPMC method
Let us consider the problem of coagulation or fragmentation of, say, N = 10^{30} particles. Obviously we cannot model all of these particles individually. With the RPMC method we consider, of these 10^{30} particles, only n particles which together form a representative sample of the actual particles. The number n must be large enough that this sample provides a good enough representation of the true distribution. At the same time, it must be small enough to keep the computational cost manageable.
The sampling is massweighted. That means that a boulder with a mass of 1000 kg will have a million times higher chance to be one of the representative particles than a pebble of 1 g. However, if there are a million times more pebbles than boulders (meaning that the total mass in pebbles equals the total mass in boulders), then the chance that representative particle i happens to be a pebble is the same as that it happens to be a boulder.
Now let us assume that representative particle i is a boulder of m_{B} = 1000 kg. We let it undergo an event that causes it to fragment into a mass distribution n_{fr}(m). The total mass of the mass distribution is still 1000 kg:
In the RPMC method, rather than splitting the numerical particle up into a large number of new numerical particles, we randomly choose a new mass for the representative particle using the probability distribution function
In other words, we use massweighting to choose the new property of the representative particle, so the sampling remains massweighted. All the other fragments are ‘forgotten’. If only a single such fragmentation event were to happen, such forgetfulness would be problematic, since a single randomly chosen particle does not represent an entire distribution of fragments. But since we are dealing with very large numbers of boulders, such fragmentation events will occur many times (each time for a different representative particle j ≠ i). Statistically, the outcomes of many representative particles follow the fragment mass distribution n_{fr}(m).
The representative particles form a representation of the real (full) collection of particles at a given time. They contain all the information we have of the system at that particular time. In order to reconstruct the full collection of particles from this limited information, the RPMC method follows the principle of Ockham’s Razor: if we do not know more than the information we currently have of the representative particles, then the simplest assumption possible is that the other (unknown) particles are the same as the representative ones. Given that the representative particles follow a massweighted sampling, each of them represents a fraction 1/n of the total mass of solids. And so we divide the total mass up into n equalmass parts, and assume each part to consist of particles with identical properties as their representative particle.
Although it might be possible to conceive different methods of extrapolating the full distribution of particles from the set of representative particles, our simple method has the advantage that it trivially guarantees the idempotency of the selection process. Regardless of the how representative particles are chosen from swarms (random pick, mass median, nearest to average, etc.), we would end up with an exactly equivalent set of representative particles after extrapolating a full particle ensemble and selecting a new set of representative particles.
We note that, so far, we have not used the concept of ‘swarms’. The RPMC method does not require this concept. But to make it easier to talk about the method, and to compare it to the ‘superparticles’ often used in computational astrophysics (e.g. Johansen et al. 2007), it is convenient to define a swarm in the context of RPMC. The key lies in the massweighted sampling used by RPMC. If the total mass of solids is M, and we used n representative particles which are randomly sampling the solids in a massweighted fashion, then we can say that each representative particle represents a fraction 1/n of the total mass of solids. We then assign to each representative particle i a swarm of N_{i} particles with identical properties as the representative particle. The total mass of the swarm is M_{i} = M/n, so that N_{i} = M_{i}/m_{i}, where m_{i} is the mass of the individual representative particle. If the properties of the representative particle change, so will the properties of the other particles in the swarm. This also means that if the representative particle becomes more massive (i.e. m_{i} increases) due to, for instance, a collideandmerge event with another particle, the other swarm particles also become more massive. As the total mass of the swarm stays constant, the number of particles in the swarm N_{i} will then drop. This is a purely statistical effect, not a true sudden disappearance of particles.
To better understand how the representative particles work, it is helpful to pretend that all particles are composites of the same elementary substance, and that a representative particle is identified with a ‘tracer atom’ of this substance. If tracer atoms were initially chosen at random from the entirety of atoms, then at any time the particle mass distribution of the system must be approximated by the mass distribution of representative particles, assuming that the number of particles is so much larger than the number of tracer atoms that the possibility of two tracer atoms ending up in the same particle can be neglected. In this picture, swarms just embody the simplest possible way of extrapolating an approximate global mass distribution from the mass distribution of representative particles.
We have not yet introduced any collisions between particles. The fragmentation event is, so far, a purely hypothetical spontaneous event affecting only the representative particle itself. The RPMC method can, however, easily handle collisions. Since each swarm consists of truly many particles, when a representative particle of swarm j collides with a particle from swarm k, the probability that it hits the representative particle of swarm k is extremely low. Instead, it will most likely (with a probability close to 1) hit another particle of that swarm, leaving representative particle k alone. A collision nearly always affects both colliding particles. For instance, upon highvelocity impact, both particles can fragment. Or upon lowvelocity impact, the particles can merge. But since with almost certainty the representative particle j will collide with a nonrepresentative particle from swarm k, only representative particle j will fragment or merge, while representative particle k will be entirely unaffected as it is not involved in the collision. This means that only the properties of representative particle j, and by extrapolation those of swarm j, are changed, while swarm k stays unaffected. This asymmetric interaction between swarms is a key feature of the RPMC method. It means that on individual collision basis the results can be skewed, but statistically over many collisions the results are correct.
In Zsom & Dullemond (2008), the method is described in much more detail. It is explained how to randomly sample collision events, how to modify the representative particle involved, how to update the collision matrix, how to deal with pathological cases, etc. We discuss these issues in Sect. 4.3, where we define our improved version of the RPMC method.
An advantage of the RPMC method is that it naturally keeps the number of representative particles unchanged, even as particles merge or shatter. While the actual number of particles may decline (as particles coalesce) or increase (as particles shatter), the number of representative particles stays, by design, the same. This avoids the need of reintroducing computational particles as the number of particles declines (to keep the resolution optimal), or forcefully merging particles as the number of particles increases (to prevent runaway computational cost). Also, it naturally conserves mass. These properties make the RPMC method exceptionally stable and robust. But this robustness comes at a cost. First, the interactions between particles become asymmetric, which is not problematic but can be confusing. Furthermore, the method has the basic assumption of massweighted sampling, which automatically means that all swarms have the same total mass, namely M/n. And the method assumes that two representative particles have a negligible chance of colliding with each other, which requires that the swarm mass always is huge compared to the particle mass (many particles per swarm). Together, these disadvantages limit the applicability of the RPMC method. Addressing these shortcomings is the purpose of this paper.
3 Mathematical formalisation of the RPMC method
3.1 Fundamentals
We consider an ensemble of N physical particles which interact through collisions. The number N will typically be very large (say, N ~ 10^{30}). Here and in the following, we refer to a ‘physical particle’ as any particle of the system we study, while only a (tiny) subset of these particles, the ‘representative particles’, are stored on the computer. Each physical particle k is associated with a particle mass m_{k} and possibly other properties. These particle properties, in particular the mass, may span many orders of magnitude. We denote the full set of particle properties of particle k with the symbol q_{k} ∈ ℚ, where ℚ is the space of intrinsic properties (e.g. mass, charge, porosity) and statistical properties (e.g. velocity dispersion).
For every pair of particles j and k, we define the raw collision rate λ(q_{j}, q_{k}) as the rate (in units of s^{−1}) at which both particles collide. It is necessarily commutative, λ(q_{j}, q_{k}) = λ(q_{k}, q_{j}). If the particles are distributed homogeneously and isotropically in a volume V, the collision rate λ(q_{j}, q_{k}) can be written as
where σ(q_{j}, q_{k}) is the collision crosssection and ∆v(q_{j}, q_{k}) the average relative velocity between the particles. Because particles cannot collide with themselves, we define a true collision rate λ_{jk} which explicitly suppresses selfcollision:
The collision rate λ_{j} of particle j with any other particle in the ensemble is given by
The total rate of collisions is
Supposing, for the moment, that we had a small enough number of physical particles N that we can store them all on the computer (i.e. they would all be representative, though only of themselves), we would then proceed as follows to simulate the particle collisions. Following the method of Gillespie (1975), we choose a random time increment
where ξ is a uniform random number drawn from the interval [0, 1). We then draw random indices j and k from the discrete distribution defined by the joint probability
where is the chance that particle j is involved in the collision event and P(kj) = λ_{jk}/λ_{j} is the chance that particle k is the collision partner of particle j given that particle j will suffer a collision. We then advance the system by the time increment ∆t and let particles j and k collide. This collision will generally lead to a modification of the properties of the particles j and k involved. It can also lead to the merging of particles (in the case of coagulation) or to the creation of additional particles (by means of fragmentation). Therefore, the collision rate λ has to be recomputed after a collision. It also needs to be recomputed if the properties of the particles change by means of other processes.
Fig. 1 Indexing of the full ensemble of physical particles, with the representative particles ordered at the beginning of the sequence. The entirety of particles in a given ensemble is indexed from 1 to N. The first n indices refer to RPs, and the remaining indices belong to nonRPs. Because N may be enormously large, tracking the entirety of particles is not feasible. Only the n RPs are actually modelled on the computer, while the properties of the (N − n) nonRPs are estimated from the known properties of the n representative ones. 
3.2 Representative particles
Simulating a full ensemble of N ~ 10^{30} particles is computationally unfeasible. We therefore pick a representative subset of n representative particles with n ≪ N and simulate only them while still allowing them to interact with all the other physical particles. For convenience we abbreviate ‘representative particle’ as RP henceforth.
Also for notational convenience, we assume that all physical particles are indexed from 1 to N, but with the RPs arranged at the beginning of the ordering. In other words, the particles with indices 1 to n are the RPs, all others are not, as illustrated in Fig. 1.
Let us now define the total collision rate of the RPs, or more precisely: the total collision rate of all RPs (indices 1 to n) with all physical particles (indices 1 to N):
We adapt Gillespie’s method for the simulation of RPs by substituting for λ in Eq. (7). However, is not a computable quantity because the properties of the (N − n) nonRPs are not known. To obtain a computable estimate of , we first need to clarify how RPs actually represent other particles.
The basic assumption underlying the RPMC method is that a RP k can be regarded as a primus inter pares (first among equals), chosen from a larger swarm 𝒮_{k} of particles whose properties are similar to q_{k}. Here, 𝒮_{k} is the index set of the N_{k} physical particles in the swarm k belonging to RP k. In other words: the n RPs divide the total ensemble of N physical particles up into n swarms, each containing N_{k} physical particles, such that
Given that, for each swarm k, we only have information about the RP k, the simplest possible assumption about the properties of the other particles in that swarm is to say that they are not only similar in some unspecified way, but in fact identical:
What still needs to be defined is how large each swarm is. This is best defined through a conserved quantity. The only practical quantity that is strictly conserved, irrespective of the complexity of the model, is mass. And so we divide the total mass M up into swarms of masses M_{k} such that
In other words: each swarm k has a mass M_{k}, which is conserved throughout the simulation, unless a swarm is deliberately split or merged. The mass of a swarm and the number of particles in the swarm are directly related via
where m_{k} is the mass of each of the particles in the swarm k. In the original RPMC method of Zsom & Dullemond (2008) all swarms by design had the same mass, so that
where M is the total mass in the system and n is the number of RPs, and thus also the number of swarms. We later show that this equalmass partitioning of swarms is not necessary for the method to work.
3.3 Simulation of representative particles
To approximate the total collision rate of RPs in Eq. (9), we first split it in two terms,
where the first term describes the collisions among RPs and the second term describes collisions between RPs and nonrepresentative swarm particles.
By invoking the assumption that all particles in a swarm have identical properties, as stated by Eq. (11), we can replace the sum of RP–nonRP collision rates with the sum of RP–RP collision rates multiplied with the number of nonRP particles in the swarm,
We note that the raw collision rate λ(q_{j}, q_{i}) had to be used here instead of the true collision rate λ_{ji}. The reason is that, while a RP j cannot collide with itself, it may collide with another particle from its own swarm. In the true collision rate, λ_{jj} had been set to 0 to avoid selfcollisions, so its use in Eq. (16) would unphysically avoid collisions between particles from the same swarm. Hence the use of the raw collision rate instead.
Rewriting the first term of Eq. (15) as
and using Eqs. (4), (16), we find
From this we can define the RP–swarm collision rate ,
and the cumulative collision rate of RPs ,
Although λ_{jk} and λ(q_{j}, q_{k}) are commutative, is not.
The term −(1 + δ_{jk})/2 in the above expressions accounts for the possibility that RP j collides with a particle from swarm k that is itself a RP. The probability that this happens is
For such (usually rare) events, we have to avoid doublecounting, because this potential collision is also accounted for by RP k. So this collision has to be counted only half, meaning that we have to subtract 1/2 from the number of particles N_{k} in the swarm k, as done in Eq. (18). If j = k, that is, if we consider collisions of RP j with members of its own swarm, then instead of subtracting just 1/2 we have to subtract 1, because apart from the RP itself, there are only (N_{j} − 1) particles in the swarm. Hence the term −(1 + δ_{jk})/2 in Eq. (18).
3.4 The largeparticlenumber approximation of the original RPMC method
The original RPMC method now makes the additional assumption that the number of particles per swarm is much greater than unity,
This allows neglecting the probability of RPRP collisions . With this assumption, we can define the simplified RP–swarm collision rate as
and the simplified cumulative collision rates as
Gillespie’s method is then applied to simulate RP collisions by substituting λ with λ in Eq. (7):
A RP suffering a collision is then chosen by drawing an index j from the discrete distribution with probability
Then a collision partner from swarm k is chosen, where the index k is drawn from the discrete distribution with the probability
3.5 Stochastic representation of collisional outcomes
With this method we can randomly determine the time of the next collision event, the RP j suffering a collision, and the swarm k which represents the collision partner. The collision can have a variety of results: coagulation, fragmentation, pulverisation, cratering, etc. (Güttler et al. 2010; Blum 2018; Wurm & Teiser 2021), and it can yield a wide variety of particle distributions ranging from a single merged particle to an almostcontinuous spectrum of fragment masses. This raises the question of how the resulting particle distribution can be incorporated into the simulation model.
Of course, in case of fragmentation or pulverisation, adding every resulting particle to the simulation is infeasible. Instead, we use the sampling nature of the Monte Carlo process to let the resulting particle distribution realise itself stochastically. Let
denote the number density distribution of the resulting fragments from a collision of RP j with a particle from swarm k. It is normalised as follows:
where m(q) is the mass of a particle with properties q, and the integration is over all possible properties q.
We now draw a single set of particle properties q^{coll} from the weighted distribution
In other words, a collision in our simulation always consumes the representative particle and replaces it with a new representative particle sampled from the particle mass distribution of the collision outcome,
The masses of both swarms do not change,
For a single collision, the method always chooses a single outcoming particle, so that the number of RPs remains constant. If the collision leads to merging, then the resulting particle will have a mass m(q_{j}) + m(q_{k}). If, instead, the collision leads to a distribution of fragments, then the resulting particle will be a single massweighted choice among those fragments. The fragment distribution will then emerge as large numbers of collisions lead to many samplings of similar fragment distributions.
As demonstrated in Zsom & Dullemond (2008), the RPMC method yields the expected result for the standard analytic coagulation tests, albeit with substantial noise owed to the Monte Carlo nature of the method. The method also has the desirable consequence that the number of RPs always remains constant, which allows for simpler and more efficient implementation of the method. Finally, the representativeness of the sampling remains optimal: every RP keeps representing the same total mass.
4 The improved RPMC method
As mentioned before, the RPMC method, as described in Sect. 3, has two major limitations: (1) all swarms have the same mass and (2) all swarms must contain a very large number of physical particles. Here we present an improved RPMC method that overcomes these limitations. We put these improvements to the test in later sections.
4.1 Unequalmass swarms
The original RPMC method assumed that all swarms have equal mass M_{k} = M/n. The reason for this choice was that the representative particles were considered to be randomly chosen from the full set of particles, using mass as the weight factor. This fits well to the ‘atomistic’ picture mentioned above: we randomly pick n ‘atoms’, all having equal weight, and then consider, for each representative atom, the dust particle it is in to be the representative particle. This guarantees that the n representative particles together form an unbiased sampling of the full ensemble of particles.
However, as we saw in Sect. 3.2, the formalisation of the RPMC algorithm allows for unequal swarm masses M_{k} ≠ M/n, as long as the sum of all these masses equals the total mass M. The way this enters into the algorithm is through the modified number of particles per swarm
which appears in the collision rate calculation in Eq. (18). If we define the ‘weight’ of a RP as
then in the original RPMC method all RPs had equal weight (i.e. they were all equally ‘important’). But in the improved RPMC method the RPs are now no longer necessarily of equal weight.
We can split any swarm up into a number of smaller swarms. For example, we might replace a RP k with weight W_{k} with two RPs l, m with weights W_{l} + W_{m} = W_{k}. Or to put it in terms of their swarms: we split the swarm of mass M_{k} up into two swarms of masses M_{l} + M_{m} = M_{k}. Because the particles in both swarms have identical mass, m_{l} = m_{m} = m_{k}, the total number of particles is not changed, N_{l} + N_{m} = N_{k}.
4.2 The limits to growth
A much more difficult problem is to enable the RPMC method to deal with swarms that contain only a few or even just a single particle.
The original RPMC method was constructed to simulate a growth process over many orders of magnitude with a constant number of RPs while keeping resolution optimal. The method crucially relies on the assumption stated in Eq. (22) that the number of particles in each swarm is much greater than unity, N_{k} ≫ 1 ∀k ∈ {1, …, n}. Equivalently, the particle mass is assumed to be much smaller than the swarm mass, m_{k} ≪ M_{k} ∀k ∈ {1, …, n}. This assumption allowed the use of the simplified collision rate defined in Eq. (23) instead of in Eq. (18), and it allowed us to keep the number of RPs n constant because we could neglect RP–RP collisions.
When the particle mass m_{k} is no longer negligibly small compared to the swarm mass M_{k}, several problems appear with the original RPMC method. Firstly, the number of particles in a swarm is computed through N_{k} = M_{k}/m_{k}. Like all RP and swarm properties, N_{k} is an expectation value. Therefore, a fractional swarm particle count such as N_{k} = 1.3 is not per se problematic. But if we want to treat singleparticle swarms as nonstochastic objects–for example, we might want to identify singular particles with Nbody objects in a hybrid MC/Nbody simulation – we must decide for an integer number of particles. However, by rounding N_{k} to an integer, we no longer conserve the total mass of the system. Secondly, by neglecting the possibility of RP–RP collisions, we were able to adopt the simplified RP–swarm collision rate of Eq. (23). . But for N_{k} ~ 1, this collision rate significantly overestimates the actual RP–swarm collision rate given in Eq. (19). In the extreme case of N_{k} = 1, we have
(To solve this problem, one might be tempted to simply use the actual RP–swarm collision rate instead of approximating it as . However, it turns out that the approximation must be made, otherwise the RPMC method is not statistically balanced, as demonstrated in Appendix C). Thirdly, a particle cannot grow in mass beyond the swarm mass: m_{k} ≤ M_{k}, whereas in reality it might do so. The original RPMC method strictly keeps M_{k} constant in time, meaning that this creates an artificial upper bound to the mass a physical particle can acquire. When increasing the number of swarms n, we can decrease the number of particles N_{k} in swarms k ∈ {1, …, n} to improve the resolution, but as a result, we also decrease the largest individual mass m_{k} ≤ M_{k} which the simulation can represent. Finally, if N_{k} is not large, the probability that RP j will collide with the RP of swarm k ≠ j is not negligibly small.
4.3 Extending the method
To overcome the problems outlined in the previous section, we propose some modifications to the RPMC method. As the basis of our extension, we first classify all swarms k ∈ {1, …, n} in two categories: manyparticles swarms with N_{k} > N_{th}, and fewparticles swarms with N_{k} ≤ N_{th}, where the simulation parameter N_{th} is the particle number threshold.
Every RP is associated with a swarm. When a RP j interacts with a particle from swarm k, we can use the above classification to define interaction regimes: the interaction operates in the manyparticles regime if both swarms j, k are manyparticles swarms, and in the fewparticles regime if at least one swarm of j, k is a fewparticles swarm.
We then use the following adaption of the effective RP–swarm collision rate:
where we define the swarm multiplicity factor N_{jk}:
Interactions in the manyparticles regime are handled with the conventional RPMC method: we assume that a RP j always interacts with a nonRP from swarm k. A collision then alters only the properties of RP j.
If the interaction operates in the fewparticles regime, we make the simplest possible choice and impose that, in the swarm with fewer particles, all particles operate in ‘lockstep’, that is, they all do as the RP does, at the same time. Therefore, instead of RP–swarm interactions, we now consider swarm–swarm interactions. Imposing N_{j} ≤ N_{k} without loss of generality, the result of the collision is still determined following the procedure described in Sect. 3.5, q_{j} → q^{coll}, but the swarm masses M_{j} and M_{k} now grow and shrink as mass is transferred to the swarm with fewer particles:
By assumption of N_{j} ≤ N_{k}, the swarm mass M_{k} cannot become negative.
Interactions in the fewparticles regime will therefore upset the sampling balance: as swarm weights change, different RPs represent different fractions of the total mass. As was argued in Sect. 4.1, this does not impair the validity of the simulation, but it may lead to inefficient use of the available RPs. The particle number threshold N_{th} should therefore be chosen as small as possible so that the sampling is not unnecessarily disequilibrated. At the same time, N_{th} should be large enough such that, for N_{k} ≥ N_{th}, the simplified RP–swarm collision rate Eq. (23), , only negligibly overestimates the RP–swarm collision rate . In our tests we found satisfactory results with N_{th} ~ 10.
This simple prescription allows unconstrained growth in the RPMC method while conserving mass and keeping the RP count unchanged. However, it has some implications that are detrimental to our goal of modelling runaway growth processes with high accuracy. First, unless N_{j} happens to be a power of 2, swarm selfinteraction will still result in a fractional particle number. More gravely, the method usually does not resolve the runaway body individually. Let us consider a swarm j representing the heaviest bodies. By sweeping up smaller particles, the swarm has just become a fewparticles swarm, N_{j} = N_{th}. Let us assume N_{th} = 10. The mass of RP j will grow further by coagulation of smaller bodies; but because of the mass transfer precept given in Eq. (38), the swarm body count will stay the same, N_{j} = 10. In other words, we have 10 bodies simultaneously experiencing runaway growth. This is at odds with the mass separation characteristic of runaway growth, sometimes paraphrased as ‘the winner takes it all’. Runaway growth happens when the accretion rate has a superlinear proportionality with mass. Then, the body which first enters the runaway growth regime has the largest as well as the fastestgrowing accretion rate; it will separate from the particle mass distribution and ‘starve out’ its competitors. Even worse, if bodies in the runaway regime cannot accrete each other, and their swarms will therefore not selfinteract, then the swarm particle count cannot drop below N_{th} at all, and runaway bodies will never be resolved individually.
We therefore relinquish a core property of the RPMC method: we allow the number of RPs to grow. We impose that a swarm j that reaches the particle number threshold, N_{j} ↘ N_{th}, is split into N_{th} singleparticle swarms. The RP of a singleparticle swarm i with N_{i} = 1 represents only itself.
To accurately model a runaway process, the initial number of RPs n^{0} and the particle number threshold N_{th} must then be chosen large enough that swarms whose RPs surpass the critical mass are already resolved individually:
There is still the question of how to handle fractional particle counts. A manyparticles swarm k becomes a fewparticles swarm as the swarm particle count falls to or below the particle number threshold, N_{k} ≤ N_{th}. This can happen in two different ways: (1) RP k may grow by accumulating mass through a collision in the manyparticles regime where the swarm mass M_{k} is conserved, thereby decreasing the swarm particle count N_{k} = M_{k}/m_{k}; or (2) a fewparticles swarm may subtract mass from swarm k through a fewparticles interaction as per Eq. (38). In both cases, N_{k} may end up nonintegral. Some kind of compromise then has to be made when splitting up swarm k into individual RPs. Assuming nonintegral N_{k}, some possibilities are: (i) slightly adjusting the mass m_{k} before the splitting such that N_{k} becomes an integral number; (ii) splitting up the swarm to ⎾N_{k}⏋ particles, one of which is assigned a slightly lower mass (N_{k} − ⎿N_{k}⏌)m_{k}. In our implementation we chose the latter option. We argue that, if N_{th} is chosen large enough, this adjustment is of negligible significance. Furthermore, the boosting technique described in the next section will actually make the swarm particle count N_{k} an integral number in most situations where a regime transition would occur.
Fragmentation can be another cause of fractional particle counts. Consider a fewparticles swarm j, N_{j} ≤ N_{th} whose representative particle collides with a particle from some other swarm k, resulting in a distribution of fragment masses. As per Sect. 3.5, a new mass m′_{j} is sampled from the number density distribution of the resulting fragments, while swarm j absorbs the mass of the impacting particles from swarm k as per Eq. (38). The new particle number can then become nonintegral, for example if a cratering event leaves the new particle mass m′_{j} only slightly smaller than the combined mass, m′_{j} = (1 − ϵ)(m_{j} + m_{k}) with ϵ ≪ 1:
This incongruity points at a more severe problem: statistical treatment is not appropriate for fragmentation of individual bodies. As an example, consider the collision of protoEarth and a protoplanet called Theia, an event which is hypothesised to have formed the Moon by cratering. We assume that protoEarth is represented by RP j with N_{j} = 1. With the collision method discussed above, the mass of protoEarth and Theia would be combined into a new unified swarm, and the new mass of its representative body would be chosen from the distribution of the fragments as either cratered protoEarth (most likely), the Moon (with a chance of 1 : 82), or debris (unlikely). The collision would thus likely result in a swarm of 1.0123 Earthmass planet(s) or, on rare occasions, 82 Moons. This is clearly meaningless and wrong.
To ameliorate this we suggest, for noncoagulating collisions involving selfrepresenting RPs, to decree that every fragment whose mass exceeds a certain threshold m_{th} be represented individually, and therefore to add new RPs to the simulation as required. From the subm_{th} end of the fragment distribution one can then sample one or more statistically representative fragments as RPs. With this approach, the collision of protoEarth and Theia would result in two selfrepresenting RPs of Earth and Moon mass and one or more RPs representing a swarm of debris.
4.4 Boosting
Despite tracking only a small subset of the particles, the simulation still models every single interaction of the representative particles. For wide distributions of particle masses, this is a severe constraint on the efficiency of the simulation. For example, assume that a boulder with a mass of 1000 kg accretes pebbles of mass 1 g. To grow to twice its initial mass, the boulder must accrete 10^{6} pebbles, and each interaction is modelled as a separate RPMC event, which slows down the simulation tremendously. This problem was noted by Zsom & Dullemond (2008, Sect. 2.3), who proposed to group accumulation events of small particles together, effectively decreasing the collision rate while proportionally increasing the gain of mass during an individual event. We now introduce a formalised version of this boosting procedure.
4.4.1 Grouping interactions
To overcome the inefficiency inherent in a method that tracks every individual collision event, we want to group together similar events. To this end, we introduce a boost factor β_{jk} for interactions between RP j and swarm k. If the collision can be assumed to lead to ‘hitandstick’ coagulation, the collision rate from Eq. (36) is substituted with the boosted collision rate
where we defined the boosted swarm multiplicity factor
and the mass transferred to the RP during a collision then is
For interactions in the fewparticles regime, we again impose N_{j} ≤ N_{k} without loss of generality. During a boosted coagulation, mass is then transferred between swarms j and k as per
It is clear that
must hold, where β_{jk} = 1 neutralises the boosting effect. In the following we discuss further constraints for β_{jk}, and how a suitable boost factor can be determined while obeying these constraints.
4.4.2 Constraining the boost factor
Zsom & Dullemond (2008) propose to set a mass accumulation threshold q_{m} := δm/m and choose the boost factor such that each collision event grows the mass of the larger particle by at least q_{m}. For example, if we wanted to accept a relative mass gain of 5% during a single collision event, we would set q_{m} = 0.05 and choose an initial boost factor of
The boost factor β_{jk} is subject to additional constraints. If the collision operates in the manyparticles regime, growth by coagulation will decrease the number of particles in swarm j as the mass of its RP grows,
To avoid an abrupt ‘regime change’, we would like to avoid boosting collisions such that the swarm particle count falls below the particle number threshold, and hence we require N′_{j} ≥ N_{th}, which implies the constraint
For the fewparticles regime, we had imposed N_{j} ≤ N_{k} without loss of generality. This allowed us to pretend that all particles in a swarm j grow by accumulating one particle from swarm k each. When introducing a boost factor, we must ensure that the new swarm mass M′_{k} remains a nonnegative value; hence it follows from Eq. (44) that the boost factor must satisfy
An interaction in the fewparticles regime may change the mass of swarm k. If swarm k is a manyparticles swarm, the loss of mass might push its swarm particle count below N_{th}, which we again want to avoid:
However, we want to allow for the depletion of a manyparticles swarm, and thus apply this constraint only if the swarm would not be depleted because β_{jk} < N_{k}/N_{j}.
To summarise, closedform expressions for the boost factor can be given as
for interactions in the manyparticles regime, and as
for interactions in the fewparticles regime.
4.4.3 Boosting and fragmentation
The boost factor effectively groups particles together before having them accreted. Boosting is therefore not directly applicable if collisions lead to other outcomes, such as pulverisation or cratering. Depending on the sophistication of the fragmentation model used, a given collision may lead to either coagulation or fragmentation with a certain probability, or only a fraction of the combined mass is shattered to smaller pieces. For an overview of the findings of laboratory studies of dust coagulation and fragmentation, see Güttler et al. (2010), and Blum (2018).
A simple way of keeping the benefits of boosting while allowing for fragmentation is to exploit the probabilistic nature of the simulation. When simulating a collision, the new mass of the RP is chosen randomly by sampling the fragment mass distribution as explained in Sect. 3.5. If a fraction (1 − f_{frag}) of the total mass remains in one cohesive body, then the probability that this massive body is chosen as the RP is (1 − f_{frag}). Equivalently, one could say that the collision rate is decomposed in two contributions for fragmentation and for coagulation,
A boost factor is then applied only to the partial collision rate corresponding to coagulation:
Boosting decreases the effective collision rate, hence the likelihood f_{frag} that the RP is sampled from the mass distribution of fragments must increase:
Bukhari Syed et al. (2017) show that the transition from bouncing to fragmentation is smooth: the mass of the largest remaining fragment
which is equal to the mass of the heavier particle in the case of bouncing, , continuously decreases as the kinetic impact energy increases. If the swarm particle k is much lighter than RP j, m_{k} ≪ m_{j}, a large number of cratering impacts are required to effectuate a significant change in the mass of the largest fragment. Similar to coagulation events, we can also group cratering events together by generalising the initial boost factor (Eq. (46)) and boosted mass transfer (Eq. (43)) as
where the transferred mass is given as
5 Validating the statistical balance
The RPMC method is asymmetric at its core: when operating in the manyparticles regime, it is assumed that a RP always interacts with a nonRP, and hence only the properties of the RP are changed. This asymmetry seems natural in the atomistic picture illustrated in Sect. 2. However, in the RPswarm picture, correctness is not obvious. Consider the grainsandboulders example originally given in Zsom & Dullemond (2008, Sect. 3.1). We have an initial ensemble of particles with only two different masses: heavy bodies with mass , henceforth called boulders, and light particles with mass , from now on referred to as grains. We assume that mass is the only relevant particle property. We further assume that collisions are always of the hitandstick type, and that they happen only between boulders and grains. Because boulders do not collide with each other, their number cannot decrease. Likewise, because grains do not suffer mutual collisions, they cannot grow. Therefore, the number of boulders must remain invariant as they grow by accreting grains. In the RP–swarm picture, when a grainmass RP hits and accretes a boulder, its entire swarm is converted to boulders, causing a sudden increase in the number of boulders. Conversely, when a bouldermass RP j accretes a grain, its mass grows, m_{j} → m_{j} + m_{g}, but the mass of its swarm is not changed, M_{j} → M_{j}. Therefore, the number of boulders in the swarm N_{j} = M_{j}/m_{j} decreases, M_{j}/m_{j} → M_{j}/(m_{j} + m_{g}). This section will formally prove that, for the specific case of the grainsandboulders example, the two effects indeed cancel statistically.
5.1 Grains and boulders
We first define the collision rate as
where the mutual collision rate Λ(m, m′) is an arbitrary commutative function of the masses m, m′. Boulders can therefore collide with grains, but collisions among boulders and among grains are suppressed.
Let us now represent this system by a selection of n = n_{g} + n_{B} RPs, n_{g} of which represent the grains, and n_{B} represent the boulders. For notational convenience, we group the RP indices in two index sets 𝔗_{g} and 𝔗_{B} at any given time t, where 𝔗_{g} holds the indices of the grainmass RPs, and the indices in 𝔗_{B} refer to bouldermass RPs:
In the RPMC simulation, the total number of grains N_{g} and total number of boulders N_{B} are given by
With the mutual collision rate between a boulder j and a grain conveniently abbreviated as
we find the RP–swarm collision rates of Eq. (23) to be
and the cumulative collision rates of Eq. (24) to be
where i ∈ 𝔗_{g} and j ∈ 𝔗_{B}, and where we defined the RP–swarm collision rate for any grainmass RP with boulder swarm j, the cumulative grainmass RP collision rate , and the cumulative RP collision rate for bouldermass RP j. The boulder average 〈X〉_{B} of a particlespecific quantity X_{j} was defined as
A grainmass RP can collide with a boulder, but it thereby turns into a bouldermass RP which cannot collide with boulders anymore by choice of the collision rate function in Eq. (60). For a collision rate , the probability that a given grainmass RP has suffered a collision with a boulder during time ∆t is given by the exponential distribution
The expected number of grains at time t + ∆t is therefore
In the limit of ∆t → 0, this yields
which is equivalent to Eq. (B.5) of the analytical model.
The RPMC method keeps swarm masses constant, . In the swarms that already had boulder mass at time t, the number of boulders must therefore decrease over a duration ∆t:
Growth of boulders is a Poisson point process, so for a bouldermass RP j ∈ 𝔗_{B} with mass m_{j} at time t we have
and with M_{j} = N_{j}m_{j}, Eq. (70) evaluates to
At the same time, some of the grainmass swarms become bouldermass swarms after their RP hits a boulder and becomes part of it. The chance for a grainmass RP i ∈ 𝔗_{g} to collide with a boulder is again given by the exponential distribution given in Eq. (67),
Given that RP i undergoes a collision with a boulder, we now need to determine which boulder will be its collision partner. Because the collision rate is independent of the boulder mass, every boulder has the same chance of colliding with the given grainmass RP i. The chance P(j∣i) that RP i collides with a boulder from swarm j ∈ 𝔗_{B} is therefore given by the relative collision rate of a given grain with swarm j,
If RP i collides with a boulder from swarm j, its entire swarm is converted to a swarm of boulders. The swarm mass remains the same, , while the grain RP sticks to the boulder and thereby grows to the mass by time t + ∆t. The number of boulders in the new swarm will thus be
The total number of boulders in the swarms which were composed of grains at time t but have been converted to boulders by time t + ∆t is therefore
Noting that
we find
We then obtain the expected total number of boulders at time t + ∆t by summing up the two contributions in Eqs. (72) and (78). The contributions cancel exactly to first order in ∆t:
The RPMC method can thus be expected to keep the number of boulders N_{B} invariant,
This result demonstrates in the RP–swarm picture that the two paradoxical properties exactly cancel, and that the RPMC method is statistically correct. It is worth pointing out that not only the totality of dN_{B}/dt vanishes, but for every individual j ∈ 𝔗_{B}, the corresponding summands in Eqs. (72) and (78) cancel out. In other words, every swarm of boulders maintains its own statistical balance with the totality of grains.
5.2 Mutating swarm masses
As pointed out in Sect. 4.1, it is reasonable to have each RP i represent approximately the same relative mass M_{i} ≈ M/n, where M is the total mass in the system. But our formal definition of the RPMC method does not impose any assumptions on the values of swarm masses M_{i}. This is crucial because the extension of the method presented in Sect. 4.3 transfers mass between swarms. Specifically, in an interaction that operates in the fewparticles regime, a fewparticles swarm – usually a singleparticle swarm – may remove mass from a manyparticles swarm. In the following we argue that this does not affect the statistical balance of the method.
It should first be noted that the statistical balance only concerns manyparticles swarms, as only interactions between manyparticles swarms operate in the asymmetric manyparticles regime. To incorporate the effect of mass transfers, we admit that the mass M_{i} of any manyparticles swarm i may be changed by an external source or sink modelled as
where the continuous function f (m, t) is the expected relative inflow/outflow rate of particles of mass m at time t. We justify this approach by arguing that, if k is a manyparticles swarm and j is a singleparticle swarm, we have N_{j} = 1 and N_{k} > N_{th}, hence N_{j} ≪ N_{k}, and therefore the change N_{k} → N_{k} − 1 is small enough to be modelled as a continuous process.
With the source or sink model of Eq. (81), swarm mass at time t + ∆t can be expanded as
This leads to a firstorder deviation from N_{g} and N_{B} during time increment ∆t,
where the grain inflow rate f_{g} and the mean boulder inflow rate 〈f〉_{b} are defined as
Any influence of inflow or outflow on the mass transfer terms in Eqs. (72) and (76) is of second order in ∆t and hence does not upset the balance of Eq. (80). We also find
which is equivalent to the result found by adding the source or sink term to the analytical model discussed in Appendix B.
6 Numerical tests
To assess the correctness and applicability of the improved RPMC method, we run a series of numerical tests which can be grouped in three parts.
In Sect. 6.1, we first simulate two wellknown coagulation kernels for which analytical solutions are available, the constant and the linear kernel, leading to selfsimilar mass distributions. Equivalent tests were already conducted in Zsom & Dullemond (2008); we additionally verify that unequal samplings of the mass leave the result unimpaired as claimed in Sect. 4.1.
Second, in Sect. 6.2 we study the transition to the fewparticles regime and the simulation of runaway growth. A fullscale nonrepresentative Monte Carlo simulation of the constant kernel is compared to a simulation with the extended RPMC method. We also study runaway growth by simulating the product kernel, a test kernel with an analytical solution available which develops a very sharp runaway characteristic. Third, we test a runaway kernel that mimics gravitational focussing, finding no effect of the simulation parameters on the evolution of the mass distribution as long as the resolution criterion in Eq. (39) is met.
Third, in Sect. 6.3 we adopt two variants of the grainsandboulders example given by Zsom & Dullemond (2008) for which analytical predictions can be made. We study how well the RPMC method retains the statistical balance depending on the number of RPs used.
6.1 Selfsimilar solutions
The constant kernel
and the linear kernel
are wellknown special cases of the Smoluchowski equation Smoluchowski (1916)
an integrodifferential equation for the number density distribution f(m, t) which describes coagulation processes in a continuous mass distribution. The first term represents the loss of particles at mass m due to coagulation with other particles, while the second term describes the gain of particles by coagulation of masses m′ + (m − m′) = m.
The results for a standard equalmass sampling, M_{k} = M/n∀k, are shown in Fig. 2, finding very good agreement with the analytical solutions, which are summarised in Appendix A. The same tests were rerun with heterogeneous swarm mass samplings. For the constant kernel, swarm masses were drawn from the lognormal swarm number density distribution
where we identified σ ≡ µ/20 and chose µ such that the mean of the distribution is exp(µ + σ^{2}/2) = M/n. The samples M_{k} were normalised such that Σ_{k} M_{k} = M = 10^{20} held exactly. A logspace distribution was chosen because we wanted to assess the resiliency of the method not only to small variations in swarm mass; with the lognormal distribution given, swarm masses span several orders of magnitude. For the linear kernel test, we sampled the RP masses m_{k} from the initial particle number density distribution f(m, 0) in Eq. (A.2) and then chose a constant swarm particle count N_{k} ≡ 𝒩 ∀k such that Σ_{k} m_{k}N_{k} = 𝒩Σ_{k}m_{k} = M = 10^{18}: instead of an equalmass sampling, we chose an equalcount sampling of swarm masses. While this was done mainly for reasons of simplicity, the resulting swarm mass distribution also spans a few orders of magnitude. Both simulations yielded results not significantly different from those obtained with equalmass samplings, demonstrating that the RPMC method works well despite considerable variations of swarm masses. The results are not shown separately because they are visually indiscernible from those in Fig. 2.
Fig. 2 Analytical solutions and RPMC simulations for standard coagulation tests: (a) constant kernel λ(m, m′) = Λ = const; (b) linear kernel λ(m, m′) = Λ_{0} (m + m′). The RPMC simulation runs use n = 1024 RPs. 
6.2 Regime transition and runaway growth
To quantify how well the regime transition works, we study three scenarios. First, we again run a simulation with the constant kernel in Eq. (88), but we let it run until all mass is concentrated in a single particle. We note that the analytical solution in Eq. (A.1) is not applicable in this case because it is a solution to the continuous Smoluchowski equation (Eq. (90)), which approximates discrete coagulation processes only for particle number densities much greater than unity, f(m, t) ≫ 1. Instead, we run a fiducial simulation of n = 10^{8} RPs with swarm particle counts N_{k} = 1∀k – that is, every RP only represents itself – and compare this to a regular RPMC simulation with n = 1 024 RPs that transition into the fewparticles regime as they grow. As can be seen in Fig. 3, the two simulation runs are in good agreement.
Next, we study the product kernel
The coagulation process produces a runaway particle at dimensionless time η = 1. The analytical solution for the product kernel is discussed in Appendix A.
In Fig. 4 we compare the particle number density of the analytical solution in Eq. (A.9) to an RPMC simulation with n = 2048 RPs, finding good agreement even in the runaway regime at η > 1. The figure can be directly compared to Fig. 5 in Ormel & Spaans (2008). Figure 5, which is modelled after Fig. 2 in Wetherill (1990), shows the analytical prediction for the number of particles, the mass of small bodies, and the mass of the runaway particle. The runaway particle mass produced by the RPMC simulation is also shown and in good agreement with the prediction.
In our third test of runaway growth, we use a collision kernel inspired by gravitational focussing (cf. e.g. Armitage 2017, Sect. III.B.1):
with some constant coefficient Λ_{0} and a threshold mass m_{th}. The collision rate therefore scales with m^{2/3} for masses below mth and with m^{4/3} for masses above, which implies that particles above the threshold mass sweep up mass much more efficiently. As with any powerlaw collision rate, we expect the initial mass distribution to relax to a selfsimilar state while in the ∝ m^{2/3} regime. As the largest particles approach the threshold mass, we expect them to start ‘running away’, making the mass distribution bimodal. Finally, the most massive particle will end up accumulating all the remaining mass.
Results are given in Fig. 6, confirming our expectations: the initial distribution quickly becomes selfsimilar and stays that way until t ~ 30. After the highermass end of the distribution exceeds the threshold mass m_{th} = 10^{7}, a second peak starts to form, clearly visible at t = 50. At t = 60 the runaway particles are already depleting the tail end of the mass distribution, and almost all mass has concentrated in a single particle by t = 80. We find no notable sensitivity to the number of RPs as long as the resolution criterion Eq. (39) is satisfied.
Fig. 3 Transition into the fewparticles regime for the constant kernel Eq. (88): (a) Fiducial nonrepresentative Monte Carlo simulation with n = 10^{8} particles; (b) RPMC simulation with n = 1 024 RPs; (c) mean mass 〈m〉 and relative mass spread of both simulations in comparison. The discontinuities in (a) at t = 10^{2} are an aliasing effect caused by histogram binning. (a) and (b) show snapshots of a single simulation, while (c) shows results averaged over 10 runs, with statistical error bounds indicated the by dashed curves. 
Fig. 4 Particle mass density at different times for the product kernel coagulation test, compared to the analytical solution in Eq. (A.9) (solid curves). The RPMC simulation uses n = 2 048 particles and different values of the particle number threshold: (a) N_{th} = 1; (b) N_{th} = 10. The results shown were averaged over 5 runs, with a standard deviation indicated by the error bars. 
Fig. 5 Fraction of particle mass in the runaway particle for the product kernel coagulation test, compared to the analytical solution in Eq. (A.9) (solid curves). The RPMC simulation uses n = 2 048 particles and different values of the particle number threshold: (a) N_{th} = 1; (b) N_{th} = 10. The results shown were averaged over 5 runs, with a standard deviation indicated by the error bars. 
Fig. 6 RPMC simulation with the runaway collision rate in Eq. (93), a dimensionless threshold mass of m_{th} = 10^{7} (indicated by dotted red horizontal line in (b)), a total mass of M = 10^{10}, a homogeneous swarm mass distribution, a particle number threshold of N_{th} = 20, and n = 2500 RPs. Every individual swarm i has mass M_{i} = M/n; particles are resolved individually once they grow beyond the mass M/(nN_{th}) (indicated by the dashed orange horizontal line in (b)). Each panel shows a time series of histograms. The massweighted particle number density is shown on the vertical axis of (a), whereas in (b) the massweighted histogram bin counts are colourencoded on a logarithmic scale, and the mass of the runaway particle is shown separately (red curve). Histogram bin counts and mass of runaway particle are averaged over 10 runs, the error bounds of the runaway particle being indicated by red dashed curves. 
6.3 Grains and boulders
We simulate the grainsandboulders example introduced and analysed in Sect. 5 to validate the results and to confirm that statistical balance is indeed retained. We first consider a massindependent mutual collision rate,
with some constant Λ_{0}. An analytical prediction for the expected mean boulder mass 〈m〉_{B} and its standard deviation is developed in Appendix B. It is shown for different initial conditions in Fig. 7. The model was then simulated with the RPMC method for the same set of initial conditions; results are given in Fig. 8. We observe very good agreement with the analytical prediction.
Boulders remain boulders after a hitandstick collision with a grain. The number of boulders therefore cannot change, dN_{B}/dt = 0, and thus
When simulating an ensemble of grains and boulders with the RPMC method, the number of boulders is
If the simulation operates in the manyparticles regime, the swarm masses M_{j} do not change. Therefore, as bouldermass RPs grow, the number of boulders per swarm N_{j} = M_{j}/m_{j} decreases. At the same time, some grainmass RPs collide with a boulder and therefore become bouldermass RPs. Both effects contribute to the sum in Eq. (96), and as argued before we expect them to cancel exactly.
Figure 9a visualises this statistical balance. As can be seen, the two contributions operate on slightly different timescales. It is more likely for a boulder to hit a grain than for a grain to hit a boulder; but when a bouldermass RP accumulates a grain, it just grows a little and thus represents slightly fewer boulders than before, whereas a grainmass RP that hits a boulder will instantly convert its entire swarm to boulders, causing a surge in total boulder count. We can observe the relative smoothness of the particle number decay as well as the more erratic particle number growth in Fig. 9a.
We naturally expect the resolution of the simulation to improve if we add more RPs. The influence of the number of RPs on the accuracy of the result is studied in Fig. 9b for different RP counts. The statistical precision of the final boulder mass is estimated as the standard deviation of the final masses in repeated runs of the simulation and shown as error bars around the mean value. We find it to be insensitive to the particle mass ratio . However, the physical spread of the mass distribution, whose standard deviation is visualised as a filled area, does depend on the particle masses and the total masses as predicted by Eq. (B.17),
The larger the ratios and , the smaller the physical spread .
Next, we consider a mutual collision rate linearly correlated to mass,
with some constant Λ_{0}. Fig. 10 compares the analytical prediction of the average boulder mass 〈m〉_{B} and its standard deviation derived in Appendix B to an equivalent RPMC simulation, finding very good agreement.
Fig. 7 Analytical solution given in Eqs. (B.11), (B.15) to the grainsandboulders model introduced in Sect. 5 for different boulder–grain particle mass ratios: (a) total number of grains N_{g}; (b) expected boulder mass 〈m〉_{B}. The total mass ratio of grains and boulders is , so boulders are expected to grow to 7 times their initial mass, as indicated by the horizontal black line. The horizontal axes refer to dimensionless time . In (b), the filled area indicates the physical spread of the mass distribution given by Eq. (B.17). 
Fig. 8 RPMC simulation of the grainsandboulders model introduced in Sect. 5 for different boulder–grain particle mass ratios for comparison with Fig. 7: (a) total number of grains N_{g}; (b) average boulder mass 〈m〉_{B}. The simulation uses n = 1 792 RPs divided into n_{B} = 256 bouldermass RPs and n_{g} = 6n_{B} = 1 536 grainmass RPs with equalweight swarms, M_{i} = M_{j} ∀i, j ∈ {1, …, n}, and an initial number of grains. The boulder masses in (b) are averaged over 1 consecutive runs; the filled area indicates the physical spread of the mass distribution, and the error bars indicate the standard deviation of the mean boulder mass over 1 runs. The horizontal black line indicates the expected final boulder mass . 
Fig. 9 Further analysis of the RPMC simulation of the grainsandboulders model: (a) contributions to dN_{B}/dt: dN_{g→B}/dt (solid curve), −dN_{B→B}/dt (dashed curve); (b) statistics of final boulder mass as a function of RP count n: standard deviation of the mean over 10 runs (error bars) and physical spread (filled area). Simulation parameters as in Fig. 8. In (a), the full n = 1 792 RPs were used; in (b), the number of RPs was varied. n was divided into n_{B} bouldermass RPs and n_{g} = 6n_{B} grainmass RPs with equalweight swarms, M_{i} = M_{j}∀i, j ∈ {1, …, n}. The horizontal black line indicates the expected final boulder mass . 
Fig. 10 Average boulder mass 〈m〉_{B} and the physical spread for a massdependent collision rate Λ(m, m′) = Λ_{0} (m + m′) for different bouldergrain particle mass ratios: (a) analytical model as given by Eqs. (B.19), (B.21); (b) RPMC simulation. The simulation uses n = 1 792 RPs divided into n_{B} = 256 bouldermass RPs and n_{g} = 6n_{B} = 1 536 grainmass RPs with equalweight swarms, M_{i} = M_{j}∀i, j ∈ {1, …, n}. The boulder masses in (b) are averaged over 10 runs; the filled area indicates the physical spread of the mass distribution, and the error bars indicate the standard deviation of the mean boulder mass over 10 runs. The horizontal black line indicates the expected final boulder mass . The total mass ratio of grains and boulders is , so boulders are expected to grow to 7 times their initial mass, as indicated by the horizontal black line. The horizontal axes refer to dimensionless time . 
7 Discussion
The proposed extensions to the RPMC method are strictly additive in the sense that they do not change how the method operates in the manyparticles regime to which the original RPMC method was constrained. As such, the considerations of Zsom & Dullemond (2008, Sect. 3.2) with regard to the required number of representative particles continue to apply. In Sect. 3.3 of the same paper, two main limitations of the method are discussed. The first limitation – that swarms must consist of many particles – is overcome by our extension, but the other – that, as an explicit method, the RPMC method is illsuited for simulating ‘stiff’ problems in which two adverse effects nearly balance each other – remains and is in fact aggravated by our extension of the method. When following the traditional implementation strategy, the variability of the number of representative particles in the extended method adds substantially to the computational cost; however, in a companion paper, we devise a different implementation strategy which is much less sensitive to the number of RPs.
7.1 Limitations
The original RPMC method was not wellsuited to simulate systems in which two strong effects nearly balance each other, leading only to a small net change. As an example, Zsom & Dullemond (2008, Sect. 3.3) suggest a system in which coagulation and fragmentation processes lead to a ‘semisteady state’ in which particles grow and fragment many times, and explains that, as an explicit method, the RPMC method must simulate the individual growth and fragmentation processes, rendering the problem ‘stiff’. This limitation is not removed by our extension of the method, and can in fact be further aggravated by it: If a RP in such a system grows large enough that its swarm particle count falls below the particle number threshold N_{th}, it will be split up to individual RPs. If all of these RPs subsequently fragment to small particles which then recommence the growth cycle, the number of RPs will steadily grow while the particle number distribution remains stable. With an everincreasing number of RPs, the simulation will quickly exceed its computational resources.
The root of this problem is an asymmetry in our extension of the method: we split up swarms as their particle mass grows larger, but we never merge multiple swarms into one. Unlike splitting, which can be done without consequences for the statistical outcome of the simulation, merging usually entails a loss of information because the particle properties q_{j}, q_{k} of two swarms j, k to be merged will rarely be exactly identical, and hence have to be averaged over in some way. A merging step was omitted for simplicity, and because it was deemed unnecessary for our main goal in extending the method, which is to simulate planetesimal growth to and beyond the runaway growth regime. However, there is nothing fundamentally precluding a merging step in the simulation, and such an addition would indeed be necessary before the method can be used to simulate semisteady processes (which would however retain their stiffness even with merging). Merging was an integral part of the Monte Carlo method of Ormel & Spaans (2008); in this method, different ‘species’ (the semantic equivalent of swarms in the RPMC method) could be merged by averaging their properties. A concrete example of such a merging prescription is given in Krijt et al. (2015, Sect. 3.4). A similar approach could in principle be incorporated into the RPMC method.
7.2 Computational cost
It was mentioned that a growing number of RPs may significantly increase the computational cost of the simulation. In the traditional implementation described in Zsom & Dullemond (2008), n^{2} RP–swarm interaction rates must be computed, stored, and updated for n RPs. Each time a swarm is split up into N_{th} individual RPs, this cost increases accordingly. The companion paper (Beutel et al., in prep.) explores an alternative sampling method which avoids computing all n^{2} RP–swarm interaction rates. Using a bucketing scheme, this alternative sampling method makes the computational cost less sensitive to the number of RPs n and more dependent on the breadth of the occupied subspace of the parameter space ℚ, and thereby makes the improved RPMC method computationally viable.
7.3 Boosting
In Sect. 4.4, a boosting scheme was proposed to ensure efficient simulation of growth processes where large bodies accrete large amounts of small bodies. Although an intuitive justification for the approach was given, it remained unclear how it impacted the statistical result of the simulation. This impact can be studied by means of the grainsandboulders example:
In Sect. 6.3, we studied the influence of the bouldergrain mass ratio on the simulation outcome. We predicted (cf. Fig. 7) and confirmed numerically (cf. Fig. 8) a spread of the final boulder mass that was anticorrelated with the boulder–grain mass ratio . Now, a boost factor β_{jk} > 1 would be equivalent to accreting more massive grains with a lower collision rate, and the effect of boosting is therefore quantitatively similar to the spread of the final boulder mass distribution in Eq. (97). Although the spread of the mass distribution increases as the boulder–grain mass ratio is increased, it was found that the fidelity of the RPMC method, quantified by the uncertainty of the expected final boulder mass, did not depend on the mass ratio (cf. Fig. 8b). Analogously, boosting seems justifiable provided that the spread of the mass distribution incurred does not exceed the desired precision of the result. This requirement may be significant when dealing with sharply peaked distributions as in the grainsandboulders example, but it becomes much less restricting for ensembles with a broad mass distribution such as the standard coagulation tests (cf. Fig. 2) which were conducted with a mass accumulation threshold of q_{m} = 5%, leading to no significant deviation from analytical predictions.
7.4 Avoiding nonintegral particle numbers
As mentioned in Sect. 4.3, nonintegral swarm particle counts N_{k} pose a challenge when swarms become fewparticles swarms and are thus split up to individual RPs. We note that the boost factor implicitly helps mitigating this problem. A manyparticles swarm can become a fewparticles swarm in two scenarios: (1) by accumulating mass through a collision in the manyparticles regime, or (2) by losing mass through a collision in the fewparticles regime. Equation (48) constrains the boost factor such that, in scenario (1), the new swarm particle count does not fall below the particle number threshold, , thus obtaining a ‘precision landing’ of if permitted by the other constraints. For the fewparticles regime, by exhausting the constraint β_{jk} ≤ N_{k}/Nj given in Eq. (49) we can let swarm j deplete swarm k completely by choosing β_{jk} = N_{k}/N_{j}, again if permitted by the other constraints.
8 Summary and conclusions
We have presented a method for stochastic simulation of particle–particle interactions. The method builds upon the RPMC method introduced by Zsom & Dullemond (2008). The most severe limitation of the RPMC method was that, due to its asymmetric construction, the growth of a given representative particle was effectively constrained by the amount of mass it represented. Our extension of the method overcomes this growth barrier by allowing the coexistence of, and the transition between, representative particles and individual, selfrepresenting particles. With our method, it is now possible to accurately model runaway growth processes while largely retaining the structural advantages of the RPMC method such as the stability of the sampling weights.
In Sect. 2, we gave a formal definition of the RPMC method, which we subsequently extended in Sect. 4.3. In Sect. 5 we proved for a generalised variant of the grainsandboulders example given in Zsom & Dullemond (2008) that, despite the asymmetry in the modelling of interactions, the RPMC method is in fact statistically balanced and produces correct results. Finally, in Sect. 6 we conducted extensive numerical testing to validate that the extended RPMC method indeed produces plausible results for different cases of orderly growth and runaway growth.
In our extended version of the RPMC method, representative particles are added to the simulation as particles transition between the representative manyparticles regime and the individual singleparticle regime. We have thereby abandoned a key feature of the original RPMC method: a stable number of representative particles, which was useful because it allowed for predictable performance and simple implementation. In a traditional implementation of the RPMC method, adding new representative particles would introduce substantial computational cost. In the companion paper (Beutel et al., in prep.), we present an implementation strategy that not only alleviates the cost of adding nearidentical representative particles to the simulation but can also overcome the 𝒪(n^{2}) computational complexity and storage requirements of the traditional method.
With our extension, the RPMC method can be used to study growth processes over a vast dynamic range, extending into the oligarchic regime where individual bodies arise which cannot be treated stochastically. In this regard, it is similar to the Monte Carlo method of Ormel & Spaans (2008); in the fewparticles regime, our method indeed faces similar challenges. A possible advantage of our method over other approaches may be that its main benefit, constant resolution through the intrinsic preservation of an equalmass sampling, is retained. Interactions between manyparticle swarms keep swarm masses unchanged; and swarms can lose but not gain mass through interactions with fewparticles swarms, which implies that the resolution is never impaired.
With the proposed method, planetary growth processes can now be studied for the entire range of masses up to fullgrown planets. Regimes of runaway growth or oligarchic growth, where individual particles accumulate large fractions of the total mass, can be faithfully represented. In addition, the transition to individual particles gives rise to the possibility of embedding a representative particle Monte Carlo simulation in another type of simulation which traces particles individually, for example a gravitational Nbody simulation.
The original RPMC method has been used for coagulation problems involving more than just the mass as a particle property, for instance spatial resolution (Drążkowska et al. 2013, 2014) or additional intrinsic particle properties (e.g. Zsom et al. 2010; Krijt & Ciesla 2016; Krijt et al. 2016; Windmark et al. 2012). The added capability, discussed in this paper, of transiting to the fewparticles regime allows to address a variety of problems involving the growth from pebbles, via planetesimals, to planets; runaway and oligarchic growth, and the subsequent Nbody dynamics between the oligarchs, possibly driven by pebble accretion onto these oligarchs; or planetesimal and planet formation in dust rings while incorporating the Nbody dynamics of these objects, with the newly formed planets starting to stir the pebbles and planetesimals in the ring.
Acknowledgements
We would like to thank Robert Strzodka, Til Birnstiel, Sebastian Stammler, Joanna Drazkowska, Tommy Chi Ho Lau and Natalia Dzyurkevich for valuable feedback and useful discussions. We also thank the referee, Sebastiaan Krijt, for his insightful comments and questions which helped us to improve the paper significantly. This work is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy EXC 2181/1390900948 (the Heidelberg STRUCTURES Excellence Cluster).
Appendix A Analytical solutions and equalweight samplings for standard coagulation tests
For the constant kernel Eq. (88) and initial conditions f(m, 0) = N^{0}δ_{D}(m − m^{0}), the Smoluchowski equation (90) has the analytical solution
with the total dimensionless mass M (e.g. Ohtsuki et al. 1990).
Sampling an initial set of RPs with equalweight swarms is easy because the definition of the number density distribution f(m, 0) implies that all RPs have the same mass initially. We simply select an ensemble of n RPs with homogeneous particle masses m^{0} and swarm masses M/n.
For the linear kernel Eq. (89) with an initial number density distribution
and the initial average mass , the Smoluchowski equation has the analytical solution
with the total dimensionless mass M, where I_{1} (·) is the modified Bessel function of the first kind (e.g. Ohtsuki et al. 1990).
Drawing equalweight samples for the nontrivial initial mass distribution of the linear kernel test is a bit more involved. Let us assume that some number density distribution f(m) is to be sampled by n RPs of masses m and equal weight
Because of the equalweight sampling, masses of RPs must instead be drawn from the massweighted density distribution
For the initial number density distribution of the linear kernel test (Eq. (A.2)), the massweighted density distribution is
We can then draw RP masses with inverse transform sampling by first computing the cumulative representative number density distribution
then inverting the relation ξ = F_{R}(m, 0),
where W_{k}(z) is the Lambert W function, and then drawing a uniform random number ξ ∈ [0, 1).
An analytical solution for the product kernel Eq. (92) was constructed by Trubnikov (1971) and made applicable for times η > 1 by Wetherill (1990). The particle number density is
where m^{0} is the initial particle mass, N^{0} is the initial number of particles of mass m^{0}, k ∈ ℕ is a dimensionless mass argument, and η is a dimensionless time coordinate.
As with the constant kernel test, all particles have the same mass initially, making the sampling of equalweight swarms trivial.
Wetherill (1990) explains that, although the equation solved by Trubnikov (1971) becomes inconsistent for η > 1 when a runaway particle has separated from the continuous distribution, the solution in Eq. (A.9) still correctly describes the small body mass distribution, that is, the mass distribution of all particles except for the runaway particle. Using conservation of mass, we can infer the mass of the runaway particle by direct summation of Eq. (A.9).
Appendix B Analytical model for grainsandboulders test
In this section we develop an analytical model for the grainsandboulders example used in Sect. 5. We derive a system of differential equations for the remaining number of grains N_{g}, the average boulder mass 〈m〉_{B} and the variance of the boulder mass distribution , and we give analytical expressions for the cases of constant and linear mutual collision rates, cf. Eqs. (94, 98).
For any given grainmass RP, the grain–boulder collision rate is
where we define the boulder average 〈X〉_{B} of a particlespecific quantity X_{j} as
(note that in Eq. (66) the boulder average was defined differently in the context of the RPMC method) and where the mutual collision rate of any grain with a boulder j was abbreviated as
The number of grains N_{g} decreases stochastically, so we can only reason about the expected value of N_{g}. At some time t, the expected value will change over a time increment ∆t with the grain–boulder collision rate given in Eq. (B.1),
where the time dependencies of quantities X are henceforth abbreviated as X ≡ X(t), X′ ≡ X(t + ∆t). We thus obtain the differential equation
The growth of a given boulder mass is described by a Poisson point process. The chance that boulder j accumulates k grains during a given time increment ∆t,
is given by the Poisson distribution with the probability mass function Pois_{k}(λ_{B,j}∆t) defined as
for which the expected value of k equals the variance,
For a time increment ∆t, the expected average value of is thus
By taking the difference quotients to the limit ∆t → 0, we find
where we again note that the number of boulders cannot change, N_{B} = const. By relating Eqs. (B.5) and (B.10), the average boulder mass 〈m〉_{B} can thus be expressed in terms of the number of grains N_{g} as
To find a differential equation for the variance of the boulder mass as a function of time, we evaluate the variance at times t and t + ∆t,
By subtracting Eq. (B.12) from Eq. (B.13) and taking the difference quotient for ∆t → 0, we obtain
where the first term describes the shot noise originating from the discrete nature of the collision events, and the second term represents the covariance of boulder masses and collision rates.
For a constant mutual collision rate, Λ(m, m′) = Λ_{0}, Eq. (B.5) is solved by exponential decay,
where is the initial number of grains and the grainboulder collision rate λ_{g} in Eq. (B.1) simplifies to
Because collision rates and boulder masses are uncorrelated, the variance is fully determined by shot noise, and we can directly integrate Eq. (B.14):
where we note that the initial variance is zero, , because all boulders have the same mass initially. In a homogeneous Poisson point process, the variance of the number of collisions equals the expected number of collisions. We see that this also holds true for our nonhomogeneous Poisson process if collision rates do not depend on particle masses.
Let us now study a linear mutual collision rate, Λ(m, m′) = Λ_{0} (m, m′) with some constant Λ_{0}. By evaluating Eq. (B.2), we find
Eq. (B.5) then has the solution
where we defined the abbreviations
Using Eq. (B.14) and identifying the variance as per Eq. (B.12), we find the variance of the average boulder mass governed by the differential equation
where the covariance term describes the broadening of the variance due to the mass dependency of collision rates: smaller boulders will grow more slowly than average, whereas larger boulders will sweep up grains faster.
Appendix C RPMC approximation and statistical balance
In Item 2 of Sect. 4.2, one of the problems with extending the RPMC method to fewparticles swarms was identified in the fact that the simplified RP–swarm collision rate of Eq. (23) overestimates the RP–swarm collision rate Eq. (19), which becomes significant for N_{k} ~ 1. An obvious way of attempting to mitigate this problem would be to use the RP–swarm collision rate instead of its approximation . But while this would lead to a statistically correct interaction rate, the resulting mass distribution would be skewed. In Fig. C.1, it can be seen that the approximation must be made for the RPMC method to be statistically balanced: without the approximation, the final boulder mass does not converge to the expected value of as the number of RPs n is increased.
Fig. C.1 As in Fig. 9b, but with an initial number of only grains: (a) using the approximate collision rates , and (b) using the true collision rates instead. As can be seen, using the true collision rates in the manyparticles regime would distort the statistical balance of the RPMC method. 
The statistical imbalance can be understood by considering the different origins of the N_{g} and N_{j} factors in Eqs. (78) and (72): some originate from collision rates and and would hence be adjusted by summands of −1/2 and −n_{g}/2, respectively, whereas others came from summation over all grainmass and bouldermass swarms and thus are not adjusted. The statistical balance equation (80) would then end up end up different from0 with a spurious dependence on the simulation parameters n_{g} and .
Appendix D List of symbols
A crossreferenced overview of the most commonly used symbols in this work is given in Table D.1.
List of commonly used symbols.
References
 Armitage, P. J. 2017, ArXiv eprints [arXiv:astroph/0701485] [Google Scholar]
 Birnstiel, T., Dullemond, C. P., & Brauer, F. 2010, A&A, 513, A79 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Birnstiel, T., Fang, M., & Johansen, A. 2016, Space Sci. Rev., 205, 41 [Google Scholar]
 Bitsch, B., Lambrechts, M., & Johansen, A. 2015, A&A, 582, A112 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Blum, J. 2018, Space Sci. Rev., 214, 52 [NASA ADS] [CrossRef] [Google Scholar]
 Bukhari Syed, M., Blum, J., Wahlberg Jansson, K., & Johansen, A. 2017, ApJ, 834, 145 [NASA ADS] [CrossRef] [Google Scholar]
 Drążkowska, J., Windmark, F., & Dullemond, C. P. 2013, A&A, 556, A37 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Drążkowska, J., Windmark, F., & Dullemond, C. P. 2014, A&A, 567, A38 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Drążkowska, J., Li, S., Birnstiel, T., Stammler, S. M., & Li, H. 2019, ApJ, 885, 91 [Google Scholar]
 Dullemond, C. P., & Dominik, C. 2005, A&A, 434, 971 [CrossRef] [EDP Sciences] [Google Scholar]
 Emsenhuber, A., Mordasini, C., Burn, R., et al. 2021, A&A, 656, A69 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Gillespie, D.T. 1975, J. Atmos. Sci., 32, 1977 [NASA ADS] [CrossRef] [Google Scholar]
 Greenberg, R., Hartmann, W. K., Chapman, C. R., & Wacker, J. F. 1978, Icarus, 35, 1 [NASA ADS] [CrossRef] [Google Scholar]
 Güttler, C., Blum, J., Zsom, A., Ormel, C. W., & Dullemond, C. P. 2010, A&A, 513, A56 [Google Scholar]
 Johansen, A., Oishi, J. S., Mac Low, M.M., et al. 2007, Nature, 448, 1022 [Google Scholar]
 Johansen, A., Youdin, A., & Mac Low, M.M. 2009, ApJ, 704, L75 [NASA ADS] [CrossRef] [Google Scholar]
 Kokubo, E., & Ida, S. 1998, Icarus, 131, 171 [Google Scholar]
 Krijt, S., & Ciesla, F. J. 2016, ApJ, 822, 111 [NASA ADS] [CrossRef] [Google Scholar]
 Krijt, S., Ormel, C. W., Dominik, C., & Tielens, A. G. G. M. 2015, A&A, 574, A83 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Krijt, S., Ciesla, F. J., & Bergin, E. A. 2016, ApJ, 833, 285 [Google Scholar]
 Lambrechts, M., & Johansen, A. 2012, A&A, 544, A32 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Ohtsuki, K., Nakagawa, Y., & Nakazawa, K. 1990, Icarus, 83, 205 [NASA ADS] [CrossRef] [Google Scholar]
 Okuzumi, S., Tanaka, H., & Sakagami, M.a. 2009, ApJ, 707, 1247 [NASA ADS] [CrossRef] [Google Scholar]
 Okuzumi, S., Tanaka, H., Kobayashi, H., & Wada, K. 2012, ApJ, 752, 106 [NASA ADS] [CrossRef] [Google Scholar]
 Ormel, C. W., & Klahr, H. H. 2010, A&A, 520, A43 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Ormel, C. W., & Spaans, M. 2008, ApJ, 684, 1291 [NASA ADS] [CrossRef] [Google Scholar]
 Ormel, C. W., Spaans, M., & Tielens, A. G. G. M. 2007, A&A, 461, 215 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Ormel, C. W., Dullemond, C. P., & Spaans, M. 2010, Icarus, 210, 507 [NASA ADS] [CrossRef] [Google Scholar]
 Ormel, C. W., Liu, B., & Schoonenberg, D. 2017, A&A, 604, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Safronov, V. S. 1964, Probl. Cosmogeny, 6, 71 [Google Scholar]
 Schneider, A. D., & Bitsch, B. 2021, A&A, 654, A71 [Google Scholar]
 Smoluchowski, M. V. 1916, Phys. Zeitsch., 17, 557 [Google Scholar]
 Stammler, S. M., Birnstiel, T., Panic, O., Dullemond, C. P., & Dominik, C. 2017, A&A, 600, A140 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Tanaka, H., Himeno, Y., & Ida, S. 2005, ApJ, 625, 414 [NASA ADS] [CrossRef] [Google Scholar]
 Trubnikov, B. A. 1971, Sov. Phys. Dokl., 16, 124 [Google Scholar]
 Wahlberg Jansson, K., & Johansen, A. 2014, A&A, 570, A47 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Wahlberg Jansson, K., Johansen, A., Bukhari Syed, M., & Blum, J. 2017, ApJ, 835, 109 [NASA ADS] [CrossRef] [Google Scholar]
 Weidenschilling, S. J. 1980, Icarus, 44, 172 [NASA ADS] [CrossRef] [Google Scholar]
 Wetherill, G. W. 1990, Icarus, 88, 336 [NASA ADS] [CrossRef] [Google Scholar]
 Wetherill, G. W., & Stewart, G. R. 1993, Icarus, 106, 190 [Google Scholar]
 Windmark, F., Birnstiel, T., Güttler, C., et al. 2012, A&A, 540, A73 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Wurm, G., & Teiser, J. 2021, Nat. Rev. Phys., 3, 405 [CrossRef] [Google Scholar]
 Zsom, A., & Dullemond, C. P. 2008, A&A, 489, 931 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Zsom, A., Ormel, C. W., Güttler, C., Blum, J., & Dullemond, C. P. 2010, A&A, 513, A57 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Zsom, A., Ormel, C. W., Dullemond, C. P., & Henning, T. 2011a, A&A, 534, A73 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Zsom, A., Sándor, Z., & Dullemond, C. P. 2011b, A&A, 527, A10 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
All Tables
All Figures
Fig. 1 Indexing of the full ensemble of physical particles, with the representative particles ordered at the beginning of the sequence. The entirety of particles in a given ensemble is indexed from 1 to N. The first n indices refer to RPs, and the remaining indices belong to nonRPs. Because N may be enormously large, tracking the entirety of particles is not feasible. Only the n RPs are actually modelled on the computer, while the properties of the (N − n) nonRPs are estimated from the known properties of the n representative ones. 

In the text 
Fig. 2 Analytical solutions and RPMC simulations for standard coagulation tests: (a) constant kernel λ(m, m′) = Λ = const; (b) linear kernel λ(m, m′) = Λ_{0} (m + m′). The RPMC simulation runs use n = 1024 RPs. 

In the text 
Fig. 3 Transition into the fewparticles regime for the constant kernel Eq. (88): (a) Fiducial nonrepresentative Monte Carlo simulation with n = 10^{8} particles; (b) RPMC simulation with n = 1 024 RPs; (c) mean mass 〈m〉 and relative mass spread of both simulations in comparison. The discontinuities in (a) at t = 10^{2} are an aliasing effect caused by histogram binning. (a) and (b) show snapshots of a single simulation, while (c) shows results averaged over 10 runs, with statistical error bounds indicated the by dashed curves. 

In the text 
Fig. 4 Particle mass density at different times for the product kernel coagulation test, compared to the analytical solution in Eq. (A.9) (solid curves). The RPMC simulation uses n = 2 048 particles and different values of the particle number threshold: (a) N_{th} = 1; (b) N_{th} = 10. The results shown were averaged over 5 runs, with a standard deviation indicated by the error bars. 

In the text 
Fig. 5 Fraction of particle mass in the runaway particle for the product kernel coagulation test, compared to the analytical solution in Eq. (A.9) (solid curves). The RPMC simulation uses n = 2 048 particles and different values of the particle number threshold: (a) N_{th} = 1; (b) N_{th} = 10. The results shown were averaged over 5 runs, with a standard deviation indicated by the error bars. 

In the text 
Fig. 6 RPMC simulation with the runaway collision rate in Eq. (93), a dimensionless threshold mass of m_{th} = 10^{7} (indicated by dotted red horizontal line in (b)), a total mass of M = 10^{10}, a homogeneous swarm mass distribution, a particle number threshold of N_{th} = 20, and n = 2500 RPs. Every individual swarm i has mass M_{i} = M/n; particles are resolved individually once they grow beyond the mass M/(nN_{th}) (indicated by the dashed orange horizontal line in (b)). Each panel shows a time series of histograms. The massweighted particle number density is shown on the vertical axis of (a), whereas in (b) the massweighted histogram bin counts are colourencoded on a logarithmic scale, and the mass of the runaway particle is shown separately (red curve). Histogram bin counts and mass of runaway particle are averaged over 10 runs, the error bounds of the runaway particle being indicated by red dashed curves. 

In the text 
Fig. 7 Analytical solution given in Eqs. (B.11), (B.15) to the grainsandboulders model introduced in Sect. 5 for different boulder–grain particle mass ratios: (a) total number of grains N_{g}; (b) expected boulder mass 〈m〉_{B}. The total mass ratio of grains and boulders is , so boulders are expected to grow to 7 times their initial mass, as indicated by the horizontal black line. The horizontal axes refer to dimensionless time . In (b), the filled area indicates the physical spread of the mass distribution given by Eq. (B.17). 

In the text 
Fig. 8 RPMC simulation of the grainsandboulders model introduced in Sect. 5 for different boulder–grain particle mass ratios for comparison with Fig. 7: (a) total number of grains N_{g}; (b) average boulder mass 〈m〉_{B}. The simulation uses n = 1 792 RPs divided into n_{B} = 256 bouldermass RPs and n_{g} = 6n_{B} = 1 536 grainmass RPs with equalweight swarms, M_{i} = M_{j} ∀i, j ∈ {1, …, n}, and an initial number of grains. The boulder masses in (b) are averaged over 1 consecutive runs; the filled area indicates the physical spread of the mass distribution, and the error bars indicate the standard deviation of the mean boulder mass over 1 runs. The horizontal black line indicates the expected final boulder mass . 

In the text 
Fig. 9 Further analysis of the RPMC simulation of the grainsandboulders model: (a) contributions to dN_{B}/dt: dN_{g→B}/dt (solid curve), −dN_{B→B}/dt (dashed curve); (b) statistics of final boulder mass as a function of RP count n: standard deviation of the mean over 10 runs (error bars) and physical spread (filled area). Simulation parameters as in Fig. 8. In (a), the full n = 1 792 RPs were used; in (b), the number of RPs was varied. n was divided into n_{B} bouldermass RPs and n_{g} = 6n_{B} grainmass RPs with equalweight swarms, M_{i} = M_{j}∀i, j ∈ {1, …, n}. The horizontal black line indicates the expected final boulder mass . 

In the text 
Fig. 10 Average boulder mass 〈m〉_{B} and the physical spread for a massdependent collision rate Λ(m, m′) = Λ_{0} (m + m′) for different bouldergrain particle mass ratios: (a) analytical model as given by Eqs. (B.19), (B.21); (b) RPMC simulation. The simulation uses n = 1 792 RPs divided into n_{B} = 256 bouldermass RPs and n_{g} = 6n_{B} = 1 536 grainmass RPs with equalweight swarms, M_{i} = M_{j}∀i, j ∈ {1, …, n}. The boulder masses in (b) are averaged over 10 runs; the filled area indicates the physical spread of the mass distribution, and the error bars indicate the standard deviation of the mean boulder mass over 10 runs. The horizontal black line indicates the expected final boulder mass . The total mass ratio of grains and boulders is , so boulders are expected to grow to 7 times their initial mass, as indicated by the horizontal black line. The horizontal axes refer to dimensionless time . 

In the text 
Fig. C.1 As in Fig. 9b, but with an initial number of only grains: (a) using the approximate collision rates , and (b) using the true collision rates instead. As can be seen, using the true collision rates in the manyparticles regime would distort the statistical balance of the RPMC method. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.