The Seven Sisters DANCe III: Projected spatial distribution

Methods. We compute Bayesian evidences and Bayes Factors for a set of variations of the classical radial models by King (1962), Elson et al. (1987) and Lauer et al. (1995). The variations incorporate different degrees of model freedom and complexity, amongst which we include biaxial (elliptical) symmetry, and luminosity segregation. As a by-product of the model comparison, we obtain posterior distributions and maximum a posteriori estimates for each set of model parameters. Results. We find that the model comparison results depend on the spatial extent of the region used for the analysis. For a circle of 11.5 parsecs around the cluster centre (the most homogeneous and complete region), we find no compelling reason to abandon Kings model, although the Generalised King model, introduced in this work, has slightly better fitting properties. Furthermore, we find strong evidence against radially symmetric models when compared to the elliptic extensions. Finally, we find that including mass segregation in the form of luminosity segregation in the J band, is strongly supported in all our models. Conclusions. We have put the question of the projected spatial distribution of the Pleiades cluster on a solid probabilistic framework, and inferred its properties using the most exhaustive and least contaminated list of Pleiades candidate members available to date. Our results suggest however that this sample may still lack about 20% of the expected number of cluster members. Therefore, this study should be revised when the completeness and homogeneity of the data can be extended beyond the 11.5 parsecs limit. Such study will allow a more precise determination of the Pleiades spatial distribution, its tidal radius, ellipticity, number of objects and total mass.


Introduction
The projected spatial distribution (PSD), also known as number surface density, of a stellar cluster is the two dimensional (2D) projection, in the plane of the sky, of its three dimensional (3D) space distribution. Because celestial coordinates are far more easily measured than parallaxes (at least before Gaia), only a small fraction of the objects with stellar positions have distance estimates. Furthermore, the relative uncertainties in the celestial coordinates yield far more precise measurements (by a factor of 10 4 ) of distances perpendicular to the line of sight than those achieved by parallaxes along this line so far (except perhaps for very close objects). This explains why most of the previous works devoted to studying the spatial distribution of stars in clusters have been done using the PSD.
In the case of the Pleiades, cross-matching the Hipparcos catalogue (Perryman et al. 1997) with the 2109 candidate mem-bers of Bouy et al. (2015), shows that only 70 of them have parallax measurements. This figure has roughly doubled with the first Gaia data release DR1 (Gaia Collaboration et al. 2016), and is expected to improve based on the longer time baselines and hence more accurate measurements of subsequent Gaia releases. In preparation for the analysis of these upcoming data sets and to narrow down the set of models that will be tested in the context of 3D studies, we have initiated a re-examination of the current analytical alternatives to describe the PSD of the Pleiades cluster. Table 1: Survey, and derived core and tidal radius for recent studies in the literature.
The study of the spatial distribution also has implications that go beyond its intrinsic interest. One of them is the existence of mass segregation as a result of star formation and dynamical interactions in the cluster. This effect has been predicted by numerical simulations of the internal cluster dynamics; see, for example, Terlevich (1987); Kroupa et al. (2001); Moraux et al. (2004); Converse & Stahler (2010). Confirming and quantifying its dependance on various parameters (e.g. initial mass function, core mass function, total mass of the cluster, presence or absence or massive stars, T-or OB-association) shall provide important input to the models and simulations of star formation and dynamical evolution.
In the specific case of the Pleiades, mass segregation has been reported in the works of Raboud & Mermilliod (1998); Pinfield et al. (1998); Kroupa et al. (2001); Adams et al. (2001); Moraux et al. (2004); Converse & Stahler (2008). Yet, Loktin (2006), using radial and tangential velocity dispersions, found no hint of mass segregation in a sample of 340 stars contained in the central 2.3 • . However, his results may arise from the low number and extent of his sample. All the mentioned works performed their analyses by binning the stellar samples in mass or distance ranges. It is well known however that fitting a function to a binned data set can introduce biases (Bevington & Robinson 2003;Nousek & Shue 1989), and that modifying the bin width could improve the fitting to a preferred model (Towers 2012). Thus, the use of bins in previous works and the contradictory mass-segregation results found by Loktin (2006) may suggest that the hypothesis of mass segregation in the Pleiades requires a more solid reexamination.
In this work we aim at addressing this hypothesis on the basis of the largest and least contaminated sample of Pleiades candidate members found to date: the combined list of candidate members from Bouy et al. (2015) and Olivares et al. (2017). We avoid the binning biases by using Bayesian inference methods applied to continuous and thus non-binned distributions. In addition, these Bayesian methods allow a quantitative comparison of the competing models, including those with and without mass segregation. This will allow us to establish on firm grounds the analytical expression of the Pleiades PSD, and its potential dependence on stellar mass.
In Section 2 we briefly describe the data set that forms the basis of our analysis. In Section 3 we present the set of radially symmetric analytical models we used, as well as their extension to biaxially symmetric (elliptical) profiles. We also include a luminosity dependence of the core radius (as a proxy to the investigation of mass segregation). In Section 4 we describe the foundations of model selection in the Bayesian framework. We then discuss and compare the results that we obtain for the posterior distributions of the various models in Section 5, where we also briefly describe our estimates on the total mass and number of members in the cluster. Finally, in Section 6 we summarise the conclusions drawn from the study.

The data sample
The data set used to compare the models in Section 3 corresponds to the high-membership-probability candidate members of Olivares et al. (2017), in the middle and faint luminosity end, with the addition of the Tycho-2 Pleiades high-luminosity candidate members from Bouy et al. (2015). This joint data set comprises the equatorial coordinates R.A. and Dec. (in the following α and δ), proper motions, photometry, and membership probabilities of 2060 sources. In this analysis we work only with the positions, membership probabilities, and J photometric band. The latter is the reddest most available band for this list of members, and is used as a proxy for the mass and to explore evidence of mass segregation.

Completeness of the sample
To properly establish the probabilistic framework, it is necessary to take into account the observational constraints of the data. The Pleiades DANCe catalogue is constrained by its sky coverage and the different degrees of completeness (see Bouy et al. 2013Bouy et al. , 2015. Although the data set extends up to a radius of 6.5 • , Bouy et al. (2015) conservatively assume that the census is homogeneous in coverage and limiting magnitude only in the central 3 • radius area.
Here, we estimate the completeness of the whole of the joint Tycho+DANCe survey in terms of the J band luminosity and spatial coverage, which also applies to our list of candidate members. In Fig. 1 we show the distributions of the number of sources in the combined DANCe+Tycho catalogue as a function of the radial position for different limiting magnitudes and bins in the J band. The radial position is computed assuming a distance of 134.4 pc to the Pleiades cluster (Galli et al. 2017) and a centre at α, δ = [56.65, 24.13]. As can be seen from the top panel of this Figure, the DANCe+Tycho catalogue is spatially complete until a radial distance of 11.5 pc (∼ 5 • ). The latter corresponds roughly to the sky coverage of the UKIDSS survey (Lawrence et al. 2007). Above this limit, the density of sources drops with two different slopes. The first one is created by the sawtooth pattern at the edge of the DANCe survey, while the last one corresponds to the more extended selection box used for the Tycho survey. To evaluate the photometric completeness, we assume that the distribution of sources in the sky region of the Pleiades is uniform (this simplistic assumption is sufficient for our current purpose). We compare the radial density of sources of different J magnitude bins with that of a synthetic sample uniformly distributed in space and truncated at the completeness radius of 11.5 pc. The radial distribution of this synthetic sample and those of the three magnitude bins are shown in the bottom panel of Fig.  1. As can be seen from the latter, the joint Tycho+DANCe survey is expected to be complete until magnitude ∼ 19 in the J band. Above this limit, the distribution of sources departs significantly from the expected one. Hence, we restrict our list of candidate members to those with: i) J band observed and less than 19 mag., and ii) radial distances less than 11.5 pc. This results in 1954 candidate members, which represents more than 50% more candidate members than those of Converse & Stahler (2010), who did the latest analysis of the Pleiades PSD. Accounting for completeness and the previous truncation in the data set is essential to avoid possible bias in the inferred parameters (see Appendix A). Nevertheless, we remind the reader that the inhomogeneities (e.g. spatial resolutions, gaps in luminosity) of the DANCe+Tycho data set are so complex (and some of them only partially understood) that they can indeed bias the sample of candidate members in unknown ways. For example, the gap in luminosity coverage between the faint end of the Tycho-2 catalogue and the bright end of the DANCe survey (see Fig. 8 of Bouy et al. 2015) may result in undetected sources, therefore unmeasured proper motions and, finally, an incomplete list of candidate members. Another important constraint is the number of cluster stars observed within the survey area coverage. Truncating the probability distributions properly accounts for the cluster members Article number, page 3 of 39 A&A proofs: manuscript no. 31996 left outside the truncation radius. However, due to the several artefacts surrounding the images of bright sources (e.g. halos, spikes, saturation), potential cluster members could also remain undetected. Furthermore, these artefacts can severely bias any evidence of mass segregation, as the most massive and brightest stars are located at the centre of the cluster. However, the statistical treatment of the impact of these artefacts lays beyond the scope of this work.
The information provided by the observational constraints, which we call I, consists of the maximum radius, R max = 11.5pc, and the number of stars observed within this radius, N = 1954. These constraints will be incorporated into the model through the likelihood.

Contamination
Olivares et al. (2017) estimate a contamination rate of 4.3±0.2% in their sample of candidate members at the probability threshold of p 84% 2 > 0.84. This would amount to 84 of their 1963 candidate members. Also, Sarro et al. (2014) estimate that the contamination rate of their methodology is 11.0 ± 2.0% for a probability threshold of p = 0.5, similar to that used by Bouy et al. (2015) to classify the candidate members of their Tycho+DANCe data set. Thus, in our combined Tycho+DANCe list of candidate members, we acknowledge a mean contamination rate of ∼ 8% (approx. 156 objects). We expect these contaminating sources to be uniformly distributed in right ascension and declination because the position on the sky was explicitly removed from the calculation of membership probabilities. Nevertheless, there may be a mild positive gradient of the density towards the Galactic centre. In addition, these contaminants may not be uniformly distributed in J band, with possible concentrations around 14 and 17 mag, where the entanglement of field and cluster populations is higher. The quantification of this possible dependency of contaminants with photometric magnitude and its consequences lay beyond the objective of this work and will be analysed in future studies.

Spherical models
In this Section we consider spherically symmetric models of the spatial distribution of Pleiades members. In the following paragraphs we give a brief description of each model, its analytical parameterisation, and the corresponding references.
Our starting point is the classical King's profile. Although it was introduced as an empirical law to describe the number surface density of globular clusters, it has also been used to describe open clusters (see Alonso-Santiago et al. 2017;Panwar et al. 2017, for recent applications), globular clusters (Myeong et al. 2017) and even to study galaxies (Robotham et al. 2017), halo substructure (Sohn et al. 2007) and the dark matter distribution (Jiang & van den Bosch 2016). The analytical description of the surface number density of stars n is given by where r c , the core radius, is a scale factor, r t is the tidal radius, and k is a constant related (but not equal) to the central surface density. In the following we use R instead of r (as is often commonly done in the literature) to refer to the distance from the system centre projected on the celestial sphere.
In addition to the classical King's profile we have tested two extensions of it. We define the Generalised King's profile (hereafter GKing) as the classical King's profile without fixing the exponents of the analytical expression. Instead of Eq. 1, we have where the classical King's profile is recovered for α = 0.5 and β = 2. To the best of our knowledge, only in the work of Robotham et al. (2017) has a similarly modified King's profile been used. However, the profile used by those authors is more restrictive than the one presented here, requiring that β = α −1 , and that both terms (R/r c ), and (r t /r c ) are at the power of 2.
The optimised generalised King's profile (hereafter OGKing) is the GKing profile with the values of α and β fixed at the maximum-a-posteriori (MAP) values of the GKing parameters. This maximises the Bayesian evidence and reduces the dimensionality of the parameter space.
To avoid the use of a tidal radius in the radial profile, we have also considered the model proposed by Elson et al. (1987), henceforth EFF, to describe young open clusters in the Large Magellanic Cloud. Their surface density (in star counts per solid angle) is given by with r c the core radius, and γ the slope of the profile at radii much larger than the core radius. Finally, we analyse a more general parameterisation introduced in Lauer et al. (1995), Byun et al. (1996) and Zhao (1997), where the projected mass density is given as Equation 4 represents a double power law, with r c being the so-called core or break radius, γ and β the exponents of the inner and outer regions, respectively, α the width of the transition region, and k a scale constant. Meaningful values of these parameters fulfil the following conditions: α > 0 and 0 ≤ γ ≤ β. The aforementioned works assume this functional form for the projected surface brightness, the projected mass density ρ, and the volume density v, although the latter two are related by integration: where z is the distance along the line of sight.
In this work we use the same analytical expression as in Eq. 4 but for the number density n(R). We call this model the generalised density profile 3 (hereafter GDP), as it comprises many simpler models, each of which corresponding to particular choices of the model parameters. Several density profiles proposed to describe galaxies can indeed be grouped by parameter values. For example, α = 1 includes models by Navarro et al. (1997), Hernquist (1990), Jaffe (1983), and Moore et al. (1999). Similarly, α = 1/2, γ = 0 includes the models by Plummer (1911) (with β = 5), by Sackett & Sparke (1990) and by de Zeeuw (1985). The EFF model corresponds also to α = 1/2, γ = 0. King's profile, however, cannot be cast into this general model unless the tidal radius r t is fixed at infinity.
For our spatial analysis, we also considered the restricted generalised profile (RGDP), corresponding to the generalised profile with the value γ fixed at 0.
We note that we have used similar names for parameters r c and γ in all the aforementioned formulations. However, these parameters do not share the same meaning amongst models. The latter is distinctively specified by each model relation M.
In all cases, the R coordinate is defined with respect to the cluster centre. The actual values of R then depend on the choice of this origin (see Sect. 3.2).

Central symmetry constraint
In the above, we have defined six models: King, GKing, OGKing, EFF, GDP and RGDP. Each of them has a different set of parameters. For example, the King's model depends on two parameters (r c and r t ), the EFF model depends on two other parameters (r c and γ), and the generalised profile GDP depends on four parameters (α, β, γ and r c ).
In reality, there are always two more parameters that do not appear explicitly in any of the above analytical formulations of the number density profiles. These are the cluster centre coordinates from which all radial distances R are measured. It is not a minor question because the problem is degenerate, and there is a maximum likelihood solution for each choice of the cluster centre. In principle, one could even choose a poor cluster centre estimate that renders the angular distribution of members asymmetric, and obtain a maximum likelihood fit better than those obtained with a better centred estimate. The models assume central symmetry, but this can only be ensured approximately. There is a region of non-negligible extent, where the cluster centre may be, and any particular choice of its position will influence the posterior distribution inferred. Thus, in order to propagate appropriately this uncertainty about the cluster centre position in our posterior inferences, we have included the two cluster centre coordinates, α c and δ c , as further parameters of our models (their allowed intervals will be described in Sect. 4.3).
For any given choice of the central coordinates, we calculate the radial distance, R, and the position angle θ of each star in our data set. To avoid biases introduced by projection effects of objects located far from the cluster centre, we project each object's coordinates into the plane of the sky along the line-of-sight vector (see for example, Eq. 1 of van de Ven et al. 2006).
The requirement of central symmetry is enforced by the inclusion of a multiplicative term in the likelihood. For a given set of parameter values of α c , δ c , we divide the computed polar angles of individual stars, θ, into four symmetric quadrants (divisions at [0, π/2, π, 3π/2]) and require that the number of stars in each quadrant be Poisson distributed with a mean rate given by N q = N tot /4. Under this model, the likelihood of any given proposal for the model parameters (α c , δ c ) will be where N i , i = 1, 2, 3, 4 is the number of sources in each quadrant, and P(N i |N q ) is the Poisson distribution with mean rate N tot /4 evaluated at N i .

Elliptical models
In this Section we extend the aforementioned spherical models to allow for deviations from radial symmetry. We do this by allowing variations of the radial profile that depend on the angular coordinate but still maintain biaxial symmetry. This can be done in many ways. In this work we focus on the simplest one: the analytical expression of the radial profile is maintained along any radial direction but the profile parameters (e.g. r c and r t in the King profile) have an ellipse-like dependence on the angular coordinate.
This requires the definition of a coordinate system centred at the cluster centre (parameters α c and δ c ), and potentially rotated from the RA-Dec system of axes. Thus, we further include the angle φ between the principal axes of the ellipse and RA-Dec system as a parameter of these models. The coordinatesx and y of Eq. 6 are rotated by angle φ to obtain coordinates x and y. Then, R and θ are computed from the latter by means of Eq. 7.
The radially symmetric parameters of the previous Section have now an angular dependency, which is now expressed by means of the characteristic radii at the semi-major and semiminors axes (denoted by subscripts a and b, respectively). These new radii are expressed as where θ is the position angle measured from the semi-major axis, and r a and r b are the parameters representing the characteristic radius at the semi-major and -minor axis, respectively.
We illustrate this new biaxial dependency in the King's profile. The surface number density is now where r c and r t are obtained from Eq. 9. Explicitly they are, r t (θ) = r ta · r tb (r ta sin(θ)) 2 + r tb cos(θ)) 2 , where r ca and r ta are the core and tidal semi-major axis of the ellipse, and r cb and r tb correspond to the semi-minor axis. We highlight that we do not constrain the two ellipses to have the same aspect ratio, but they are co-aligned. For the other model, the surface densities are similarly obtained. We do not incorporate any angle dependence for the exponents α, β or γ.
The position angle of the semi-major axis with respect to the Right Ascension axis (φ) is constrained using the equivalent of the radial symmetry likelihood term, except that now the position angle has its origin at the semi-major axis.

Segregated models
Finally, in this Section we introduce another set of profiles to revisit the problem of mass segregation in the context of the Pleiades.
We consider the previous biaxially symmetric models to which we add a dependence of the core radius with the J magnitude. We select the J magnitude because it is the reddest of the magnitudes that are available for all candidate members. We assume that stars of the same mass have approximately the same magnitude and that distance differences (due to the 3D spatial extent of the Pleiades) average out. The core radius dependence with the J magnitude is modelled as where J mode is the mode of the J band distribution. The slope of the relationship, κ, is independent of the angle θ. Therefore, for J = J mode = 13.6 the model reduces to the elliptic profile described in Section 5.2. A positive value of κ corresponds to smaller values of the core radius for stars brighter than J mode = 13.6; in other words, it describes a system where the more massive stars are more concentrated than the less massive ones.

Bayesian analysis
As mentioned in Section 2, our data set may be contaminated. Thus, in an effort to minimise the possible impact that these contaminants may have on our inference, we also model their spatial distribution. Hence, our model of the spatial distribution of stars not only includes the model of the Pleiades cluster, but also a field component which is modelled by a uniform spatial distribution U within the maximum radius R max .
The measured properties of each of star in our data set can be assumed to be unaffected by the measured properties of any other star in the data set (this assumption is called statistical independence). Under this assumption, the probability that the data set was generated by the mixture of cluster and field is the product of the probabilities that each of the stars was generated by this mixture.
Allowing D = {d i , π i } N i to denote our data set, with d comprising the sky coordinates and J magnitude, and π the cluster membership probability of each object, the probability or likelihood of the data set D, given the model M, constraints I, and parameters q, is The probability p(d|q, M, I) depends on the profile under consideration and is described in the following Section.

Probabilistic framework
To avoid the use of bins and to properly infer the parameters of the models presented in Section 3, we need to convert the projected stellar densities into probability density functions that describe the probability of finding a star between R and R + dR, under the assumption of spherical symmetry. The probability density function p(R) is constructed from the definition: where N is the total number of stars in the system. This probability is renormalised to integrate to unity at the truncation radius R max , which in our data set corresponds to 11.5 pc. Thus, All the probabilities rendered by our set of models are renormalised according to the previous equation. However, in the following and for the sake of simplicity, we only present the nontruncated probabilities. Applying Eq. 15 to the spherically symmetric King's profile, we obtain Actually, in probabilistic inference we write this probability function as: where we have defined a new constant, k 1 = k·2π N , and made explicit the dependence of the probability on the underlying analytical expression (M 1 ), the constraints I, and the values of the parameter set (k 1 , r c and r t ). In practice, k 1 is treated as a normalisation constant (to enforce unit integral) and there is no need to know the total number of stars in the system.
For the generalised King's profile, this becomes Likewise, the expression for the EFF model is And finally, the GDP model is given by with γ = 0 for the RGDP model.
For the elliptical and luminosity segregated density profiles, the likelihoods are obtained similarly by adding φ and replacing r c and r t by r ca , r cb and r ta , r tb in the model parameters, and introducing the dependence on θ and J in the relations. For example, the likelihood of the biaxial King's profile is p(R, θ|φ, r ca , r cb , r ta , r tb , k 5 , I,

Model selection
In this Section we aim at comparing the aforementioned analytical parametrizations of the projected stellar densities in the light of the currently available data. In order to do so, we use the Bayesian evidence, also known as marginal likelihood. In the following, we use evidence (and its plural evidences 4 ) to refer to the Bayesian evidence. The evidence is the key for model comparison in the Bayesian framework. In this framework, the model comparison is done on the basis of the model posterior probability p(M|D). This is the probability of model M given the collected data D. In our opinion, this is the most natural way to compare and select (if needed) models in the scientific context. The posterior probability can be expressed as using Bayes' theorem. The ratio of posterior probabilities can then be expressed as If there is no difference in the prior probabilities for models i and j, then the posterior ratio is equal to the marginal likelihood ratio (also known as Bayes Factor), where the marginal likelihood (i.e. the evidence) is the full likelihood marginalised over the model parameters q, as follows It is important to remark that the Bayesian model comparison naturally incorporates a preference towards the less complex models if they are equally supported by the data. In fact, the preference is towards models with less effective parameters (understood as parameters that the data can constrain).
The computation of the posterior probability distributions and the evidence of each model is carried out in practice using the Nested Sampling (Skilling 2006) algorithm as implemented in PyMultiNest (Buchner et al. 2014).

Priors
In the spherical models, we have assumed exponential priors, with a scale value of 1, and truncated at 100, for all exponent parameters α, β, γ, normal priors for the central coordinates (with mean at [56.65 • , 24.13 • ] and standard deviation of one degree), and Half-Cauchy priors for radial parameters (with scale parameter at 10 pc). These priors fall in the category of weakly informative ones (see Gelman 2006).
In the biaxially symmetric models, we use the same priors as for the radially symmetric ones but we restrict the semi-major axes of the core and tidal radii to be larger than, or at least equal to, their corresponding semi-minor axis. We also include a uniform prior for the angle φ ∈ [−π/2, π/2].
In the luminosity segregated models, in addition to the previous priors, we use a normal, N(0, 0.5), as a prior for κ, which represents our prior beliefs of almost negligible luminosity segregation.
The code to perform the analysis of the present work, together with the data set described in Section 2, is available at https://github.com/olivares-j/PyAspidistra

Results and Discussion
We apply the Bayesian formalism described in Section 4 to the data set detailed in Section 2. Thus, for each of our models we obtain the posterior distribution of its parameters, together with its evidence. Appendix B contains the details of the inferred posterior distributions, figures of the fitted densities and marginal distributions, together with the uncertainties of the parameters in each analysed model. Table 2 summarises the evidences and Bayes Factors resulting from all our models and their extensions. In the following we use these evidences to discuss the model comparison.
The boundaries for decision making from Bayes Factors should be set ab initio. We mostly discuss our results following the classical scale by Jeffreys (1961). In this scale, the strength of the evidence 5 is said to be: Inconclusive if the Bayes Factor is 3:1, weak if it is ∼ 3:1, moderate if it is ∼12:1, and strong if it is 150 :1. Nevertheless, we hope that our conclusions can be shared by the reader independently of the scale used to categorise the Bayes Factors.

Models with radial symmetry
The upper-left panel of Table 2 summarises the evidences and Bayes Factors obtained from our radially symmetric models. In addition, Table 3 shows the Maximum-a-Posteriori (MAP) estimate of each parameter in the radially symmetric models (uncertainties are shown in the Appendix B in the form of covariance matrices).
We observe that the evidences cluster in two groups. On one hand there is the family of King's models, where the evidence to compare between them is inconclusive and weak in favour of OGKing over GKing. On the other hand there are the EFF, GDP, and RGDP, where there is weak evidence supporting EFF over GDP and RGDP. There is inconclusive evidence supporting RGDP over GDP.
In addition, we observe that in GDP and RGDP, parameters r c and β show large correlations (0.85 and 0.92 for GDP and RGDP, respectively) and are relatively unconstrained with large uncertainties; see Appendix B. Despite this fact, the models still provide evidences comparable to those of the other models, suggesting that these two parameters, although necessary for the model, are unconstrained by the data, and therefore not penalised by the evidence. Aiming at eliminating this source of degeneracy, we tested models in which one of these two parameters was removed. However, the fits and evidence resulting from them were poorer than that of the RGDP. Thus, we consider these parameters as necessary for this model.
We find that the introduction of more flexibility in the analytical expressions of the classical radially symmetric profiles does not provide an increased amount of evidence, and results, in some cases, in unconstrained parameters and a loss of the interpretability associated to the original formulations. Therefore, the competing models are within the King's family, with insufficient evidence to select amongst them. Only additional, perfectly acceptable prejudices like physical interpretability or the ability to compare with previous results can be invoked to choose one (e.g. King's profile) over the rest.
The Bayes Factors seem to indicate (with inconclusive evidence however) that the best model is the OGKing. However, the fact that this profile has a larger evidence than any of the remaining models should come as no surprise since it results from fixing the values of α and β of the GKing model to their MAP values.
Comparing the rest of the models, we see that the poorest model is GDP with moderate evidence against it. The best models are again in King's family, followed by EFF and RGDP.
The conclusion from the comparison of these radially symmetric profiles is that i) there is no compelling reason to abandon the widely used King profile, and ii) there are slightly better models, but we lack evidence to prove if they truly represent a requirement to make the King's profile more flexible to accommodate the data.

Biaxially symmetric models
The central panel of Table 2 contains the logarithm of the evidences and Bayes Factors of the biaxially symmetric models. The evidences follows a pattern similar to that observed for the radially symmetric models, with the exception of those that are against the GDP model. We can conclude that there is strong evidence for the family of King's models and against the GDP one. The evidence is still moderate and too weak to compare the rest of the models.
Additionally, we compute a posteriori (from the MCMC chains) the ellipticities 6 rc and rt , which are defined as, with the latter available only for the King's family of models. Table 4 shows the MAP estimate for the parameters in the models of this Section, together with the mode of the distributions of ellipticities. Uncertainties for the latter are given in Appendix B.
We can observe that models with no tidal radius have similar rc ellipticities with a mean value of 0.23 ± 0.01. This value is similar to the 0.17 found by Raboud & Mermilliod (1998), who use a multicomponent analysis to derive the directions (although its value is not given) and the aspect ratio of the ellipse's axes. However, it is very interesting to see that the models within King's family result in lower values of the ellipticity in the central region and larger values in the outer one. This result is ex-  levich 1987). By comparing the evidences of the biaxially symmetric models to those of the radially symmetric ones, we can conclude that in all cases there is strong evidence in favour of the biaxial models.

Models with luminosity segregation
The lower-right panel of Table 2 summarises the evidences and Bayes Factors of models with luminosity segregation. Also, Table 5 shows the MAP of the inferred distributions for this set of models, together with the derived ellipticities.
We observe that the ellipticities follow the same pattern as those of the previous Section. This is expected because we explicitly model the luminosity segregation as independent of the position angle.
The luminosity segregation inferred here is non negligible with κ in the range 0.1 to 0.25 pc mag −1 , thus indicating that it is indeed an important parameter. However, in all the models, the marginal posterior distribution of κ does not discard the zero value (see the marginal posterior of κ in Appendix B).
The evidences provided by the models with luminosity segregation follow a similar pattern as those from radial symmetry. However, in this case the best model is the classical King's, which shows only moderate evidence against the EFF, RGDP, and GDP models. The evidence of King's model over GKing and OGKing is weak.
The evidences provided by the luminosity segregated models lead to them being strongly favoured over the radially and biaxially symmetric ones in all cases. We can conclude that, despite having a small value of κ, the luminosity segregation is an important parameter regardless of the model used.

Total mass and number of members
In this Section we use the inferred values of the parameters in King's family of models to derive simple estimates of the total number of members and mass of the Pleiades cluster.
For each model and extension within the King's family, we estimate the total number of cluster members by integrating the surface density profile until the tidal radii inferred for the model. This is done for each set of parameters returned by PyMultiNest. The resulting distributions of the total numbers fore each model and extension in the King's familly are shown in Fig. 2. Additionally, Table 6 shows the mode of these distributions. As can be seen from this Table,  The abbreviations Ctr, Ell, and Seg stand for the radial and biaxial symmetric models, and those with luminosity segregation, respectively.
We also estimated the total mass of the cluster using the posterior samples of the parameters returned by PyMultiNest. To gain an estimate of the total mass we use the tidal force resulting from the interaction of the self-gravitating cluster with the galactic potential. A detailed derivation of the Jacobi radius under the Hill's approximation can be found at p. 681 of Binney & Tremaine (2008). Following the mentioned authors, the Jacobi radius is given by, where G is the gravitational constant, m the total mass of the cluster, and Ω 0 the circular frequency of the cluster around the galactic centre, which can be expressed in terms of the Oort's constants A 0 and B 0 as Ω 0 = A 0 − B 0 . In the following, we assume an over-simplistic correspondence between the tidal radius of the King's family of models and Jacobi radius. Binney & Tremaine (2008, p. 677) provide a detailed list of reasons why this correspondence is only approximate. Thus, using the Oort's constant values given by Bovy (2017, A = 15.3 ± 0.4 kms −1 kpc −1 and B = −11.9 ± 0.4 kms −1 kpc −1 ), we can derive an estimate of the total mass of the cluster for each inferred value of the tidal radius.  Figure 3 shows the distributions of the total mass derived from the posterior distributions of the parameters of the King's family of models with biaxial symmetry and luminosity segregation (the distributions of total mass resulting from the radial and biaxial models are shown in Appendix B). As a summary, Table  7 shows the mode of each of these total mass distributions.
As can be seen from this Figure and Table, inferring the total mass by means of the poorly constrained tidal radius leads to large uncertainties and probably biased estimators. This effect has already been observed by Raboud & Mermilliod (1998), who derived a total mass of 4000 M with a confidence interval ranging from 1600 M to 8000 M . These values are in good agreement with the ones reported in Table 7 and observed in Fig.  3.
Given the large ellipticity of the cluster, we also investigated the possibility of deriving the total mass by means of the tidal elongation effect. However, the values determined are even more poorly constrained than those determined using Eq. 26.
The results of this Section show that: i) there is still a large fraction (up to 20%) of cluster members that lay beyond the spatial coverage of our data set, and, ii) although poorly unconstrained, the distributions of the total mass of the cluster seem to suggest that it is highly unlikely that the total mass of the cluster lays below the 1000 M limit, as commonly stated in the literature. However, the large and unconstrained mass distribution could also be an artefact resulting from: i) the poor correspondence between the Jacobi radius and the tidal radius, ii) the poorly constrained values of the tidal radius, and iii) dynamical effects not taken into account to derive Eq. 26 (e.g. the cluster is not a point mass but a self gravitating and rotating system).

Conclusions
In this work we have formulated the existing radially symmetric alternatives for the spatial distribution of stars in open clusters in a probabilistic framework. The set of distributions reviewed include i) the classical King's profile with two variants put forward by us, ii) the EFF model, and, iii) a general profile inspired by galactic profiles together with a more restricted version of it. We have used Bayesian techniques to both obtain posterior probability distributions for the parameters, and evidences for each model. With them, we compare and select the best model, given the data (and its possible biases). Furthermore, we have computed Bayes Factors for all pairwise model comparisons. Due to high correlations among their r c and β parameters, the GDP and RGDP models loose their physical interpretability. The result of the comparison amongst models with radial symmetry is that the King's family of models is only mildly superior, with weak and moderate evidence, to those models without the tidal radius parameter. Furthermore, we have analysed biaxially symmetric extensions of our set of models. The results indicate that deviations from spherical symmetry have strong evidence when compared to the more simple radially symmetric models. Additionally, the distribution of ellipticities derived from the EFF, GDP, and RGDP models peak at 0.22 ± 0.01, which is similar to the value of 0.17 found by (Raboud & Mermilliod 1998). Within the King's family, the models return ellipticities that are small (mean rc = 0.07 ± 0.02) and large (mean rt = 0.44 ± 0.14) in the inner and outer parts of the cluster, respectively. This effect is expected from the dynamical interaction of the cluster with the galactic potential, and is also predicted by numerical simulations. We use Bayesian model selection with Bayes Factors to analyse mass segregation. We prefer to remain in the domain of direct observables and study potential differences in the parameters of the spatial distribution as a function not of mass, but of the apparent J-band magnitude. The Bayes Factors show strong evidence in favour of the luminosity segregated models, and against the simpler biaxially symmetric ones. We interpret this result as strong evidence for mass segregation.
The above conclusions heavily depend on the sample of Pleiades members selected for the analysis. In our probabilistic analysis we took into account the possibility that our sample is contaminated, but a J-band-dependent contamination rate (J-band contamination gradient) could mimic a mass segregation such as the one observed here. In addition, the halos and artefacts in the images of the central and bright stars can induce a spatial incompleteness that could also artificially enhance the slope of the luminosity segregation. Thus, our results must be taken with care. In the near future, we expect to conduct similar studies given the more homogenous and well characterised data sets (e.g. new releases of Gaia's data).
Although the the GKing and OGKing models introduced here have greater evidences and fitting properties than the classical King's profile, there is no strong evidence supporting an abandonment of the latter. Nevertheless, the GKing profile is a good alternative to the King's classical profile and should be compared with it in light of new and more complete data sets.
From the model selection process, we can conclude that the classical King's profile extended to include biaxial symmetry and mass/luminosity segregation should be the starting point in future analyses of the spatial distribution of open clusters.
Finally, we use the posterior distributions of the parameters in King's model family to obtain rough estimates of the total mass and number of systems in the cluster. We observe that even the largest census of candidate members (Bouy et al. 2015;Olivares et al. 2017) may lack up to 20% of the predicted number of stellar systems. The probability distribution function of the cluster total mass, which is determined using approximations of the tidal force exerted by the galactic and cluster potentials, reveals that it is highly unlikely that the true cluster total mass lays below the 1000 M limit.
The results of this work suggest that, although the Pleiades cluster is one of the most studied in the literature, the daughters of Atlas still keep many of their secrets within the oceans of the sky; probably awaiting the arrival of the final Gaia's data.

Appendix A: Effects of truncation on King's profile.
Statistical truncation occurs when an unknown number of sources lay beyond a threshold value. This threshold value can originate in the measuring process or in the post-processing of the data. The resulting data set does not contain any information about objects beyond the threshold.
Performing inference on truncated data can bias the recovered parameters if the truncation mechanism is not included in the analysis. Nevertheless, bias can still appear if poor statistics are used to summarise the results. Practically speaking, if the truncation is too restrictive it could also lead to bias due to a reduced sample size. To estimate the impact of these effects, we generated synthetic data sets from the King's profile, at true values of r c = 2.0 pc and r t = 20.0 pc, and infer the parameters under different sample sizes (1000,2000, and 3000 objects) and truncation radii (5,10,15,20 pc). We repeat each estimation ten times to account for randomness in the sample. Figure A.1 shows the posterior distributions inferred at each sample size and truncation radius. As can be seen, accounting for truncation results in posterior distribution that correctly recovers the true parameter values. However, due to the large asymmetry in the posterior distributions of the tidal radius at the lower truncation radius (5 pc), the Maximum A Posteriori (MAP) statistic can be severely biased. Figure A.2 shows the mean relative error of this statistic as a function of the truncation radius. As can be seen, the larger biases appear at the extreme case where the truncation radius is only one fourth of the true tidal radius. We note that although the MAP estimates of each of the ten realisations are biased, estimates are made in a similar way above and below the true value; except at the truncation radius of 5 pc, where they slightly over estimate the value. Also, the MAP is unbiased above truncation radii of half the tidal radius, in spite of the number of stars (at least for the tested values).
This example shows that the inference of the parameters in the King's profile can be biased even after truncation has been accounted for. In particular, the tidal radius can be severely affected by truncation radius below one half of the tidal radius. Since this phenomenon is observed under the weakly informative priors used (half-cauchy centred at zero and scale parameter of 100), this effect can be generalised to any maximumlikelihood estimator, the χ 2 statistic particularly.
Finally, as can be seen in Fig. A.3, inferring King's profile parameters without properly accounting for truncation leads to even larger biases.

Appendix B: Posterior distributions
This Appendix contains the details of the inference performed for each of the models and extensions presented in Section 3. It is structured in the same way as that Section. It starts with the radial models, then continues with the biaxial extensions, and finishes with the luminosity segregated ones. For each extension we give: i) the uncertainties of the MAP for each model, and, ii) The MAP uncertainties and correlations are summarised by covariance matrices. These are computed using the 68.2% of samples from the MCMC that were the closest to the MAP value.  Mean relative error (r c and r t , respectively) of the MAP statistic inferred from ten random realisations of different sample sizes (line styles) and truncation radii (colours). The uncertainties correspond to the standard deviation of the ten inferred MAPs.

Posterior distributions
They represent the 2σ uncertainties and correlations of the parameters at the vicinity of the MAP. For the biaxial and luminosity segregated models we also give the ellipticity distributions computed a posteriori from the core and tidal (when available) semi-major and semi-minor axes resulting from the PyMultiNest samples.
Finally, this Appendix also contains the distributions of the total mass of the cluster derived from the radial and biaxial models in the King's family.
0.001 0.010 -0.000 -0.000 r c [pc] -0.001 -0.000 0.054 -0.100 r t [pc] -0.003 -0.000 -0.100 1.951                        Fig. B.16: Inferred density of the luminosity segregated models. The data are binned in three bins of the J band: J < 12, 12 J 15, and 15 < J (with colours green, cyan and magenta, respectively). The MAP is shown by means of three coloured solid lines, the colours correspond to those of the J band bins. In these MAPs, the core radius is increased accordingly to Eq. 13 using the mean value of the J band in each bin. Also shown are 100 samples from the posterior distribution (grey lines). Olivares, J. et al.: The seven sisters DANCe III.