Two empirical regimes of the planetary mass-radius relation

Dolev Bashi; Ravit Helled; Shay Zucker; Christoph Mordasini

doi:10.1051/0004-6361/201629922

Open Access

Issue		A&A Volume 604, August 2017


Article Number		A83
Number of page(s)		5
Section		Planets and planetary systems
DOI		https://doi.org/10.1051/0004-6361/201629922
Published online		11 August 2017

A&A 604, A83 (2017)

Two empirical regimes of the planetary mass-radius relation

Dolev Bashi¹, Ravit Helled¹^,2, Shay Zucker¹ and Christoph Mordasini³

¹ School of Geosciences, Tel-Aviv University, Tel-Aviv, Israel
² Center for Theoretical Astrophysics & Cosmology, Institute for Computational Science, University of Zurich, 8057 Zürich, Switzerland
e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
³ Physics Institute, University of Bern, 3012 Bern, Switzerland

Received: 18 October 2016
Accepted: 25 January 2017

Abstract

Today, with the large number of detected exoplanets and improved measurements, we can reach the next step of planetary characterization. Classifying different populations of planets is not only important for our understanding of the demographics of various planetary types in the galaxy, but also for our understanding of planet formation. We explore the nature of two regimes in the planetary mass-radius (M-R) relation. We suggest that the transition between the two regimes of “small” and “large” planets occurs at a mass of 124 ± 7M_⊕ and a radius of 12.1 ± 0.5R_⊕. Furthermore, the M-R relation is R ∝ M^{0.55 ± 0.02} and R ∝ M^{0.01 ± 0.02} for small and large planets, respectively. We suggest that the location of the breakpoint is linked to the onset of electron degeneracy in hydrogen, and therefore to the planetary bulk composition. Specifically, it is the characteristic minimal mass of a planet that consists of mostly hydrogen and helium, and therefore its M-R relation is determined by the equation of state of these materials. We compare the M-R relation from observational data with the relation derived by population synthesis calculations and show that there is a good qualitative agreement between the two samples.

Key words: planets and satellites: composition / planets and satellites: fundamental parameters / planets and satellites: general

© ESO, 2017

Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

Exoplanet studies have now reached the level at which planet characterization is possible. There are hundreds of planets with measured masses and radii. Knowledge of these two physical properties provides valuable clues about the planetary composition through the mass-radius (M-R) relationship. Traditionally, planets have been divided into two main groups. The first includes the massive gas-dominated planets, while the second consists of the small terrestrial planets (e.g., Weidenschilling 1977). In part, this division is inspired by the solar system, where massive planets are composed of volatile materials (e.g., Jupiter), while the terrestrial planets are small and consist of refractory materials. However, the diversity in masses and radii of exoplanets¹ has taught us that this separation is somewhat arbitrary and may be overly simplistic (see review by Baraffe et al. 2014, and references therein).

While the first detected exoplanets had relatively high masses and radii, the number of small exoplanets increased dramatically in recent years through improvements in technology and detections from space (e.g., CoRoT Baglin et al. 2006 and Kepler Borucki et al. 2010). Of course, since most exoplanets have been detected via radial velocity measurements or transits, there is a difference when defining a “small planet” by mass or by radius. In terms of mass, it is customary to define small planets as planets with masses lower than ~30 M_⊕ (Mayor et al. 2011; Howard et al. 2010), while in terms of radius, small exoplanets are often those with radii smaller than 4 R_⊕ (e.g., Marcy et al. 2014; Weiss & Marcy 2014). These divisions are partially based on the behavior of the planetary mass function of exoplanets.

Previous studies that examined the M-R relation have suggested a transition in the M-R relation between small planets (Neptune-like) and large planets (Jovian). Based on visual estimates of the M-R and mass-density relations, Weiss et al. (2013) suggested that the transition point occurs at a mass of ~150 M_⊕. The derived slopes of the M-R relations in the different regimes was found to be R ∝ M^0.54 for M_p < 150 M_⊕ and R ∝ M^-0.039 for massive planets (M_p > 150 M_⊕). Hatzes & Rauer (2015) have analyzed changes in the slope of the mass-density relation. Using a similar slope criterion, they located the breakpoint at a mass of ~0.3 M_J ≃ 95 M_⊕. In a recent study, Chen & Kipping (2017) presented a detailed forecasting model built upon a probabilistic M-R relation using Monte Carlo Markov chain. According to their classification, the transition between small and large planets occurs at 0.41 ± 0.07 M_J ≃ 130 ± 22 M_⊕, corresponding to the transition between Neptunians and Jovians, with slopes of R ∝ M^0.59 and R ∝ M^-0.04 for the low- and high-mass planets, respectively. Interestingly, although the studies do not agree exactly on the transition mass between the two regimes, they do agree that it is significantly higher than the traditional cutoff at 20−30 M_⊕. This essentially suggests that the change in the occurrence rate as seen in the mass function of exoplanets (at ~30 M_⊕), that is to say, the frequency of planets is not the same as the behavior of the M-R relation, which is linked to the planetary composition.

In this paper we present the results of a study we performed in order to empirically characterize the transition point between small and large planets based on their M-R relation. On the one hand, our aim was to perform a quantitative straightforward study that determines simple numerical information – the two M-R power-law indices, and the transition mass. On the other hand, we opted for a kind of least-square fit, and not an elaborate probabilistic recipe. Our hope was that this would allow a more intuitive yet rigorous characterization of the planetary M-R relation. Finally, we also compare the exoplanet population to formation models and find a qualitative good agreement.

2. Sample

The data we use include only planets with masses and radii that are based directly on observations, as opposed to being inferred from planetary physics models. Our sample consists of 274 exoplanets queried from http://exoplanets.org in March 2016. The planet with the lowest mass in our sample is Kepler-138b, with a mass of 0.0667 ± 0.0604 M_⊕; this is well below the mass of Earth. The planet with the highest mass in our sample is CoRoT-3b, with a mass of 6945 ± 315 M_⊕(=21.85 ± 0.99 M_J); it is a brown dwarf. For all the planets our sample must include measured masses, radii, and their uncertainties. We therefore exclude planets with reported masses that are estimated based on a theoretical M-R relation. All planets in the sample are transiting planets, whose masses have been measured either by RV (238 through RV follow-up, and 9 were first detected by RV), or using transit-timing variations (TTVs, e.g., Nesvorný & Morbidelli 2008; 27 planets). It should be noted that almost all the TTV planets are of low mass. The top panel of Fig. 1 shows the resulting M-R diagram.

Fig. 1

Top: M-R relation of the exoplanets considered in the analysis. The dashed lines identify the three different regimes we consider for the weighting (see text). Bottom: the M-R relation and the derived best-fit curves, and M-R relations.

3. Analysis

The model we assume is that of two mass regimes that differ by the M-R power law. In the log-log plane, this translates into a continuous piecewise linear function, with two segments that we had to fit to the data points. In spite of our ambition to apply the most basic techniques of simple regression to perform this fit, several problems conspire to turn this into a somewhat more complicated problem.

First, the two variables – the planetary mass and radius – are both measured with non-negligible errors. If we could assume that only one of them (e.g., the mass) had errors, the problem could have been treated as a standard regression problem. Because uncertainties exist for both variables, we face the field of errors-in-variables (EIV) problems, which are surprisingly more difficult than standard regression problems (e.g., Durbin 1954), and there is not one agreed approach to analyze them.

As difficult as EIV problems are, in our case the complexity is even exacerbated by the fact that we aim to fit not a linear function, but a continuous piecewise-linear function, rendering futile any hope to solve the problem analytically. Even under the assumptions of standard regression, where the so-called explanatory variable has no uncertainty, the problem (dubbed “segmented regression”) is not trivial (e.g., Hinkley 1969).

Another difficulty arises because of the nature of our specific sample. A glance at the top panel of Fig. 1 reveals that the data points are not scattered evenly across the logarithmic mass range. The points corresponding to the smaller planets seem to be much more sparse than those of the Jovian planets. The same is true for the very large planets, with masses of a few Jupiter’s mass. There seem to be three mass intervals with varying density of sample points. The origin of this differentiation lies beyond the scope of this study, and in any case, it may very well be a combination of observational bias and astrophysical processes of formation and evolution. The smaller number of massive planets (above a Jupiter-mass) is a result of the low occurrence rate of such planets (e.g., Cumming et al. 2008), while the clustering of small planets is probably a result of the massive efforts to detect low-mass planets and their high occurrence rate (e.g., Howard et al. 2012).

There are various reasons for this sampling variability, ranging from observational biases to physical effects related to planet formation and evolution. However, the fact remains that for the purposes of regression analysis, the mass affects the sampling. In regression theory this amounts to endogenous sampling. While fitting a simple straight line might be affected only slightly by this imbalanced sampling, it is not guaranteed for a piecewise-linear function. Any fitting procedure should take this imbalance into consideration.

To streamline the discussion, we denote $\begin{matrix} x = \log M_{p} \\ y = \log R_{p} . \end{matrix}$ $Mathematical equation: \begin{eqnarray} x = \log M_\mathrm{p}\\ y = \log R_\mathrm{p} . \end{eqnarray}$ The choice of logarithm base is irrelevant as long as it consistent throughout the calculation. In the end, the values in linear scale are important, not the values in logarithm scale. Now our sample, in the log-log plane, consists of a set of ordered pairs (x_i,y_i). We furthermore denote by Δx_i and Δy_i the corresponding logarithmic uncertainties derived from the uncertainties in the linear scale using the standard transformation. In cases where the transformation led to asymmetric uncertainties, we still assigned symmetric errors by taking the more conservative (larger error) of the two error estimates.

In our quest for the best-fit piecewise-linear function, we chose what is probably the most intuitive approach to EIV: a total least-squares approach (TLS, e.g., Markovsky et al. 2010). Similarly to standard regression, in TLS the problem is represented as a minimization problem of a sum of squares. Each data point contributes to the total sum-of-squares its orthogonal distance from the fitted line, measured in units of the two uncertainties. In the simple case where we fit a simple linear function, when we denote the slope and intercept of the line by a and b, the contribution of the point (x_i,y_i) would be

$\frac{(y_{i} - a x_{i} - b)^{2}}{a^{2} (Δ x_{i})^{2} + (Δ y_{i})^{2}},$ $Mathematical equation: $$ \frac{(y_i - a x_i - b)^2}{a^2(\Delta x_i)^2 + (\Delta y_i)^2} , $$$ where Δx_i and Δy_i are the errors of x_i and y_i. Golub (1973), and Golub & Van Loan (1980) were the first to introduce an algorithm to solve this basic TLS problem, using singular value decomposition. They have also shown that even in this simple linear case a solution is not guaranteed to exist.

In our case, where the function we seek consists of two straight lines, we simply calculate for each point the weighted orthogonal distances from the two lines and include the smaller distance in the total sum-of-squares:

$\begin{matrix} S (a_{1}, b_{1}, a_{2}, b_{2}) = \\ \sum_{i = 1}^{N} \min \begin{matrix} ⎧ \\ ⎪ \\ ⎨ \\ ⎪ \\ ⎩ \end{matrix} \frac{(y_{i} - a_{1} x_{i} - b_{1})^{2}}{a_{1}^{2} (Δ x_{i})^{2} + (Δ y_{i})^{2}}, \frac{(y_{i} - a_{2} x_{i} - b_{2})^{2}}{a_{2}^{2} (Δ x_{i})^{2} + (Δ y_{i})^{2}} \begin{matrix} ⎫ \\ ⎪ \\ ⎬ \\ ⎪ \\ ⎭ \end{matrix}, \end{matrix}$ $Mathematical equation: \begin{eqnarray} &&S(a_1,b_1,a_2,b_2) =\nonumber \\ &&\qquad \qquad \sum_{i=1}^N \mathrm{min} \left\{ \frac{(y_i - a_1 x_i - b_1)^2}{a_1^2(\Delta x_i)^2 + (\Delta y_i)^2} , \frac{(y_i - a_2 x_i - b_2)^2}{a_2^2(\Delta x_i)^2 + (\Delta y_i)^2} \right\} , \end{eqnarray}$ (3)

where N is the total number of points and a₁, b₁, a₂ and b₂ are the slopes and intercepts of the two straight lines. S is parameterized by four numbers whose meaning is somewhat arbitrary. This is true especially for the two intercepts b₁ and b₂, which are functions of the arbitrary location of the zero point of x. We can instead parameterize S by an alternative quadruple that is physically more meaningful: the two coordinates of the breakpoint (breakpoint mass and corresponding radius), and the two slopes of the separate mass regimes.

When we set out to minimize S, we found that the solution was numerically unstable. Using diffferent starting points for the optimization algorithm (Nelder-Mead simplex algorithm, see Nelder & Mead 1965) resulted in different solutions. This meant that around the global minimum of S(a₁,b₁,a₂,b₂) there were many local minima. We suspect that this instability resulted from the endogenous sampling problem to which we alluded above (the mass-distribution shown in Fig. 2). In order to rectify this problem, we have decided to introduce weights to the definition of S, which will balance the effect each mass range has on the final solution. As is clear from the top panel of Fig. 1, there are apparently three intervals: M_p < 69 M_⊕, 69 M_⊕ ≤ M_p < 1660 M_⊕, and 1660 M_⊕ ≤ M_p². Figure 2 further demonstrates the differentiation in mass by portraying a histogram of the mass, together with the borders we chose among the three mass ranges.

Fig. 2

Histogram of planetary mass, showing clearly the three empirical mass ranges. The division into three intervals was performed in order to improve the quality of the statistical analysis (see text for details).

The weighting scheme we applied is known in statistics as inverse probability weighting, which is designed to alleviate the implications of endogenous sampling (e.g., Wooldridge 1999). We thus multiplied the contribution of each data point by a weight that was assumed to compensate for the effect of the size of the mass-range set to which the data point belonged. The weight we assigned was simply proportional to the inverse of the size of the set: N/N_c, where N is the total number of planets and N_c is the size of the set. Table 1 details the three mass-range sets, their sizes, and the corresponding weights. The final expression for S is thus

$\begin{matrix} S (a_{1}, b_{1}, a_{2}, b_{2}) = \\ \sum_{i = 1}^{N} w_{i} \min \begin{matrix} ⎧ \\ ⎪ \\ ⎨ \\ ⎪ \\ ⎩ \end{matrix} \frac{(y_{i} - a_{1} x_{i} - b_{1})^{2}}{a_{1}^{2} (Δ x_{i})^{2} + (Δ y_{i})^{2}}, \frac{(y_{i} - a_{2} x_{i} - b_{2})^{2}}{a_{2}^{2} (Δ x_{i})^{2} + (Δ y_{i})^{2}} \begin{matrix} ⎫ \\ ⎪ \\ ⎬ \\ ⎪ \\ ⎭ \end{matrix}, \end{matrix}$ $Mathematical equation: \begin{eqnarray} && S(a_1,b_1,a_2,b_2) = \nonumber\\ &&\qquad \sum_{i=1}^N w_i \ \mathrm{min} \left\{ \frac{(y_i - a_1 x_i - b_1)^2}{a_1^2(\Delta x_i)^2 + (\Delta y_i)^2} \ , \frac{(y_i - a_2 x_i - b_2)^2}{a_2^2(\Delta x_i)^2 + (\Delta y_i)^2} \right\} , \end{eqnarray}$ (4)

where w_i is the weight of each point.

Table 1

Details of the three mass-range sets, and the resulting weights used in the analysis.

After optimizing S, we went on to obtain error estimates for the four variables, using a Monte Carlo resampling approach. We randomly drew new data points from a Gaussian distribution. The expected values of the Gaussian distribution were the nominal values of x and y, and we used the error bars as the widths (standard deviations) of the Gaussian distribution. We repeated the resampling procedure for 100 000 such random realizations of the data. The resulting random sample yielded the error estimates.

4. Results

Using the approach we outlined in the previous section, we obtained estimates for the two slopes, and the breakpoint. We found the breakpoint at a mass of 124 ± 7 M_⊕ and a radius of 12.1 ± 0.5 R_⊕. The resulting power laws of the two regimes (based on the two slopes in the x-y plane) are R ∝ M^{0.55 ± 0.02} for small planets, and R ∝ M^{0.01 ± 0.02} for large planets. The bottom panel of Fig. 1 shows the derived relation.

It is interesting to note also that according to our analysis, Saturn is “a small planet” (e.g., Chen & Kipping 2017; Weiss et al. 2013). Indeed, based on internal structure models, the heavy element fraction is Saturn is estimated to be between ~20% and 40% (e.g., Guillot 2005). This means that Saturn’s mass is not very far from the transition point, and it is important to note that the transition mass at ~120 M_⊕ must be understood as a statistical quantity. As can be seen in Fig. 1, there is a region near the breakpoint in the fit at 120 M_⊕ that could either be considered as the continuation of the regime where the radius increases with mass to even higher masses, or as an continuation of the high-mass regime (with approximately constant radius) to even lower masses. This transition regime approximately covers a mass range broader than the range derived in the analysis, somewhere between about 80 and 120 M_⊕. According to the data, the actual transition therefore occurs at the higher end of this mass range. Another point that should be taken into account is that the apparent transition is also affected by stellar irradiation, while Saturn experiences a much lower irradiation than most of the planets that were used in our statistical analysis.

Our results are in good agreement with previous studies. The analysis we used to obtain the results was simple and intuitive and did not rely on subjective estimates. The fact that the transition occurs at a planetary mass higher than that of Saturn supports the idea that the change in the M-R relation for large planets is due to the dominating composition – in the case of massive planets, a mixture of hydrogen and helium. The data suggest that for planets with masses higher than ~120 M_⊕, the planetary radius is determined by the equation of state of these light elements (e.g., Zapolsky & Salpeter 1969; Fortney et al. 2007). The dominating H-He composition and the compression due to the high mass also naturally explains the weak dependence of the radius on mass for giant planets that consist of mostly hydrogen and helium (e.g., Guillot 2005). Planets with lower mass are less compressed and therefore have a radius that increases in mass. The relatively broad spread of the low-mass planets around the line suggests that the planets can have various compositions in this mass regime.

4.1. Comparison with theoretical calculations

In this section we briefly compare the observational data with theoretical results from planet population syntheses based on the core-accretion paradigm (Mordasini et al. 2012). These calculations yield the planetary bulk composition (solids and H/He) and the post-formation entropy based on the planets’ formation track. Here we use two sets of core (heavy element) compositions: silicates, and iron or water. These two sets are chosen to assess the impact of various compositions of the solid core on the predicted radii of the synthetic planets. The first core is differentiated, and its composition is assumed to consists in mass of 1/3 iron (inner core) and 2/3 perovskite (outer core), similar to Earth and several low-mass extrasolar planets (e.g., Santos et al. 2015). The second composition corresponds to cores consisting exclusively of water ice. While pure water cores are unlikely to exist, these cores represent the limiting case of low-density cores. In all cases, the modified polytropic equation of state is used to derive the core radius, taking into account the pressure exerted by the surrounding envelope (see Mordasini et al. 2012). The star is assumed to be 1 M_⊙. Planets with semimajor axes of 0.01 to 0.5 AU are included in order to have a better comparison with the measurements. The formation model includes the effect of type I and II orbital migration. During the evolutionary phase, no mechanisms that can lead to inflation of the planetary radius (bloating) are included, whereas the effect of atmospheric escape is considered as described in Jin et al. (2014). The planetary opacity used in the formation models is the combination of the interstellar medium (ISM) opacities (Bell & Lin 1994) reduced by a factor 0.003 plus the grain-free opacities of Freedman et al. (2014). The reduction factor was determined in Mordasini et al. (2014) by comparison with detailed simulations of the grain dynamics by Movshovitz et al. (2010). During the planetary evolution, we assume a grain-free opacity because grains are expected to grow and settle to deeper regions after gas accretion is terminated (e.g., Movshovitz & Podolak 2008).

The observations and the theoretical data are compared in Fig. 3. As can be seen from the figure, the general M-R relation is similar, but there are also important differences. Both data sets show two different regimes. In the low-mass regime, both the observational and synthetic data show a large scatter in the M-R relation, which stems in the synthetic population from different envelope-core mass ratios, which in turn reflect different formation histories. For giant planets, the simulated planets follow a narrow M-R relation, which is clearly a consequence of neglecting bloating, assuming solar opacity, and an internal structure consisting of a pure H/He envelope surrounding a core made of pure heavy elements (i.e., a core+envelope internal structure). This is in contrast to the observations that also contain planets that have significantly larger radii, and probably different compositions and/or internal structures. In addition, the theoretical data correspond to a given age (5 × 10⁹ yr), while the observed population includes various ages. However, since most of the detected planets are observed around relatively old stars, we do not expect a large impact on the goodness of fit to the observed M-R relation.

Fig. 3

M-R relation: observations vs. theoretical data. The circles correspond to the observations, while the shaded area represents the results from planet population synthesis models.

In the giant planet regime there are both giant planets with significantly larger but also planets with smaller radii in the observed exoplanet population. The large radii can be attributed to bloating, while the smaller planets suggest that there are some planets that contain significantly higher amounts of heavy elements than in the synthetic population. This could be the result of a more efficient accretion of solids during formation, or giant impacts at later times. The effect of bloating on the population of small planets still needs to be studied in detail, although some work on this topic has already been presented (e.g., Lopez et al. 2012; Owen & Wu 2013). At the moment, it is still unclear whether an inflation mechanism is required in order to explain some of the small exoplanets with very low mean densities, since the existence of an (H-He) atmosphere can significantly increase the planetary radius. In addition, unlike massive planets, which are expected to be H-He dominated, small planets have a large spread of heavy elements and various fractions of H-He. This introduces a degeneracy with inflation mechanisms for low-mass planets: an observed M-R relation can probably either be caused by the existence of a more massive H-He envelope without inflation, or alternatively by a physical mechanism that causes the planet to be large, that is, inflation. A better understanding of inflation and atmospheric loss in small- and intermediate- mass planets is clearly desirable.

Clearly, the two data sets should be compared only qualitatively. This is because the observed planets have a variety of ages, atmospheric opacities, and of course, possibly mixed compositions. As a result, the partially strong and tight correlations in the theoretical M-R should not be considered realistic, as they simply represent the composition of pure ice or rock planets (in the case of the bare cores), or the artificially narrow M-R relation of giant planets having all the same atmospheric opacity and lacking bloating mechanism. Nevertheless, there is a rather good agreement in terms of the transition between small and large planets in the M-R diagram.

Table 2

M-R relations derived for small and large planets by previous studies and in this work.

5. Discussion and conclusions

Our analysis suggests that the transition between large and small planets occurs at a mass (radius) of 124 ± 7 M_⊕ (12.1 ± 0.5 R_⊕). As expected, we established two mass-radius relations for exoplanets. For low-mass planets, the radius increases with increasing mass, and the M-R relation we derive is R ∝ M^{0.55 ± 0.02}, whereas for the large planets, the radius is almost independent of the mass, and the M-R relation is R ∝ M^{0.01 ± 0.02}.

Planetary mass and heavy element content almost exclusively determine the radius of low-mass planets <124 M_⊕. The turnover point at this mass is probably due to the characteristic boundary between planets that are mostly gaseous (H-He dominated) and planets that consist of varying compositions and therefore do not have a single M-R relation. When the planet mass exceeds 124 M_{⊕ ,} the relation is flattened and is even consistent with a small negative slope, since we are approaching a slope of a compressed hydrogen-helium-dominated planet.

This work identifies the transition point between small and large planets based on the M-R relation. This transition point is not the same as the point derived from studies of measured frequency of planets (occurrence rate), although the two might be linked. From the point of view of standard planet formation models, the transition from a heavy-element-dominated composition to a hydrogen-helium-dominated composition occurs at a mass where the core and envelope mass are similar (crossover mass). Statistical simulations of planet formation have shown (e.g., Mordasini et al. 2015) that this leads to a break in the planetary occurrence rate at about 30 M_⊕, but the actual value can vary significantly depending on the assumed solid-surface density, opacity, accretion rates, etc. It is therefore interesting to

note that not many planets are observed with masses between 30 and 120 M_⊕ (see Fig. 1; see also Mayor et al. 2011). This may suggest that the two transitions are linked. Finding the link between the two transition points can reveal crucial information on planetary formation and characteristics, and we hope to address this topic in the future.

As mentioned earlier, thinking about planetary characterization in terms of M-R relation is useful, but it should be noted that in reality, there is a M-R-flux, or even M-R-flux-time relation for planets. This is because the stellar flux and the time evolution are expected to affect the radius of the planet at a given time. These relations will be better understood in the future when exoplanet detections will include larger radial distances and various ages of stars, as expected from the PLATO mission.

See http://exoplanets.org for exoplanet properties.

It is beyond the scope of this study to perform a rigorous clustering analysis. There seems to be a consensus in data-mining literature that at this stage there is not a single clustering algorithm or criterion that is guaranteed to be the best. An intuitive division at this stage is therefore completely acceptable (e.g., Estivill-Castro 2002).

Acknowledgments

We thank the anonymous referee for valuable comments and suggestions. R.H. acknowledges support from the Israel Space Agency under grant 3-11485 and from the United States - Israel Binational Science Foundation (BSF) grant 2014112. C. M. acknowledges the support of the Swiss National Science Foundation via grant BSSGI0-155816 “PlanetsInTime”.

References

Baglin, A., Auvergne, M., Boisnard, L., et al. 2006, 36th COSPAR Scientific Assembly, 36, 3749 [Google Scholar]
Baraffe, I., Chabrier, G., Fortney, J., & Sotin, C. 2014, in Protostars and Planets VI, eds. H. Beuther, R. S. Klessen, C. P. Dullemond, & T. Henning (Tucson: University of Arizona Press), 763 [Google Scholar]
Bell, K. R., & Lin, D. N. C. 1994, ApJ, 427, 987 [NASA ADS] [CrossRef] [Google Scholar]
Borucki, W. J., Koch, D., Basri, G., et al. 2010, Science, 327, 977 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Chen, J., & Kipping, D. M. 2017, ApJ, 834, 17 [Google Scholar]
Cumming, A., Butler, R. P., Marcy, G. W., et al. 2008, PASP, 120, 531 [CrossRef] [Google Scholar]
Durbin, J. 1954, Rev. Inst. Int. Stat., 22, 23 [CrossRef] [Google Scholar]
Estivill-Castro, V. 2002, SIGKDD explor., 4, 65 [CrossRef] [Google Scholar]
Fortney, J. J., Marley, M. S., & Barnes, J. W. 2007, ApJ, 659, 1661 [NASA ADS] [CrossRef] [Google Scholar]
Freedman, R. S., Lustig-Yaeger, J., Fortney, J. J., et al. 2014, ApJS, 214, 25 [NASA ADS] [CrossRef] [Google Scholar]
Golub, G. 1973, SIAM rev., 15, 318 [CrossRef] [Google Scholar]
Golub, G., & Van Loan, C. 1980, SIAM J. Numer. Anal., 17, 883 [NASA ADS] [CrossRef] [Google Scholar]
Guillot, T. 2005, Annu. Rev. Earth Planet. Sci., 33, 493 [Google Scholar]
Hatzes, A. P., & Rauer, H. 2015, ApJ, 810, L25 [NASA ADS] [CrossRef] [Google Scholar]
Hinkley, D. V. 1969, Biometrika, 56, 495 [CrossRef] [Google Scholar]
Howard, A., Marcy, G. W., Johnson, J. A., et al. 2010, Science, 330, 653 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Howard, A. W., Marcy, G. W., Bryson, S. T., et al. 2012, ApJS, 201, 15 [NASA ADS] [CrossRef] [Google Scholar]
Jin, S., Mordasini, C., Parmentier, V., et al. 2014, ApJ, 795, 65 [NASA ADS] [CrossRef] [Google Scholar]
Lopez, E. D., Fortney, J. J., & Miller, N. 2012, ApJ, 761, 59 [NASA ADS] [CrossRef] [Google Scholar]
Marcy, G. W., Weiss, L. M., Petigura, E. A., et al. 2014, Proc. Natl. Acad. Sci., 111, 12655 [Google Scholar]
Markovsky, I., Sima, D. M., & Van Huffel, S. 2010, WIREs Comp. Stats. 2, 212 [CrossRef] [Google Scholar]
Mordasini, C., Alibert, Y., Georgy, C., et al. 2012, A&A, 547, A112 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Mordasini, C., Klahr, H., Alibert, Y., Miller, N., & Henning, T. 2014, A&A, 566, A141 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Mordasini, C., Molliére, P., Dittkrist, K. M., Jin, S., & Alibert, Y. 2015, Int. J. Astrobiol., 14, 201 [CrossRef] [Google Scholar]
Movshovitz, N., & Podolak, M. 2008, Icarus, 194, 368 [NASA ADS] [CrossRef] [Google Scholar]
Movshovitz, N., Bodenheimer, P., Podolak, M., & Lissauer, J. J. 2010, Icarus, 209, 616 [NASA ADS] [CrossRef] [Google Scholar]
Nelder, J. A., & Mead, R. 1965, Comput. J., 7, 308 [CrossRef] [Google Scholar]
Nesvorný, D., & Morbidelli, A. 2008, ApJ, 688, 636 [NASA ADS] [CrossRef] [Google Scholar]
Owen, J. E., & Wu, Y. 2013, ApJ, 775, 105 [Google Scholar]
Santos, N. C., Adibekyan, V., Mordasini, C., et al. 2015, A&A, 580, L13 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Weidenschilling, S. J. 1977, Ap&SS, 51, 153 [Google Scholar]
Weiss, L. M., & Marcy, G. W. 2014, ApJ, 783, L6 [NASA ADS] [CrossRef] [Google Scholar]
Weiss, L. M., Marcy, G. W., Rowe, J. F., et al. 2013, ApJ, 768, 14 [NASA ADS] [CrossRef] [Google Scholar]
Wooldridge, J. M. 1999, Econometrica, 67, 1385 [CrossRef] [Google Scholar]
Zapolsky, H. S., & Salpeter, E. E. 1969, ApJ, 158, 809 [NASA ADS] [CrossRef] [Google Scholar]

All Tables

Table 1

Details of the three mass-range sets, and the resulting weights used in the analysis.

In the text

Table 2

M-R relations derived for small and large planets by previous studies and in this work.

In the text

All Figures

	Fig. 1 Top: M-R relation of the exoplanets considered in the analysis. The dashed lines identify the three different regimes we consider for the weighting (see text). Bottom: the M-R relation and the derived best-fit curves, and M-R relations.
In the text

	Fig. 2 Histogram of planetary mass, showing clearly the three empirical mass ranges. The division into three intervals was performed in order to improve the quality of the statistical analysis (see text for details).
In the text

	Fig. 3 M-R relation: observations vs. theoretical data. The circles correspond to the observations, while the shaded area represents the results from planet population synthesis models.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[R1] Baglin, A., Auvergne, M., Boisnard, L., et al. 2006, 36th COSPAR Scientific Assembly, 36, 3749 [Google Scholar]

[R2] Baraffe, I., Chabrier, G., Fortney, J., & Sotin, C. 2014, in Protostars and Planets VI, eds. H. Beuther, R. S. Klessen, C. P. Dullemond, & T. Henning (Tucson: University of Arizona Press), 763 [Google Scholar]

[R3] Bell, K. R., & Lin, D. N. C. 1994, ApJ, 427, 987 [NASA ADS] [CrossRef] [Google Scholar]

[R4] Borucki, W. J., Koch, D., Basri, G., et al. 2010, Science, 327, 977 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]

[R5] Chen, J., & Kipping, D. M. 2017, ApJ, 834, 17 [Google Scholar]

[R6] Cumming, A., Butler, R. P., Marcy, G. W., et al. 2008, PASP, 120, 531 [CrossRef] [Google Scholar]

[R7] Durbin, J. 1954, Rev. Inst. Int. Stat., 22, 23 [CrossRef] [Google Scholar]

[R8] Estivill-Castro, V. 2002, SIGKDD explor., 4, 65 [CrossRef] [Google Scholar]

[R9] Fortney, J. J., Marley, M. S., & Barnes, J. W. 2007, ApJ, 659, 1661 [NASA ADS] [CrossRef] [Google Scholar]

[R10] Freedman, R. S., Lustig-Yaeger, J., Fortney, J. J., et al. 2014, ApJS, 214, 25 [NASA ADS] [CrossRef] [Google Scholar]

[R11] Golub, G. 1973, SIAM rev., 15, 318 [CrossRef] [Google Scholar]

[R12] Golub, G., & Van Loan, C. 1980, SIAM J. Numer. Anal., 17, 883 [NASA ADS] [CrossRef] [Google Scholar]

[R13] Guillot, T. 2005, Annu. Rev. Earth Planet. Sci., 33, 493 [Google Scholar]

[R14] Hatzes, A. P., & Rauer, H. 2015, ApJ, 810, L25 [NASA ADS] [CrossRef] [Google Scholar]

[R15] Hinkley, D. V. 1969, Biometrika, 56, 495 [CrossRef] [Google Scholar]

[R16] Howard, A., Marcy, G. W., Johnson, J. A., et al. 2010, Science, 330, 653 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]

[R17] Howard, A. W., Marcy, G. W., Bryson, S. T., et al. 2012, ApJS, 201, 15 [NASA ADS] [CrossRef] [Google Scholar]

[R18] Jin, S., Mordasini, C., Parmentier, V., et al. 2014, ApJ, 795, 65 [NASA ADS] [CrossRef] [Google Scholar]

[R19] Lopez, E. D., Fortney, J. J., & Miller, N. 2012, ApJ, 761, 59 [NASA ADS] [CrossRef] [Google Scholar]

[R20] Marcy, G. W., Weiss, L. M., Petigura, E. A., et al. 2014, Proc. Natl. Acad. Sci., 111, 12655 [Google Scholar]

[R21] Markovsky, I., Sima, D. M., & Van Huffel, S. 2010, WIREs Comp. Stats. 2, 212 [CrossRef] [Google Scholar]

[R22] Mordasini, C., Alibert, Y., Georgy, C., et al. 2012, A&A, 547, A112 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R23] Mordasini, C., Klahr, H., Alibert, Y., Miller, N., & Henning, T. 2014, A&A, 566, A141 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R24] Mordasini, C., Molliére, P., Dittkrist, K. M., Jin, S., & Alibert, Y. 2015, Int. J. Astrobiol., 14, 201 [CrossRef] [Google Scholar]

[R25] Movshovitz, N., & Podolak, M. 2008, Icarus, 194, 368 [NASA ADS] [CrossRef] [Google Scholar]

[R26] Movshovitz, N., Bodenheimer, P., Podolak, M., & Lissauer, J. J. 2010, Icarus, 209, 616 [NASA ADS] [CrossRef] [Google Scholar]

[R27] Nelder, J. A., & Mead, R. 1965, Comput. J., 7, 308 [CrossRef] [Google Scholar]

[R28] Nesvorný, D., & Morbidelli, A. 2008, ApJ, 688, 636 [NASA ADS] [CrossRef] [Google Scholar]

[R29] Owen, J. E., & Wu, Y. 2013, ApJ, 775, 105 [Google Scholar]

[R30] Santos, N. C., Adibekyan, V., Mordasini, C., et al. 2015, A&A, 580, L13 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R31] Weidenschilling, S. J. 1977, Ap&SS, 51, 153 [Google Scholar]

[R32] Weiss, L. M., & Marcy, G. W. 2014, ApJ, 783, L6 [NASA ADS] [CrossRef] [Google Scholar]

[R33] Weiss, L. M., Marcy, G. W., Rowe, J. F., et al. 2013, ApJ, 768, 14 [NASA ADS] [CrossRef] [Google Scholar]

[R34] Wooldridge, J. M. 1999, Econometrica, 67, 1385 [CrossRef] [Google Scholar]

[R35] Zapolsky, H. S., & Salpeter, E. E. 1969, ApJ, 158, 809 [NASA ADS] [CrossRef] [Google Scholar]