Issue 
A&A
Volume 650, June 2021



Article Number  A177  
Number of page(s)  8  
Section  Planets and planetary systems  
DOI  https://doi.org/10.1051/00046361/202140375  
Published online  24 June 2021 
Machine learning techniques in studies of the interior structure of rocky exoplanets
^{1}
State Key Laboratory of Lunar and Planetary Sciences, Macau University of Science and Technology,
Macau,
PR China
email: ddni@must.edu.mo
^{2}
CNSA Macau Center for Space Exploration and Science,
Macau,
PR China
Received:
19
January
2021
Accepted:
27
April
2021
Context. Earthsized exoplanets have been discovered and characterized thanks to new developments in observational techniques, particularly those planets that may have a rocky composition that is comparable to terrestrial planets of the Solar System. Characterizing the interiors of rocky exoplanets is one of the main objectives in investigations of their habitability. Theoretical massradius relations are often used as a tool to constrain the internal structure of rocky exoplanets. But one massradius curve only represents a single interior structure and a great deal of computation time is required to obtain all possible interior structures that comply with the given mass and radius of a planet.
Aims. We apply a machinelearning approach based on mixture density networks (MDNs) to investigate the interiors of rocky exoplanets. We aim to provide a welltrained MDN model to quickly and efficiently predict the interior structure of rocky exoplanets.
Methods. We presented a training data set of rocky exoplanets with masses between 0.1 and 10 Earth masses based on threelayer interior models by assuming Earthlike compositions. This data set was then used to train the MDN model to predict the layer thicknesses and core properties of rocky exoplanets, where planetary mass, radius, and water content are inputs to the MDN. The performance of the trained MDN model was investigated in order to discern its predictive ability.
Results. The MDN model is found to show good performance in predicting the layer thicknesses and core properties of rocky exoplanets through a comparison with the real solutions obtained by solving the interior models. We also applied the MDN model to the Earth and the superEarth exoplanet LHS 1140b. The MDN predictions are in good agreement with the interior model solutions within the uncertainties of planetary mass and radius. More importantly, the MDN model takes a much shorter computational time compared to the cost of the interior model calculations, offering a convenient and powerful tool for quickly obtaining information on planetary interiors.
Key words: methods: numerical / planets and satellites: terrestrial planets / planets and satellites: interiors
© ESO 2021
1 Introduction
Thousands of exoplanets of varying mass, radius, and orbital parameters have been discovered and confirmed to be orbiting other stars, including hot Jupiters, miniNeptunes, and superEarths (Batalha 2014; Gillon et al. 2017). Even Earthsized exoplanets with masses smaller than 10 Earth masses (M_{⊕}) have been discovered owing to new developments in observational techniques, in particular, those planets that may have a rocky composition that is comparable to the terrestrial planets of the Solar System. This opens up a new era in planetary habitability and raises further questions of life beyond the Solar System. Along with the advent of spacebased missions such as Kepler and CoRoT, exoplanet studies have been moving from the phase of discovery to characterization. To date, several hundred have been characterized by mass and radius based on spacebased or groundbased observations, including a few smallmass exoplanets. Based on our knowledge of mass and radius, one can derive the mean density of observed exoplanets and estimate their bulk composition by comparison with theoretical massradius curves. In this area, considerable efforts have been devoted to quantitative investigations of massradius relations, especially for rocky exoplanets (Valencia et al. 2006; Seager et al. 2007; Sotin et al. 2007; Zeng & Sasselov 2013; Brugger et al. 2017; Turbet et al. 2020). These works look to various interior models with different focuses and cover the wide span of exoplanets. In the direction of forward modeling, different interior models are capable of interpreting the measured mass and radius within their uncertainties, showing the inherent degeneracy in finding the interior structure of exoplanets. In order to address the degeneracy issue, an analysis may also be performed as an inverse problem. Such inverse calculations require solving the internal structure equations for a large number of interior parameters, thus, they are generally timeconsuming. Valencia et al. (2006) concluded that different mantle compositions affect the massradius relation for massive terrestrial planets. Sotin et al. (2007) found that the massradius relation shows some dependency on the total water content. In turn, Rogers & Seager (2010) and Dorn et al. (2015, 2017) performed a full probabilistic inverse analysis of planetary compositions in terms of mass, radius, and stellar refractory elementary abundances, taking into account the uncertainties of measurements and models. Additionally, some works have been devoted to interpreting the data of newly discovered exoplanets based on interior structure models by assuming plausible compositions such as terrestrial bulk compositions and stellar abundances (Wagner et al. 2012; LilloBox et al. 2020; Noack et al. 2017; Dorn et al. 2018; Hoolst et al. 2019).
More recent machine learning (ML) approaches have applied this powerful and reliable tool in Earth and planetary sciences. For example, Zhao et al. (2019) applied modern ML approaches based on random forests (RFs) and deep neural networks (DNNs) to explain the origin and involvement of fluid in the generation of Cenozoic basalts in Northeast China. Atkins et al. (2016) used mixture density networks (MDNs) to infer the governing parameters for mantle convection. UlmerMoll et al. (2019) used RFs to compute the radius of exoplanets based on their mass, their equilibrium temperature, and several stellar parameters. Alibert & Venturini (2019) trained DNNs to compute the critical core mass and envelope masses of forming planets. Baumeister et al. (2020) trained MDNs for exoplanets with 0.01 M_{⊕} < M < 25 M_{⊕} to infer the distribution of possible thicknesses of each planetary layer. In addition, a wide range of MLbased techniques have been used in astrophysics to detect exoplanets (Shallue & Vanderburg 2018; Pearson et al. 2018; McCauliff et al. 2015; Schanche et al. 2019).
The past detection and characterization of exoplanets have demonstrated that exoplanets are extremely common objects in the Universe and they are much more diverse than originally expected. The search for rocky exoplanets is of particular interest. And these rocky exoplanets might have liquid water at the surface, since water is an essential ingredient for life. As far as we know, exoplanet habitability requires planetary magnetic fields as well as plate tectonics or stagnant lid in addition to water. The presence of planetary magnetic fields may reduce atmospheric loss and shield planetary surfaces from highenergy charged particles (Boujibar et al. 2020). Plate tectonics plays an essential role in longterm climate stabilizationthrough the carbon cycle on Earth (Raymond et al. 2007; Spiegel et al. 2014) and Earthsized stagnant lid planets could possess habitable climates as well (Noack et al. 2017; Foley 2019; Godolt et al. 2019). For rocky planets, they are all related to planetary cores. The generation and maintenance of magnetic fields require the existence of a liquid core and fluid motion driven by thermochemical convection. Stagnant lid and plate tectonics are related to the thermal state of the mantle and core. However, even for the planets in the Solar System, the properties of coremantleboundary (CMB) regions and planetary cores are difficult to assess. Future observational constraints of exoplanetary magnetic fields might provide a route to understanding the internal dynamics and thermalstate of planetary cores (Driscoll 2018). In the present study, we intend to apply ML approaches to predict the layer thicknesses and core properties of rocky exoplanets. Such a datadriven approach decouples the data generation from the ML inference. Once a ML model is well trained on the generated data, it can be useful in quickly characterizing the interior of rocky exoplanets. Two main observables for exoplanets are mass and radius, and a combination of both can provide much more about the interior composition and internal structure of observed exoplanets, such as massradius curves. However, mass and radius alone allow too many degenerate solutions to constrain planetary interiors. Driscoll (2018) proposed three possible ways to break this degeneracy: (1) new observations of atmospheric compositions; (2) orbital observations for tidal response; and (3) remote magnetic field observations. In consideration of method (2), Baumeister et al. (2020) added fluid Love numbers k_{2} as an additional observable and showed a considerable decrease of the degeneracy. In contrast to Baumeister et al. (2020), we follow the perspective of method (1) to break the degeneracy of interior structures. As is well known, exoplanet habitability is one of the hot topics in current exoplanetary sciences. Taking the Earth as a good representation, habitability usually requires an appreciable water budget on the surface of terrestrial planets (Raymond et al. 2007). The water mass fraction (WMF) of rocky planets provides not only important information on the habitability of rocky planets, but also insights for discussing the massradius relationship for rocky exoplanets. Sotin et al. (2007) presented different massradius curves for extrasolar Earthlike planets and waterrich rocky planets with 50% H_{2}O. In this work, we trained a MDN model to predict the radial structure and core properties of rocky exoplanets on the input of mass, M, radius, R, and the WMF.
To address this issue, we first describe the threelayer interior models of rocky exoplanets and introduce the composition of different layers in Sect. 2. Then we present the ML approaches applied in this field along with the training data sets in Sect. 3. Section 4 presents our results concerning the predictive ability of the ML approach and its applications to Earthsized and superEarth exoplanets. Finally, the main results of this paper are summarized in Sect. 5.
2 Interior structure models
We model the internal structure of rocky exoplanets using a newly developed interior model named ExoPlex^{1} (Lorenzo & Unterborn 2018). A rocky exoplanet consists of three compositionally distinct layers: an ironrich core, lower and upper silicate mantles, and a waterice outer shell. The silicate crust is regarded as part of the upper mantle since its mass is negligible with respect to that of the silicate mantle. The pressure profile is calculated from the equation of hydrostatic equilibrium, (1)
The temperature profile is calculated using an adiabatic temperature gradient, (2)
And we adopt surface boundary conditions of P(R) = 1 bar and T(R) = T_{Pot}, where T_{Pot} is the potential temperature: T_{Pot} = 1600 K if a planet has no waterice layer above the mantle and T_{Pot} = 300 K if a planet has waterice layers. A set of independent parameters, which are used to describe the composition of each layer in addition to planetary mass and radius, will be explained below for each layer.
2.1 Layer 1: ironrich core
It is generally recognized that the density of the Earth’s core is about 5−10% smaller than the density of pure iron, suggesting the existence of light alloying elements such as sulfur, oxygen, and silicon. But little is known about the amount of light elements in exoplanetary cores. Moreover, Earth’s core is predominately in the liquid phase, with a small and solid inner core accounting for less than 2% of Earth mass. The phase of the inner core is expected to have little influence on the Earth’s interior (Sotin et al. 2007; Valencia et al. 2006). Furthermore, a completely liquid core could be ubiquitous for superEarths (Valencia et al. 2006). With these in mind, it is assumed that the modeled planet exhibits a liquid core composed of a FeFeS alloy with a molar fraction of 13% FeS, as suggested inSotin et al. (2007). The core equation of state (EoS) adopted in the ExoPlex comes from Anderson & Ahrens (1994), an empirical fit of liquid iron to a secondorder BirchMurnaghan EoS centered at 1 bar and 1811 K.
2.2 Layer 2: silicate mantle
The silicate mantle consists of various minerals. We can use ExoPlex to compute stable mineral phases and their thermodynamic properties using the Gibbs free energy minimization package, PerpleX (Connolly 2009). Considering that most of the Earth’s mantle minerals are composed ofchemical elements Si, Mg, O, Fe, Ca, and Al, the silicate mantle of rocky exoplanets is regarded to comprise the oxides SiO_{2}, FeO, MgO, CaO, and Al_{2}O_{3}, and the mineral phases of modeled planets are assumed to be what potentially appear in the Earth’s mantle. Given a bulk silicate mantle composition, an assemblage of thermodynamically consistent phases as well as their corresponding densities are computed at a certain temperature and pressure through the Gibbs minimization scheme. The formulation and EoS parameters involved in these calculations are as given by Stixrude & LithgowBertelloni (2005, 2011). In this work, the bulk mantle composition is assumed to be Earthlike, which is defined by relative molar abundances of SiFeCaAl to Mg. We use an Earthlike value of 0.884 for the molar Si/Mg ratio in all modeled planets (Sotin et al. 2007). The magnesium number (Mg#), defined as the mole fraction Mg/(Mg+Fe) in the silicates, is often used to describe the major element composition of the mantle material (Sotin et al. 2007; Brugger et al. 2017). Here we determine the Fe/Mg ratioin terms of the Mg number. It is known that the Earth has the Mg# number of 0.9 for its mantle silicates (Sotin et al. 2007). So we use a value of 1/9 for the molar Fe/Mg ratio in all modeled planets. CaO and Al_{2}O_{3} are minor components and their relative molar abundances are fixed to the Earth’s values Ca/Mg = 0.06 and Al/Mg = 0.08. The silicate mantle is assumed to be chemically homogeneous, which means all the chemical parameters are constant throughout the mantle.
2.3 Layer 3: waterice shell
In order to cover possible exoplanets with various water contents, we can expect a waterice shell on top of the silicate mantle. At some depth, H_{2}O is presented in phase of highpressure ice polymorphs instead of liquid water. ExoPlex takes into account both liquid water and various highpressure ice polymorphs. Ices VI and Ih are included in addition to the most stable highpressure phase ice VII. Their respective EoSs come from Bezacier et al. (2014); Feistel & Wagner (2006). We note that some minor phases, namely ices II, V, and III, are ignored in ExoPlex by limiting the minimum surface temperature to 300 K.
3 Machinelearning approaches and training data sets
The DNN algorithm (LeCun et al. 2015), with its powerful abilities for mining the information of highdimensional data, has been widely applied to various fields in recent years. A DNN is a network structure composed of multiple processing layers, including one input layer, several hidden layers, and one output layer. Essentially, the training process of DNNs is aimed at finding a function that minimizes the difference between the computed output value and the real output (target) value. The functional (i.e., singlevalued) mapping can be well described by the DNN model, which is defined as forward problems. However, DNNs show mediocre performance on inverse problems where one input value yields more than one output (Bishop 1994). By contrast, the MDN algorithm combines the structure of a conventional neural network with a probability mixture model, which yields multimodal probability distributions instead of discrete values. Therefore, the MDN method is useful for cases of nonuniqueness without sacrificing degenerate solutions. The interior structures of rocky exoplanets are indeed degenerate with respect to current observations. Here, we use a Gaussian mixture model for our MDN model.
Figure 1 illustrates the MDN structure with Gaussian mixture models combined with the neural network. The input layer consists of planetary mass, radius, and WMF as inputs to the MDN. Three hidden layers are used with 512 neurons each. The output layer contains five separate MDN layers, corresponding to six outputs [the radial fractions of each layer, the core mass fraction (CMF), the CMB pressure and temperature]. Each of the five MDN layers are built using 20 Gaussian mixture components and each Gaussian mixture component is characterized by means μ, variances σ and mixing coefficients π. This configuration, including the layer size as well as the number of mixture components and neurons per layer, is found to be among the bestperforming MDN structures thanks to this use of the grid search method in the manually specified hyperparameter space.
Rectified linear unit (ReLU) functions are used to activate the hidden layers, while nonnegativeexponential linear unit (NNELU) functions are used to activate the means μ and variances σ in the output layer. Mixing coefficients π are trained as logits and they are scaled between zero and one using softmax functions. In the training process, we use the adaptive moment estimation (Adam) optimizer (Kingma & Ba 2014) with learning rates of 0.001, which is regarded as fairly robust to the choice of hyperparameters (Goodfellow et al. 2016). To prevent the network from overfitting the training data set, early stopping (Montavon et al. 2012) and dropout (Srivastava et al. 2014) techniques are applied to the MDN model. The network would stop training when the validation loss hardly drops within a few epochs. Dropout is implemented to all hidden layers with the probability of 0.05. The weights are initialized randomly before the ML starts.
A simple stepbystep description of the ML approach is shown in the following:
Traintest split. The data we use are randomly split into training and testing data sets with an 90–10 split, which means that 90% of the data will be used for training the ML model, while the remaining 10% will be used for testing the final trained model that is built out of it. During the model training, 10% of the training data set, forming a validation data set, is used to validate the model results at the end of each epoch.
Feature scaling. The raw data span over a wide range of values. As a result, gradient descent, an optimization algorithm for finding a local minimum of a differentiable function, shows a very slow convergence speed when training a MDN model (Ioffe & Szegedy 2015) and it is possible to get a bad ML model. In order to avoid this, minmax normalization, which rescales the features to the [0, 1] range, is used through the scikitlearn library (Pedregosa et al. 2011).
Model training. The scaled input data is passed through the constructed MDN structure and the outputs are computed. The computed outputs are compared with the real data. The difference between them is measured by the socalled loss function. Here, we use the average negative loglikelihood (NLL) between predictions and input values as the loss function of every output layer. Then, the loss function is propagated from the output end to the input layer and the weight values of the MDN are adjusted to make sure that the loss function would be reduced in the next iteration. In other words, the computed outputs of the MDN become closer to the real ones.
Assessing the MDN model. The performance of the final trained MDN model is assessed using the testing dataset.
In this work, we use the Python libraries Tensorflow (Abadi et al. 2016), Keras (Chollet et al. 2015), and MDN layers (Martin & Duhaime 2019) to build and train the MDN model.
We do not consider miniNeptunes and restrict our attention to rocky exoplanets with masses between 0.1 and 10 Earth masses. In modeling the interior structure of these exoplanets, we first initialize the structure of the modeled planet according to the inputs of total planet mass M, CMF, and WMF. The planet mass is selected randomly between 0.1 and 10 Earth masses. The CMF is set to a random value between 0 and 0.7, and the WMF is randomly sampled between 0 and 0.1. Some interior models without waterice shells (i.e., WMF = 0) are considered as well in order to cover waterless planets. The composition and mass distribution of the core and mantle are computed as specified above. We note that these restrictions are based on our current knowledge of the solid objects in the solar system and may not cover all rocky exoplanets. Then, the radial pressure and temperature profiles within each layer and at its interface are computed in terms of the basic physical principles and the density profile within each layer is updated in terms of the adopted EoS. The iterative process is performed until convergence, providing temperature, pressure, density, and radial fractions of each layer. We obtained ~220 000 data points for the MDN model, among which the data points for waterless planets account for about 9.1%. Figure 2 shows the histograms of various quantities for this data set.
Fig. 1
Schematic diagram of the mixture density network (MDN) structure where Gaussianmixture models are combined with the neural network. Three hidden layers are involved with 512 neurons each. The Gaussian mixture parameters contain means μ, variances σ, and mixing coefficients π. 
Fig. 2
Histograms of the input and output parameters: mass, radius, water mass fraction (input, blue) and water radial fraction, core radial fraction, core mass fraction, CMB pressure, CMB temperature (output, red). We note that the output parameter of the mantle radial fraction is not shown since it is uniquely determined given water and core radial fractions. 
Fig. 3
Negative loglikelihood (NLL) loss for the MDN model trained on mass and radius inputs only and for the MDN model trained on mass, radius, and WMF inputs. (a) Dependence of the NLL loss on the size of the training data set for the two MDN models. (b) Learning curve of the MDN trained on mass and radius inputs only for both the training and validation data sets. (c) Learning curve of the MDN trained on mass, radius, and WMF inputs for both the training and validation data sets. 
4 Results
4.1 Model training and testing
The MDN is also trained on the input of mass and radius only for comparison with the MDN trained on mass, radius, and WMF inputs. In order to gain insight into the training performance of the MDNs, we plot in Fig. 3 the NLL losses for these two MDNs. Figure 3a shows the dependence of the NLL loss on the size of the training data set. For both MDNs, the loss first decreases with the increase of the training size and reaches a plateau after one certain size N [N = 6 × 10^{4} for the MDN with (M, R) inputs and N = 9 × 10^{4} for the MDN with (M, R, WMF) inputs]. This suggests that the training data set of this work is sufficient to establish good MDN models for both cases. We used all the training data to train our MDN models. Figures 3b,c illustrate the learning curves of the two MDNs, where the training and validation losses are shown as a function of the training epoch. In both MDNs, the average NLL loss reaches an approximate constant for both the training and validation curves.
At the end of the model training, the testing data set is sent to our trained MDN models to evaluate the training performance of the MDN. Figure 4 illustrates the ability of the MDN trained on mass and radius inputs only to predict the interior properties of rocky exoplanets, where the distributions predicted by the MDN from the testing data set are displayed versus the actual values obtained from the interior models. The MDN predictions well reproduce the core radial fraction (CRF) for planets with large cores, but the MDN tends to overestimate the CRF and underestimate the mantle radial fraction (MRF) for planets with small cores and large mantles. This result is similar to the findings of Baumeister et al. (2020). As the core size gets smaller and the mantle size gets larger, a slight increase of the core mass brings in a larger relative increase in the CRF, corresponding to more degenerate solutions to the radial structure of modeled planets. This causes some difficulties for the MDN in predicting the CRF and MRF of smallcore planets. Boujibar et al. (2020) have suggested that T_{CMB} has little effect on themineralogy of the lowermost mantle, although it does play an important role in determining the physical state of the core and the thermal evolution of planetary bodies. Here, the thermal state of planetary cores is characterized by temperature and pressure at the CMB (T_{CMB} and P_{CMB}). The MDN can reproduce the actual P_{CMB} values well and barely constrain the actual T_{CMB} values. Thisis quite consistent with the conclusions of Unterborn & Panero (2019), who showed that the P_{CMB} and T_{CMB} values are mostly a function of planetary radius for planets with radii less than 1.5 Earth radii.
The pressure profile is calculated by integrating the hydrostatic Eq. (1). The CMF has direct influence on the CMB location and hence affect the P_{CMB} value. That is, a greater CMF shallows the CMB location reducing the integral interval of the hydrostatic equation. However, a larger CMF brings in an increase in the gravity acceleration g(r) enhancing the pressure gradient in Eq. (1). Under the balance of these two effects, the resulting P_{CMB} values show small variations for a given mass. Because the adiabatic temperature gradient is primarily a function of the hydrostatic pressure (see Eq. (2)), the resulting T_{CMB} values also show weak sensitivities to planetary structures for a given mass. In addition, we may note that the MDN predictions are quite accurate for small values of P_{CMB} and T_{CMB}. This is because the low pressure or temperature at the CMB does not allow for much variation in possible interior structures under the conditions of the hydrostatic equilibrium and adiabatic approximation.
Figure 5 illustrates the ability of the MDN trained on mass, radius, and WMF inputs. Adding WMF as an additional input significantly enhances the ability of the MDN. The most considerable improvement lies in the water radial fraction (WRF). This can be easily understood as the WRF is directly correlated with the WMF. However, the MDN still has a tendency to overestimate the CRF and underestimate the MRF for planets with small cores and large mantles. The MDN predicts the core properties of rocky exoplanets with a high accuracy. In particular, for T_{CMB} and P_{CMB}, the distributions predicted by the MDN are concentrated along the diagonal.
4.2 Application to Earthsize and superEarth exoplanets
In this section, we apply our trained MDN model on mass, radius, and WMF inputs to two rocky planets: earthsized and superEarth planets, where the Earth is regarded as a typical Earthsized representative and the waterrich exoplanet LHS 1140b is chosen as a superEarth representative. We establish the validation dataset for both planets using the above interior models, where the uncertainties on mass and radius are considered by introducing a standard deviation (3)
In Eq. (3), M and R are the true values of mass and radius, and ΔM and ΔR denote the deviations of one sample from the true values. Here, the standard deviation σ is fixed at 1% and 2%, respectively. The amount of water in the oceans on the Earth’s surface is known as M_{oce} = 2.3 × 10^{−4} M_{⊕} (Schubert et al. 2001) and the water content in the Earth’s mantle is uncertain because we cannot access the mantle directly. Earth’s M_{oce} may be uncommon on rocky smallradius planets (Tian & Ida 2015; Kite & Ford 2018) and highresolution simulations of rockyplanet formation suggest surviving terrestrial planets have the WMF between 3 × 10^{−3} and 2 × 10^{−2} (Raymond et al. 2007). Moreover, the interior models of Earthsized exoplanets have shown that a larger water content would be possible for a planet with Earthmass and Earthradius if larger amounts of iron are presented (Noack et al. 2017). Following Sotin et al. (2007), we adopt a value of WMF = 5 × 10^{−4} for Earthlike planets. Waterrich planets are obvious candidates of habitable worlds and have attracted increasing attention. Recently knowledge of the known exoplanet LHS 1140b has been improved with ESPRESSO and TESS (LilloBox et al. 2020). The mass and radius of LHS 1140b are determined as M = 6.48 ± 0.46 M_{⊕} and R = 1.635 ± 0.046 R_{⊕}. Based on the interior model of Brugger et al. (2017) and the method of Dorn et al. (2015), LilloBox et al. (2020) also found the WMF of LHS 1140b peaking at WMF = 4 × 10^{−2}, around 80 times the water content on Earth. Figures 6 and 7 show the MDN predictions for the Earth and the superEarth exoplanet LHS 1140b using mass, radius, and WMF as inputs to the MDN, compared with the validation data set consisting of the interior model solutions. In both cases, the MDN predictions are well consistent with the interior model solutions. For the layer thicknesses, as discussed in the previous subsection, the MDN predicts the CRF and MRF of LHS 1140b better than for the Earth since LHS 1140b exhibits a larger core and a smaller mantle. The MDN predictions of T_{CMB} and P_{CMB} show particularly good agreement with the interior model solutions for both the Earth and LHS 1140b, while the CMFs of the Earth and LHS 1140b, respectively, are moderately and adequately predicted.
Fig. 4
Predicted distributions of six output quantities from the MDN trained on mass and radius inputs only versus the actual values obtained from the interior models for the testing data set: water radial fraction, mantel radial fraction, core radial fraction, core mass fraction, as well as pressure and temperature at the coremantleboundary (CMB). Blue dashed diagonal lines denote a perfect performance of the MDN. The predicted distributions are colored according to the local probability density (the color scale is black at the maximum). 
Fig. 5
Predicted distributions of six output quantities from the MDN trained on mass, radius, and water mass fraction inputs, as shown in Fig. 4. 
Fig. 6
MDN predictions of the layer thicknesses and core properties for the Earth. Color lines denote the combined Gaussian mixture prediction of the MDN. Vertical bands denote the validation results from the interior models, where the shade of each band corresponds to different standard deviations σ (σ = 1% for darker tones and σ = 2% for lighter tones). 
Fig. 7
MDN predictions of the layer thicknesses and core properties for the superEarth exoplanet LHS 1140b, as shown in Fig. 6. 
5 Summary
In this paper, we present a ML approach to infer the interior structure of rocky exoplanets with 0.1 M_{⊕} < M < 10 M_{⊕}. Threelayer interior models are constructed using ExoPlex and data training is performed based on the MDN model. The resulting MDN model is available on GitHub^{2}. We comparethe MDN predictions with the results obtained from the interior model calculations. The MDN model shows good performancein predicting the layer thicknesses and core properties of rocky exoplanets, as shown in Fig. 5. The MDN can be efficiently trained using the highdimensional data of planetary interiors to bypass the interior model calculations. We also apply the MDN model to the Earth and the superEarth exoplanet LHS 1140b. Our finding is that the MDN predictions are consistent with the interior model solutions, as shown in Figs. 6 and 7. The trained MDN model offers a convenient and powerful tool for quick knowledge of planetary interiors and, hence, more insight into exoplanetary habitability.
To date, detecting the water amount of rocky exoplanets has proven to be a challenging task. Upcoming space and groundbased observatories, such as the James Webb Space Telescope (JWST) and the next generation of Extremely Large Telescopes, might make it possible in the future. Direct measurements of water vapor in exoplanetary atmospheres are expected to be extended from hot Jupiters to planets with considerably lower masses and temperatures, probably in the superEarth or even Earthsized regime (Kreidberg et al. 2014; Benneke et al. 2019; Pinhas et al. 2019). Spurred on by the search for habitable worlds and signals of life, characterizing the surface of rocky exoplanets might be possible for nontransiting exoplanets by measurements of integrated spectrum and light curves (MontañésRodríguez et al. 2006; Fujii et al. 2010).
The present analysis is merely preliminary and meant to serve as a starting point of these types of analyses. In fact, the internal structure and interior composition of rocky exoplanets could be more complex than what we consider here. The interiormodeling effort would be supplemented by including more elements, such as adding atmospheric layers abovethe waterice shell and considering more mineralogical transformations in the mantle. This would bring in a larger degeneracy in model solutions and the MDN model has some powerful advantages over the other approaches. Moreover, it has been demonstrated by Baumeister et al. (2020) that adding fluid Love number k_{2} as an additional input to the MDN significantly breaks the degeneracy of the MDN predictions with regard to possible interior structures. Furthermore, the inclusion of WMF input considerably improves the performance of the MDN, as shown in Figs. 4 and 5. Therefore, additional inputs to the MDN could be worthwhile, along with more complex interior models.
Acknowledgements
This work is supported by the National Natural Science Foundation of China (Grant No. 12022517), the Science and Technology Development Fund, Macau SAR (File No. 0005/2019/A1 and 0048/2020/A1), and the PreResearch Projects on Civil Aerospace Technologies of China National Space Administration (Grant No. D020308 and D020303).
References
 Abadi, M., Barham, P., Chen, J., et al. 2016, TensorFlow: a System for LargeScale Machine Learning (Berkeley: Usenix association) [Google Scholar]
 Alibert, Y., & Venturini, J. 2019, A&A, 626, A21 [CrossRef] [EDP Sciences] [Google Scholar]
 Anderson, W. W., & Ahrens, T. J. 1994, J. Geophys. Res. Solid Earth, 99, 4273 [NASA ADS] [CrossRef] [Google Scholar]
 Atkins, C., Valentine, A. P., Tackley, P. J., & Trampert, J. 2016, Phys. Earth Planet. Inter., 257, 171 [CrossRef] [Google Scholar]
 Batalha, N. M. 2014, PNAS, 111, 12647 [NASA ADS] [CrossRef] [Google Scholar]
 Baumeister, P., Padovan, S., Tosi, N., Montavon, G., & Nettelmann, N. 2020, ApJ, 889, 42 [CrossRef] [Google Scholar]
 Benneke, B., Wong, I., Piaulet, C., et al. 2019, ApJ, 887, L14 [NASA ADS] [CrossRef] [Google Scholar]
 Bezacier, L., Journaux, B., Perrillat, J.P., et al. 2014, J. Chem. Phys., 141, 104505 [Google Scholar]
 Bishop, C. M. 1994, Mixture Density Networks, Tech. Rep. NCRG/94/004 [Google Scholar]
 Boujibar, A., Driscoll, P., & Fei, Y. 2020, J. Geophys. Res. Planets, 125, e2019JE006124 [CrossRef] [Google Scholar]
 Brugger, B., Mousis, O., Deleuil, M., & Deschamps, F. 2017, ApJ, 850, 93 [Google Scholar]
 Chollet, F., et al. 2015, Keras, https://keras.io [Google Scholar]
 Connolly, J. A. D. 2009, Geochem. Geophys. Geosyst., 10, Q10014 [NASA ADS] [CrossRef] [Google Scholar]
 Dorn, C., Khan, A., Heng, K., et al. 2015, A&A, 577, A83 [EDP Sciences] [Google Scholar]
 Dorn, C., Venturini, J., Khan, A., et al. 2017, A&A, 597, A37 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Dorn, C., Bower, D. J., & Rozel, A. 2018, in Handbook of Exoplanets, Assessing the Interior Structure of Terrestrial Exoplanets with Implications for Habitability (Berlin: Springer) [Google Scholar]
 Driscoll, P. E. 2018, Handbook of Exoplanets, Planetary interiors, magnetic fields, and habitability (Berlin: Springer), 1 [Google Scholar]
 Feistel, R., & Wagner, W. 2006, J. Phys. Chem. Ref. Data, 35, 1021 [Google Scholar]
 Foley, B. J. 2019, ApJ, 875, 72 [NASA ADS] [CrossRef] [Google Scholar]
 Fujii, Y., Kawahara, H., Suto, Y., et al. 2010, ApJ, 715, 866 [NASA ADS] [CrossRef] [Google Scholar]
 Gillon, M., Triaud, A. H. M. J., Demory, B. O., et al. 2017, Nature, 542, 456 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Godolt, M., Tosi, N., Stracke, B., et al. 2019, A&A, 625, A12 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Goodfellow, I., Bengio, Y., & Courville, A. 2016, Deep Learning (Cambridge: MIT Press) [Google Scholar]
 Hoolst, T. V., Noack, L., & Rivoldini, A. 2019, Adv. Phys. X, 4, 1630316 [Google Scholar]
 Ioffe, S., & Szegedy, C. 2015, ArXiv preprint [arXiv:1502.03167] [Google Scholar]
 Kingma, D. P., & Ba, J. 2014, ArXiv preprint [arXiv:1412.6980] [Google Scholar]
 Kite, E. S., & Ford, E. B. 2018, ApJ, 864, 75 [NASA ADS] [CrossRef] [Google Scholar]
 Kohavi, R. 1995, A study of crossvalidation and bootstrap for accuracy estimation and model selection, Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, Canada [Google Scholar]
 Kreidberg, L., Bean, J. L., Désert, J.M., et al. 2014, ApJ, 793, L27 [NASA ADS] [CrossRef] [Google Scholar]
 LeCun, Y., Bengio, Y., & Hinton, G. 2015, Nature, 521, 436 [Google Scholar]
 LilloBox, J., Figueira, P., Leleu, A., et al. 2020, A&A, 642, A121 [CrossRef] [EDP Sciences] [Google Scholar]
 Lorenzo, A., & Unterborn, C. 2018, https://doi.org/10.5281/zenodo.1208161 [Google Scholar]
 Martin, C., & Duhaime, D. 2019, https://doi.org/10.5281/zenodo.2578015 [Google Scholar]
 McCauliff, S. D., Jenkins, J. M., Catanzarite, J., et al. 2015, ApJ, 806, 6 [NASA ADS] [CrossRef] [Google Scholar]
 MontañésRodríguez, P., Pallé, E., Goode, P. R., & MartínTorres, F. J. 2006, ApJ, 651, 544 [NASA ADS] [CrossRef] [Google Scholar]
 Montavon, G., Orr, G., & Müller, K.R. 2012, Neural Networkstricks of the Trade, 2nd edn. (Berlin: Springer) [CrossRef] [Google Scholar]
 Noack, L., Snellen, I., & Rauer, H. 2017, Space Sci. Rev., 212, 877 [NASA ADS] [CrossRef] [Google Scholar]
 Pearson, K. A., Palafox, L., & Griffith, C. A. 2018, MNRAS, 474, 478 [NASA ADS] [CrossRef] [Google Scholar]
 Pedregosa, F., Varoquaux, G., Gramfort, A., et al. 2011, J. Mach. Learn. Res., 12, 2825 [Google Scholar]
 Pinhas, A., Madhusudhan, N., Gandhi, S., & MacDonald, R. 2019, MNRAS, 482, 1485 [Google Scholar]
 Raymond, S. N., Quinn, T., & Lunine, J. I. 2007, Astrobiology, 7, 66 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Rogers, L. A., & Seager, S. 2010, ApJ, 712, 974 [NASA ADS] [CrossRef] [Google Scholar]
 Schanche, N., Cameron, A. C., Hébrard, G., et al. 2019, MNRAS, 483, 5534 [NASA ADS] [CrossRef] [Google Scholar]
 Schubert, G., Turcotte, D. L., & Olson, P. 2001, Mantle Convection in the Earth and Planets (New York: Cambridge University Press) [CrossRef] [Google Scholar]
 Seager, S., Kuchner, M., HierMajumder, C. A., & Militzer, B. 2007, ApJ, 669, 1279 [Google Scholar]
 Shallue, C. J., & Vanderburg, A. 2018, AJ, 155, 94 [NASA ADS] [CrossRef] [Google Scholar]
 Spiegel, D. S., Fortney, J. J., & Sotin, C. 2014, PNAS, 111, 12622 [NASA ADS] [CrossRef] [Google Scholar]
 Sotin, C., Grasset, O., & Mocquet, A. 2007, Icarus, 191, 337 [Google Scholar]
 Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. 2014, J. Mach. Learn. Res., 15, 1929 [Google Scholar]
 Stixrude, L., & LithgowBertelloni, C. 2005, Geophys. J. Int., 162, 610 [NASA ADS] [CrossRef] [Google Scholar]
 Stixrude, L., & LithgowBertelloni, C. 2011, Geophys. J. Int., 184, 1180 [NASA ADS] [CrossRef] [Google Scholar]
 Tian, F., & Ida, S. 2015, NatGe, 8, 177 [Google Scholar]
 Turbet, M., Bolmont, E., Ehrenreich, D., et al. 2020, A&A, 638, A41 [Google Scholar]
 UlmerMoll, S., Santos, N. C., Figueira, P., Brinchmann, J., & Faria, J. P. 2019, A&A, 630, A135 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Unterborn, C. T., & Panero, W. R. 2019, J. Geophys. Res. Planets, 124, 1704 [CrossRef] [Google Scholar]
 Valencia, D., O’Connell, R. J., & Sasselov, D. 2006, Icarus, 181, 545 [NASA ADS] [CrossRef] [Google Scholar]
 Wagner, F. W., Tosi, N., Sohl, F., Rauer, H., & Spohn, T., 2012, A&A, 541, A103 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Zeng, L., & Sasselov, D. 2013, PASP, 125, 227 [Google Scholar]
 Zhao, Y., Zhang, Y., Geng, M., Jiang, J., & Zou, X. 2019, Geophys. Res. Lett., 46, 5234 [CrossRef] [Google Scholar]
All Figures
Fig. 1
Schematic diagram of the mixture density network (MDN) structure where Gaussianmixture models are combined with the neural network. Three hidden layers are involved with 512 neurons each. The Gaussian mixture parameters contain means μ, variances σ, and mixing coefficients π. 

In the text 
Fig. 2
Histograms of the input and output parameters: mass, radius, water mass fraction (input, blue) and water radial fraction, core radial fraction, core mass fraction, CMB pressure, CMB temperature (output, red). We note that the output parameter of the mantle radial fraction is not shown since it is uniquely determined given water and core radial fractions. 

In the text 
Fig. 3
Negative loglikelihood (NLL) loss for the MDN model trained on mass and radius inputs only and for the MDN model trained on mass, radius, and WMF inputs. (a) Dependence of the NLL loss on the size of the training data set for the two MDN models. (b) Learning curve of the MDN trained on mass and radius inputs only for both the training and validation data sets. (c) Learning curve of the MDN trained on mass, radius, and WMF inputs for both the training and validation data sets. 

In the text 
Fig. 4
Predicted distributions of six output quantities from the MDN trained on mass and radius inputs only versus the actual values obtained from the interior models for the testing data set: water radial fraction, mantel radial fraction, core radial fraction, core mass fraction, as well as pressure and temperature at the coremantleboundary (CMB). Blue dashed diagonal lines denote a perfect performance of the MDN. The predicted distributions are colored according to the local probability density (the color scale is black at the maximum). 

In the text 
Fig. 5
Predicted distributions of six output quantities from the MDN trained on mass, radius, and water mass fraction inputs, as shown in Fig. 4. 

In the text 
Fig. 6
MDN predictions of the layer thicknesses and core properties for the Earth. Color lines denote the combined Gaussian mixture prediction of the MDN. Vertical bands denote the validation results from the interior models, where the shade of each band corresponds to different standard deviations σ (σ = 1% for darker tones and σ = 2% for lighter tones). 

In the text 
Fig. 7
MDN predictions of the layer thicknesses and core properties for the superEarth exoplanet LHS 1140b, as shown in Fig. 6. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.