Issue |
A&A
Volume 686, June 2024
|
|
---|---|---|
Article Number | L7 | |
Number of page(s) | 9 | |
Section | Letters to the Editor | |
DOI | https://doi.org/10.1051/0004-6361/202450223 | |
Published online | 30 May 2024 |
Letter to the Editor
NeuralCMS: A deep learning approach to study Jupiter’s interior
1
Department of Earth and Planetary Sciences, Weizmann Institute of Science, Rehovot 76100, Israel
e-mail: maayan.ziv@weizmann.ac.il
2
Université Côte d’Azur, Observatoire de la Côte d’Azur, CNRS, Laboratoire Lagrange, Nice, France
3
Institut für Astrophysik, Universität Zürich, Winterthurerstr. 190, 8057 Zürich, Switzerland
Received:
3
April
2024
Accepted:
6
May
2024
Context. NASA’s Juno mission provided exquisite measurements of Jupiter’s gravity field that together with the Galileo entry probe atmospheric measurements constrains the interior structure of the giant planet. Inferring its interior structure range remains a challenging inverse problem requiring a computationally intensive search of combinations of various planetary properties, such as the cloud-level temperature, composition, and core features, requiring the computation of ∼109 interior models.
Aims. We propose an efficient deep neural network (DNN) model to generate high-precision wide-ranged interior models based on the very accurate but computationally demanding concentric MacLaurin spheroid (CMS) method.
Methods. We trained a sharing-based DNN with a large set of CMS results for a four-layer interior model of Jupiter, including a dilute core, to accurately predict the gravity moments and mass, given a combination of interior features. We evaluated the performance of the trained DNN (NeuralCMS) to inspect its predictive limitations.
Results. NeuralCMS shows very good performance in predicting the gravity moments, with errors comparable with the uncertainty due to differential rotation, and a very accurate mass prediction. This allowed us to perform a broad parameter space search by computing only ∼104 actual CMS interior models, resulting in a large sample of plausible interior structures, and reducing the computation time by a factor of 105. Moreover, we used a DNN explainability algorithm to analyze the impact of the parameters setting the interior model on the predicted observables, providing information on their nonlinear relation.
Key words: methods: numerical / planets and satellites: gaseous planets / planets and satellites: interiors / planets and satellites: individual: Jupiter
© The Authors 2024
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
1. Introduction
The interior structure of Jupiter holds information on its formation and evolution processes, with the two research fields being highly related to one another (Helled et al. 2022; Miguel et al. 2022). The range of plausible interior structures of Jupiter is constrained by the accurately measured gravity field by NASA’s Juno mission (Bolton et al. 2017; Iess et al. 2018; Durante et al. 2020) and atmospheric measurements by both Juno (Li et al. 2020) and the Galileo probe (von Zahn et al. 1998; Seiff et al. 1998; Wong et al. 2004). In addition, it is also affected by the surface winds and their internal structure, which significantly contribute to the gravity field (Kaspi et al. 2018, 2023). Inferring this range requires the exploration of a large parameter space of interior models to identify those consistent with the observations.
Relating the above observables to the physical parameters defining the interior structure of a gas giant can be done by two approaches. Theory of figures (ToF; Zharkov & Trubitsyn 1978), implemented for example to the seventh order in Nettelmann et al. (2021) and to the fourth order (Nettelmann 2017) in the CEPAM model (Guillot & Morel 1995; Guillot et al. 2018), which was used by Howard et al. (2023a), and the more accurate concentric Maclaurin spheroid (CMS) method (Hubbard 2013), used by Militzer et al. (2022), which is more computationally demanding (Militzer et al. 2019). One way to overcome the computational burden of the CMS approach is to correct the ToF results with offsets to the gravity moments to make up for the precision difference (Guillot et al. 2018; Miguel et al. 2022). However, the offsets are defined for specific parameters and might not represent the entire parameter space.
Previous studies have suggested deep learning approaches to characterize the interior of exoplanets by predicting the distribution of interior features given the planetary mass, radius, and several additional parameters (e.g., the fluid Love number k2, the effective temperature, the temperature at one bar), thus addressing the inverse problem directly (Baumeister et al. 2020; Zhao & Ni 2022; Baumeister & Tosi 2023). Recently, Haldemann et al. (2023) presented an approach allowing inference of both the inverse problem and the forward interior model.
In this work, we present NeuralCMS, a new approach to accelerate the CMS method, by predicting the model results using a deep neural network (DNN), which in practice can quickly compute millions of interior models simultaneously, or a single interior model on the order of milliseconds. Theoretically, DNNs are a suitable choice to regress the CMS results as they can approximate any nonlinear function between an adjustable number of inputs and outputs (Goodfellow et al. 2016). We used a DNN for the principal task of approximating the detailed forward CMS model constrained by the gravity moments and mass. NeuralCMS can then be used in any search algorithm, such as Monte Carlo methods (Miguel et al. 2022; Militzer 2023), to assemble a sample of plausible interior structures. We also demonstrate that with the advance in explainable DNN techniques (Samek et al. 2021), further investigation of the nonlinear relations between interior features and the observables can be made possible.
In Sect. 2 we describe the numerical and theoretical interior model, followed by a description of the dataset used to train the DNN. In Sect. 3 we present the DNN architecture, performance, and training specifics. In Sect. 4 we present the efficiency derived from our approach by performing a simple grid search for plausible Jupiter’s interior solutions.
2. Jupiter interior structure model
Our numerical interior model is based on a publicly available CMS model (Movshovitz 2019; Movshovitz et al. 2020). CMS (Hubbard 2013) is an iterative method to compute the shape and gravity harmonics (J2n) of a rotating fluid planet. It is constructed of multiple concentric Maclaurin spheroids set by their equatorial radii, and using the hydrostatic equilibrium of the gravitational and rotational potential, it solves the shape for each spheroid assembling the planet (Fig. 1). We modeled Jupiter with N = 1041 spheroids spaced the same as in Howard et al. (2023a). We validated our CMS model against the analytic n = 1 polytrope solution (Wisdom & Hubbard 2016) (see Appendix A).
![]() |
Fig. 1. Schematic view of Jupiter’s dilute core model used in this study and the computational process: given a combination of the seven marked interior parameters (left), the CMS method (right) converges to solve the gravity moments and mass, which is then compared to the Juno measurements to determine the feasibility. The image in Jupiter’s schematic (left) is available at https://www.planetary.org/space-images/merged-cassini-and-juno. |
We constructed a four-layer model of Jupiter as shown in Fig. 1, similar to Miguel et al. (2022) and Howard et al. (2023a). The outer layer is mostly composed of hydrogen and helium with their mass fraction X1 and Y1, respectively. We set Y1/(X1 + Y1) = 0.238 to be consistent with the Galileo probe measurements (von Zahn et al. 1998). The mass fraction of heavier elements, or metallicity, in this layer, is marked by Z1, which was constrained in the atmosphere by both Juno and Galileo to be higher than the solar abundance (Wong et al. 2004; Li et al. 2020; Howard et al. 2023a). Recent interior models still struggle to reconcile with this important observation (Howard et al. 2023b). The outer envelope is treated as adiabatic, with a constant entropy determined by the temperature at one bar, which was measured by the Galileo probe to be T1bar = 166.1 ± 0.8 K (Seiff et al. 1998), and was recently suggested to reach T1bar = 170.3 ± 3.8 K after reassessing Voyager radio occultations (Gupta et al. 2022).
The boundary between the inner He-rich and the outer He-poor envelopes is set by the pressure P12 representing a region where immiscibility of He in H occurs, and based on simulations of phase separation of H and He mixtures should occur between ∼0.8 and ∼3 Mbar (Morales et al. 2013). We set the metallicity of the inner envelope to be Z2 = Z1. Then we implemented a dilute core by imposing an inward increase in the mass fraction of heavy elements using the same formulation used by Miguel et al. (2022) with two main controlling parameters, Zdilute defining the maximum mass fraction of heavy elements in the dilute core, and mdilute representing the extent of the dilute core in normalized mass (see Appendix B for more details). The helium mass fraction in the inner envelope and the dilute core regions is forced by requiring the planet’s overall helium abundance to be consistent with the protosolar value, Yproto = 0.278 ± 0.006 (Serenelli & Basu 2010). Most recent models agree on the presence of a dilute core inside Jupiter although its extent exhibits discrepancies between interior and formation models. Recently, models with a small enough dilute core consistent with Juno were suggested (Howard et al. 2023a). Finally, we allowed the presence of a compact core composed of heavy materials only, and its normalized radius rcore controls it.
It was shown that the choice of the equation of state (EOS) for hydrogen and helium strongly affects the interior model (Miguel et al. 2016; Howard et al. 2023a). For this work, we did not explore this effect but used only the pure H and He tables from Chabrier et al. (2019), and the nonideal mixing effect tables, accounting for the interactions between H and He, from Howard & Guillot (2023). We used the Sesame water EOS (Lyon & Johnson 1992) for heavy materials. We used the additive volume law combined with the nonideal corrections to compute the density and entropy at a given pressure and temperature (Howard & Guillot 2023; Howard et al. 2023b). The EOS for the compact core is the analytical solution from Hubbard & Marley (1989).
Unlike other CMS-based models, we did not restrict the calculated planetary mass to a specific value. As we treat the gravity moments, the mass is compared to the Juno-derived value within its uncertainty. The mass was computed through the observed GM = 1.266865341 × 1017 m3 s−2 (Durante et al. 2020), and while in our model we do not use the Newtonian constant of gravitation G explicitly, the range of its suggested values results in a noticeable uncertainty in Jupiter’s mass. Dividing the measured GM by the extremum values of G collected by CODATA (Tiesinga et al. 2021) gives a mass range between 1.8978 and 1.8988 × 1027 kg. We used G = 6.673848 × 10−11m3 kg−1 s−2 (Mohr et al. 2012), to be consistent with Howard et al. (2023a), resulting in MJ = 1.8983 ± 0.0005 × 1027 kg. Also, deviations in the equatorial radius Req, stemming from either measurement errors (Lindal 1992) or from the dynamical height due to the wind (Galanti et al. 2023), suggest a similar mass uncertainty.
We reduced the characterization of Jupiter’s interior to the seven input parameters shown in Fig. 1 and allowed them to vary, while Req = 71 492 km (Lindal 1992) and Jupiter’s rotation rate of 9 h 55 min 29.7 s (Riddle & Warwick 1976) were kept constant. For training, we used previously computed results of over 106 CMS interior models, with different setups of 7D input samples: (1) sparsely gridded inputs; (2) randomly sampled inputs, both resulting in a large range in the outputs J2n and mass; and (3) densely gridded inputs where the outputs are closer to the Juno measurements. The deep learning model was trained to accurately predict a broad range of interior models, to be used in a wide search for plausible models. The training dataset range and distribution are presented in Table B.1 and in Fig. B.1.
3. A deep sharing-based neural network
In recent years, feedforward neural networks have been a popular machine-learning approach in many research fields. They are capable of deciphering information and relations in multidimensional data (LeCun et al. 2015). They are generally composed of input and output layers connected through several hidden layers, all built from a varying number of neurons. Their training is controlled by minimizing a so-called loss function between the predicted and the true output values, practically by optimizing the weights and biases on the connections between neurons (LeCun et al. 2015). Many machine-learning algorithms were suggested to address multi-target regression problems where the outputs are correlated (Borchani et al. 2015; Cui et al. 2018).
For this work, we adopted a sharing-based architecture (Caruana 2002; Reyes & Ventura 2019), shown in Fig. 2. Our feedforward DNN comprises an input layer fully connected to a shared hidden layer with 1024 neurons. The shared layer is fully connected to five separate private hidden blocks, one for each output. Each block contains two 1024-neuron fully connected layers computing a single output value. The internal hidden layers are activated using the nonlinear Rectified Linear Unit (ReLU) function (Goodfellow et al. 2016), which is commonly used and easy to optimize. Since the gravity moments and mass are all functions of the density structure and shape of the planet, the correlation between them is designed to be learned by the shared layer. The private layers then act as single output regressors. We find that using the mass as an output parameter improves the prediction of J2n, but it is more precisely predicted using an entirely separate DNN that has the same seven input parameters, and four fully connected layers with 1024, 512, 256, and 128 neurons, respectively, each activated with ReLU, predicting only the mass. Appendix C discusses other architectures that were tested.
![]() |
Fig. 2. Schematic diagram of the DNN architecture presented in this study. The hidden layers, both shared and private are marked by dashed black outlines and contain 1024 neurons each. The mass is used for training on the gravity moments J2n and it is predicted separately. |
We normalized the inputs to have values between zero and one, using the bounds of the training dataset (see Table B.1). The gravity moments were scaled by 106 and taken in positive values. The mass was scaled by 10−27. We initialized the DNN weights using Kaiming uniform distribution to prevent activation from potentially harming the training (He et al. 2015). During training, we approached the predicted and true output values using a weighted mean squared error loss function,
where N is the number of samples, ΔM = 0.0005 × 1027 kg is the mass uncertainty discussed in Sect. 2, 3σ2n is Juno’s 3σ uncertainty for J2n (Durante et al. 2020), and is the sum of the weights. This gives a larger weight to the more accurately measured observables. The loss function was minimized using the Adam optimizer (Kingma & Ba 2014) with a learning rate starting from 0.001 and reduced tenfold at manually chosen epochs. We took 80% of the models from the full dataset for training, leaving the rest to validate the trained DNN. The DNN was trained for 700 epochs using the PyTorch library (Paszke et al. 2019), showing no overfitting (Fig. 3f).
![]() |
Fig. 3. Performance of NeuralCMS on a sample of 104 models from the validation dataset (a–e). The dashed black lines are the standard deviation of the full validation dataset error ϵσ. The red patch represents the combined uncertainty from dynamics (Miguel et al. 2022) and measurement errors (Durante et al. 2020) for the gravity harmonics: |
The performance of our DNN was evaluated by the prediction errors ϵ for each output. Figure 3 shows the prediction errors as a function of the true output values compared to the uncertainty stemming from measurement errors (Durante et al. 2020) and the wind (Galanti et al. 2023; Miguel et al. 2022). For all model outputs, except for J2, the standard deviation of the prediction error ϵσ is smaller than the combined uncertainty, and it is mostly comparable to the wind-derived uncertainty. Specifically, for J2 × 106, ϵσ = 0.789 is about twice the wind-related uncertainty but smaller than the offset applied to ToF results by a factor of ∼2−7 (Guillot et al. 2018; Miguel et al. 2022). Debras & Chabrier (2018) evaluated the uncertainty on J2 × 106 related to assumptions of the CMS method to be roughly 0.1, being lower than ϵσ. Deviations in J2 × 106 from the analytical solution for the n = 1 polytrope found in previous studies with a similar number of spheroids are between ∼0.1−3 (Wisdom & Hubbard 2016; Guillot et al. 2018; Debras & Chabrier 2018; Nettelmann et al. 2021). We note that the ToF offsets and deviations from the polytropic solutions are systematic errors, whereas ϵσ are both positive and negative (see Fig. 3). Table D.1 compares these sources for errors and uncertainty with the DNN’s performance.
The relatively small prediction errors allow the DNN to eliminate the vast majority of interior models that deviate from the Juno measurements. Moreover, the prediction errors are independent of the output values, highlighting the DNN’s ability to predict a large range of interior models. Due to the very accurate measured J2 and J4 compared to the DNN errors, interior models still need to be calculated using CMS, to eliminate models falsely predicted by the DNN to be consistent with the observations. Also, the CMS output contains valuable information we wish to retrieve such as density and composition profiles.
4. Performance and interpretation of NeuralCMS
To demonstrate the computational efficiency gained by using NeuralCMS, we performed a simple grid search exploring all possible combinations of an equally spaced grid for each of the seven parameters with m grid points. We initiated the first grid search using only the DNN, with a wide range for all the parameters using the bounds shown in the axes range in Fig. 4, the determined Yproto = 0.278 ± 0.006 (Serenelli & Basu 2010), and P12 between 0.8 and 5 Mbar, with m = 17 grid points for each parameter, exploring over 4 × 108 interior models. This procedure takes ∼2 h. The results of this grid search were used to reduce the range of input parameters by eliminating models that do not fall within a wind-effect criterion regarding the Juno measurements within the absolute maximal prediction errors on the validation dataset, ϵmax. The wind-effect criterion is a range we allowed models to deviate from the Juno measurements setting a subsequent exploration of the effects of a coupled wind model. The criterion accepts models that are within 2 × 10−6 and 10−6 from the measured J2 and J4, respectively, and within the mass uncertainty discussed in Sect. 2. These values were added to the prediction errors considered. The range of Z1 and Zdilute was significantly reduced after the first grid search as shown in the left column of Fig. 4.
![]() |
Fig. 4. Correlation between interior features for the two grid search stages. The black points are model results, and the blue shading is the models’ distribution. The left column shows accepted models predicted by NeuralCMS in the first grid search, within the DNN’s maximal absolute prediction errors on the validation dataset. In the right column are accepted models computed with CMS found in the second tighter grid search. The axes range is the initial wide search range. The range of P12 and Yproto was not reduced. The middle panels nicely reproduce previous results (Howard et al. 2023a). |
Using NeuralCMS, we encountered the known difficulty of finding solutions that are consistent with the Juno gravity measurement, the Galileo-measured T1bar, and the high measured atmospheric metallicity (Howard et al. 2023b). Moreover, the correlation plot between mdilute and Zdilute shown in the middle-left panel of Fig. 4 is similar to the results produced with the same EOS by Howard et al. (2023a) and with a slightly different setup (see their Fig. 13). This provides another validation for our model. We note that some of the accepted models here, shown in the left column of Fig. 4, will be eliminated with CMS calculations because we considered large prediction errors due to the relatively sparse grid used, hence only serving to reduce the interior parameters’ range roughly.
The second grid search was done with the narrowed parameter range obtained from the initial grid search, with a denser grid of m = 20 grid points per parameter, exploring 1.28 × 109 interior models. Again, we used the wind-effect criterion, now added to the 3σ prediction errors on the validation dataset (ϵ3σ) to retrieve 10 871 possible models to test against actual CMS calculations. From these possible combinations, we find 2927 interior models accepted by the wind-effect criterion according to their CMS results. The initial search was enough to produce a compact distribution of models with respect to the parameters presented in Fig. 4, which was further tuned after the second denser grid search. Testing the grid search predictions made by the DNN against the CMS outputs yields a performance similar to that observed on the validation dataset. Importantly, using NeuralCMS, only ∼104 actual CMS models need to be computed instead of the impractical computation of ∼109 CMS models, to assemble a sample of ∼3000 plausible interior models, all falling within ϵ3σ.
The DNN can also be used to reveal the contribution of the interior parameters to the predictions. The SHapley Additive exPlanations (SHAP) values are a game theory approach assigning an impact value for each model input representing its contribution to a specific prediction (Lundberg & Lee 2017). SHAP values provide a locally accurate approximation such that for a single prediction, the SHAP values for all inputs sum to the deviation of the predicted values from their mean in a reference dataset. This provides an interpretation of the magnitude and direction in which each input moves the model predictions. For this work, we used Deep SHAP (Lundberg & Lee 2017), which linearizes the DNN’s nonlinear components to backpropagate the SHAP computation through the network. We took all CMS-accepted models as the reference dataset and calculated SHAP values for 500 random models from these accepted models. As an example, we examined the SHAP values for J6, an observable that is usually difficult to fit (Debras & Chabrier 2019; Militzer et al. 2022). Figure 5 shows these SHAP values for all 500 models, each corresponding to a point on each row. For example, we marked with black circles a specific model with the highest SHAP value for Zdilute (see Table E.1). The parameters controlling the planet’s core (mdilute, Zdilute, and rcore) have the largest effect on the prediction of J6. Moreover, the analysis shows that all model parameters have an overall monotonic effect on J6 (color-coded in Fig. 5). The analysis also highlights the interplay among input features. The SHAP values for T1bar and Z1 exhibit similar magnitudes but in opposite directions, such that high values of Z1 (T1bar) contribute positively (negatively), effectively balancing out each other’s contribution to J6. Conversely, the dilute core parameters yield the same sign contribution to J6. Similar results are observed for the other gravity harmonics.
![]() |
Fig. 5. Contribution in ppm to the prediction of J6 of 500 interior models (each point in a row is an individual model) that are consistent with the Juno measurements, within the wind-effect criterion. Higher SHAP values correspond to a larger contribution to the predicted J6. Points are stacked vertically where there is a high density of model solutions. The colors scale the values of each interior parameter. For example, when high, only Z1 positively contributes to J6. The black circles correspond to a specific interior model having the highest SHAP value for Zdilute. |
5. Conclusion
We present NeuralCMS, an efficient deep learning approach to explore the range of plausible interior structures of Jupiter, constrained by the Juno-measured gravity field and mass. This is done by training a DNN to predict the results of the sophisticated and computationally demanding CMS method. We trained a sharing-based DNN using results from over 106 CMS calculations, showing good performance compared to the uncertainties associated with the gravity moments and mass.
We show that NeuralCMS can be used to eliminate interior models inconsistent with the measured gravity field and substantially reduce the number of actual CMS runs. We demonstrate the efficiency of NeuralCMS by performing a grid search for model solutions consistent with Juno. Evaluating over 109 possible models with NeuralCMS allowed us to identify ∼104 solutions on which actual CMS runs were performed, producing a big sample of nearly 3000 plausible interior models, thus reducing the computational time by a factor of 105. This would not have been computationally feasible using only the CMS model.
We demonstrate that despite the DNN’s complex nature, it is possible to interpret relations between the physical interior parameters and their contribution to the observables using SHAP values. As an example, we show that within their range relevant to the Juno measurements, parameters controlling the planet’s core have the largest impact on the predicted J6 suggesting that high dilute core metallicity is associated with small dilute core extent, and vice versa, thus compensating for each other.
NeuralCMS can be used in any search methodology to detect plausible interior structures without a single CMS computation, acknowledging its prediction errors. It can also be further expanded to allow additional interior parameters (e.g., the equatorial radius and a temperature jump in the He rain region) and to more complex interior structures of Jupiter or other gaseous planets. NeuralCMS is available on GitHub1.
Acknowledgments
We thank the referee for useful comments and the Juno Interior Working Group for valuable discussions. This work was supported by the Israeli Space Agency and the Helen Kimmel Center for Planetary Science at the Weizmann Institute.
References
- Agarwal, S., Tosi, N., Breuer, D., et al. 2020, Geophys. J. Int., 222, 1656 [NASA ADS] [CrossRef] [Google Scholar]
- Baumeister, P., & Tosi, N. 2023, A&A, 676, A106 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Baumeister, P., Padovan, S., Tosi, N., et al. 2020, ApJ, 889, 42 [Google Scholar]
- Bolton, S. J., Adriani, A., Adumitroaie, V., et al. 2017, Science, 356, 821 [Google Scholar]
- Borchani, H., Varando, G., Bielza, C., & Larranaga, P. 2015, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 5, 216 [CrossRef] [Google Scholar]
- Caruana, R. 2002, Neural Networks: Tricks of the Trade (Springer), 165 [Google Scholar]
- Chabrier, G., Mazevet, S., & Soubiran, F. 2019, ApJ, 872, 51 [NASA ADS] [CrossRef] [Google Scholar]
- Cui, L., Xie, X., Shen, Z., Lu, R., & Wang, H. 2018, IISE Trans. Healthcare Syst. Eng., 8, 291 [CrossRef] [Google Scholar]
- Debras, F., & Chabrier, G. 2018, A&A, 609, A97 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Debras, F., & Chabrier, G. 2019, ApJ, 872, 100 [Google Scholar]
- Durante, D., Parisi, M., Serra, D., et al. 2020, Geophys. Res. Lett., 47, e86572 [NASA ADS] [CrossRef] [Google Scholar]
- Galanti, E., Kaspi, Y., & Guillot, T. 2023, Geophys. Res. Lett., 50, e2022GL102321 [NASA ADS] [CrossRef] [Google Scholar]
- Goodfellow, I., Bengio, Y., & Courville, A. 2016, Deep Learning, Adaptive Computation and Machine Learning (London: The MIT Press) [Google Scholar]
- Guillot, T., & Morel, P. 1995, A&AS, 109, 109 [NASA ADS] [Google Scholar]
- Guillot, T., Miguel, Y., Militzer, B., et al. 2018, Nature, 555, 227 [Google Scholar]
- Gupta, P., Atreya, S. K., Steffes, P. G., et al. 2022, Planet. Sci. J., 3, 159 [NASA ADS] [CrossRef] [Google Scholar]
- Haldemann, J., Ksoll, V., Walter, D., et al. 2023, A&A, 672, A180 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- He, K., Zhang, X., Ren, S., & Sun, J. 2015, Proceedings of the IEEE International Conference on Computer Vision, 1026 [Google Scholar]
- Helled, R., Stevenson, D. J., Lunine, J. I., et al. 2022, Icarus, 378, 114937 [NASA ADS] [CrossRef] [Google Scholar]
- Howard, S., & Guillot, T. 2023, A&A, 672, L1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Howard, S., Guillot, T., Bazot, M., et al. 2023a, A&A, 672, A33 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Howard, S., Guillot, T., Markham, S., et al. 2023b, A&A, 680, L2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Hubbard, W. B. 2013, ApJ, 768 [Google Scholar]
- Hubbard, W. B., & Marley, M. S. 1989, Icarus, 78, 102 [Google Scholar]
- Iess, L., Folkner, W. M., Durante, D., et al. 2018, Nature, 555, 220 [NASA ADS] [CrossRef] [Google Scholar]
- Kaspi, Y., Galanti, E., Hubbard, W. B., et al. 2018, Nature, 555, 223 [Google Scholar]
- Kaspi, Y., Galanti, E., Park, R. S., et al. 2023, Nat. Astron., 7, 1463 [Google Scholar]
- Kingma, D. P., & Ba, J. 2014, arXiv e-prints [arXiv:1412.6980] [Google Scholar]
- LeCun, Y., Bengio, Y., & Hinton, G. 2015, Nature, 521, 436 [Google Scholar]
- Li, C., Ingersoll, A., Bolton, S., et al. 2020, Nat. Astron., 4, 609 [Google Scholar]
- Lindal, G. F. 1992, AJ, 103, 967 [Google Scholar]
- Lundberg, S. M., & Lee, S. I. 2017, in Advances in Neural Information Processing Systems 30, eds. I. Guyon, U. V. Luxburg, S. Bengio, et al. (Curran Associates, Inc.), 4765 [Google Scholar]
- Lyon, S., & Johnson, J. 1992, LANL Report LA-UR-92-3407 [Google Scholar]
- Miguel, Y., Guillot, T., & Fayon, L. 2016, A&A, 596, A114 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Miguel, Y., Bazot, M., Guillot, T., et al. 2022, A&A, 662, A18 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Militzer, B. 2023, ApJ, 953, 111 [NASA ADS] [CrossRef] [Google Scholar]
- Militzer, B., Wahl, S., & Hubbard, W. B. 2019, ApJ, 879, 78 [Google Scholar]
- Militzer, B., Hubbard, W. B., Wahl, S., et al. 2022, Planet. Sci. J., 3, 185 [NASA ADS] [CrossRef] [Google Scholar]
- Mohr, P. J., Taylor, B. N., & Newell, D. B. 2012, Rev. Mod. Phys., 84, 1527 [NASA ADS] [CrossRef] [Google Scholar]
- Morales, M. A., Hamel, S., Caspersen, K., & Schwegler, E. 2013, Phys. Rev. B, 87, 174105 [NASA ADS] [CrossRef] [Google Scholar]
- Movshovitz, N. 2019, CMS-Planet GitHub Repository, version 2.0, https://github.com/nmovshov/CMS-planet [Google Scholar]
- Movshovitz, N., Fortney, J. J., Mankovich, C., Thorngren, D., & Helled, R. 2020, ApJ, 891, 109 [Google Scholar]
- Nettelmann, N. 2017, A&A, 606, A139 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Nettelmann, N., Movshovitz, N., Ni, D., et al. 2021, Planet. Sci. J., 2, 241 [NASA ADS] [CrossRef] [Google Scholar]
- Paszke, A., Gross, S., Massa, F., et al. 2019, Advances in Neural Information Processing Systems, 32 [Google Scholar]
- Reyes, O., & Ventura, S. 2019, Int. J. Neural Syst., 29, 1950014 [CrossRef] [Google Scholar]
- Riddle, A. C., & Warwick, J. W. 1976, Icarus, 27, 457 [NASA ADS] [CrossRef] [Google Scholar]
- Samek, W., Montavon, G., Lapuschkin, S., Anders, C. J., & Müller, K.-R. 2021, Proc. IEEE, 109, 247 [CrossRef] [Google Scholar]
- Seiff, A., Kirk, D. B., Knight, T. C. D., et al. 1998, J. Geophys. Res., 103, 22857 [NASA ADS] [CrossRef] [Google Scholar]
- Serenelli, A. M., & Basu, S. 2010, ApJ, 719, 865 [NASA ADS] [CrossRef] [Google Scholar]
- Shrikumar, A., Greenside, P., & Kundaje, A. 2017, International Conference on Machine Learning, PMLR, 3145 [Google Scholar]
- Tiesinga, E., Mohr, P. J., Newell, D. B., & Taylor, B. N. 2021, J. Phys. Chem. Ref. Data, 50, 033105 [NASA ADS] [CrossRef] [Google Scholar]
- von Zahn, U., Hunten, D. M., & Lehmacher, G. 1998, J. Geophys. Res., 103, 22815 [NASA ADS] [CrossRef] [Google Scholar]
- Wisdom, J., & Hubbard, W. B. 2016, Icarus, 267, 315 [NASA ADS] [CrossRef] [Google Scholar]
- Wong, M. H., Mahaffy, P. R., Atreya, S. K., Niemann, H. B., & Owen, T. C. 2004, Icarus, 171, 153 [Google Scholar]
- Zhao, Y., & Ni, D. 2022, A&A, 658, A201 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Zharkov, V. N., & Trubitsyn, V. P. 1978, Physics of Planetary Interiors (Pachart Publishing House), 388 [Google Scholar]
Appendix A: CMS model validation
To validate our CMS model, we tested it against the analytic Bessel solution for Jupiter’s uniformly rotating n = 1 polytrope (Wisdom & Hubbard 2016). This was done using the same spheroid radii grid as used by Howard et al. (2023a). Table A.1 shows our model convergence to the analytic Bessel solution with an increasing number of spheroids. Our model exhibits good convergence with N = 1024 spheroids.
Validation of our CMS model for Jupiter’s polytropic of index unity model with an increasing number of CMS spheroids (N).
Appendix B: Dilute core formulation and the training data range
Jupiter’s dilute core is implemented differently in various interior models. For this Letter, we followed the formulation of Miguel et al. (2022), defining the mass fraction of heavy elements in the inner envelope and the dilute core,
where Zdilute is the maximum metallicity in the dilute core, mdilute represents the extent of the dilute core in normalized mass, and δmdil controls the steepness of the metallicity gradient and was set to be δmdil = 0.075. This region is not an adiabat, but we treat each spheroid as an adiabat. We infer the interpolated entropy (S) and temperature (T) profiles, such that for each spheroid i in the dilute core region, Ti + 1 = T(Pi + 1, Si, Yi, Zi) and Si + 1 = S(Pi + 1, Ti + 1, Yi + 1, Zi + 1).
Table B.1 and Fig. B.1 show the parameter range and distribution of the training and validation datasets. The range of most parameters is larger than observational or theoretical constraints to allow for a broad exploration of the parameter space.
![]() |
Fig. B.1. Distribution of the inputs (blue) and the outputs (red) in the training dataset. For Yproto the solid blue line is the determined value and the dashed blue lines represent the uncertainty (Serenelli & Basu 2010). The solid black lines are Juno-derived values (Durante et al. 2020). We note that some of the histogram bins for the outputs are not visible due to the high frequency of the proximate to Juno models. We refer readers to Table B.1 and the x-axis range for the full range of the dataset. |
Parameter range of the training and validation datasets.
Appendix C: Exploration of DNN architectures
We found our proposed deep learning architecture (Fig. 2) to perform best at predicting the gravity moments compared to other architectures tested. First, we tried a fully connected network architecture (LeCun et al. 2015) with a varying depth and size (i.e., varying the number of hidden layers, with a varying number of neurons) predicting all five output parameters together. Secondly, we tried a similar fully connected architecture to regress each parameter separately, which was successful only for the mass prediction. Lastly, we adopted a sharing-based architecture (Caruana 2002), again tested with different sizes and depths. Our chosen architecture may seem large compared for example with the network used by Agarwal et al. (2020) for a similar regression task, where they used three hidden layers with less than 100 neurons each. These authors compensated for the small network size with a very long training of 4.4 × 106 epochs, which is a few orders of magnitudes longer than our training process. Again, we note that no overfitting occurred during the training, supporting the validity of the architecture chosen.
Appendix D: Quantitative DNN performance evaluation
In addition to the performance evaluation shown in Fig. 3, we present a quantitative comparison of the DNN mean prediction errors with other sources of uncertainty in Table D.1. The 1σ prediction errors on the validation dataset (ϵσ) are comparable to the Juno measurement uncertainty for J6 and J8. All prediction errors are comparable to the uncertainty due to the wind, and they are significantly lower than the offsets needed to be applied on results produced by the ToF expansion (Zharkov & Trubitsyn 1978). The maximal absolute prediction errors (ϵmax) are comparable to the offsets to ToF by Guillot et al. (2018). In our CMS model, we set the outermost spheroid radius to be the measured Req = 71 492 km (Lindal 1992) at one bar. This underlines the assumption that the higher atmosphere (P < 1 bar) can be neglected. Debras & Chabrier (2018) evaluated the uncertainty stemming from this assumption, which is lower than ϵσ for J2, but higher for J4 and J6. These authors also evaluated the discretization error on J2 × 106, when using a polytropic EOS, to be of a similar magnitude to the value shown in Table D.1 for neglecting the higher atmosphere.
NeuralCMS performance on the validation dataset compared to other sources of uncertainty.
Appendix E: Description of SHAP values
The SHAP values are a game theory approach that assigns an impact value for each model input representing its contribution to a specific prediction (Lundberg & Lee 2017). For deep learning models, SHAP is combined with the additive feature attribution DeepLIFT method (Shrikumar et al. 2017), which practically linearizes the nonlinear components of the DNN, to provide explanations based on a locally accurate approximation:
where x is a specific input; f(x) is the prediction model (the DNN in our case); is the mean of all output predictions y in a reference dataset, used as a baseline value; M is the number of input features; and ϕxi, y is the SHAP value of the input xi for a predicted output y (Lundberg & Lee 2017). This means that for each prediction, the SHAP values for all inputs sum to the difference between the predicted value and the mean of all predictions in a reference dataset. For this work, we used Deep SHAP through the DeepExplainer from the SHAP Python library (Lundberg & Lee 2017), which combines SHAP values computed for smaller components of the DNN into values for the whole network by backpropagating the computation through the network. We took all 2927 models accepted by their CMS results as the reference dataset and calculated SHAP values for 500 random models from these accepted models. More details on the specific model marked by black circles in Fig. 5 can be found in Table E.1.
SHAP values for the prediction of J6 of an interior model with the highest SHAP value for Zdilute being shown with black circles in Fig. 5.
All Tables
Validation of our CMS model for Jupiter’s polytropic of index unity model with an increasing number of CMS spheroids (N).
NeuralCMS performance on the validation dataset compared to other sources of uncertainty.
SHAP values for the prediction of J6 of an interior model with the highest SHAP value for Zdilute being shown with black circles in Fig. 5.
All Figures
![]() |
Fig. 1. Schematic view of Jupiter’s dilute core model used in this study and the computational process: given a combination of the seven marked interior parameters (left), the CMS method (right) converges to solve the gravity moments and mass, which is then compared to the Juno measurements to determine the feasibility. The image in Jupiter’s schematic (left) is available at https://www.planetary.org/space-images/merged-cassini-and-juno. |
In the text |
![]() |
Fig. 2. Schematic diagram of the DNN architecture presented in this study. The hidden layers, both shared and private are marked by dashed black outlines and contain 1024 neurons each. The mass is used for training on the gravity moments J2n and it is predicted separately. |
In the text |
![]() |
Fig. 3. Performance of NeuralCMS on a sample of 104 models from the validation dataset (a–e). The dashed black lines are the standard deviation of the full validation dataset error ϵσ. The red patch represents the combined uncertainty from dynamics (Miguel et al. 2022) and measurement errors (Durante et al. 2020) for the gravity harmonics: |
In the text |
![]() |
Fig. 4. Correlation between interior features for the two grid search stages. The black points are model results, and the blue shading is the models’ distribution. The left column shows accepted models predicted by NeuralCMS in the first grid search, within the DNN’s maximal absolute prediction errors on the validation dataset. In the right column are accepted models computed with CMS found in the second tighter grid search. The axes range is the initial wide search range. The range of P12 and Yproto was not reduced. The middle panels nicely reproduce previous results (Howard et al. 2023a). |
In the text |
![]() |
Fig. 5. Contribution in ppm to the prediction of J6 of 500 interior models (each point in a row is an individual model) that are consistent with the Juno measurements, within the wind-effect criterion. Higher SHAP values correspond to a larger contribution to the predicted J6. Points are stacked vertically where there is a high density of model solutions. The colors scale the values of each interior parameter. For example, when high, only Z1 positively contributes to J6. The black circles correspond to a specific interior model having the highest SHAP value for Zdilute. |
In the text |
![]() |
Fig. B.1. Distribution of the inputs (blue) and the outputs (red) in the training dataset. For Yproto the solid blue line is the determined value and the dashed blue lines represent the uncertainty (Serenelli & Basu 2010). The solid black lines are Juno-derived values (Durante et al. 2020). We note that some of the histogram bins for the outputs are not visible due to the high frequency of the proximate to Juno models. We refer readers to Table B.1 and the x-axis range for the full range of the dataset. |
In the text |
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.