Issue 
A&A
Volume 650, June 2021
Parker Solar Probe: Ushering a new frontier in space exploration



Article Number  A198  
Number of page(s)  8  
Section  The Sun and the Heliosphere  
DOI  https://doi.org/10.1051/00046361/202141063  
Published online  30 June 2021 
A powerful machine learning technique to extract proton core, beam, and αparticle parameters from velocity distribution functions in space plasmas
^{1}
Laboratory for Atmospheric and Space Physics, University of Colorado,
Boulder,
CO,
USA
email: daniel.vech@lasp.colorado.edu
^{2}
Smithsonian Astrophysical Observatory,
Cambridge,
MA
02138
USA
^{3}
Astrophysical and Planetary Sciences Department, University of Colorado,
Boulder,
CO,
USA
^{4}
Lunar and Planetary Laboratory, University of Arizona,
Tucson,
AZ
85719,
USA
^{5}
BWX Technologies, Inc.,
Washington
DC
20002,
USA
^{6}
Climate and Space Sciences and Engineering, University of Michigan,
Ann Arbor,
MI
48109,
USA
Received:
12
April
2021
Accepted:
18
May
2021
Context. The analysis of the thermal part of velocity distribution functions (VDFs) is fundamentally important for understanding the kinetic physics that governs the evolution and dynamics of space plasmas. However, calculating the proton core, beam, and αparticle parameters for large data sets of VDFs is a timeconsuming and computationally demanding process that always requires supervision by a human expert.
Aims. We developed a machine learning tool that can extract proton core, beam, and αparticle parameters using images (2D grid consisting pixel values) of VDFs.
Methods. A database of synthetic VDFs was generated, which was used to train a convolutional neural network that infers bulk speed, thermal speed, and density for all three particle populations. We generated a separate test data set of synthetic VDFs that we used to compare and quantify the predictive power of the neural network and a fitting algorithm.
Results. The neural network achieves significantly smaller rootmeansquare errors to infer proton core, beam, and αparticle parameters than a traditional fitting algorithm.
Conclusions. The developed machine learning tool has the potential to revolutionize the processing of particle measurements since it allows the computation of more accurate particle parameters than previously used fitting procedures.
Key words: turbulence / plasmas / waves / methods: statistical
© ESO 2021
1 Introduction
The solar wind is a hot, tenuous plasma propagating away from the Sun’s surface (e.g., Wolfe et al. 1966). Determining the properties of the particle populations (protons, electrons and αparticles) in the solar wind is fundamentally important for describing the radial evolution of the plasma (e.g., Richardson & Smith 2003), dissipation of turbulent energy (e.g., Coleman Jr 1968), the onset of plasma instabilities (e.g., Kasper et al. 2002) and waveparticle interaction (e.g., Howes et al. 2017). Protons constitute 95% of the number density of the solar wind (Feldman et al. 1978) and are typically well described by a combination of a peaked function most often a Maxwellian velocity distribution function (VDF) for the primary proton population, and a separate VDF for the smaller secondary population that is flowing differentially with respect to the primary population (e.g., Tu et al. 2004; Alterman et al. 2018). Typically the primary proton population is referred to as the core and the smaller one is referred to as the beam. The second most common ion species is αparticles that constitute ≈4% of the solar wind number density. Velocity distribution functions in the solar wind are measured in situ by particle detectors such as electrostatic analyzers (ESA, e.g., Sauvaud et al. 2008) and Faraday cups (e.g., Ogilvie et al. 1995; Kasper et al. 2016).
Faraday cups have been used for space plasma exploration for over half a century and they have flown on several spacecraft including NASA’s Parker Solar Probe (PSP) mission (Fox et al. 2016; Kasper et al. 2016). A Faraday cup measures the currents due to charged particles reaching the collector plates. The discrimination of charged particles and energy per charge determination is based on a timevarying positive potential, which chops a selected portion of the charged particle flux. A capacitor integrates the chopped current from each collector plate in a fixed time interval and the resulting voltage is then converted to a digital signal (Ogilvie et al. 1995). The voltage resolution of Faraday cups is typically between 5 and 13%.
The key fluid parameters of the VDFs (velocity, density, thermal speed) are derived by fitting the observations to a biMaxwellian function. This is an iterative process that aims to minimize the residuals in the fit. The fitting procedure is particularly difficult or potentially impossible when there is a significant overlap between the core, beam, and αparticle populations,which leaves large uncertainties in the computed fitting parameters. When fitting to multiple populations is impossible, cruder estimates for density, thermal speed, and bulk speed have to be made, such as partial velocity moments of the distribution as measured. The weakness of partial moments is that they do not extrapolate to account for the tails or the overlapping portions of the distributions.
Some Faraday cup experiments use multiple cups with different orientations such as Voyager (Bridge et al. 1977), Wind (Ogilvie et al. 1995), Europa Clipper (Grey et al. 2018) or a spinning platform such as Wind (Ogilvie et al. 1995) and IMP8 (Lazarus & Paularena 1996) for additional geometrical constraints to break the degeneracy between populations. The problem of confusion between multiple populations is especially difficult for experiments that measure a single projection of the solar wind VDF: singlesensor, threeaxis stabilized systems like PSP or Deep Space Climate Observatory (Burt & Smith 2012).
Another difficulty is that modern Faraday cups are capable of high cadence (≈0.05–1 s) sampling of the solar wind proton VDFs leading to the generation of massive data sets. The Faraday cup (Solar Probe Cup, SPC, Case et al. 2020) of PSP obtainsover 10^{6} proton VDFs during each solar encounter period. The processing of this huge data set is a timeconsuming, computationally demanding process that requires supervision by a human expert.
Recently, machine learning (ML) techniques have been applied to several problems in space plasma and solar physics suchas predicting solar flares (e.g., Chen et al. 2017; Zheng et al. 2019; Li et al. 2020). One particular ML technique that has played an important role in these advancements is the Convolutional Neural Network (CNN). These networks are most commonly applied to image recognition since they have exceptional ability to identify patterns and features. The output of a CNN can be a discrete variable (a variable that can only take on specific values such as predefined image labels) or a continuous variable, which can take on an unlimited number of values. Given the success of previous studies, we ask the question: is it possible to infer proton core, beam, and αparticle parameters using images (2D grid consisting pixel values) of VDFs? Such a predictive tool could have immense applications for plasma investigations by significantly reducing the time and effort needed to postprocess the data and obtain accurate particle fits.
In this paper, we develop a powerful ML technique using a Convolutional Neural Network, which can infer proton core, beam, and αparticle parameters using only images of VDFs as the input. We generate a large data set of synthetic VDF images where the core, beam, and αparticle parameters are randomly assigned and the VDFs’ characteristics (signaltonoise ratio, width of the voltage bins) mimic the measurements of SPC. We train the CNN and compare the inferred particle parameters with the ones obtained with the fitting algorithm of the SPC instrument team. Our results suggest that the predictive power of the CNN is significantly better than previous fitting procedures even when the VDFs are affected by substantial noise.
2 Training data set
For the generation of the synthetic VDFs, we used the SPC response function for a highMach number Maxwellian plasma (Stevens et al., in prep). The SPC instrument team fits this response function to the measured solar wind fluxes and in Sect. 4 we use this function to extract the core, beam, and αparticle parameters from the synthetic VDFs. The SPC response function has the following form. (1)
In Eq. (1), ϕ(V) is the differential flux, which is a function of modulator voltage V, ΔV is the width of the voltage bins (in units of volt), q is the elementary charge, w is the thermal speed, m is the mass of a proton, u_{*} = u∕v and w_{*} = w∕v where u is the bulk speed of the distribution and . The i subscript corresponds to the ith energy bin. In Eq. (2), [pA/V], A is the sensitive area of the collector plate and n is the particle density. To simplify our calculations we assume that in Eq. (2).
For the generation of the synthetic VDFs, we use the range of [300; 5000] V for the modulator voltage, which overlaps with the range used by SPC in nominal operation mode. The energy resolution of SPC varies in the range of 5–10%. We plot the synthetic VDFs on a grid, which has a constant 5% energy resolution. For more details on the SPC instrument characteristics, see Case et al. (2020).
To make the VDFs more realistic, we add random noise to them. The Maxwellian response function with added noise is given by Eq. (3), (3)
where f_{total} corresponds to the sum of the core (f_{core}), beam, (f_{beam}) and αparticle (f_{α}) flux charge densities (obtained with Eq. (1)), r_{1,2} are random phases drawn from a uniform distribution with [0; 2π] boundaries, g is also a random variable drawn from a normal distribution with 0 mean. The g0.04 term corresponds to the quantization error, errors associated with the modulator waveform, and any additional noise source that is proportionate to the signal. The Γ ⋅ sin(r_{1,2}) terms model background noises on the two separate components of the SPC signal, which are current amplitudes measured respectively inphase and outof phase with the highvoltage modulation waveform (Case et al. 2020). We use Γ = 0.5 pA, which makes for a reasonable comparison to SPC.
The nine parameters of the distributions (bulk speeds V_{c,b,α}, thermal speeds V_{th, c, b,α} and density n_{c, b,α} where c, b and α subscripts correspond to the core, beam, and αparticles, respectively) are drawn randomly from a uniform distribution, whose lower and upper boundaries are listed in Table 1. The ranges were chosen such that they are similar to the solar wind properties observed in the innerheliosphere by PSP (Kasper et al. 2019) and near 1 AU as well (Wilson III et al. 2018; Klein & Vech 2019). The range of differential flows between particle populations were chosen such that full and partial overlap between particle populations is possible, which is a challenging problem for any fitting algorithm.
In the solar wind the three particle populations are not always simultaneously observed, therefore for the training and testing of the CNN, we generate three groups of images (10^{4} images in each category) with one, two and three Maxwellian response functions. Figure 1 shows an example of the synthetic VDFs, which were randomly generated and includes three particle populations. The bottom panel shows an actual input image for the training of the CNN where the particle populations have been summed up and noise was added to the resulting distribution.
For generating the image data set, the VDFs are plotted on a fixed scale along the X axis (200–1000 km s^{−1}), which ensures that the CNN learns to associate higher speeds with VDFs that are positioned closer to the right side of the image and vice versa for slower speeds. The Y axis of the plot is logarithmic and upper and lower thresholds (10^{−0.3}–10^{2.5}) are chosen such that all VDFs fit in the image box.
Ranges of possible values for the core, beam, and αparticle populations.
Fig. 1 Top panel: one of the 10^{4} randomly generated core (blue), beam, (red) and αparticle (yellow) distributions. Bottom panel (actual input image for the CNN): sum of these three particle populations with added noise. 
3 Training and testing the CNN
Our goal is to develop a data processing pipeline that takes the input image of a VDF and extracts the corresponding particle parameters. This includes two steps (see Fig. 2): in the first step, our goal is to train a CNN, which is capable of identifying the correct number of particle populations in a given VDF. In the second step, we train separate neural networks for extracting bulk speed, thermal speed, and density of VDFs with 1, 2, and 3 particle populations.
The CNN used in this paper was obtained from an online repository^{1}. This network was originally designed to classify images of handwritten numbers.
For stepone, we train an image classification CNN on 27 000 labeled VDF images (9000 images selected from each category) where the labels are “1”, “2”, and “3” corresponding to the number of particle populations in a VDF. The remaining 3000 VDFs (1000 from each category) are used for testing the CNN’s performance. The used model parameters are summarized in Table 2. For the image classification problem we use Stochastic Gradient Descent (SGD) optimizer (Bottou 2012). The optimizer is an algorithm, which is used to change the parameters of the network such as weights and learning rate to reduce the losses and find the local minimum of a differentiable function. The learning rate defines how big or small the stepsizes are that the optimizer takes into the direction of the local minimum. If the learning rate is too large, there is a risk that the optimizer misses a local minimum. On the other hand, very low learning rate means that the training process takes substantial amount of time. We found that decreasing the initial learning rate did not lead to further improvements in the accuracy of the classification. We also found that increasing the number of epochs (the number of times that the learning algorithm works through the entire training dataset) above 20 did not lead to further improvements in the prediction accuracy. Various image sizes (50 × 50, 75 × 75 pixels) were tested, however we found that increasing the image size above 100 × 100 pixels leads to no further improvement in the model accuracy therefore the generated VDF images are rescaled to this size.
Figure 3 shows the confusion matrix of the CNN using the 3000 test images. In the confusion matrix the rows correspond to the inferred class (Output Class; “1”, “2”, and “3” corresponding to the number of particle populations in a VDF) and the columns correspond to the true class (Target Class). The correctly classified observations are in the diagonal of the matrix. Both the right column and bottom row are important to asses the accuracy of the classification. The bottom row shows the percentages of all the examples belonging to each class that are correctly (true positive, green) and incorrectly (false negative, red) classified. The right column shows the percentage of cases when the CNN predicted 1 (80.0% correct), 2 (59.7% correct) and 3 (58.6% correct) particle populations and the prediction was correct. The values in the bottom row and right column do not necessarily have to be the same. For example, if our network inferred 1 particle population for all the 3000 VDFs, then there would be 100% in the bottom left corner of the confusion matrix and 33.3% in the top right corner, respectively, which would be very poor classification.
The overall accuracy of the CNN is 67.6%. The network has excellent ability to distinguish VDFs with 1 particle population from the other two categories (97.7% true positive rate). The most challenging problem is distinguishing between 2 and 3 particle populations, in particular when the Target Class is 2 (42.9% true positive rate).
In the second step, we train separate CNNs to extract particle parameters from VDFs with one, two, and three particle populations. For this regression problem, we use a CNN, which has only one output variable; this significantlyreduces the training time and complexity of the model. This means that we train 3, 6, and 9 separate CNNs for VDFs with 1, 2, and 3 particle populations, respectively. We have tested CNNs with three output parameters (e.g., bulk speed, thermal speed, density of a particle population), however, we found slight (≈20%) increase in the rootmeansquare error (RMSE) of the inferred values. This might be explained by the fact that the speed, thermal speed and density are randomly generated (Table 1) therefore the CNN could not identify correlations between them. It is also possible that the RMSE increased due to the fact that we increased the number of free variables in the CNN.
Before the training process, the output parameter is normalized to its zscores (Z =(y − μ)∕σ where y is the output parameter and μ and σ are the mean and standard deviations of y, respectively), which is a standard ML approach to increase the numerical stability of the model (e.g., Nawi et al. 2013). The output of the regression network is a dimensionless quantity (Z), which is transformed back to physical units with σ and μ (i.e., y = Zσ + μ), which are obtained from the training data. We split the initially generated 10^{4} images fromeach of the three categories into a training (9000 images) and testing (1000 images) sets, respectively and train separate CNNs for all the 18 parameters (3+6+9) in total. During the training process the goal of the CNN is adjusting its model parameters so it can achieve the lowest RMSE to infer the single output parameter. We found that for all parameters the CNN converged to a steady RMSE value after 10–12 epochs; between epochs 12–20 the RMSE error fluctuated by ±2–3% with respect to the previously achieved RMSE value and no further improvement was achieved.
After training both the classification and regression networks, our data processing pipeline shown in Fig. 2 can be tested. We take the 3000 test VDFs (1000 from each of the three categories; these VDFs were not used for training any of the networks), use the previously trained CNN to classify them into 3 categories. Then, we use the separate CNNs to infer the bulk speed, thermal speed, and density of each particle species. We compare the inferred parameters with the ground truth parameter (e.g., parameter used for generating the VDF) even if the number of particle populations is incorrectly inferred. For example, if a VDF includes 3 particle populations but the classification networks infers that only 2 particle populations are present, then our comparison will be based on the inferred two particle populations.
The result of this test is summarized in Table 3 where we listed the RMSE and Pearson correlation coefficients between the inferred and ground truth values, respectively. In the CNN the quality of the inferred values is not evaluated based on a test such as the chisquare statistics, which can give misleading results in the case of the traditional fitting algorithm (e.g., low chisquare value despite large deviations of the fit from the data.).
The training dataset was based on “reasonable” VDFs, which were generated with noise amplitudes typically observed in the SPC data. To test the response of our technique to “unreasonable” VDFs, we generated 3000 VDFs (with the parameters listed in Table 1) where we increased the amplitude of the background noise from 0.5 (see Eq. (3)) to 20. These VDFs are indistinguishable from pure noise and does not show any resemblance to the example in Fig. 1. We found that on this data set the CNN inference had no correlation (Pearson correlation coefficients of ≈0) with the ground truth values, however, all the nine inferred parameters were in the range of the training data set (Table 1). This test shows that the CNN has to be carefully used on real data since in the case of extreme noise the inferred values are still ‘reasonable” and do not stand out from the rest of the data as being erroneous.
Fig. 2 Workflow of the proposed approach to process VDFs. 
Hyperparameters of the CNNs.
Fig. 3 Confusion matrix of the classification problem. 
Fig. 4 Inferred and ground truth values for the core, beam, and αparticle parameters. 
RMSE and Pearson correlation coefficients (CC) of the inferred particle parameters by the CNN and the fitting algorithm on the 3000 test VDFs.
4 Comparison to traditional fitting algorithm
The fitted parameters are obtained by nonlinear leastsquares regression of the synthetic VDFs to the analytic model of the SPC instrument response (Eq. (1)) to one, two, or three isotropic Maxwellian distributions of inflowing ions. The RMSE and Pearson correlation coefficients of the fitting algorithm on the test data are listed in Table 3. It can be seen that the CNN achieves lower RMSE values and higher Pearson correlation coefficients for all nine parameters.
In Fig. 4 we compare the inferred and the fitted values for the 3000 test VDFs. The top row shows the core parameters: the CNN achieves a minor improvement compared to the fitting algorithm (factor of 1.075 decrease in RMSE) in its computation of bulk speed, while the CNN RMSEs are significantly smaller (factors of 1.57 and 2.27) than the fitting algorithm for the core thermal speed and density. The beam parameters in the second row show that the CNN RMSE values are smaller than the fitting algorithm RMSE values by factors of 2.33, 3.34 and 2.32 for the beam speed, thermal speed and density, respectively. Finally, the bottom row shows the αparticle comparison. The αparticle speed in the SPC algorithm is determined by shifting the bulk proton speed by the αtoproton mass factor hence the large discrepancies between the SPC values and the ground truth values. The CNN predictions of αparticle speed have a similar RMSE value to the proton core and beam speeds. The CNN predictions of αparticle thermal speed have the lowest correlation coefficient (0.25, see Table 3) among all the parameters. The CNN αparticle density predictions are in good agreement with the ground truth values and the RMSE value is only 1.06 #/cm^{3}.
In Fig. 5 we compare the inferred total (core + beam) proton bulk speed, thermal speed and density from the CNN and the fitting algorithm. The RMSE and correlation coefficients are listed in Table 4. By comparing Figs. 4 and 5 it can be seen that the CNN does not make significant systematic errors in the effort to separate the core and the beam into two pieces. Additionally, Fig. 5 also clarifies that the overshoot from the fitting algorithm for the core density and the core temperature are almost always accompanied by an undershoot in the beam density and temperature.
There are fundamental differences between the two methods that must be considered for the correct interpretation of Figs. 4 and 5. The fitting algorithm tested here was designed to process a very broad range of VDFs that occur throughout the orbit of PSP and is able to handle SPC specific noise anomalies while imposing as few assumptionsas possible. In contrast, the CNN had the advantage that it was tested on a domain that has the same properties as the training data. Therefore the achieved RMSE values correspond to the best case scenario for the network. Another important difference is that the analytical fitting has the advantage that error bars can be obtained therefore the quality of the fits can be investigated.
The SPC data pipeline imposes constraints of comovement (V_{c}= V_{α}) and temperature equilibrium (V_{th, c} = V_{th,α}) to minimize the αproton confusion in a onedirectional way: protonsbeing confused for αparticles is acceptable but αparticles being confused for protons is not. Similar constraints are not used in the data pipelines of other Faraday cups such as Wind, which is a major advantage of those data sets. Additionally, a Wind Faraday cup spectra comprises about 1240 measurements of charged particle flux distributed over 40 different projection geometries, as compared to an SPC spectrum, which is about 30 measurements on one geometrical projection. The proposed CNN operates as a standalone data processing pipeline, however, it also has the potential to improve the SPC fitting algorithm by inferring the number of particle populations in a given VDF, therefore the constraints on αparticles are not required.
Fig. 5 Inferred and ground truth values for the total (core + beam) proton population. 
RMSE and Pearson correlation coefficients (CC) of the inferred total (core + beam) proton parameters by the CNN and the fitting algorithm on the 3000 test VDFs.
5 Test with real data
We have selected 7500 VDFs (19 August 2019 00:00 to 20 August 2019 00:00, orbit #3) from SPC to test our approach. Only those VDFs were selected that had a data quality flag of 0 (good data) as determined by the instrument team.
Figure 6 shows the comparison of the core and secondary population parameters computed with the fitting algorithm and the CNN. In panel g the ion VDFs are shown; the peak of each VDF was normalized to 1 and plotted on a logarithmic scale between 0.01 and 1. The largest discrepancy between the two methods is at approximately 00:2001:20. In Figs. 6d–f it can be seen that this is the only interval where the CNN predicted 2 particle populations. By inspecting VDFs from these intervals (Fig. 7a) we can see that there is a substantial secondary population between 00:2001:20 and similar features are not present in VDFs outside this interval (Fig. 7b), although the fitting algorithm suggested that there is a beam.
This test shows that the CNN has significantly higher thresholds for labeling a VDF as having two particle populations than the fitting algorithm. In Fig. 3 we showed that the CNN has excellent ability (97.7% true positive rate) to distinguish between 1 vs. 2 (and 3) particle populations, therefore we suggest that outside the 00:2001:20 interval there is no substantial secondary particle population. The CNN predicted 3 particle populations in only 2 instances out of the 7500 VDFs therefore those cases are not shown.
We have quantified the differences between the fitting algorithm and the CNN in terms of Pearson correlation coefficients and rootmeansquare difference, which are shown in Table 5. These results show that the leastsquare fits offer a reliable way to measure the core protons with SPC data.
6 Conclusions
In this paper, we have developed a new, powerful ML tool to compute parameters of one, two, and three Maxwellian response functions. We have demonstrated that the proposed Convolutional Neural Network achieves significantly lower RMSE (Fig. 4) than the previously used fitting technique to derive bulk speed, thermal speed, and density of particle populations.
The fitting algorithm tested here was designed to work in very diverse conditions with minimal assumptions. In contrast, the CNN is significantly less robust since it was trained on a specific range of solar wind parameters and it works best on a dataset that lies in the domain of the training data set. Additional disadvantage of the CNN is that for an input VDF with extreme noise the network infers “reasonable” (however, incorrect) particle parameters. We suggest thatthe fitting algorithm should be used in tandem with the CNN to achieve the best performance. This approach would allow to filter out VDFs that are affected by substantial noise by comparing the CNN’s inferred core parameters with the fitting algorithm, which are expected to be in good agreement when the noise levels are reasonable. When the two core datasets are consistent, the second and third population can be extracted with the CNN.
More accurate computation of solar wind proton and αparticle parameters has major implications for the improved characterization of space plasmas. For example, linear stability analysis is a frequently used technique to understand wave modes and energy exchange between electromagnetic fields and particles in space plasmas. However, large inaccuracies in the input parameters (such as proton thermal speed, differential flow between ion species) may lead to false results suggesting that the plasma is stable (or unstable) against certain instabilities. Our technique will provide more accurate input parameters for linear stability calculations than previous fitting techniques, which will lead to a better understanding of the turbulent cascade including the scaletoscale energy transfer and the growth rate of unstable wave modes.
Throughout this paper, we used the characteristics of the SPC instrument to demonstrate the feasibility of our approach, however, the technique can be easily adopted to any currently operated or former particle detectors and can be applied to both ions and electrons whose distributions can be modeled through theoretical calculations. In the training and test VDFs we did not explicitly include a combination of core proton and αparticle populations. A subset of the core proton and beam VDFs show similarities (in terms of density ratio, temperature ratio, range of differential flow) to the expected core and αparticle signatures. Our priority was to use only 3 classes of VDFs in an effort to maximize the accuracy of the classification network and keep the number of free parameters in the CNN low. On PSP the time of flight section of the Solar Probe Analyzers (SPANA) enables to sort particles by their mass/charge ratio, permitting differentiation of ion species, which allows the correct determination of the particle species of the second and third particle populations (Kasper et al. 2016).
We recognize that the CNN technique described in this paper is limited by the fact that it is inherently onedimensional (i.e., it is using measurements of only the radial dimension of the VDF) and the CNN was trained on typical range of solar wind parameters. In reality, space plasmas are typically anisotropic with respect to the local magnetic field (e.g., Hellinger et al. 2006), and the flow velocity is not purely radial (e.g., Kasper et al. 2019). Further work will remove these limitations by extending the CNN technique to incorporate solar wind flow angle measurements and threedimensional VDFs.
Fig. 6 Comparison of the SPC fitting algorithm and the CNN on a real test data. Panel g: the peak of each VDF was normalized to 1 and the normalized phase space densities are plotted on a logarithmic scale. 
Rootmeansquare difference and Pearson correlation coefficients for the core and secondary particle population derived with the CNN and the fitting algorithm on the real test data.
Fig. 7 Panela: VDF from 00:44:51, which was identified as having two particle populations by both methods. Panel b: VDF from 03:00:20, where the fitting algorithm suggested both core and a secondary population while the CNN’s prediction was that there is only core. 
Acknowledgements
K.G.K. was supported by NASA Grant 80NSSC19K0912. D.V. was supported by NASA contract 80NSSC21K0454. D.M. was supported by NASA contract 80NSSC19K0305. The authors thank the Parker Solar Probe team for their support. All data used in this paper is available on the SWEAP data archive: http://sweap.cfa.harvard.edu/pub/data/sci/sweap
References
 Alterman, B., Kasper, J. C., Stevens, M. L., & Koval, A. 2018, ApJ, 864, 112 [Google Scholar]
 Bottou, L. 2012, in Neural Networks: Tricks of the Trade (Berlin: Springer), 421 [CrossRef] [Google Scholar]
 Bridge, H., Belcher, J., Butler, R., et al. 1977, Space Sci. Rev., 21, 259 [CrossRef] [Google Scholar]
 Burt, J., & Smith, B. 2012, in 2012 IEEE Aerospace Conference, IEEE, 1 [Google Scholar]
 Case, A. W., Kasper, J. C., Stevens, M. L., et al. 2020, ApJS, 246, 43 [Google Scholar]
 Chen, S., Xu, L., Ma, L., et al. 2017, in 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), IEEE, 198 [CrossRef] [Google Scholar]
 Coleman Jr. P. J. 1968, ApJ, 153, 371 [NASA ADS] [CrossRef] [Google Scholar]
 Feldman, W., Asbridge, J., Bame, S., & Gosling, J. 1978, J. Geophys. Res. Space Phys., 83, 2177 [NASA ADS] [CrossRef] [Google Scholar]
 Fox, N., Velli, M., Bale, S., et al. 2016, Space Sci. Rev., 204, 7 [Google Scholar]
 Grey, M., Westlake, J., Liang, S., et al. 2018, in 2018 IEEE Aerospace Conference, IEEE, 1 [Google Scholar]
 Hellinger, P., Trávníček, P., Kasper, J. C., & Lazarus, A. J. 2006, Geophys. Res. Lett., 33, L09101 [NASA ADS] [CrossRef] [Google Scholar]
 Howes, G. G., Klein, K. G., & Li, T. C. 2017, J. Plasma Phys., 83, 535830401 [Google Scholar]
 Kasper, J. C., Lazarus, A. J., & Gary, S. P. 2002, Geophys. Res. Lett., 29, 20 [NASA ADS] [CrossRef] [Google Scholar]
 Kasper, J. C., Abiad, R., Austin, G., et al. 2016, Space Sci. Rev., 204, 131 [NASA ADS] [CrossRef] [Google Scholar]
 Kasper, J., Bale, S., Belcher, J. W., et al. 2019, Nature, 576, 228 [Google Scholar]
 Kingma, D. P., & Ba, J. 2014, ArXiv eprints [arXiv:1412.6980] [Google Scholar]
 Klein, K., & Vech, D. 2019, Res. Notes AAS, 3, 107 [CrossRef] [Google Scholar]
 Lazarus, A., & Paularena, K. 1996, in Proceedings, AGU Chapman Conference Measurement Techniques for Space Plasma, AGU Geophysical Monograph [Google Scholar]
 Li, X., Zheng, Y., Wang, X., & Wang, L. 2020, ApJ, 891, 10 [Google Scholar]
 Nawi, N. M., Atomi, W. H., & Rehman, M. 2013, Procedia Technol., 11, 32 [Google Scholar]
 Ogilvie, K., Chornay, D., Fritzenreiter, R., et al. 1995, Space Sci. Rev., 71, 55 [NASA ADS] [CrossRef] [Google Scholar]
 Richardson, J. D., & Smith, C. W. 2003, Geophys. Res. Lett., 30 [Google Scholar]
 Sauvaud, J.A., Larson, D., Aoustin, C., et al. 2008, in The STEREO Mission (Berlin: Springer), 227 [Google Scholar]
 Tu, C.Y., Marsch, E., & Qin, Z.R. 2004, J. Geophys. Res. Space Phys., 109, A5 [Google Scholar]
 Wilson III, L. B., Stevens, M. L., Kasper, J. C., et al. 2018, ApJS, 236, 41 [NASA ADS] [CrossRef] [Google Scholar]
 Wolfe, J. H., Silva, R. W., & Myers, M. A. 1966, J. Geophys. Res., 71, 1319 [Google Scholar]
 Zheng, Y., Li, X., & Wang, X. 2019, ApJ, 885, 73 [Google Scholar]
All Tables
RMSE and Pearson correlation coefficients (CC) of the inferred particle parameters by the CNN and the fitting algorithm on the 3000 test VDFs.
RMSE and Pearson correlation coefficients (CC) of the inferred total (core + beam) proton parameters by the CNN and the fitting algorithm on the 3000 test VDFs.
Rootmeansquare difference and Pearson correlation coefficients for the core and secondary particle population derived with the CNN and the fitting algorithm on the real test data.
All Figures
Fig. 1 Top panel: one of the 10^{4} randomly generated core (blue), beam, (red) and αparticle (yellow) distributions. Bottom panel (actual input image for the CNN): sum of these three particle populations with added noise. 

In the text 
Fig. 2 Workflow of the proposed approach to process VDFs. 

In the text 
Fig. 3 Confusion matrix of the classification problem. 

In the text 
Fig. 4 Inferred and ground truth values for the core, beam, and αparticle parameters. 

In the text 
Fig. 5 Inferred and ground truth values for the total (core + beam) proton population. 

In the text 
Fig. 6 Comparison of the SPC fitting algorithm and the CNN on a real test data. Panel g: the peak of each VDF was normalized to 1 and the normalized phase space densities are plotted on a logarithmic scale. 

In the text 
Fig. 7 Panela: VDF from 00:44:51, which was identified as having two particle populations by both methods. Panel b: VDF from 03:00:20, where the fitting algorithm suggested both core and a secondary population while the CNN’s prediction was that there is only core. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.