Upper limit of the solar wind protons backscattering efficiency from Comet 67P/Churyumov-Gerasimenko

Context. Solar wind ions backscattering is a fundamental plasma-surface interaction process that may occur on all celestial bodies exposed to the solar wind and lacking a significant atmosphere or magnetosphere. Yet, observations have been limited to the regolith-covered Moon and Phobos, one of the Martian moons. Aims. We aim to expand our knowledge of the process to include comets by investigating the backscattering of solar wind protons from the surface of comet 67P/Churyumov-Gerasimenko. Methods. We used one of the ion spectrometers on board ESA’s Rosetta spacecraft to search for evidence of backscattered solar wind protons from the cometary surface. The signal of interest was expected to be very weak and several statistical treatments of the data were essential to eliminate any influence from background noise and instrumental effects. Due to limited knowledge of the signal location within the observed parameter space, we conducted a statistical analysis to identify the most probable conditions for detecting the signal. Results. No significant solar wind backscattered protons were ever observed by the instrument. The statement applies to the large spectrum of observation conditions. An upper limit of the backscattered proton flux is given, as well as an upper limit of the backscattering efficiency of 9 × 10 − 4 . Conclusions. The surface of comet 67P/Churyumov-Gerasimenko distinguishes itself as a notably weak reflector of solar wind protons, with its backscattering efficiency, at most, as large as the lowest observed backscattering efficiency from the lunar regolith.


Introduction
Space weathering is the collection of processes that both erode and chemically modify the interaction surfaces of celestial bodies directly exposed to magnetospheric, heliospheric and interstellar space environments.Solar radiation, micrometeorite impacts, cosmic rays, and plasma-surface interactions, collectively result in significant transformations of the physical and optical properties of, for example, the atmospheric top layers of Mars and Venus, the lunar regolith, or the icy surfaces of the outer planets' moons.
Plasma precipitation is one of the contributions to space weathering.Plasma precipitates on all celestial bodies and may change the composition and/or morphology of the matter with which it interacts.For example, on the Moon, solar wind H + implants in the oxygen-rich regolith, leading to hydration of the uppermost surface layer (Pieters et al. 2009).The resulting hydroxyl-and/or water-bearing volatile elements migrate and might accumulate in permanently shadowed regions at or close to the poles, where they constitute a possible water source for future long-term human activities.
On Mars and Venus, plasma precipitation changes the atmospheric composition.Solar wind He 2+ ions precipitating onto the upper atmosphere constitute one of the major sources of atmospheric helium, at least on Mars (Stenberg et al. 2011;Stenberg Wieser et al. 2015).Oxygen ions picked up by the solar wind may later re-impact on the upper atmosphere and sputter neutral oxygen, enhancing the total atmospheric escape from the planets (Curry et al. 2015;Luhmann & Kozyra 1991).
In the Jovian system, the Jovian magnetospheric plasma precipitating on Ganymede's surface is the primary agent responsible for the moon's strong albedo asymmetry (Fatemi et al. 2016).Ganymede's dipole-like magnetic field shields the equatorial regions better.The uneven sputtering of the surface constituents results in clear asymmetries between the equatorial and polar regions, as well as between the leading and trailing hemispheres (Pontoni et al. 2022).
Plasma precipitation can also be used as a tool for probing surface properties and the surrounding space environment.One notable example is the study of the Moon.Schaufelberger et al. (2011) studied the scattering function of energetic neutral H, resulting from the neutralization and backscattering of precipitating solar wind H + from the lunar surface.They showed energetic neutral H atoms are preferentially reflected back into the sunward direction, contrary to the classical forward scattering found in laboratory experiments.This unexpected result exposed the highly porous microscopic structure of the lunar regolith.
Observations of energetic neutral H were also used to produce images of lunar magnetic anomalies (Wieser et al. 2009).Strong localized magnetic fields shield the surface from the solar wind resulting in lower energetic neutral atom fluxes, directly correlated to the magnetic field intensity.Additionally, Futaana et al. (2006) showed that the fluxes of energetic neutral atoms sputtered from the lunar regolith by precipitating solar wind ions are high enough to be detected up to the lunar orbit.This allows an elemental map of the Moon's surface to be constructed, reflecting its chemical composition.
Sputtering induced by solar wind ions has also been observed at comets.For example, Wurz et al. (2015) derived a compositional map of comet 67P/Churyumov-Gerasimenko (hereafter comet 67P) from the observation of O, Na, K, Si, Ca, and S sputtered from its surface.
Comets are pristine remnants from the formation of our Solar System.They spend most of their existence in dark regions of the outer solar system, where solar radiation is weak and the solar wind is too tenuous to induce substantial changes.As comets journey closer to the Sun, the increasing solar radiation causes the volatile compounds within the nucleus to sublimate and release gas and dust into space.The comet's atmosphere, referred to as a coma, grows in size as the comet approaches the Sun and the cometary nucleus is often completely shielded from solar wind precipitation close to perihelion (Hansen et al. 2016;Nilsson et al. 2017).Further away from the Sun, the solar wind reaches the cometary nucleus, tending to an asteroid-like solar wind-surface interaction.
Comet nuclei stand out in their composition and structure.They consist of clusters of ice and millimeter-sized dust aggregates loosely packed together in very low gravity (Blum et al. 2017).This "rubber pile" structure results in low bulk densities, with values such as 0.6 g cm −3 for comet 1P/Halley (Sagdeev et al. 1988) and 0.535 g cm −3 for comet 67P (Jorda et al. 2016).The underdense nuclei structure is often characterized by large porosities, ranging from 70 to 75% for comet 67P (Jorda et al. 2016).These levels of porosity are similar to that of volcanic pumice, which is known for its ability to float on water.Comets also stand out as some of the darkest objects in our Solar System.In the case of comet 67P, an albedo of 6.2% at 0.55 µm was observed (Ciarniello et al. 2015).This is comparable to the albedo of a newly laid asphalt surface.Notably, Bibring et al. (2015) analyzed images from the Philae lander, part of ESA's Rosetta mission to comet 67P (Glassmeier et al. 2007) and discovered a complex, fractured surface with a wide range of grain sizes and strong local variations in albedo (>10%).In this context, studying plasma-surface interactions at comets may reveal more about their surface composition and structure.
When the comet activity is low, solar wind plasma precipitates onto the cometary nucleus in a similar way it does onto the Moon's surface.At the Moon, a resulting fundamental process is the reflection/backscattering of solar wind H + from the lunar regolith, first observed by JAXA's SELENE mission (Saito et al. 2008).A subsequent study using data from ISRO's Chandrayaan-1 spacecraft (Lue et al. 2014) have confirmed this observation, and they both showed that ∼0.1-1% of solar wind H + are reflected back into space by the lunar regolith.Additional observations from the two NASA's ARTEMIS probes (Lue et al. 2018) revealed that the fraction of reflected solar wind H + decreases as the solar wind H + impact speed increases.Specifically, it decreases from approximately ∼0.5% at a solar wind speed of 300 km s −1 to ∼0.3% at 600 km s −1 .Furthermore, up to 20% of solar wind H + impinging on the lunar regolith are reflected back into space as energetic neutral atoms (Wieser et al. 2009;Funsten et al. 2013, Chandrayaan-1, IBEX).
Additionally, measurements by Mars Express suggested that 0.5-10% of solar wind H + backscatter from the regolithcovered surface of Phobos (Futaana et al. 2010), but subsequent measurements from MAVEN were unable to reproduce this result (Deniau et al. 2022).
In this paper, we analyze data from one of the ion mass analyzers on board Rosetta to investigate if solar wind H + backscatter from the surface of comet 67P.Our objective is to quantify the backscattering efficiency, which is defined as the ratio of incoming to reflected solar wind H + fluxes.

Instrument description
We use data from the Ion Composition Analyzer (ICA, Nilsson et al. 2006;Wieser et al. 2017), part of the Rosetta Plasma Consortium instrument suite (Carr et al. 2007) on the ESA's Rosetta mission to the comet 67P (Glassmeier et al. 2007).At the end of the Rosetta mission, ICA had recorded close to two years of scientific data.
ICA measures the energy, mass and arrival direction of positively charged ions.First, an electrostatic deflection system sweeps through elevation angles between ±45 • with respect to the instrument vertical symmetry axis (see Fig. 5a).The sweep has 16 linear steps, giving an elevation angular resolution of approximately 5.6 • .Particles passing the elevation filter enter a top-hat electrostatic analyzer resolving their energy per charge (q) from 25 eV/q to 40 keV/q in 96 exponentially distributed steps, with an energy resolution of 7%.Ions with an energy per charge matching the settings of the electrostatic analyzer enter the mass analyzer consisting of a cylindrical magnetic field.The latter separates the ions in mass by deflecting the lighter ions more than the heavier, making them hit a micro-channel plate (MCP) at different locations depending on their mass.Each hit on the MCP is producing an electron shower.Behind the MCP is a set of 32 concentrical ring anodes detecting the radial impact position of the electrons, giving the original ion mass per charge.The mass resolution allows for the separation of H + , He + , He ++ and heavier ions.Similarly, a set of 16 sector anodes measures the azimuthal arrival direction of ions, with an angular resolution of 22.5 • (see Fig. 5a).ICA has a total field of view of 360 • × 90 • and an effective field of view of about 2π sr due to spacecraft shadowing.
ICA counts particles distributed in a five-dimensional parameter space, denoted as Θ, spanning over the azimuthal and elevation angles, mass, energy, and time.While the mass and the azimuthal angle of the ions are measured continuously, the instrument sweeps through the entire energy range at each elevation step.A full scan of Θ by ICA consists of 16 (azimuth) × 96 (energy) × 32 (mass) × 16 (elevation) bins and takes 192 s, which is the nominal time resolution.

Anode cross-talk
The ICA instrument differentiates the mass and the azimuthal arrival direction of the ions from their splatting position on 32 ring anodes and 16 sector anodes, respectively.Electronic interferences due to capacitive-coupling between preamplifiers can lead to cross-talk between these anodes.As a result, a beam of ions splatting on a sector anode, that is an azimuthal bin, can generate a signal on a different sector anode.A similar cross-talk occurs between the ring anodes (mass bins).
The cross-talk is in the order of a few percent and is largely suppressed in the data used in this study.However, cross-talk signature of very strong (high flux) signals are still visible and might A245, page 2 of 13 Canu-Blot, R., et al.: A&A, 683, A245 (2024) Fig. 1.Probability density function of counts in a bin under a Poisson process of mean λ T = 3.5 before and after applying the background reduction at a level of r = 2.Because of the background reduction, probabilities of counts larger than 0 and equal or below r are set to 0. Although the original mean number of counts is λ T = 3.5, after the background reduction is applied zero count becomes the most probable observation.The expected value of the background-reduced distribution can be estimated analytically by λO = c>r λ c T (c − 1)! e −λ T .This estimator is a special case of the general estimator defined in Eq. (A.4).pollute weak signals.We know that sector 0 cross-talks significantly and sectors 1 and 13 have very low sensitivity (Bergman et al. 2021), and we exclude data from them.Additionally, we found out that sectors 2 and 3 are not reliable enough due to their higher level of cross-talk in comparison to other sectors, and we also exclude data from them.We expose in Sect.3.4 how we mitigated cross-talk from the remaining sectors.

Mitigating instrumental background reduction
Penetrating radiation, electronic noise and decay of radioactive isotopes in the MCP glass generate a background signal visible in the ICA data, worsening its compressibility.To fit the data rate allocated by the limited spacecraft telemetry bandwidth, the ICA on-board central processing unit compresses the data after applying a background reduction.
The background reduction is applied such that for one specific bin of n original counts, the background-reduced bin has: 0 counts if n ≤ r, n counts otherwise, with r the background reduction level.In this way, all small background counts are converted to zeros, allowing for a more efficient data compression.While the level r was most of time set to 2, it varied between 0, 1 and 2 depending on observation cases.
Although background reduction allows for a finer coverage of Θ by saving telemetry, it damages or removes very weak signals by reducing the instrument sensitivity.The damaged and lost information cannot be recovered.Moreover, binned data is Poisson-distributed (Barlow 2013) but the background reduction makes it no longer true. Figure 1 shows the effect of a background reduction of r = 2 applied on a Poisson distribution of mean λ = 3.5.
To account for the non-Poisson behavior, we needed to calculate the correction factor between the observed mean number of counts per bin λ O , as reported by ICA, and the true mean number of counts per bin λ T , as detected by ICA.We approach the problem using a Monte-Carlo simulation.An analytical derivation of the problem is additionally given in Appendix A. The Monte-Carlo simulation is setup as such: we first generate a data matrix of randomly distributed independent counts following a Poisson distribution of mean λ T .The data matrix is generated within a region of Θ in which any signal mean λ is assumed constant.The Top panel: relation between the true mean number of counts λ T , as detected by ICA, and the observed mean number of counts λ O , as reported by ICA after the background reduction and binning are applied.In this figure, the relation is averaged over all energies for clarity ; in a real application, the correction factor is different for each energy bin.The observed mean λ O is significantly reduced for low true mean values λ T .The deviation vanishes for larger λ T ≳ 3 (inset).Bottom panel: ratio between the true mean number of counts and the observed mean number of counts.In all panels, the analytical estimate matches very well (within the MC uncertainties) the MC simulation, validating both approaches.The undistorted unit response is shown by the dashed line in all panels.smaller this region is, the less likely are fluctuations of the signal mean inside it.Then, we apply the same background reduction and binning as what the instrument applied on-board.The data matrix is processed through our pipeline (see Sect. 3), thus imitating the handling of a real dataset.Finally, the processed data matrix is averaged over all dimensions but the energy, giving for a known λ T the simulated λ O at each energy step.That is, we have for a given region of Θ-small enough to assume the mean of any signal to be constant-a correction factor between the observed and true signal mean at a given energy step.
The procedure is repeated for We iteratively add values to λ T until the real observed mean λ O,ICA ∈ λ O , finally giving λ T,ICA from linear interpolation.Figure 2 shows an example of the relation between λ O and λ T for a background reduction level of r = 2.One can deduce that if ICA observed on average 0.5 counts per bin, then the true average number of counts per bin must have been ∼0.9.The correction factor K = λ T /λ O converges to 1 for large λ T ≳ 3.For the rest of the analysis, the background reduction is corrected using the Monte-Carlo estimate.

Data selection and processing
From the parameter space Θ sampled by the ICA instrument, we identify regions that are most likely to contain the backscattered solar wind H + signal.These identified regions are referred to as the sub-volume Θ s+b , which is a subset of the overall parameter A245, page 3 of 13 Fig. 3. Data selection and processing pipeline.After the raw data is selected in angular space, the selection and processing that follow are identical for both foreground and background matrices.The instrumental data reduction correction is described in Sect.2.3.Fig. 4. Mission profile and time periods used in this study.Lower panel: spacecraft distance to the comet (solid line) and the comet distance to the Sun (dashed line).The spacecraft orbited comet 67P from September 2014 until its mission ended on September 30, 2016.The spacecraft distance to the comet's surface spanned from a thousand kilometers down to the surface, at the mission end on impact.During this period, the comet's heliocentric distance varied between 3.8 and 1.24 au.Middle panel: the solar wind H + density as seen by ICA.The color scale is logarithmic and increases for lighter colors.The solar wind ion cavity was observed for heliocentric distances below ∼2 au.Upper panel: the selected mission segments (green), adding up to a total of 74 days of data.space Θ.In parallel, we select a background-only region Θ b ⊂ Θ, from which the instrument's noise floor is estimated.
The data selection and processing pipeline are summarized in Fig. 3.

Selected mission segments
The sublimation process of the icy surface of comet 67P as it heats up approaching the Sun generates an atmosphere.For large out-gassing rates a distinct region within the comet's coma is formed, void of solar wind ions, as described by Behar et al. (2017).This region, also referred to as the 'solar wind ion cavity', is clearly visible in Fig. 4 (middle panel).There, the solar wind H + density is shown, and a clear depletion is visible for heliocentric distances below ∼2 astronomical units (au).Our investigation focuses on the backscattering of solar wind H + from the cometary surface, and for this purpose, we have chosen mission segments occurring outside the solar wind ion cavity.
We have also limited the data to times when the spacecraft distance to the comet is less than 40 km.By concentrating on this limited spatial range, we can neglect the influence of electromagnetic fields on both solar wind H + and backscattered H + .Thus, we assume that all H + move along a straight line at constant energy.This simplified our analysis as all H + seen by ICA when looking at the comet are assumed to originate from it and no particle back-tracing is required.
Finally, we only selected times when ICA had a full energy coverage.The resulting selected time segments are shown in Fig. 4 (upper panel), accounting for a total of 74 days.

Angular selection
The ICA instrument has 16 azimuthal bins, or sectors, each divided into 16 elevation bins, as illustrated in Fig. 5a.Sectors 0, 1, 9, 10, 11, 12, 13, 14 and 15 are partially obstructed by the spacecraft body and solar panels (Bergman et al. 2021), and we mask them out to eliminate possible particle reflection from spacecraft surfaces.Excluding sectors 2 and 3 (see Sect. 2.2), we retain sectors 4-8 for data integration.It is worth noting that during the selected mission segments, the Rosetta spacecraft adopted a terminator orbit, from which the cometary nucleus was mainly visible in sectors 4-6 (Nilsson et al. 2015, Fig. 1).
Additionally, we systematically mask out angular bins in the antisolar wind flow direction.To account for the variation of the solar wind angular spread, the masking is applied for a 60°halfangle cone, centered about the solar wind direction.

Signal angular space
The backscattered solar wind H + are assumed to originate from the cometary nucleus.As a result, we define the angular region of the parameter space signal region Θ s+b as all solid angles directly looking at the cometary nucleus, as shown in Fig. 5b.The cometary nucleus is simplified as a spherical body of diameter equal to its largest dimension (Jorda et al. 2016).

Background angular space
We assume the direct surroundings of the comet to be free from any H + signature.Therefore, a background-only region, Θ b , is defined as the angular region around the signal region covered by Θ s+b as sketched in Fig. 5b.

Energy resampling and normalization
To improve the overall statistics, we accumulate data for as long as possible.This means we compare data from different solar wind conditions and instrumental configurations.
Throughout the mission, the ICA instrument underwent several software updates.The energy table, which maps energy bins to physical units (eV/q), changed with each new software version.To unify data from different energy tables we resampled the data into a single reference energy table.The resampling resulted in interference patterns that we cancel out as explained in Sect.3.5.
The reference energy table was normalized to the solar wind H + energy (defined at peak flux).This allowed comparing data obtained under different solar wind conditions.Additionally, normalizing the data to the solar wind H + energy is particularly relevant when considering particle-surface interactions; we anticipate the backscattered signal to be a fraction of the solar wind H + energy, and normalizing with respect to it simplifies data interpretation.

Energy selection
In Sect.2.2, we discussed how strong signals can cross-talk over mass and azimuthal bins.We have identified two significant signals, namely the solar wind H + and the low-energy (<100 eV/q) cometary ions, whose cross-talk signatures could contaminate the count rate in Θ s+b and Θ b .Although angular bins in the anti-solar wind flow direction are masked out, the azimuthal cross-talk of solar wind H + could still affect Θ s+b and Θ b .Similarly, the mass cross-talk of heavier cometary ions could introduce contamination in the H + mass bin.This results in the presence of nonphysical counts in certain energy regions of Θ s+b and Θ b .
To address this cross-talk contamination, we apply an energy mask.We assume that the solar wind H + and cometary ions are normally distributed in energy and establish upper and lower energy boundaries for data integration by considering two standard deviations above and below the peak of the signals, respectively.Additionally, we mask out energy bins 17-20 (ranging from 40 to 110 eV/q) to exclude any instrumental effects resulting from a change in the elevation stepping scheme.

Flat-field correction
The energy resampling, as discussed in Sect.3.3, introduces artificial fluctuations in the data that can be misinterpreted as physical.To correct the data, we need to characterize the effect our data processing has on a real data matrix.This characterization can be seen as an analog to the flat-field correction commonly used in digital imaging.By applying this correction, we ensured that a uniform input yields a uniform output from our data processing.
The characterization process is straightforward: we create a matrix of ones of the same shape as the real data matrix.This matrix is masked and resampled in energy, in the same ways as the real data matrix.This yields the processing response to unity, describing how does our processing pipeline affect a uniform input.We finally divide the real data matrix by the processing response to unity to obtain the corrected data matrix.

Signal energy spectrum
We consider two separate data matrices that correspond to different regions in the parameter space: the foreground region Θ s+b and the background region Θ b .Θ s+b contains both the possible signal and background, while Θ b contains only background data.After applying instrumental background reduction and flat-field correction, both matrices are summed over all dimensions but energy, yielding two energy spectra.Each spectrum represents the total number of counts observed in each energy bin within the parameter space region of interest.

Removing the background
Our goal is to estimate the signal mean λ s , representing the expected average number of counts ICA would observe from a signal-only random process.However, the number of observed counts in the foreground region Θ s+b results from the convolution of two random processes: signal and background.
To isolate the signal, one solution is to subtract the average background, derived from Θ b , from the observed counts in Θ s+b .However, as the number of observed counts is a Poissondistributed random variable, the subtraction of an average background may result in negative estimates of λ s .
To ensure meaningful results, we employ the method introduced by Feldman & Cousins (1998).This technique, which relies on a likelihood ratio ordering principle, allows us to estimate guaranteed positive confidence intervals of a random process parameter, for example the signal mean λ s , in the presence of nuisance parameters, for example the background mean λ b .By doing so, we prevent the construction of nonphysical confidence intervals, especially in situations where the number of observed background counts is larger than the number of observed foreground counts.We also avoid constructing undercovered intervals, that is when the confidence intervals does not cover the true value of λ s within the specified confidence level.

Probabilistic model
To calculate a confidence interval for the expected average number of counts originating from a possible signal process in Θ s+b , we define a probability model.The observed number of counts in Θ s+b , denoted as c, is assumed to arise from two Poisson processes: a signal process with a mean λ s and a background process with a mean λ b .The probability density function is defined as: The background mean, λ b , is determined from the background region Θ b where we observed b background counts.The large size of this region, giving a relative standard error on the mean in the order of 10 −6 , provides a precise estimate of the mean: λ b = b.
The parameter τ accounts for the different sizes of the foreground and background regions.τ defines the ratio between the probability that a background count occurs in the foreground region and the probability it occurs in the background region: A245, page 5 of 13 To judge the significance of the calculated signal mean above the background level, we compare it to the measurement sensitivity.We define it at a 90% confidence level as the average upper limit of λ s that would be obtained by an ensemble of n observations of the number of counts c n in Θ s+b if no signal was present (λ s = 0).In other words, the measurement sensitivity is the minimum signal mean required to distinguish it from zero.For large background counts this limit corresponds to ∼1.64σ of the background count distribution.For each energy bin, if the calculated signal mean exceeds this limit, the total number of counts c observed in the bin is unlikely to result only from the background process.However, given the 90% confidence level, on average 10% of the background-only data are expected to exceed the measurement sensitivity.

From counts to physical units
All statements about the signal mean λ s and the measurement sensitivity are given in terms of counts, an instrumentdependent unit.For a generalized statement using physical units, we convert counts to differential directional number flux j cm −2 sr −1 eV/eV s −1 , normalized to solar wind H + energy as discussed in Sect.3.3.The eV/eV in the differential flux unit results from the normalization.
From Kessel et al. (1989, Eq. (2)), we derive the following relation: where i is the energy bin index, N i is the number of counts, GF is the instrument geometric factor for the proton mass in cm 2 sr eV/eV (Nilsson et al. 2006), E i is the bin central energy in eV, T i is the total measurement time in s and E sw is the average solar wind H + energy impinging onto the surface in eV.We define the time-average differential directional number flux as: In the time period considered, the average solar wind H + energy was E sw = 963 eV.The geometric factor GF , set equal to 1.7 × 10 −5 cm 2 sr eV/eV, is approximated as constant for the selected instrument operational modes, the energy range of interest and the looking direction.It is important to note that j is the differential directional number flux averaged over the integration time period and the solid angle covered by the comet as seen by ICA.

Differential flux energy spectrum
Figure 6 shows the differential directional energy number flux of protons originating from the comet 67P.The striped area from 1 to 4 E sw , referred to as the solar wind exclusion region, covers the energy range of solar wind H + and cross-talk signatures of He + and He ++ .This region has been masked out.In almost every energy bin, the signal is below the measurement sensitivity.We conclude there is no discernible foreground signal in the region of interest Θ s+b .
To be conservative in our estimate, we use the measurement sensitivity as an upper limit for the backscattered solar wind H + differential directional number flux.From 0.1 to 1.0 E sw , the differential flux upper limit ranges from 1.1 × 10 4 to 3.3 × 10 1 cm −2 sr −1 eV/eV s −1 .In comparison, from the lunar surface, Lue et al. (2014) reported a peak backscattered solar wind H + flux at an energy of 0.64 E sw .In our study, that corresponds to a differential flux upper limit of j (E = 0.64 E sw ) ≈ 1.3 × 10 3 cm −2 sr −1 eV/eV s −1 .

Background check at high energies
We do not expect to observe any signal above 4 E sw .All data above 4 E sw shown in Fig. 6 should thus only arise from a background process.We use this energy range to compare the background probability density functions of Θ s+b and Θ b .
By applying a two-samples Kolmogorov-Smirnov test (Barlow 2013), we can certify at a 90% confidence level the background counts observed in both regions originate from the same probability density function (p-value = 6.8 × 10 −1 ⩽̸ 0.1).In other words, the background level derived from Θ b is A245, page 6 of 13 Canu-Blot, R., et al.: A&A, 683, A245 (2024) consistent with the background level of Θ s+b , for E > 4 E sw .This consistency supports our conclusion that there is no significant signal.

Backscattering efficiency
The solar wind H + backscattering efficiency R is the ratio between the H + total flux emitted from the cometary surface and the solar wind H + total flux impinging onto the surface.For the emitted flux we only consider energies between 0.1 E sw to 1.0 E sw , with E sw the solar wind H + energy.We also assume that the angular scattering at the surface is uniform and follows a cosine angular distribution.Such a distribution is more applicable to a powder-like substance like regolith than a cosine square distribution often used for scattering from polycrystalline surfaces (Cassidy & Johnson 2005).Solar wind parameters are obtained from ICA data (Nilsson et al. 2020).The total emitted H + flux f backscattered cm −2 s −1 is then estimated by f backscattered ≡ π/E sw • 1.0 E sw 0.1 E sw jdE, with E sw the average solar wind H + energy in the observation interval and π comes from the angular integration.
Ions emitted from the surface may charge exchange in the thin cometary atmosphere as they travel from the surface to the Rosetta spacecraft.To estimate the loss, we model the cometary neutral atmosphere density as N(r) = n 0 (R 0 /r) a , with r the distance from the nucleus, n 0 the neutral density measured by the COmet Pressure Sensor (COPS, part of the ROSINA package; Balsiger et al. 2007) at a distance R 0 from the comet nucleus (Gasc et al. 2017), and the exponent a fitted to the measured neutral density profile (Edberg et al. 2022).
From the neutral density model, we compute the averaged neutral column density from the spacecraft to the surface during the covered time period.Assuming a cross-section of 10 −15 cm 2 for charge-exchange between protons to oxygen atoms (Wurz 2000), we find that about one third to half of the H + emitted from the surface are neutralized prior reaching ICA.We neglect possible charge-exchange of solar wind H + on the inbound part of the trajectory to the surface as the charge state of the emitted particles is independent of the charge state of the impinging particles in the energy range considered (Hird et al. 1991;Van Wunnik et al. 1983;Jans et al. 2001;Wieser et al. 2002).The total solar wind hydrogen flux onto the cometary surface f sw , independent of its charge state, is estimated to be equal to the solar wind H + flux as seen by ICA.Using the measurement sensitivity from Fig. 6 as the upper limit of the observed backscattered differential flux j and including charge exchange losses of ions traveling from the cometary surface to ICA, we obtain an upper limit of the backscattering efficiency: with ν = 0.4 the fraction of the emitted H + lost to charge exchange.

Discussion: investigating the absence of a visible signal
We conclude from Fig. 6 that no significant backscattered solar wind H + signal is visible in the parameter space region Θ s+b .
Either the signal intensity is too weak for ICA to differentiate it from the background noise, or the signal was missed for other reasons.

Reasons to miss an existing signal
Assuming the signal is intense enough to be visible, the following scenarios can explain why it does not show up in our analysis: 1.The backscattering signal originates from a different region of the parameter space Θ that falls outside the coverage of Θ s+b .2. The data from Θ b overestimates the background in Θ s+b , raising the measurement sensitivity shown in Fig. 6.A data point drawn from the signal process would then be misinterpreted as a background-only process.3. The region Θ s+b encompasses more than just the signal region, lowering the signal-to-noise ratio and making the signal indistinguishable from background.The first scenario is very unlikely: we expect backscattered H + to originate from the comet with an energy below the solar wind energy, which is included in Θ s+b (see Sect. 3).The background check in Sect.4.3.1 confirms that the background estimation at high energies is the same for Θ b and Θ s+b .There is no reason this does not hold true at lower energies, thus we disregard the second scenario.The third scenario, however, merits a more detailed investigation laid out in the next section.

Can we improve the signal-to-noise ratio?
Due to the lack of prior knowledge about the backscattering signal distribution over time, we initially integrate data over as long as possible (within the limits specified in Sect.3.1), yielding the energy spectrum presented in Fig. 6.One possible explanation for the lack of a visible signal is that the data includes time periods with no significant backscattering, thus degrading the signal-to-noise ratio.To address this, we need to better constrain the time periods during which the backscattering process is most likely to occur.
The flux of backscattered solar wind H + is likely influenced by, for example, the solar wind bulk parameters, the cometary activity and the overall observation geometry.For example, we expect the backscattered flux to be proportional to the solar wind H + flux impinging on the surface.Similarly, the orbital parameters likely have an influence, for example the closer to the comet, the higher the expected observed backscattered flux.Additionally, a composition dichotomy between the north-south hemispheres, first observed by Hoang et al. (2019), could cause the observed backscattered flux to depend on the side of the comet facing the spacecraft.The comet orientation relative to the spacecraft continuously changes with the comet rotation period of ∼12 h (Jorda et al. 2016) and along the spacecraft's orbit.These highly dynamic observation conditions may cause large time variations of the signal strength as seen by ICA over time.

Search for a signal
To identify time periods that are more likely to contain a signal, we test for the presence of any signal within these periods.We create shorter time periods by binning the data according to the values of several parameters that might affect the signal strength.These parameters are the solar wind H + energy, temperature, and density, the cometary ion energy and density, the Sun-Comet-spacecraft angle, the spacecraft's distance from the comet, the comet's distance from the Sun, as well as the latitude and longitude of the sub-nadir point.For each of these bins (now representing shorter time periods) we now investigate the possibility that it contains a signal.

Joint probabilistic model and significance
As we are investigating the foreground and background data over shorter time periods, we can no longer assume the background mean λ b to be exactly known like in Sect.4.1.1.Therefore, the uncertainty on the background mean must be added in our probabilistic model.
The joint probabilistic model of observing c counts in Θ s+b and b counts in Θ b is given by: (5) The model parameters are the same as the ones defined in Sect.4.1.1.The two components of Eq. ( 5) represent the probabilities associated with the foreground and background regions.We note the background mean λ b is a nuisance variable, unlike in Eq. ( 1) where it is assumed to be exactly known.
To evaluate the presence of a significant signal in the binned data (see Sect. 5.2.1), we construct a statistical hypothesis test based on the profile likelihood ratio as described in Appendix B.1.The statistical test Λ (details are given in Appendix B.2) compares the goodness-of-fit of a given observation from two competing probabilistic models, which general description is given by Eq. ( 5).The first model defines the null hypothesis H 0 : λ s = 0, that is the absence of a signal.The alternative model/hypothesis is H a : λ s > 0, where λ s is assumed to be log-normally distributed with respect to the solar wind H + energy.Working at a 90% confidence level, for a given observation, if the statistical test Λ is greater than a threshold corresponding to a p-value of 10%, then we can reject the null hypothesis H 0 .This would indicate the presence of a signal.

We likely cannot improve the signal-to-noise ratio
Figure 7 shows the best fit of the signal mean λ s versus the energy normalized to the solar wind H + energy and four parameters: the sub-nadir latitude, the sub-nadir longitude and the estimated solar wind H + and cometary ions densities.The 'best fit' refers to the maximum likelihood estimate of the signal mean, as discussed in Appendix B.2.The data in this example is segmented into three quantiles using variable bin sizes.The statistical test Λ is evaluated for each of these fits, and is shown in the upper panel of Fig. 7. Here, Λ gives the goodness of fit a lognormally distributed signal across the energy range of interest.Each fit is evaluated individually, characterized by one unique Λ.
Most tests Λ are below the critical threshold above which we can reject H 0 .While two tests do exceed this threshold, it is likely due to statistical fluctuation within the 90% confidence level.In other words, the tests suggest the absence of a signal.The same is true (not shown) for the other time-dependent parameters listed in Sect.5.2.1.Changing the number of quantiles (that is the number of bins in Fig. 7) does not significantly change the result.The random behavior of the fits shown in Fig. 7 is an additional element suggesting the background-only nature of the data in Θ s+b .Indeed, the best fits of the signal average location and spread in energy appear to fluctuate randomly, supporting the initial conclusion that there is no signal.
These observations indicate either the signal is not visible anywhere in the signal parameter space Θ s+b and/or it does not correlate well with any of the parameters listed in Sect.5.2.1.As we would expect the signal to be a function of, for example, Fig. 7. Results of the statistical hypothesis test Λ.The test was applied to data binned according to the values of four parameters, namely the solar wind H + density n sw , the cometary ions density n ci , and the latitude and longitude of the spacecraft sub-nadir (SN) point on the cometary nucleus.The data are distributed into three quantiles using variable bin sizes.The upper panel exhibits the results of the Λ test for each bin, with a dashed horizontal line denoting the critical value u crit above which H 0 is rejected at a 90% confidence level.In practical terms, if Λ falls below this threshold, we can confidently conclude the absence of a significant signal.In the lower panel, we visualize the optimal distribution of the signal mean λ s as a function of the energy normalized to the solar wind H + energy and the four binned parameters.
the solar wind H + density, among others, we conclude that the signal-to-noise ratio cannot be improved.This leads us to dismiss the third point in Sect.5.1 and ultimately to the conclusion that no backscattering signal is discernible in the ICA data.As a result, the upper limit of the backscattering efficiency, R = 9 × 10 −4 , determined in Sect.4.4, stands as a reasonable estimate we can likely not improve further.

Reasons for a truly small signal
There is the possibility that the solar wind H + , although seen at the spacecraft location, do not reach the cometary surface.This, however, seems very unlikely as the spacecraft is on average 20 km away from the nucleus; a substantial deflection of the solar wind H + within this short distance seems improbable.As pointed out in Sect.4.4, the charge state of the solar wind hydrogen does not matter.The neutralization of the solar wind H + as they travel from the a typical spacecraft cometo-centric distance to the surface does not affect our results.Additionally, Wurz et al. (2015) observed a particle population sputtered from the cometary surface, indicating the solar wind was reaching the cometary surface for conditions similar to those used in this study.
Table 1 shows estimates of the backscattering efficiency for other small/airless bodies in the Solar system.Comet 67P distinguishes itself as a notably weak reflector of solar wind H + , with a backscattering efficiency below the lowest observed backscattering efficiency from the lunar regolith.
The surface of comet 67P, and potentially cometary surfaces in general, seems to interact differently with the surrounding A245, page 8 of 13 Canu-Blot, R., et al.: A&A, 683, A245 (2024) Deniau et al. (2022) reported no evidence of backscattered solar wind H + from Phobos.An upper limit of the differential directional energy flux is given as 10 6 eV cm −2 s −1 sr −1 eV −1 .Similarly, Futaana et al. (2021) reported observations from two Mars Express flybys (out of more than a dozen) of H + potentially coming from Phobos' surface.However, the sporadic nature of the observations remains unexplained.
plasma than the surface of the Moon and Phobos.The sublimation of icy compounds within a comet gardens/refreshes its surface frequently, while the impact vaporization and sputtering processes at Moon and Phobos are much slower.At the Moon, solar wind H + implants into the top 100 nm of the regolith layer, where it can reach saturation if all oxygen atoms have an associated hydrogen (Farrell et al. 2015).The build-up of hydrogen within the uppermost layer might lead to an increased fraction of sputtered/reflected solar wind H + .At a comet, a similar buildup process might be hindered by the more frequent refreshing of its surface.
The porosity of the uppermost layer of lunar regolith is a key parameter in understanding plasma-surface interactions on the Moon.From energetic neutral H observations (Chandrayaan-1), Szabo et al. (2022) derived a global porosity of the uppermost regolith layer of ∼0.85.They additionally observed a decreasing reflection efficiency of solar wind H + as energetic neutrals with increasing porosity.Such a dependence might hold true also for reflected ions, as the charge state of the reflected particle is determined after the scattering process.Comet 67P has a large global porosity (Jorda et al. 2016), though the porosity of its interaction surface with the surrounding plasma is not known.The low backscattering efficiency of comet 67P might be an indictor of a high microscopic porosity of its surface, further promoted by the comet's low gravity.
The difference in backscattering efficiency between lunar regolith and the surface of comet 67P may also be attributed to differences in their surface compositions (Alexander et al. 2016;Wurz et al. 2015).

Conclusion
Solar wind ions backscattering is a fundamental plasma-surface interaction that may occur on all celestial bodies exposed to the solar wind and lacking a significant atmosphere or magnetosphere.The observations, however, have so far been limited to the Moon and Phobos.In this paper we investigate the backscattering of solar wind H + from the surface of comet 67P, utilizing data from the mass analyzer ICA on board the Rosetta spacecraft.Our statistical analysis showed that no signal was detected by the instrument, putting an upper limit to the backscattered differential directional number flux from 1.1 × 10 4 to 3.3 × 10 1 cm −2 sr −1 eV/eV s −1 for H + energies of 0.1 to 1.0 E sw .We conclude that the surface of comet 67P has a very low reflection efficiency for solar wind H + of <9 × 10 −4 .Possible reasons for the low reflectivity of the surface include the fast gardening of the surface, its composition and porosity.

B.2. Search for a signal of hypothesized distribution
In this section, we apply the general method presented in Sect.B.1 to our case study.

B.2.1. Applying the profile likelihood ratio
We have two sets of observations.c is the counts observed in the region Θ s+b where we expect two random processes: a signal and a background.These counts are integrated over all dimensions but the energy, distributed exponentially across R bins.
is the region size ratio as defined in Sect.4.1.1.
We assume the background process in Θ s+b to be the same as in Θ b .Under H a , the signal process in Θ s+b is assumed to follow a lognormal distribution in energy of unknown mean γ and standard deviation σ.The total likelihood L of obtaining the two observations c and b, given the joint probabilistic model defined in Equation 5, is: i Lognormal e, γ, σ 2 de, with Lognormal e, γ, σ 2 the Lognormal probability density function evaluated at the energy e, i the energy bin index, and [E − i , E + i ] the energy interval covered by the energy bin i. λ s i is the mean of the Poisson process for the signal within the energy bin i, and λ s is the signal strength, that is the total number of counts originating from the signal process.Here, (γ, σ) are parameters of the signal probabilistic model, belonging to the (D=2)-dimensional parameter space S.These parameters are non-identifiable under H 0 , as no signal is present.Additionally, we define λ At a point in the parameter space S, that is for a fixed set of (γ, σ), the profile likelihood ratio is given as the ratio of the total likelihoods maximized independently under the two competing hypotheses.From Equation B.1, we have: The maximization of the total likelihoods is performed over both λ b and λ s .Alternatively, the maximum likelihood estimate (MLE) of λ b can be analytically determined as: the background mean MLE becomes the arithmetical average between the two observations c and b.The MLE λb is only a function of λ s i and thus reduces the dimensionality of the maximization of the numerator in Equation B.6 to 1.The profile likelihood ratio finally becomes: We now have a tool to assess the goodness-of-fit of the data in Θ s+b , that is c, to the presence of a lognormally distributed signal of know" mean and spread against the background-only alternative, constrained by the data from Θ b , that is b.To quantify the significance of one observation of Λ (γ, σ), we must know its tail distribution.
Wilk's theorem (Wilks 1938) states that, under some validity conditions and if H 0 is true, Λ (γ, σ) converges asymptotically to a χ 2 k distribution.The number of degrees of freedom k is equal to the difference in dimensionality between the maximization of the numerator and denominator in Equation B.1.Here, k = 1.
The application of the necessary conditions for Wilk's theorem to be valid is evaluated below, following the recommendation of Algeri et al. (2020): 1. Λ (γ, σ) converges asymptotically to a χ 2 k distribution.This requires that R i c i → ∞.We assume Θ s+b to be large enough such that we have sufficient data for the asymptotical behavior to hold true. 2. Only values of the parameters over which the maximization is performed, here λ s , that are not on the boundaries of their parameter spaces are allowed.The parameter space of λ s is R + , and thus this condition fails under H 0 as λ s = 0.
3. The null hypothesis (λ s = 0) must be a limiting case of its alternative (λ s > 0).This holds true in our application.4. All parameters are identifiable under both hypotheses.Eventhough (γ, σ) are unknowns of our model, Λ (γ, σ) does not have any non-identifiable parameters as it is evaluated for one given point of S. 5.The probabilistic model (given in Equation 5) accurately represents one of the two competing hypotheses.We assume this to be true.
In our case, all conditions are met but the second one.For the null hypothesis under which we want to estimate the distribution of Λ (γ, σ), the parameter λ s is defined at the boundary of its parameter space.To account for this, the asymptotical distribution becomes Λ (γ, σ | H 0 ) ∼ 1/2 χ 2 1 + 1/2 δ(0), with δ(0) the Dirac delta distribution centered at 0 (Algeri et al. 2020, p. 247).

B.2.3. Maximization over S
The profile likelihood ratio defined previously is evaluated at a fixed point in the parameter space S. We now have to maximize it over S to find the best fit the model defined in Equation B.5 to our observations.This gives the maximized profile likelihood ratio (as presented in Equation B.2): Λ = max (γ,σ)∈S Λ (γ, σ) . (B.9) The MLE of the signal strength λ s , its average location γ and its spread in energy σ are finally given as: λs , γ, σ = arg max When Λ (γ, σ) is evaluated at a point of S, it is distributed according to a χ 2 distribution.When mapped to the parameter space S, Λ (γ, σ) becomes a χ 2 random field for which (γ, σ) is the global maximum coordinates.The statistical distribution of Λ is no longer χ 2 as the probability of obtaining a significant Λ (γ, σ) observation rises with the size of the parameter space S.This is called the "look-elsewhere effect" (Gross & Vitells 2010), and is a direct effect of the non-identifiability of the parameters θ under the null hypothesis H 0 .The distribution of Λ is therefore unknown and must be estimated to compute the significance of a given observation in terms of p-value.Vitells (2011) proposes a method of estimating the tail distribution of Λ based on the theory of random fields.As stated above, the test statistic Λ (γ, σ) is a χ 2 random field over S. The excursion set of the field over some level u is defined as A u = {(γ, σ) ∈ S : Λ (γ, σ) > u}.Let us denote by ϕ (A u ) the Euler characteristic of the excursion set A u .Practically, ϕ (A u ) counts the number of disconnected elements composing A u .As the level u increases, the random field increasingly rarely exceeds it.For large enough u, ϕ (A u ) asymptotically becomes the excursion probability over u.For large u, we can therefore approximate the tail distribution of Λ, giving: Equation B.14 is valid for every u.This equation can be solved for the constants (N 1 , N 2 ) by replacing u with two different levels (u 1 , u 2 ).The choice of (u 1 , u 2 ) must be such to obtain the smallest error on E ϕ (A u ) while running a small number of MC simulations..15evaluated at different levels u, with the constants N 1 = 2.3 ± 0.1 and N 2 = 16.6 ± 0.2 estimated from 20 MC simulations.For each MC simulation, the parameter space S is discretized in 400 evaluations.From this plot, we can conclude if Λ ≳ 8.5, then the null hypothesis H 0 is rejected at a 90% confidence level.It is important to note for small u, the estimated p-value is greater than 1, which for a probability is not possible.This is because the estimation drawn from Equation B.14 only applies to large u.In our case, the prediction for p-values < 10 −1 is reasonable, as demonstrated by Vitells (2011, Figs. 2 & 6) in a similar context.
From Equations B.14 and B.11, and given the level u is sufficiently large, we have:

Fig. 2 .
Fig.2.Observation bias due to the background reduction scheme.Both estimates from a Monte-Carlo simulation and the Eq.(A.4) are shown.Top panel: relation between the true mean number of counts λ T , as detected by ICA, and the observed mean number of counts λ O , as reported by ICA after the background reduction and binning are applied.In this figure, the relation is averaged over all energies for clarity ; in a real application, the correction factor is different for each energy bin.The observed mean λ O is significantly reduced for low true mean values λ T .The deviation vanishes for larger λ T ≳ 3 (inset).Bottom panel: ratio between the true mean number of counts and the observed mean number of counts.In all panels, the analytical estimate matches very well (within the MC uncertainties) the MC simulation, validating both approaches.The undistorted unit response is shown by the dashed line in all panels.

Fig. 5 .
Fig. 5. a.An external view sketch of the ICA instrument angular bins.b.An illustration of the angular spaces associated to both matrices.Here, the symbols Θ s+b and Θ b only represent the angular dimension of the parameter spaces.c.The projection of the angular spaces as seen by ICA.

Fig. 6 .
Fig. 6.Time-average differential directional number flux energy spectrum of protons originating from the comet, after subtracting the background.The black dots represent the background-subtracted data, horizontal error bars represent the energy bin width and vertical error bars a 90% confidence interval.The dashed line represents the measurement sensitivity; any signal below this threshold is indistinguishable from background noise.The striped area covers the solar wind and has been masked out.
, R., et al.: A&A, 683, A245 (2024) which represents the mean of the Poisson process for background counts in each energy bin.Both λ s and λ b are relevant under both hypotheses.Following the formalism introduced in Sect.B.1, we have θ = (γ, σ) and θ ′ = λ b .
Statistical distribution of Λ and p-value estimation

E
ϕ (A u ) ≈ P[Λ > u] .(B.11)One important result fromAdler (2007) is that the expectation value of the Euler characteristic ϕ (A u ) can be analytically described as:E ϕ (A u ) = D=2 d=0 N d ρ d (u).(B.12)For a χ 2 k random field, the functions ρ d (u) are derived fromAdler (2007, Theorem 15.10.1 on p. 429)  as: u) = A(k, d = 1) • u (k−1)/2 e −u/2 , (B.13b) ρ 2 (u) = A(k, d = 2) • u (k−2)/2 e −u/2 • u − 1 {k≥2} (k − 1) , (B.13c)with 1 the indicator function.The function A(k, d) does not depend on u and can be included in the constant N d .For k = 1, the Equation B.12 reduces to:E ϕ (A u ) = 1 2 P χ 2 k > u + e −u/2 N 1 + N 2 √ u .(B.14) Fig. B.1.Equation B.15 evaluated at different levels u, with the constants N 1 = 2.3 ± 0.1 and N 2 = 16.6 ± 0.2 estimated from 20 MC simulations.For each MC simulation, the parameter space S is discretized in 400 evaluations.From this plot, we can conclude if Λ ≳ 8.5, then the null hypothesis H 0 is rejected at a 90% confidence level.It is important to note for small u, the estimated p-value is greater than 1, which for a probability is not possible.This is because the estimation drawn from Equation B.14 only applies to large u.In our case, the prediction for p-values < 10 −1 is reasonable, as demonstrated byVitells  (2011, Figs. 2 & 6)  in a similar context.
the relation between the level u and the pvalues is given in Fig. B.1.A245, page 13 of 13

Table 1 .
Backscattering efficiency of solar wind H + on small bodies in the Solar System.