A new approach for the statistical denoising of Planck interstellar dust polarization data

Dust emission is the main foreground for cosmic microwave background (CMB) polarization. Its statistical characterization must be derived from the analysis of observational data because the precision required for a reliable component separation is far greater than what is currently achievable with physical models of the turbulent magnetized interstellar medium. This letter takes a significant step toward this goal by proposing a method that retrieves non-Gaussian statistical characteristics of dust emission from noisy Planck polarization observations at 353 GHz. We devised a statistical denoising method based on wavelet phase harmonics (WPH) statistics, which characterize the coherent structures in non-Gaussian random fields and define a generative model of the data. The method was validated on mock data combining a dust map from a magnetohydrodynamic simulation and Planck noise maps. The denoised map reproduces the true power spectrum down to scales where the noise power is an order of magnitude larger than that of the signal. It remains highly correlated to the true emission and retrieves some of its non-Gaussian properties. Applied to Planck data, the method provides a new approach to building a generative model of dust polarization that will characterize the full complexity of the dust emission. We also release PyWPH, a public Python package, to perform GPU-accelerated WPH analyses on images.


Introduction
The quest of primordial B-modes in the polarization of the cosmic microwave background (CMB) faces a major challenge, namely the accurate characterization of one of its main foregrounds: the polarized emission of interstellar dust from our Galaxy (BICEP2/Keck and Planck Collaborations 2015). To rightfully claim any detection of the cosmological signal, one has to fully take into account the complexity of the Galactic contribution to the sky emission (Hensley & Bull 2018;Clark & Hensley 2019;Pelgrims et al. 2020;Planck Collaboration XI 2020). The statistical characterization of Galactic polarized foregrounds is also important to extract the CMB lensing potential from sky observations (Beck et al. 2020).
The Galactic contribution is sourced by the thermal emission of nonspherical dust grains in the diffuse interstellar medium (ISM). The mixture of dust and gas is described as a turbulent magnetized fluid, and the grains tend to align statistically with their long axes perpendicular to the local magnetic field, thus polarizing the emission (Planck Collaboration XII 2020). The physical and chemical processes regulating and structuring the diffuse ISM are nonlinearly coupled (Draine 2011), leading to the emergence of strong scale couplings, which may be evidenced by the highly non-Gaussian statistics observed for many tracers (Burkhart et al. 2009).
Capturing non-Gaussianity is therefore essential to characterize the dust signal. To tackle this issue, several authors have introduced machine learning algorithms that need to be trained (Aylor et al. 2020;Krachmalnicoff & Puglisi 2021;Petroff et al. 2020; Thorne et al. 2021). Another approach, which does not involve any learning steps, uses statistics defined from nonlinear transforms, namely the wavelet scattering transform and the wavelet phase harmonics (WPH). These describe the couplings between scales in non-Gaussian fields, and they have been applied to ISM data (Allys et al. 2019;Regaldo-Saint Blancard et al. 2020;Saydjari et al. 2021) as well to study the largescale structure of the Universe Cheng et al. 2020). They have been successfully used to characterize simulated and observational data with a high signal-to-noise ratio (S/N), and to build generative models for these data. However, these approaches face difficulties with noisy data, which is the case of Planck polarization observations. In this letter, we introduce a statistical denoising method using the WPH statistics to retrieve the statistical properties of the noise-free dust emission.
In Sect. 2 we present our statistical denoising method and show how well it performs on noisy simulated data of the polarized emission from dust. In Sect. 3, we apply it to the Chamaeleon-Musca 353 GHz polarized emission, as observed by the Planck satellite, and discuss our results. In Sect. 4 we conclude and suggest a few perspectives for follow-up studies. This paper also includes two appendices. Appendix A introduces the WPH statistics and the denoising procedure in detail. Appendix B presents a first comparison with other meth-Article number, page 1 of 10 arXiv:2102.03160v2 [astro-ph.CO] 25 May 2021 A&A proofs: manuscript no. main ods. We also provide a public Python package to perform GPUaccelerated WPH analyses on images called PyWPH 1 .

Description of the method
We observe a noisy map d that is modeled as follows: with s being the target truth map and n being an additive noise signal. The noise n is an unknown realization of a random field N. We assume that we know N, meaning that we are able to generate as many independent realizations of it as needed. We call {n 1 , . . . , n M } M of these. We introduce a method to retrieve the statistical properties of s while denoising d. This statistical denoising consists in iteratively deforming d to build a denoised maps such that the {s + n i } i maps and the d map become "close enough" in a given statistical space. It employs the possibility introduced in Zhang & Mallat (2021) to build a generative model from WPH statistics, that is, the ability to generate new random realizations based on statistical constraints. In the following, we call φ the operator that computes a set of summary statistics from a given map. The algorithm consists in minimizing the following loss function: where · denotes the Euclidean norm. We choose u 0 = d to initialize the optimizer. The denoised maps corresponds to an approximate minimum obtained by performing this optimization in pixel space, using an LBFGS optimizer (Byrd et al. 1995). We note that a limitation of this algorithm relies on the (ideal) assumption that we know N. In practice, any non-modeled statistical property of the noise will be considered to be part of the signal s. The choice of the operator φ is obviously paramount to the quality of the method and must be tailored to the properties of s and n. In the context of ISM polarization data, we expect s to be relatively regular and to exhibit non-Gaussian signatures due to the interactions between scales (e.g., filamentary structures), while the noise is expected to be highly irregular and close to a (possibly spatially-varying) Gaussian white noise.
In this work, the φ operator computes WPH statistics. These statistics are designed to characterize the coherent structures that appear in non-Gaussian random fields, by quantifying the phase alignment between different scales (Zhang & Mallat 2021;Mallat et al. 2020). Their ability to define a generative model has been studied in particular by Allys et al. (2020), who demonstrate quantitatively that they include most of the information captured by various other statistics, such as the power spectrum, the bispectrum, or the Minkowski functionals. These statistics were applied to real-valued data. Here we extend them to complex-valued polarization maps Q + iU. Statistics derived from this complex variable efficiently characterize the full complexity of the polarization signal (Regaldo-Saint Blancard et al. 2020). We refer readers to Appendix A for a presentation of the WPH statistics and a detailed description of the denoising procedure.

Validation on a simulation
In this section, we assess the performance of our denoising algorithm by applying it to mock data d = s + n, emulating a noisy Q + iU Planck polarization signal. We show figures of merit based on power spectra and probability distribution functions (PDFs) of the increments of the maps. These are only shown for Q, but they are similar for U. We postpone figures of merit based on other kinds of statistics, including the WPH statistics, to a future study.
Here s is a simulated Q + iU map, representing a typical linear polarization signal produced by the thermal emission of dust in the diffuse ISM. It was built from the same magnetohydrodynamic simulation as the one used in Regaldo-Saint Blancard et al. (2020) and following the same procedure. We refer to this paper for more details on the construction of this simulated map. We simulated a Planck observation of dust polarization using Planck instrumental noise maps introduced in Sect. 3 (Planck Collaboration III 2020). We have a total of 300 realizations of this noise at our disposal. We picked one of these, called n, to build d, and we used the remaining M = 299 noise maps, labeled {n 1 , . . . , n M }, for the denoising algorithm.
We define the S/N of d as the ratio of the standard deviations of (s) and (n) (in the following, the real part of a complex Q + iU map refers to its Q map). We adjusted this S/N by scaling s, so that the impact of the noise "resembles" that on the Planck map presented in Sect. 3. This was not straightforward since the power spectrum of the simulated map s has a different slope than that of the estimate of the power spectrum of the dust emission from the Planck map (see Figs. 2 and 5). We decided to adjust the S/N so that the scale k i , at which the power spectra of n and s intersect, coincides with the one from the Planck map. Figs. 2 and 5 show that k i /(2π) ≈ 0.8 px −1 . This procedure leads to S/N ≈ 0.4 for the simulated map, which is more than two times lower than the S/N of the observational map discussed in Sect. 3.
We show in Fig. 1 (top) the simulated Q maps corresponding to the truth s (left), the noise n (middle), and the resulting noisy map d (right). The map (s) exhibits coherent structures such as filaments and large smooth regions that are characteristic of its non-Gaussian statistical properties. On the other hand, (n) seems close to an inhomogeneous white Gaussian noise. The spatial inhomogeneity appears as variations of the local standard deviation due to the Planck satellite scanning strategy. Finally, in (d), coherent structures at intermediate and small scales are hard or impossible to identify.
We applied our denoising method to the simulated noisy map d and show, in Fig. 1 (bottom), the resulting denoised Q map (s) (left) next to the difference between the noisy and denoised maps (d −s) (middle) and the denoising error (s −s) (right).
The map (s) shows that the noise level has been drastically reduced and that we were able to recover the filamentary structure down to a minimum scale. We can identify the smooth regions of (s) even if there still remains a visible noise. The similarities between (d −s) and (n) are striking. The local variations of the standard deviation of the noise are clearly recovered, demonstrating that the inhomogeneity of the noise is not an issue for our method. Finally, (s −s) exhibits some remaining structures that match the thinnest filaments appearing in (s), on top of a more diffuse background. This indicates that down to a minimum scale, below which the algorithm struggles to recover features, most of the structures are efficiently reconstructed. Figure 2 compares the power spectra of the six maps shown in Fig. 1 plus the cross-spectrum between the denoised and  truth maps. These power spectra were estimated computing the mean square moduli of the Fourier components of the maps binned on linearly spaced isotropic wavenumber bins, and the uncertainties correspond to the standard deviation of the mean. The cross-spectrum was computed similarly except for the bins above k/(2π) = 0.14 px −1 that were logarithmically spaced in order to lower the statistical variance 2 . We first point out that the power spectrum of d coincides with the sum of those of s and n because of the statistical independence between s and n. The power spectra of (s) and (s) are in very good agreement with each other up to 0.18 px −1 , at which scale the noise power is ten times that of the signal. At smaller scales, where the noise dominates the signal even more, our algorithm is not able to retrieve the true power spectrum but gets closer to it. The power spectrum of (d −s) coincides with that of (n) at small scales, and this agreement progressively worsens toward larger scales. This large-scale behavior shows that our algorithm does not remove the already negligible noise, as shown by the superposition of the power spectrum of (s −s) and that of (n) at these scales. The cross-spectrums × s is slightly below the power spectrum of s at intermediate scales and this discrepancy increases toward the smallest scales. At intermediate scales, where the power spectrum of (s) matches that of (s), we suspect this discrepancy to stem from differences between the phases of the Fourier components of s ands that would deserve a further quantification. Nevertheless, the production of a denoised map whose power spectrum coincides with that of s, even though n is ten times more powerful than s, and that retains a significant correlation with s is a striking success of our method.
To better characterize the non-Gaussianity ofs, we computed the PDFs of the increments of Q for the noisy, denoised, and truth maps, and we plot them in Fig. 3 for three scalar lags. The increment δQ l (r) for a scalar lag l at a position r is the set of differences δQ l (r) = Q(r) − Q(r + l) with l ≤ |l| < l + 1 in pixel units. These statistics are usually computed on velocity maps for the tails of the PDFs that characterize the intermittent dissipation of turbulence in the ISM (Hily-Blant et al. 2008; Hily-Blant  Fig. 3. PDFs of the increments of Q for d (noisy), s (truth), ands (denoised), computed for three logarithmically spaced lags going from 2 to 32 pixels. For each lag, the increments were normalized by the standard deviation σ l of the Gaussian that fits the core of the PDF of the noisy map. In the bottom panels, we also show the ratio of the PDFs ofs with those of s for each lag. & Falgarone 2009). Contrary to the case of the noisy map, the distributions of increments for (s) are far from Gaussian for every lag. This is a clear signature of the non-Gaussianity of the data as we expect Gaussian-distributed increments for homogeneous Gaussian data. Our method recovers these statistics and their non-Gaussian tails with limited distortion for each lag, demonstrating its efficiency in retrieving non-Gaussianity in the data. A more thorough characterization of this non-Gaussianity has been postponed for further studies.

Presentation of the data
We now apply our denoising method to a Q+iU polarization map of the Chamaeleon-Musca region observed at 353 GHz with the Planck satellite (PR3 data 3 , Planck Collaboration III 2020). At this frequency, the CMB is negligible and the dominant components are the dust emission and the noise (see Fig. 34 of Planck Collaboration IV 2020). We consider the Q and U maps corresponding to the full mission, and those corresponding to the two half-missions, as well as the 300 end-to-end simulated Q and U maps of the noise and the systematics of the instrument for the full mission (the ones used in Sect. 2). We projected all of these maps on 512 × 512 grids with a pixel size of 2.35 , centered on the region of the Chamaeleon-Musca clouds at Galactic coordinates (l, b) = (300.26 • , −16.77 • ). We used a Gnomonic projection through the HEALPix/healpy 4 package (Górski et al. 2005;Zonca et al. 2019). In Fig. 4, we show the projected full-mission Q map that we aim to denoise (left), as well as the denoised map discussed in the next section (middle). The same maps for U are given in Fig. B.2 in Appendix B.

Denoising results
We applied the denoising method presented in Sect. 2 to the full mission Q + iU map, using the corresponding 300 noises. ures 4 and B.2 (middle) show the resulting denoised Q and U maps, respectively. The overall noise level has been clearly mitigated although subtle residuals of the patterns due to the scanning strategy remain, and we can now discern a more complex layout of structures even in the regions where the signal is weak. In Fig. 5, we show a power spectrum analysis of the denoising of the Q map, comparing the power spectra of the noisy and denoised maps (d) and (s), of one noise map (n), and of the difference between the noisy and denoised maps (d −s), as well as the cross-spectrum between the two half-missions maps. This cross-spectrum gives an estimate for the power spectrum of the truth signal since the noise signals are independent. It is satisfactory to see that the power spectrum of (s) is consistent with this cross-spectrum for scales down to k/(2π) ∼ 3.4 deg −1 . Also, (d −s) behaves similarly to what we have observed in Sec. 2.2, with a power spectrum consistent with that of (n) when the noise has a larger power than the signal, and it falls below what is expected when the emission of the dust begins to dominate. Similar results are obtained for U. Figure 6 compares the PDFs of the increments of Q for the noisy and denoised maps, and for the same three scalar lags as in Fig. 3. Based on the results of Sect. 2, we expect the PDFs of the denoised map to be good estimates for the statistics of the truth map. These results show important differences between the noisy and denoised maps that are again a signature of the non-Gaussianity of the truth signal. We find similar results for the increments of U.
The literature on image denoising is rich and abundant (see for instance Buades et al. 2010;Dabov et al. 2007;Zhang et al. 2017), and a thorough comparison of our method with other denoising algorithms would be beyond the scope of this paper. Nevertheless, to give some elements of comparison, we compare our method in Appendix B to the following: (1) Wiener filtering and sampling methods, which are widely used in astrophysics; and (2) the GNILC method (Planck Collaboration Int. XLVIII 2016), which was used on Planck polarization data and provides a local smoothing kernel in order to optimally remove the noise. For the Wiener approach, we find that neither the Wiener filtered image nor realizations drawn from the Wiener posterior distribution are able to retrieve the PDFs of the increments, even when the true power spectrum was given as input. The comparison with GNILC shows that our method recovers the true power spectrum and the PDFs of the increments better.

Conclusion and prospects
We have introduced a new method for the denoising of Planck polarization data based on the generation of random synthetic maps from WPH statistics that characterize the non-Gaussianity of the dust emission. This method takes advantage of the strong statistical differences between the signal of interest (non-Gaussian and regular) and the noise (close to Gaussian and irregular) by performing an optimization that constrains the statistical properties of the denoised map plus noise realizations.
We applied our method to mock Q + iU noisy data designed to emulate typical Planck polarization maps of dust in the diffuse ISM. The denoised map has a power spectrum that coincides with the true power spectrum down to a minimum scale where the power of the noise is ten times that of the signal, while being highly correlated with the truth signal. It recovers the PDFs of the increments for various isotropic lags, demonstrating that our method is able to retrieve non-Gaussianity in the data. Finally, we applied this method to a 353 GHz Planck observation of the Chamaeleon-Musca field.
Our method has been introduced as a statistical denoising method, but it should be more generally applicable to component separation problems. In particular, we expect this method to be efficient at disentangling dust emission from the CMB, cosmic infrared background, and noise in Planck maps, provided that we have relevant models of each of these contaminants at our disposal. Therefore, it could hopefully enhance the scientific outcome of Planck data.
One of the main motivations of our work is to build a generative model of the dust polarization signal that takes into account the non-Gaussianity of the data. Such a model derived from the analysis of Planck data may be used to simulate the dust polarization sky. Our denoising algorithm constitutes a step toward this modeling, in the sense that WPH statistics of the dust emission, corrected to a first approximation from noise contamination, may be computed from the denoised map. A natural extension of this work would be to quantitatively assess how well we can recover the WPH statistics of the truth map with this approach. This will be the subject of a future paper.
A&A proofs: manuscript no. main Real part (left) and imaginary part (right) of a 512×512 bumpsteerable wavelet ψ 6,π/4 . The wavelet is centered in the middle of the map for a better visualization.

Appendix A: The WPH statistics
In this appendix, we introduce the WPH statistics used in this paper, referring to Allys et al. (2020) and Mallat et al. (2020) for additional details. We also describe the detailed denoising procedure introduced in Sect. 2.1. We call X a given random field and x one of its realizations.

Appendix A.1: Bump-steerable wavelets
The WPH statistics rely on a bank of wavelets that are dilations and rotations of a mother bump-steerable wavelet ψ defined in Fourier space as follows (Mallat et al. 2020): × cos L−1 (arg(k)) · 1 [0,π/2] (| arg(k)|), (A.1) with k = k , 1 A (x) being the indicator function that returns 1 if x ∈ A and 0 otherwise, and ξ 0 = 0.85π being the central wavenumber of the mother wavelet. The dilated and rotated wavelets are defined as follows: with j being the index of the dilation by a factor 2 j and θ being the angle of rotation. We call ξ j,θ = 2 − j ξ 0 u θ with u θ = cos θ u x + sin θ u y the central wavevector of the wavelet obtained by such a transform. In practice, we consider J dilation indices j ranging from 0 to J − 1, and we divide 2π into 2L evenly spaced rotation angles θ. The bank of wavelets is thus the set of 2JL wavelets {ψ ξ j,θ : j = 0, . . . , J − 1, θ = 0, π L , . . . , (2L−1)π L }, and these wavelets cover most of Fourier space with their respective band-passes.
In this paper, we work with 512×512 maps and choose J = 8 and L = 8. We show in Fig. A.1 one example wavelet from the bank.

Appendix A.2: WPH moments
The WPH moments of X are covariances of the phase harmonics of the wavelet transform of X. With {ψ ξ 1 , . . . , ψ ξ N }, our bank of wavelets labeled by their central wavevectors ξ i , the wavelet transform of X corresponds to the set {X ψ ξ 1 , . . . , X ψ ξ N }, where denotes the convolution operation. These convolutions amount to a local band-pass filtering of X on the scales probed by each of the wavelets. For a homogeneous X, the WPH moments of X are as follows: with z → [z] p = |z| · e ip arg(z) being the phase harmonic operator 5 . These WPH moments are able to capture interactions between different scales of X thanks to the phase harmonic operator. Indeed, the covariance between X ψ ξ i and X ψ ξ j vanishes when the wavelets ψ ξ i and ψ ξ j have nonintersecting band-passes, and it is otherwise a function of the power spectrum of X only. With proper p i and p j indices, the phase harmonic operator can make [X ψ ξ i ] p i and [X ψ ξ j ] p j comparable in the sense that they share common spatial frequencies, allowing an extraction of high-order information through their covariance.
We discretize the τ variable similarly to Allys et al. (2020). This introduces a grid of τ n,α vectors for each wavelet ψ j,θ , defined as with n = 0, . . . , ∆ n and α ∈ {−π/4, 0, π/4, π/2}. To avoid a redundancy in the information contained in the coefficients, we discarded the translations for which n > min(J − 1 − j, ∆ n ). Allys et al. (2020) identified a relevant set of WPH moments to build a generative model of real-valued simulated data of the large-scale structure of the Universe, which are highly non-Gaussian at late times and at scales smaller than 100 h −1 Mpc. In the present work, we have adopted the same corresponding WPH statistics and applied them similarly on complex-valued data, except for the scaling moments discussed below. In practice we estimate a set of moments that can be divided into the following five categories: the S (1,1) moments, of the form C ξ,1,ξ,1 , at every τ n,α with ∆ n = 2. They capture the power spectrum information of the ξ band-pass. the S (0,0) moments, of the form C ξ,0,ξ,0 , at every τ n,α with ∆ n = 2. They capture information related to the sparsity of the data in the ξ band-pass. the S (0,1) moments, of the form C ξ,0,ξ,1 , at τ = 0 only. They capture information related to the couplings between the scales of the same ξ band-pass. the C (0,1) moments, of the form C ξ 1 ,0,ξ 2 ,1 , considering 0 ≤ j 1 < j 2 ≤ J − 1 and |θ 1 − θ 2 | ≤ π 2 , at every τ n,α with ∆ n = 2 when θ 1 = θ 2 and at τ = 0 when θ 1 θ 2 . They capture information related to the correlation between local levels of oscillation for the scales in the ξ 1 and ξ 2 band-passes.
We callC ξ 1 ,p 1 ,ξ 2 ,p 2 (τ) the normalized estimates of the WPH moments C ξ 1 ,p 1 ,ξ 2 ,p 2 (τ). In practice they are defined as follows for a map x: where the brackets stand for a spatial mean on the r variable, the bar denotes the complex conjugate, and These estimates depend on a reference map x 0 . We found that the choice of x 0 can have a significant impact on the results of our denoising method as explained below.

Appendix A.3: Scaling moments
Similarly to Allys et al. (2020), we complement these statistics with a small number of coefficients, the estimates of the so-called scaling moments L j,p that better constrain the largest scales that are not probed by the WPH moments. These moments were constructed from a bank of scaling functions {ϕ j } 0≤ j≤J−1 , which correspond to dilations of an isotropic Gaussian filter ϕ defined in Fourier space by the following: with σ = 0.496 × 2 −0.55 ξ 0 . For a real-valued random field X, the scaling moments of X are as follows: In practice, we only estimated the scaling moments with p ∈ {0, 1, 2, 3} and 2 ≤ j ≤ J − 2. We define the corresponding normalized estimates computed for a map x bỹ These normalized estimates depend on the same reference map x 0 introduced for the estimates of the WPH moments of X. As we deal with complex maps x in this work, we computed these estimates for their real and imaginary parts separately.

Appendix A.4: Denoising procedure
We applied our denoising method to a simulated noisy map d in two steps. We performed a first denoising of d using a φ operator that only takes into account the estimates of the S (1,1) moments, related to the power spectrum information, and the estimates of the scaling moments, and by normalizing the estimates choosing u 0 = d. This first step yields a maps 0 that has a power spectrum that is much more consistent with the truth map s. Then, we performed a second denoising of d using a φ operator that includes the whole set of estimates of WPH moments and scaling moments, but here normalizing these estimates with u 0 =s 0 . We calls the output of this second step. This particular choice for the normalization of the estimates has proven to be more efficient at retrieving the power spectrum and the PDFs of the increments of the truth map s. In Table A.1, we show the resulting number of statistical coefficients per class of moments computed for a complex 512 × 512 pixels map x and for the chosen parameters. In this S (1,1) S (0,0) S (0,1) C (0,1) C phase L j,p Total 960 960 128 6336 2752 40 11176 paper, the φ operator used for the second step of the procedure computes a set of 11176 statistical coefficients (of which 40 correspond to the estimates of the scaling moments), which corresponds to approximately 4% of the number of pixels of the input map.

Appendix B: Comparison to other methods
Appendix B.1: Wiener filtering and posterior sampling Wiener filtering is a typical approach to signal inference for problems of the form given by Eq. 1. In this case, the Wiener filtered reconstructions W (Wiener 1949, Zaroubi et al. 1995) is given bỹ where S = ss † and N = nn † are the signal and noise covariance matrices, respectively. For brevity we have assumed s = n = 0. This is the linear filter that gives the expected minimum variance of residuals (Wd − s) for known S and N. Furthermore, it is both the maximum a posteriori and mean posterior solution if the noise and signal are drawn from Gaussian distributions. That is, if the real-valued data are distributed with a likelihood and the prior distribution for the real-valued signal is then the Wiener posterior distribution is given by For the problem described in this work, the noise covariance N can be estimated from the samples of noise realizations and the signal covariance S can either be assumed or jointly estimated from the data (Wandelt et al. 2004). We avoided inverting the dense W matrix by implementing a messenger field approach, first described by Elsner & Wandelt (2013) and now widely used in cosmology (e.g., Jasche & Lavaux 2015;Alsing et al. 2017;Jeffrey et al. 2018a,b;Kodi Ramanah et al. 2019). We assumed that the underlying signal is homogeneous and isotropic, such that the signal covariance S is diagonal in harmonic space with elements corresponding to the one-dimensional power spectrum. The uncorrelated noise gives a diagonal noise covariance N in pixel space, so that the messenger field can efficiently iterate between harmonic and pixel space.
As we are concerned with polarization data, we used a spin-2 harmonic transformation between {s Q , s U } and { s E , s B }, using the curl-free E-mode and divergence-free B-mode representation. The signal covariance in harmonic space is therefore a concatenation of {S E , S B }. This formulation preserves the relevant Q-U correlation.
Even if the Wiener filtered reconstructed signals W is the maximum of the posterior distribution p(s|d), functions f (s W ) do not correspond to the maximum p( f (s)|d). For example, the two-point statistics (e.g., variance or power spectra) ofs W are not unbiased estimates of the power spectrum of s. Instead, we can draw sample images s i from the posterior p(s|d) (Eq. B.4), so that the transformed samples f (s i ) are correctly distributed according to p( f (s)|d). These realizations are generated by amending the messenger field algorithm (Elsner & Wandelt 2013). Fig. B.1 shows the Wiener filtered reconstruction of the simulated Q map (left panel). The low amplitude power spectrum of the residual map (s −s W ) demonstrates that the pixel values are close to the truth, whereas the power spectrum of (s W ) is biased low (center panel), which is as expected. To compare the methods, we compared with functions of samples s i from the Wiener posterior. Though not plotted here, as we input the true power spectrum, which was not done for the WPH denoising method, the realizations s i did have the correct power spectra with relatively small sample variance.
As a goal of this denoising work is to retain the statistical non-Gaussianity in the signal, which could be represented in the WPH coefficients, we again compared the PDF of increments δQ l . The right panel of B.1 shows the comparison between the data (d), the signal (s), the Wiener filtered (mean) map (s W ), and a realization (s W,sample ) drawn from the Wiener posterior. We see that neither the Wiener mean map (s W ) nor the sample (s W,sample ) capture the non-Gaussianity in terms of δQ l increments as successfully as our WPH denoising method. Indeed, for |δQ l |/σ l 1, there is a clear divergence from the true PDF, showing that the tails of the true statistics are not recovered at all. Furthermore, inspection of the WPH coefficients of the Wiener posterior samples φ(s i ) shows them to be noticeably further from those of the truth map than φ(s). These preliminary results are encouraging, but not surprising. The Gaussian prior distribution in the statistical model of the Wiener posterior leads to a poor recovery of the non-Gaussian structures that are intrinsic to the polarized dust emission.

Appendix B.2: GNILC
The GNILC method is a wavelet-based component separation method that is designed to extract the emission of the Galactic foregrounds from the Planck full-sky maps. It has been applied to polarization maps Q and U (Planck Collaboration IV 2020; Planck Collaboration XII 2020) to disentangle the thermal dust polarization emission from the CMB polarization and instrumental noise over the entire sky. We show in Figs. 4 and B.2 (right column) the resulting Q and U maps, respectively, for the Chamaeleon-Musca field observed at 353 GHz. We note that at this frequency, the CMB can be safely ignored, so that the main effect of the GNILC algorithm is to denoise the dust emission. In these maps, we clearly see that the smallest scales, most contaminated by the noise, have been filtered out. Compared to GNILC,