BeyondPlanck VII. Bayesian estimation of gain and absolute calibration for CMB experiments

We present a Bayesian calibration algorithm for CMB observations as implemented within the global end-to-end BeyondPlanck (BP) framework, and apply this to the Planck Low Frequency Instrument (LFI) data. Following the most recent Planck analysis, we decompose the full time-dependent gain into a sum of three orthogonal components: One absolute calibration term, common to all detectors; one time-independent term that can vary between detectors; and one time-dependent component that is allowed to vary between one-hour pointing periods. Each term is then sampled conditionally on all other parameters in the global signal model through Gibbs sampling. The absolute calibration is sampled using only the orbital dipole as a reference source, while the two relative gain components are sampled using the full sky signal, including the orbital and Solar CMB dipoles, CMB fluctuations, and foreground contributions. We discuss various aspects of the data that influence gain estimation, including the dipole/polarization quadrupole degeneracy and anomalous jumps in the instrumental gain. Comparing our solution to previous pipelines, we find good agreement in general, with relative deviations of -0.67% (-0.84%) for 30 GHz, 0.12% (-0.04%) for 44 GHz and -0.03% (-0.64%) for 70 GHz, compared to Planck DR4 (Planck 2018). The deviations we find are within expected error bounds, and we attribute them to differences in data usage and general approach between the pipelines. In particular, the BP calibration is performed globally, resulting in better inter-frequency consistency. Additionally, WMAP observations are used actively in the BP analysis, which breaks degeneracies in the Planck data set and results in better agreement with WMAP. Although our presentation and algorithm are currently oriented toward LFI processing, the procedure is fully generalizable to other experiments.


Introduction
The cosmic microwave background (CMB) anisotropies are among the most important observables available to cosmologists, and accurate determination of their statistical properties has been a main goal for a multitude of collaborations and experiments during the last three decades (e.g., Smoot et al. 1992;de Bernardis et al. 2000;Kovac et al. 2002;Bennett et al. 2013; A fundamentally important step in any CMB analysis pipeline is photometric calibration-the process of mapping the instrument readout to the incoming physical signal. In general, this procedure involves comparing some specific feature in the measured data with a known calibration model, for instance the CMB dipole or astrophysical foreground signal (or both), or by comparing the total measured power with a reference load with a known physical temperature (e.g., Planck Collaboration V 2016).
The multiplicative factor that converts between sky signal and detector readout is called the gain. This factor typically depends on the local environment of the detectors, such as the ambient temperature, and is therefore in principle different for each sample. However, as long as the detectors are thermally stable on reasonably long time scales, it is usually a good approximation to assume that the gain is constant over some short period of time, or at least that it is smoothly varying in time. For instance, the WMAP team adopted and fitted a six-parameter model for the gain, using housekeeping data such as focal plane temperature measurements to interpolate in time (Jarosik et al. 2007;Hinshaw et al. 2009;Greason et al. 2012). For LFI, we will assume that the gain factor is constant throughout each Planck pointing period (PID) -the timescale during which the satellite scans a given "circle" on the sky; these last roughly an hour each. We will also assume that the gain is varying smoothly between neighboring PIDs, except during a small set of events during which the instrument was actively modified by the mission control center, for instance during cooler maintenance.
The Planck LFI Data Processing Centre (DPC) (Planck Collaboration II 2014; Planck Collaboration V 2016; Planck Collaboration II 2020) used an onboard 4 K reference load to support the 30 GHz calibration for the early results, while for the other channels, and for all channels in later releases, they relied primarily on the CMB dipole signal. Gain fluctuations and correlations were modelled and suppressed by boxcar averaging over a signal-to-noise dependent window size. The Planck HFI DPC (Planck Collaboration VIII 2014; Planck Collaboration III 2020) also used the CMB dipole signal for gain estimation, but in this case they assumed a constant gain factor throughout the whole mission, relying on the excellent thermal stability of the Planck instrument. Apparent gain variations were instead assumed to arise from non-linearities in the analog-todigital conversion module, which then allowed for a deterministic correction. A similar approach has also been adopted by the recent SROLL2 re-analysis initiative (Delouis et al. 2019). We refer to the latest DPC maps as Planck 2018.
In the most recent official analysis (Planck DR4 1 ; Planck Collaboration Int. LVII 2020), the Planck team adopted the LFI gain model for all channels up to and including the 857 GHz channel. A novelty introduced in that analysis, however, was a decomposition of the gain factor into two nearly orthogonal components: an absolute (or baseline) gain factor, which was assumed to be constant for the entire mission, and a detectorspecific gain mismatch factor that could vary both in time and between detectors. This approach allowed estimation of each component separately, using calibrators that are better suited to each component. For example, the low signal-to-noise (but wellunderstood) orbital dipole was used to calibrate the absolute gain factor due to the long integration time involved in estimating this particular component. Solving for all relevant factors was then performed jointly with other relevant quantities.
1 Sometimes referred to as NPIPE.
In this paper, we adopt the Planck DR4 approach, and decompose the full gain into the above-mentioned components, and we estimate these jointly with all other parameters in the full model. Thus, the main novel feature presented in this paper is the integration of the gain estimation procedure within a larger Gibbs framework, as summarized by BeyondPlanck (2022), which performs joint estimation of all relevant parameters in a statistically consistent manner, including the CMB and astrophysical foreground sky signal.
The rest of the paper is structured as follows: In Sect. 2, we aim to build intuition regarding gain estimation, presenting the general data model that we use and highlighting various important features of this model, as applied to real-world LFI observations. Next, in Sect. 3 we describe our main gain estimation procedure, before showing results in Sect. 5, and comparing these with those derived by other pipelines. Finally, we summarize in Sect. 6, with an eye toward future experiments and applications.

Data and modelling considerations
We start our presentation with a general discussion of the gainrelated data model, and how to account for various complications that arise when fitting this to real-world data.

Data model
As described by BeyondPlanck (2022), the main goal of the BeyondPlanck analysis framework is to develop an end-to-end Bayesian analysis platform for CMB data, starting from raw time-ordered data. As for most Bayesian problems, the key step in our approach is therefore to write down an explicit parametric model for the observed data from a given detector, d t,i , where t is the index denoting the sample, 2 and i is the index denoting the detector in question. In the current analysis, we adopt the following high-level model, d t,i = g t,i s tot t,i + n corr t,i + n wn t,i , where n corr t,i and n wn t,i are correlated and white noise, respectively, g t,i is the gain factor, and s tot t,i denotes the total signal. Here, s tot t,i is given in kelvin, while d t,i is the instrument readout, which is measured in volts, meaning that the unit for g t,i becomes [V/K]. The total signal can further be decomposed into In this expression, P is the pointing matrix, which contains the mapping between the pointing direction the instrument, p and the sample index t; B symm and B asymm denote convolution with the symmetric and asymmetric beams, which quantify the fraction of the total signal coming from direction p when the instrument is pointing towards p; s sky is the sky signal (including the Solar dipole); s orb is the orbital dipole (to be discussed below); and s fsl represents signal leakage through the far-sidelobes. For further details regarding any of these objects, we refer the interested reader to BeyondPlanck (2022) and references therein. The main topic of the current paper is estimating g t,i . In this respect, it is important to note that all other free parameters in the data model, including s sky t,i and n corr t,i , are also unknown, and must be estimated jointly with g t,i . Casting this statement into Bayesian terms, we wish to sample from the posterior distribution, 3 P(g, s tot , s orb , n corr , . . . | d). ( That is, we aim to model the global state of the instrument and data, and map out the probability of various points in parameter space by sampling from this distribution. This may at first glance seem like a intractable problem. However, a central component of the BeyondPlanck framework is parameter estimation through Gibbs sampling. According to the theory of Gibbs sampling, samples from a joint posterior distribution may be drawn by iteratively sampling from all conditional distributions. In other words, when sampling the gain, we may assume that the sky signal and correlated noise parameters are perfectly known. And likewise, when sampling the sky signal or correlated noise parameters, we may assume that the gain is perfectly known. The correlations between these various parameters are then probed by performing hundreds or thousands of iterations of this type. Thus, for the purposes of calibration alone, we do not need to be concerned with many aspects that indirectly affect the gain, such as CMB dipole or correlated noise estimation (Planck Collaboration II 2014;Planck Collaboration V 2016;Planck Collaboration II 2020). Instead, we are here concerned only with defining an adequate model for g, and expressing this in a way that minimizes degeneracies with parameters in the Gibbs chain.
As discussed above, the gain is generally not constant in time. A very conservative (and somewhat naïve) model would therefore be to assume that the gain is in fact different for every sample t. However, this model clearly does not take into account our full knowledge about the instrument (Planck Collaboration XXVIII 2014). In particular, we do know that the gain is expected to correlate with the detector temperature, and this temperature does not change significantly on timescales of just one sample. Rather, based on available housekeeping data, a good assumption is that the gain is constant within a given pointing period (PID, or scan) -which is defined as a collection of samples measured over a period of about an hour, during which the instrument spins about its axis once per minute while keeping the spin axis vector stationary. Between each scan, the instrument performs a slight adjustment of the satellite spin axis, ensuring that new sky areas are covered in consecutive pointing periods.
To reflect the assumption of constant gain within each scan, we rewrite our data model as follows, where q now denotes PID. Thus, t is used to indicate a specific sample, while q represents a collection of samples. From Eq. (4), we immediately note the presence of two important degeneracies, involving the sky signal and noise, respectively. If we attempt to fit for g, s tot , and n corr simultaneously, without knowing anything about any of them, we see that a given solution, say, {g 0 , s tot 0 , n corr 0 }, will result in an identical goodnessof-fit as another solution {g 1 , s tot 1 , n corr 1 }, as long as either or n corr In other words, the gain is multiplicatively degenerate with the signal, and additively degenerate with the correlated noise. Such degeneracies are mainly a computational problem, since with two degenerate parameters in a Gibbs chain, exploring the resulting distributions takes a much larger number of samples than for uncorrelated parameters. A main topic of this paper is how to break these degeneracies in a statistically self-consistent and computationally efficient manner.

Absolute versus relative gain calibration
So far, we have been talking about the calibration of a given detector in isolation, which relates to what we call absolute calibration. Absolute calibration refers to correctly determining the "true" value of the gain, and is important for accurately determining the emitted intensity of astrophysical components, such as the CMB. Another closely related concept is relative calibration, 4 which quantifies calibration differences between radiometers. Because of Planck's scanning strategy, which only provides weak cross-linking 5 (Planck Collaboration I 2011), it is impossible to estimate the three relevant Stokes parameters (the intensity, I, and two linear polarization parameters, Q and U) independently for each detector. Rather, the polarization signal is effectively determined by considering pairwise differences between detector measurements, while properly accounting for their relative polarization angle differences at any given time. Therefore, any instrument characterization error that induces spurious detector differences will be partially interpreted by the analysis pipeline as a polarization signal. If our relative gain calibration is wrong, such differences will be introduced.
Given the high sensitivity of current and future CMB experiments, the gain must be estimated to a fractional precision better than O(10 −3 ) for robust CMB temperature analysis, and better than O(10 −4 ) for robust polarization analysis. Accurate relative calibration is thus even more important than accurate absolute calibration, and this will, as discussed in the next section, inform the choices we make on how to estimate these two types of calibration.

The Solar and orbital CMB dipoles
One of the most powerful ways to break the signal/gain degeneracy mentioned in the previous section is to observe a source of known brightness. If that source happens to be significantly Instrumental noise Orbital dipole Galactic foregrounds Solar dipole Fig. 1: Comparison of different contributions to the time-ordered data seen by Planck at 30 GHz, for a PID whose orientation is close to perpendicular to the dipole axis. The blue and black curves show the orbital and Solar CMB dipoles, respectively, while the red line shows contributions from small-scale CMB fluctuations and Galactic foregrounds. The gray line shows instrumental noise.
stronger than other sources in the same area of the sky, we could fix s tot t,i in Eq. (4) and the gain would essentially be determined by the ratio of the data to the known source brightness.
Unfortunately, the number of available astrophysical calibration sources that may be useful for CMB calibration purposes is very limited, given the stringent requirements discussed in Sect. 2.2. For instance, the brightness temperature of individual planets within the Solar system is only known to about 5 % (Planck Collaboration VIII 2016), while few other local sources are known with a precision better than 1 %.
The key exception is the CMB dipole. The peculiar velocity of the Planck satellite relative to the CMB rest frame induces a strong apparent dipole on the sky due to the Doppler effect. Specifically, photons having an anti-parallel velocity relative to the satellite motion are effectively blue-shifted, while photons with a parallel velocity are redshifted.
It is useful to decompose the peculiar spacecraft velocity into two components; the motion of the Solar system relative to the CMB rest frame, v solar , and the orbital motion of the Planck satellite relative to the Solar system barycenter, v orbital . Thus, the total velocity of the satellite relative to the CMB rest frame is v tot = v solar + v orbital . Taking into account the full expression for the relativistic Doppler effect, the induced dipole reads where β = v tot /c, and γ = (1 − |β| 2 ) −1/2 . The total dipole is time dependent because of the motion of the satellite over the course of the mission. We can similarly define a Solar dipole, s solar (x) and an orbital dipole, s orb (x, t), which are induced by only the Solar and orbital velocities alone, respectively. Both dipoles play crucial roles in CMB calibration; the orbital dipole for absolute calibration and the Solar dipole for relative calibration. Starting with the orbital dipole, we note that this depends only on the satellite's velocity with respect to the Sun. This is exceedingly well measured through radar observations, and known with a precision better than 1 cm s −1 (Godard et al. 2009 an orbital speed of 30 km s −1 , this results in an overall relative precision better than O(10 −6 ). However, Eq. (7) also depends on the CMB monopole, which is measured by COBE-FIRAS to 2.72548 K ± 0.57 mK (Fixsen 2009), corresponding to a relative uncertainty of 0.02 % or O(10 −4 ). Thus, the absolute calibration of any current and future CMB experiment cannot be determined with a higher absolute precision than O(10 −4 ) until a next-generation CMB spectral distortion experiment, for instance PIXIE (Kogut et al. 2011), is launched. Still, this precision is more than sufficient for Planck calibration purposes.
The second CMB dipole component corresponds to the Sun's motion with respect to the CMB rest frame. While this velocity is intrinsically unknown, one may estimate this from the relative amplitude of the Solar and orbital dipoles. This is illustrated in Fig. 1, which compares the orbital and Solar dipole signals (blue and black curves) with contributions from Galactic foreground emission and instrumental noise at 30 GHz for about three minutes of time-ordered observations. The Solar dipole is effectively determined by the relative amplitude ratio between the black and blue curves in this figure.
Based on this approach, the most recent Planck analyses have determined that the Solar CMB dipole amplitude is about 3.36 mK, corresponding to Solar velocity of about 370 km s −1 (Planck Collaboration I 2020; Planck Collaboration Int. LVII 2020). For comparison, large-scale CMB polarization fluctuations typically exhibit variations smaller than O(1 µK) (Planck Collaboration IV 2018), and consequently the relative calibration of different detectors must be better than O(10 −4 ) to avoid significant contamination of the polarization signal by the Solar CMB dipole. Achieving this level of precision in the presence of correlated noise, Galactic foregrounds, far sidelobe contamination and other sources of systematic uncertainties is the single most difficult challenge associated with large-scale CMB polarization science.

The degeneracy between the CMB temperature dipole and polarization quadrupole
In Sect. 2.1 we noted that the gain is multiplicatively degenerate with the signal, and additively degenerate with the noise. Within this broad categorization, there are also some particularly important gain/signal degeneracies that are worth highlighting, and perhaps the most prominent example is that with respect to the CMB polarization quadrupole. To illustrate this, consider the case in which two detectors report different CMB dipole amplitude signals; how could such a difference be explained? One possible explanation is a calibration mismatch, i.e., that the absolute calibration of one or both detectors is mis-estimated.
Another possible explanation could be a large polarization CMB quadrupole signal. Due to the scanning strategy adopted by Planck, in which the same ring is observed repeatedly for one hour, a polarization quadrupole can easily appear with a dipolar signature, depending on the particular phase orientation of the mode in question. This is illustrated in Fig. 2. The black thick line shows the CMB temperature dipole as a function of time, while the colored thin lines show three random polarization quadrupole simulations, all observed with the Planck scanning strategy. Out of the three random quadrupole simulations, two have a time-domain behaviour that very closely mimics the CMB temperature dipole, and in the presence of noise and instrumental effects, it would be exceedingly difficult to distinguish between the two models. therefore explicitly estimates a transfer function to account for this effect (Planck Collaboration Int. LVII 2020).
In the following, we adopt a slightly different strategy: We include the CMB polarization component in the calibration procedure, using the current sample in the Gibbs chain. In order to break the abovementioned degeneracy, we replace the polarization quadrupole of the CMB map with a random value drawn from the best-fit ΛCDM model, using a value of D EE 2 = 0.0308827 µK 2 . Thus, we marginalize over this component and propagate the uncertainties introduced by this naturally into the Gibbs samples. We expect the combined effect of this addition to be negligible, because of the low value of the ΛCDM prediction, and as shown in Brilenkov et al. (2022) the gain estimation is unaffected by this marginalization.

Processing masks and PID selection
The Gibbs sampling framework used by BeyondPlanck requires an explicit parametric model that describes CMB, foregrounds, and the instrument. If this model turns out to be an insufficient representation of the actual data, the Gibbs sampling framework will attempt to fit eventual modelling errors using the parameters that are at its disposal. Ideally, such unexplained contributions should end up as an excess residual in r = d − gs tot − n tot , but in practice they often also contaminate the other model parameters, such as the CMB. The correlated noise, n corr , is one such parameter that, because of its relatively unconstrained and global structure, ends up absorbing a wide range of modelling errors, as discussed by Ihle et al. (2022). Furthermore, as already noted in Sect. 2.1, there is a tight degeneracy between the correlated noise and the gain, and n corr is therefore a sensitive monitor for gain errors. Figure 3 illustrates this in terms of one arbitrarily selected 30 GHz n corr sample from the main BeyondPlanck analysis (Be-yondPlanck 2022). The left column shows such a sample in the default model, in which the gain is allowed to vary from PID to PID, while the right column shows the same when enforcing a time-independent gain. While the maps in the left column are visually dominated by scan-aligned random stripes, as expected for n corr , the maps in the right column (in particular the top row) show large excesses with a dipolar pattern along each Planck scanning ring. This is the archetypal signature of gain modelling errors, and this clearly demonstrates the need for a time-variable gain model. At the same time, there is also a clear quadrupolar pattern in the default configuration, with a positive excess along the Galactic plane and a negative excess near the Galactic poles. This structure is visually consistent with a near sidelobe modelling residual, in the sense that the Galactic foreground signal is slightly over-smoothed compared to the prediction of the nominal beam model, and the resulting residual is picked up by the correlated noise component. This is not surprising, considering that about 1 % of the full LFI 30 GHz beam solid angle is unaccounted for in the GRASP beam model (Planck Collaboration II 2020), and some of this missing power may be in the near sidelobes. Fortunately, we also see from the same plot that the impact of this effect is modest, and accounts for only about 1 µK at 30 GHz.
More generally, because we have an incomplete understanding of both the instrument and the microwave sky, modelling errors will at some level always be a concern when estimating both gain and correlated noise. Furthermore, these modelling errors will typically be stronger near the Galactic plane or bright compact sources, where foreground uncertainties are large. For this reason, it is customary to apply a processing mask while es- timating these quantities, omitting the parts of the sky that are least understood from the analysis. In BeyondPlanck, we define processing masks based on data-minus-signal residual maps for each frequency (Ihle et al. 2022), which are shown in Fig. 4.
In addition, as discussed by Suur-Uski et al. (2022), we also exclude a number of PIDs from the main analysis, for similar reasons as for applying processing masks. Most of these PIDs, however, do not correspond to particularly problematic areas of the sky, but rather to unmodelled instrumental changes or systematic errors, such as cooler maintenance or major satellite maneuvers. Excluded PIDs will show up as gaps in all PID plots in this paper.

Breaking degeneracies through multi-experiment analysis
As described in BeyondPlanck (2022), BeyondPlanck includes as part of its data selection several external data sets that are necessary to break fundamental degeneracies within the model. One particularly important example in this respect is the inclusion of low-resolution WMAP polarization data. In the same way that the WMAP experiment was unable to measure a few specific polarization modes on the sky due to peculiarities in its scanning strategy (Jarosik et al. 2007), Planck is also unable constrain some modes as defined by its scanning strategy (Planck Collaboration II 2020). However, because the WMAP and Planck scanning strategies are intrinsically very different, their degenerate modes are not the same, and, therefore all sky modes may be measured to high precision when analyzing both data sets jointly. This will be explicitly demonstrated in Sect. 5.4, where we compare the BeyondPlanck sky maps to those derived individually from each experiment. The morphology of these frequency difference maps correspond very closely to the correction templates produced respectively by the WMAP and Planck teams (Jarosik et al. 2007;Planck Collaboration II 2020), and Beyond-Planck is statistically consistent with both data sets. Agreement is a direct and natural consequence of performing a joint fit, and there is no need for additional explicit template corrections for BeyondPlanck.
At the same time, it is also important to note that only the Planck LFI data are currently modelled in terms of time-ordered data, whereas the WMAP sky maps and noise covariance matrices are analyzed as provided by the WMAP team. Therefore, if there should be unknown systematics present in WMAP, those errors will necessarily also propagate into the various Beyond-Planck products. An important near-future goal is therefore to integrate also the WMAP time-ordered data into this framework. This work is already on-going, as discussed by Watts et al. (2022), but a full WMAP TOD-based analysis lies beyond the scope of the current work.

Methodology
As discussed in Sect. 2.1, our main goal in this paper is to draw samples from P( g | s tot , n corr , d, . . .), the conditional distribution of g given all other parameters. In this section, we describe each of the various steps involved in this process.

Correlated noise degeneracies and computational speed-up
Before we present our main sampling algorithms for g, we recall from Sect. 2.1 that g is additively degenerate with n corr . In a Gibbs sampling context, strong degeneracies lead to very long Markov correlation lengths as the Gibbs sampler attempts to explore the degenerate space between the two parameters. In order to save computing power and time, it is therefore better to sample g and n corr jointly, such that for a given iteration of the main Gibbs chain, we instead sample directly from P( g, n corr | s tot , d, . . .). 6 A joint sample may be produced by means of univariate distributions through the definition of a conditional distribution, Thus, sampling from the joint distribution P(g, n corr | s tot , d, . . .) is equivalent to first sampling g from its marginal distribution with respect to n corr , P( g | s tot , d, . . .), and then subsequently sampling n corr from its conditional distribution with respect to g, P(n corr | g, s tot , d, . . .). These two steps must be performed immediately after one another, or else we would introduce an inconsistency in the Gibbs chain with respect to the other parameters. Note that P(n corr | g, s tot , d, . . .) is unchanged compared to the original Gibbs prescription, and no changes are required to sample from that particular distribution (see Ihle et al. 2022, for more details on this sampling process). When it comes to P( g | s tot , d, . . .), we refer to Appendix A.2 of BeyondPlanck (2022), whose sampling equations we will use throughout this paper. We note that the data model used in that appendix is the same general form as Eq. (4), and that sampling from P(g | s tot , d, . . .) is exactly analogous to what is shown in that appendix, as long as we make the identification n → n corr + n wn . As the covariance matrix of a sum of independent Gaussian variables (n corr and n wn ) is also Gaussian, with a covariance matrix given by the sum of the individual covariance matrices, we can in what follows use the results of the above-mentioned appendix to sample from P( g | s tot , d, . . .) as long as we let N → N corr + N wn .
Computationally speaking, sampling from P( g | s tot , d, . . .) instead of P( g | s tot , n corr , d, . . .) is numerically equivalent to a more expensive noise covariance matrix evaluation. 7 To mitigate this additional cost, we note that the gain is assumed to be slowly 6 Although this might seem somewhat counter-intuitive in the context of Gibbs sampling, joint sampling formally corresponds to reparametrizing { g, n corr } into one parameter in the Gibbs chain. 7 Although not shown here, sampling from P(g | s tot , n corr , d, . . .) would follow the exact same procedure, but with a noise covariance ma-varying in time, and, in particular, constant within each PID. All time-domain operations may therefore be carried out using coadded low-resolution data with negligible loss of precision. In practice, all TOD operations are in the following carried out at a sample rate of 1 Hz, leading to a computational speed-up of about two orders of magnitude.

Absolute gain calibration with the orbital dipole
Next, we also recall from Sect. 2.1 that the gain is multiplicatively degenerate with the total sky signal. At the same time, we note that the orbital CMB dipole is known to exquisite precision, and this particular component is therefore the ideal calibrator for CMB satellite experiments. However, its relatively low amplitude as compared with instrumental noise renders it incapable of tracking short-term gain variations, and, when fitted jointly with astrophysical foregrounds, it is also not sufficiently strong to determine relative calibration differences between detectors at the precision required for large-scale polarization reconstruction. Therefore, to minimize sensitivity to potential residual systematic and modelling errors, it is advantageous to estimate the absolute calibration using the orbital dipole alone, but use the full signal model (including the bright Solar CMB dipole) when estimating relative and time-dependent gain variations.
This motivates splitting the gain into two components, where g 0 is now independent of both time and detectors, following Planck Collaboration Int. LVII (2020). Our goal is then to use only the orbital CMB dipole to estimate g 0 , and later use the full sky signal to estimate γ q,i . Thus, with this reparametrization, we go from sampling from P( g | s tot , d, . . .) to sampling from P(g 0 , γ | s tot , d, . . .). As usual, drawing samples from this joint distribution can be done by Gibbs sampling, so that we first sample g 0 from P(g 0 | γ, s tot , d, . . .) and then γ from P(γ | g 0 , s tot , d, . . .). We should note that estimating g 0 using only the orbital dipole formally represents a violation of the Gibbs formalism, as we no longer draw this particular parameter from its exact conditional distribution. This is one of many examples for which we "buy" stability with respect to systematic errors at the price of increased statistical uncertainties. This is similar to the application of a processing mask when estimating the zero-levels of a CMB sky map (e.g., Planck Collaboration Int. XLVI 2016), or fitting correlated noise parameters using only a sub-range of all available temporal frequencies (e.g., Ihle et al. 2022). In all such cases, parts of the data are disregarded in order to prevent potential systematic errors from contaminating the parameter in question.
For the split in Eq. (9) to be valid, we must ensure that q,i γ q,i = 0, such that γ q,i represents only deviations from the absolute calibration. For technical reasons, it turns out that this will be easier to do if we also reparametrize γ q,i , where ∆g i represents the detector-specific constant gain, and δg q,i denotes deviations from ∆g i per scan. We can then separately enforce i ∆g i = 0 and q δg q,i = 0 for each detector trix given by N wn instead of N wn + N corr . N wn is a diagonal matrix, while N corr is not, and since most operations are less heavy, computationally speaking, when diagonal matrices are involved, the resulting sampling process would also be lighter in that case.
Article number, page 7 of 19  i, which is computationally cheaper than enforcing both constraints simultaneously.
Thus, we split the gain into three nearly independent variables, and explore their joint distribution by Gibbs sampling. The overarching goal for this section, then, is to derive sampling algorithms for each of the three associated conditional distribu- tions, We now consider each of these in turn.
3.3. Sampling the absolute calibration, g 0 To sample the absolute calibration using the orbital dipole alone, we need to define a data model that depends only on g 0 and s orb . We do this by first subtracting the full signal model as defined by the current joint parameter state, and then add back only the orbital dipole term, Here g curr 0 denotes the absolute gain at the current iteration in the Gibbs chain, i.e., before drawing a new value for g 0 .
As noted earlier, working with this residual and using the previous sample of g 0 to estimate the current sample does represent a breaking of the Gibbs formalism, since the statistically exact residual for g 0 would be However, in this case we would also be calibrating g 0 on the total sky signal instead of just the orbital dipole. Thus, we trade mathematical rigour and statistical uncertainties for stronger robustness with respect to systematic effects. As discussed in Sect. 3.1, the noise term in Eq. (14) includes both correlated and white noise, and the appropriate covariance matrix is therefore N = N corr + N wn . Given this fact, Eq. (14) corresponds to a simple uni-variate Gaussian model as a function of g 0 , and the appropriate sampling algorithm is discussed in Appendix A.2 of BeyondPlanck (2022). Applying that general procedure to our special case, we may write down the following sampling equation forĝ 0 , 8 where η ∼ N(0, 1) is a random number drawn from a standard normal distribution. Here, and elsewhere, a T superscript indicates the matrix transpose operator. The first term in this equation is the (Wiener filter) mean of the distribution P(g 0 | r i , N i ), while the second term ensures that g 0 has the correct variance.

Sampling detector-dependent calibration
For ∆g i , we proceed similarly as with g 0 , with two exceptions. First, ∆g i now represents the relative calibration between detectors, and, as discussed in Sect. 2.1, we need to use a stronger calibration signal than the orbital dipole to avoid significant polarization leakage. Secondly, we have to impose the constraint i ∆g i = 0. We start by defining the following residual, for each detector. This equation is structurally similar to Eq. (14), with the main difference being that the total sky signal, which is 8 Note that we do not apply any priors on g 0 in this paper, which corresponds to S −1 = 0, adopting the notation of BeyondPlanck (2022), where S is the prior covariance of g 0 . The remaining notational differences between Eq. (16) and Eq. (A.10) in that paper arise from our organizing all vectors and matrices in terms of independent detectors, using the fact that n tot is assumed to be independent between detectors; this may not be strictly true in practice, as discussed by Ihle et al. (2022), and future analyses may prefer to account for the full joint matrix.  dominated by the Solar dipole, is retained on the right-hand side. Otherwise, Eq. (17) still represents a Gaussian model, and we should be able to proceed similarly as for g 0 when drawing from the conditional distribution. We do, however, need to ensure that target distribution. The numerically most convenient method for imposing such a constraint is through the method of Lagrange multipliers.
In general, the method of Lagrange multipliers allows the user to minimize a function f (x) under some set of constraints that may be formulated as g(x) = 0. Without these constraints, one would of course determine x by solving d f /dx = 0. With additional external constraints, however, it is convenient to instead define the so-called Lagrangian, and set the corresponding partial derivatives with respect to x and λ equal to zero. It is readily seen that ∂L/∂λ = 0 corresponds directly to g(x) = 0, which is precisely the desired constraint. Our primary target distribution is where the first line follows from Bayes' theorem, and the second follows from the fact that we assume vanishing covariance between detectors, and that r i is Gaussian distributed with a mean of ∆g i s tot i and covariance N i . We are of course free to minimize the logarithm of this function instead of the function itself, which makes things easier as it takes the exponential away. We may therefore define the following Lagrangian, where λ is the Lagrange multiplier.
To optimize this function, we take the derivative with respect to ∆g i and λ to obtain two coupled equations. The first equation takes the form while the second simply reads Jointly solving these linear equations for ∆g i provides estimates with the correct mean. What we require, however, is a sample from the appropriate distribution, and not mean estimates. We must therefore add a fluctuation term, as in Eq. (16). To do so, we note that if it were not for λ, Eq. (21) would have the exact same form as Eq. (16), with s tot substituted for s orb . Comparing Eq. (21) with Eq. (16), we then see that the final equation for the desired sample must be where, as usual, η ∼ N(0, 1).
Casting this in terms of a linear system with n detector + 1 unknowns, this may be solved straightforwardly with standard numerical libraries. For a two-detector example, the resulting system of equations takes the form

Sampling temporal gain variations with Wiener filter smoothing
Finally, we consider the temporal gain variations, δg q,i . As before, we write down the following residual, where we again employ the total signal as a calibrator. The only difference with respect to Eq. (17) is that δg q,i now contains multiple elements per detector, and is now a vector in PID space. We can make this point more explicit by writing where T is an n samp × n scan -matrix that contains s tot t,i in element (t, q) for all values of t in scan q. All other elements are zero. Thus, T projects δ g i into the n samp -dimensional space of r i and n tot i . Once again following the procedure in Equation A.3 in Be-yondPlanck (2022), we may write down the following sampling equation, where η ∼ N(0, I) is a random Gaussian vector of length n scan . In its current form, Eq. (28) assumes that δg q,i is uncorrelated between scans. As discussed by Planck Collaboration VIII (2014), Planck Collaboration VIII (2016), Planck Collaboration III (2020), and Planck Collaboration Int. LVII (2020), this is not an efficient model for the Planck LFI data, because the gain is primarily determined by the thermal environment of the instrument, which is quite stable in time. It is therefore advantageous, and in practice necessary, to enforce some form of smoothing between δg q,i to obtain robust results.

Wiener filtering
The smoothing approach we adopt here is Wiener filtering. Although this approach has been applied to other parts of the CMB analysis pipeline such as CMB and noise estimation, it has never before been applied to the gain estimation process. In Bayesian terms, we have so far been drawing samples from the likelihood function L(δg i ), which, since our actual goal is to draw samples from the posterior distribution of δ g i , means that we are implicitly assuming a uniform prior on this parameter, which in turn means that we let the estimates of δ g i be completely determined by the data alone.
In addition, this means we assume no correlations between the elements in the δ g i vector, as the likelihood function is separable into independent probability distribution functions -an effect of this is that Eq. (28) is really a set of independent equations that can be solved sequentially.
Wiener filtering the data essentially amounts to applying a Gaussian prior to the estimation process, meaning that instead of drawing samples from L(δg i ) alone, we draw samples from L(δ g i )P(δ g i ), where P(δg i ) is the Gaussian prior we want to apply, which in turn will depend on how we a priori expect the gain fluctuations to behave. This process will by construction ensure that for scans with high signal-to-noise, the gain estimates will be set mainly by the observed data, whereas in periods of lower signal-to-noise (such as for dipole minima) the estimates will be prior dominated (and thus less fluctuating than without any prior). In addition, this prior explicitly introduces (physically motivated) correlations between the gain fluctuations of different scans which are taken into account during the sampling process rather than being artificially applied after the fact.
In order to sample from the posterior, we need to draw a sample from the product of two Gaussians, L(δg q,i ) and P(δg q,i ): The proper way to sample from such a distribution is given by Eq. (A.10) in BeyondPlanck (2022). In this case, the procedure amounts to replacing Eq. (28) by where G is the covariance matrix of the Gaussian prior on δ g q,i , and where η 1 and η 2 are two independent vectors drawn from a normal distribution with unity variance. Eq. (30) is, in contrast to Eq. (28), no longer possible to solve scan-by-scan, as the covariance matrix G introduces cross-scan correlations. To invert this system, we use the conjugate gradient solver procedure described in BeyondPlanck (2022).
Our prior should reflect our knowledge about the gain fluctuations -namely, that such fluctuations are related to fluctuations in the detector electronics. Such fluctuations are generally modelled as having a so-called "pink noise" or 1/ f spectrum, which means that the gain fluctuation covariance matrix G can be written as (31) Here, α, f knee and σ 0 are parameters to be determined; for a complete marginalization over these degrees of freedom, they can be made part of the general Gibbs chain in which we are sampling the gains, and an additional sampling step for these parameters can be inserted after sampling the gain fluctuations. This is similar to the process described in Ihle et al. (2022), but we only need to sample a single set of parameters per detector, not one set per scan as in the correlated noise case, since the data points in the gain fluctuation case are the scan themselves, not the individual detector samples.
In the case of LFI processing, we have found that fixing α, σ 0 and f knee instead of sampling over them gives the seemingly best results in terms of gain stability and CMB solutions. For the current paper, through trial, error, and by-eye inspection we've found that fixing the parameters to the following values makes the prior behave as desired: α = −2.5, σ 0 = 3 · 10 −4 V 2 /K 2 , f knee = 1 hr −1 . Although this might seem like an arbitrary choice, the fine-tuning represented here is not substantially different from the fine-tuning of window widths needed for the boxcar smoothing approach used by the Planck DPC.

Validation by controlled simulations
As described by Brilenkov et al. (2022), the BeyondPlanck analysis framework includes a simulation tool that allows us to input a controlled simulation whose aspects are all perfectly known, and we can use this to validate the reconstructed posterior distributions. The main results from a simulation that considers only one year of LFI 30 GHz observations are presented by Brilenkov et al. (2022). Here, we reproduce some of those results that pertain to the gain estimation results. This simulation includes only CMB (fluctuations, Solar and orbital dipole), correlated and white noise, and gain fluctuations. We then perform an end-toend Commander analysis in which we sample a CMB , n corr , and g, with no other ancillary data; this is thus a test of the core gain estimation, correlated noise estimation, and mapmaking routines, but not of, say, component separation or sidelobe estimation; those are validated through separate means. The analysis comprises 3,000 samples and the first 1,000 are discarded as burn-in (although these are still included in the trace-plots below).
First, we show in Fig. 5 the total gain samples as a function of Gibbs iteration for one randomly chosen PID for each of the four 30 GHz detectors, and the horizontal line shows the true input. These trace-plots show both that the chains fluctuate around the input gain values, and that their scatter provides a meaningful estimate of the uncertainties. However, we also see that the correlation lengths are significant. This is due to a strong degeneracy between the absolute gain and the CMB Solar dipole when analyzing only a small subset of the data, in this case only one year of 30 GHz observations. When analyzing jointly all data from all channels, the CMB Solar dipole signal-to-noise ratio increases dramatically, and the correlation length goes down, as shown in Sec. 5. To validate all scans and detectors, Fig. 6 shows the relative reconstruction bias measured in units of standard deviations, i.e., where g in q,i is the true input total gain, g q,i is the mean sample total gain estimated over all the samples for a given PID and detector, and σ q,i is the sample standard deviation. Thus, the figure shows the deviation of the output gain solution from the input in units of standard deviation. Overall, most samples lies within ±2 σ, although there are a few notable outliers. It is also worth noting that the gains are intrinsically correlated in time, and this causes the apparent correlation in this figure.
Finally, in Fig. 7 we show corresponding histograms of the same quantity, but this time aggregated over all PIDs for a given detector. Ideally, these should be Gaussian distributed with zero mean and unit standard deviation. Overall, we see that gain values are generally well-recovered with small biases and reasonable uncertainties. The non-Gaussian features are due to the long correlation lengths seen in Fig. 6, which implies that the number of independent samples in these functions is limited.

Results
We are now finally ready to present gain estimates for each Planck LFI radiometer, as estimated within the end-to-end Bayesian BeyondPlanck analysis framework.

Gain posterior distributions
We start by considering the sampling efficiency of the Monte Carlo chains produced with the above algorithm in terms of mixing and Markov chain correlation lengths. Figure 8 shows total gain as a function of Gibbs iteration for some representative radiometers and PIDs. We see that overall, the chains are stable and mix well.
A more quantitive confirmation of the good mixing of the chains can be seen in Fig. 9, which shows the correlation lengths across all PIDs (black line is the estimated mean correlation length, whereas the blue bands show the estimated standard deviation). All detectors exhibit correlation coeffiecients less than 10% after only a few samples.
In Fig. 10, we show relative differences between the last sample of the chain and the first drawn sample (red line), and we compare that to the similar relative difference with the secondto-last sample in the chain (blue line), as well as an in-between chain comparison (green line). We see that even the first sample of the chain is as close to the final solution as the next-to-last sample, and the conditional burn-in period with respect to the gain does not significantly affect our results. Long-term burn-in Overall, the largest differences between BeyondPlanck and the other two pipelines are observed in the 30 GHz channel. In particular, we find that the BeyondPlanck gain model is systematically lower than the 2018 model by about 0.84 % and than DR4 by about 0.67 % for this channel, which translates into frequency maps that are about 0.84 % (or 0.67 %) brighter. We also see that the DR4 and 2018 models agree very well for three of the radiometers, while 28S is an outlier, for which DR4 is close to BeyondPlanck.
The 30 GHz channel is the most difficult to calibrate among all three LFI channels, because of its brighter foreground signal, and the different ways in which the three pipelines handle this fact makes the above-mentioned gain solution differences less surprising: The DR4 process treats this channel separately, in that this channel is analyzed without priors on polarized foregrounds. The resulting map is then subsequently used as a spatial polarization prior for the 44 and 70 GHz channels (Planck Collaboration Int. LVII 2020). In comparison, the 2018 approach  also assumes vanishing CMB polarization during calibration, but this approach make no distinction between the orbital and Solar dipole with respect to absolute gain calibration (as both DR4 and BeyondPlanck do), but rather assumes that the fitted foreground model is sufficiently accurate. In contrast, the Beyond-Planck pipeline does not treat the 30 GHz channel differently in any way, and also does not assume that the CMB polarization signal vanishes (except for the single quadrupole mode, as discussed in Sect. 2.4). Instead, it uses WMAP information to support the foreground modelling, and to constrain the poorly measured modes in LFI. Overall, these algorithmic differences lead to the observed deviations between the various solutions.
The relative differences are smaller for 44 and 70 GHz: −0.04 % (44 GHz) and −0.64 % (70 GHz) between Beyond-Planck and the Planck 2018 model, and 0.12 % (44 GHz) and -0.03 % (70 GHz) between BeyondPlanck and Planck DR4. There is generally good agreement between the three pipelines for these two channels, although Planck 2018 is generally a higher absolute calibration than for the two others. Differences between BeyondPlanck and the two other pipelines are expected due to the joint nature of the BeyondPlanck approach, and the different ways in which the smoothing of the solutions are performed -BeyondPlanck uses the Wiener filter smoothing method, Planck 2018 uses boxcar averaging, and DR4 does not smooth the solution after estimation at all.

Effects of different gain smoothing approaches on correlated noise stripes
Since the Wiener filtering approach used in this paper has not previously been used for smoothing gain estimates, we compared it with a more conventional boxcar smoothing approach with similar smoothing windows as used by Planck DR4 -that is, smoothing windows that dynamically change with the dipole amplitude so as to make the smoothing length shorter in areas of higher signal-to-noise (i.e. dipole maxima). When comparing the results of these two approaches, we found that with boxcar averaging, the gain solutions behave more like the original Planck 2018 solutions than is the case with Wiener filtering. However, we also found that boxcar smoothing results in correlated noise solutions with strong "stripes" in the binned polarization maps, especially Stokes Q. Reducing the window sizes would mitigate this to some degree (although it introduced other issues, like gain spikes in the low signal-to-noise regime), and we therefore find it likely that the correlated noise stripes are related to an "over-smoothing" of the gain solution.
With the Wiener filter process, we are, modulo the exact form of the prior covariance matrix, smoothing the data in an optimal way -in high signal-to-noise areas, the signal is allowed to determine the solution, and in low signal-to-noise areas, the prior ensures that the solution does not become nonphysical. In Figure 14, we show a comparison of the effect of these two approaches on the Stokes Q map. In this figure, the only difference in the underlying sampling algorithm is the choice of boxcar smoothing and Wiener filtering, and it is reassuring to find that Wiener filtering, in line with expectations, is apparently able to find smooth gain solutions that avoid the problem of oversmoothing.

Gain jumps
As noted from the beginning of the Planck experiment (see, e.g., Planck Collaboration XXVIII 2014), the physical gain of the instrument exhibits several sharp jumps. These jumps are related to changes in the thermal environment of the instrument -an example of such an event is the turning off of the 20 K sorption cooler. Not all of the events are well understood, and can mainly be traced after an initial gain estimation. Such jumps must be accounted for in any gain estimation approach that isn't purely data-driven -i.e. if we are applying priors in any way that set up expected correlated behaviors between the gain factors over the mission. We find that indeed the Wiener filter approach is able to account for such jumps without any extra hard-coding (see Fig. 15, where a Wiener filter solution applied globally to the whole PID range is shown along with two jumps), another advantage of this smoothing method compared to the boxcar approach, where such jumps must be specified in the code and explicitly excluded from the averaging.

Comparison with external data
To understand the combined impact of the various gain model differences discussed above, it is useful to compare the final Be-yondPlanck frequency maps with externally processed observations, both from WMAP and Planck. In this respect, we note that both the Planck 2018 and WMAP data sets are associated with sets of correction templates that track known systematic effects (or poorly measured modes) in the respective sky maps. For the Planck 2018 30 GHz channel, this template is shown in Fig. 16. As discussed by Planck Collaboration II (2020), this template was produced by iterating between calibration and component separation, and therefore traces uncertainties in the gain model due to foreground uncertainties. Furthermore, due to limited time, only four full iterations of this type were completed for the Planck 2018 analysis, and one must therefore expect that there are still residuals of this type present in the final sky maps at some level.
With this in mind, we show in Fig. 17 BeyondPlanck-DR4 and BeyondPlanck-Planck 2018 difference maps for Stokes I, Q, and U (columns), for all three LFI frequencies (rows). The first, third and fifth rows show differences with respect to Planck 2018, while the second, fourth and sixth rows show differences with respect to DR4. Several features in these difference maps are interesting from the calibration perspective. Starting with the DR4 temperature difference maps, we see that all three channels are dominated by a clean dipole-like residual aligned with the Solar CMB dipole. This shows that the BeyondPlanck and DR4 temperature maps are morphologically very similar, but have different absolute calibration. We also see that the temperature map difference between BeyondPlanck and DR4 exhibits a flip in the dipole direction going from 30 and 44 to 70 GHz. This sign change is consistent with the differences in the calibration factors between 44 and 70 GHz reported in Table 10 in Planck Collaboration Int. LVII (2020), finding a difference of 0.31 % between the absolute calibration of the 44 and 70 GHz channels. Since the CMB Solar dipole has an amplitude of about 3360 µK, this relative difference translates into an absolute temperature difference of roughly 10 µK in the observed sky signal, which is fully consistent with the dipole differences we see in Fig. 17. In comparison, the Planck 2018 vs BP difference maps in temperature (line 1, 3, and 5) show a more prominent quadrupolar structure with a morphology that might resemble the effect of bandpass mismatch leakage (Planck Collaboration X 2016; Planck Collaboration Int. LVII 2020).
For polarization, the most striking differences are seen in the 30 GHz channel, for which variations at the 4 µK level are seen over large fractions of the sky. Furthermore, these residuals correlated very closely with the Planck 2018 gain template shown in Fig. 16, suggesting that they are indeed caused by foregroundinduced gain residuals. The same patterns are also seen in the DR4 difference maps, but with a notably lower level.
For the 44 GHz maps, we note that the Stokes Q difference maps show correlated noise stripes similar to those highlighted in the top figure in Fig. 14. However, we also note that these structures have different amplitudes in Planck DR4 and Planck 2018, and these stripes are therefore present in at least one of the other pipelines as well, and possibly both. Figure 18 shows a similar comparison between the various Planck 30 GHz maps and the WMAP K-band channel (Bennett et al. 2013). In this case, all maps have been smoothed to a common angular resolution of 3 • FWHM, and the K-band map has been scaled by a factor of 0.495 to account for the different center frequencies of the two maps while adopting a synchrotronlike spectral index of β s = −3.1. From top to bottom, the first three rows show difference maps with respect to Planck 2018, Planck DR4, and BeyondPlanck.
Overall, we see a clear progression in agreement with respect to WMAP K-band, in the sense that BeyondPlanck shows smaller residuals than Planck DR4, which in turn shows smaller residuals than Planck 2018. Furthermore, we note that the strong residuals traced by the LFI gain template in Fig. 16 are most pronounced in the Planck 2018 map.
At the same time, we also observe significant coherent largescale features in the difference map between BeyondPlanck and K-band. To at least partially understand these, we show the WMAP transmission imbalance templates derived by Jarosik et al. (2007) in the bottom row of Fig. 17. These templates trace poorly measured modes due to the differential nature of the WMAP instrument. Although corrections for this effect are applied to the final K-band sky maps, the uncertainty on the template amplitudes is estimated to 20 %. Considering the tight correlation between the BeyondPlanck-WMAP difference map and the transmission imbalance template, it seems clear that at least a significant fraction of the remaining residual may be explained in terms of this effect. Of course, this also suggests that a future joint analysis between Planck and WMAP in time-domain will be able to constrain the WMAP transmission imbalance parameters to much higher precision, and Planck data can thereby be used to break an important internal degeneracy in WMAP. As reported by Watts et al. (2022), this work has already started, but a full exploration of time-ordered WMAP data is outside the scope of the current analysis. We also emphasize that the current BeyondPlanck analysis only uses low-resolution WMAP polarization data for which a full covariance matrix is available, and these modes are appropriately down-weighted in those matrices.

Conclusions
We have presented the BeyondPlanck approach to gain calibration within the larger Commander Gibbs sampling framework. This framework relies directly on the Solar and orbital dipoles for relative and absolute calibration, respectively, and accounts for astrophysical foreground and instrumental systematics through global modelling.
One critically important difference with respect to previous Planck LFI analysis efforts is the fact that we actively use external data to break internal Planck degeneracies, and in particular WMAP observations. This significantly alleviates the need for imposing strong algorithmic priors during the calibration process. Most notably, while both the Planck 2018 and Planck DR4 pipelines assumed CMB polarization to be negligible on all angular scales during the calibration phase, we only assume that the CMB quadrupole is negligible. The reason we still make this assumption is that the Planck scanning strategy renders the CMB quadrupole very nearly perfectly degenerate with the CMB Solar dipole coupled to subtle gain fluctuations; a hypothetical future and well-designed satellite mission should not require this prior, as long as its scanning strategy modulates the CMB dipole on sufficiently short time-scales and with good cross-linking.
Overall, we find good agreement between the BeyondPlanck and previous gain models. The biggest differences are observed in the LFI 30 GHz channel, with gain variations of 0.84 % between Planck 2018 and BeyondPlanck. These differences result in subtle but significant temperature and polarization residuals. When comparing these with external WMAP K-band observations, it seems clear that the BeyondPlanck LFI maps are generally cleaner than previous renditions with respect to gain residuals. At the same time, we emphasize that these differences are also consistent with previously published error estimates, as presented by the Planck 2018 and Planck DR4 teams themselves. For instance, the morphology of the Planck 2018 polarization residuals matches previously published LFI DPC gain residual templates (Planck Collaboration II 2020), and the DR4 absolute calibration differences are fully consistent with internal DR4 estimates (Planck Collaboration Int. LVII 2020). These results are thus neither novel nor surprising, but they simply highlight the inherent advantages of global analysis, using complementary data sets to break internal degeneracies.
Finally, we note that even though the procedures outlined in this paper have been aimed at modelling the LFI detectors, there is nothing about the data model or methodology that is unique for LFI. The method should be directly applicable for other data sets and experiments as well, and, indeed, a preliminary WMAP analysis is already underway (Watts et al. 2022).