Background modelling for $\gamma$-ray spectroscopy with INTEGRAL/SPI

The coded-mask spectrometer-telescope SPI on board the INTEGRAL observatory records photons in the energy range between 20 and 8000 keV. A robust and versatile method to model the dominating instrumental background (BG) radiation is difficult to establish for such a telescope in the rapidly changing space environment. From long-term monitoring of SPI's Germanium detectors, we built up a spectral parameter data base, which characterises the instrument response as well as the BG behaviour. We aim to build a self-consistent and broadly applicable BG model for typical science cases of INTEGRAL/SPI, based on this data base. The general analysis method for SPI relies on distinguishing between illumination patterns on the 19-element Germanium detector array from BG and sky in a maximum likelihood framework. We illustrate how the complete set of measurements, even including the exposures of the sources of interest, can be used to define a BG model. We apply our method to different science cases, including point-like, diffuse, continuum, and line emission, and evaluate the adequacy in each case. From likelihood values and the number of fitted parameters, we determine how strong the impact of the unknown BG variability is. We find that the number of fitted parameters, i.e. how often the BG has to be re-normalised, depends on the emission type (diffuse with many observations over a large sky region, or point-like with concentrated exposure around one source), and the spectral energy range and bandwidth. A unique time scale, valid for all analysis issues, is not applicable for INTEGRAL/SPI, but must and can be inferred from the chosen data set. We conclude that our BG modelling method is usable in a large variety of INTEGRAL/SPI science cases, and provides nearly systematics-free and robust results.


Introduction
Since 2002, the spectrometer SPI (Vedrenne et al. 2003) on board the INTErnational Gamma-Ray Astrophysics Laboratory (INTEGRAL; Winkler et al. 2003) satellite has been observing astrophysical high-energy phenomena in the hard X-ray and soft γ-ray range between 20 keV and 8 MeV. In this energy range, measured γ-ray spectra are dominated by instrumental background (BG) photons. These originate mainly from nuclear de-excitation reactions and continuum processes, such as bremsstrahlung, of the instrument and satellite material, being exposed to cosmic rays (CRs). There is no stand-alone BG model available to deal with SPI observations in an extensive, consistent, and elaborate way. In this paper, we show how to construct a self-consistent BG model for SPI data analysis from a database of spectral parameters over the INTEGRAL mission years.
The spectra in each of the 19 high-purity Ge detectors of SPI can be characterised by a large number of instrumental γ-ray lines on top of a broken power-law shaped continuum. The main features of the spectra among detectors, energy, and time stay constant and change gradually according to solar activity and detector degradation. Diehl et al. (2018) illustrated how the SPI spectral BackGround and Response DataBase (BGRDB) was created, maintained, and checked for consistency over the entire mission time. In general, the BGRDB contains fitted spectral parameters per detector and time. The time integration is chosen as either one INTEGRAL orbit of three days, or the time between two detector restoration periods 1 , which is typically half a year. In early 2015, the INTEGRAL spacecraft performed several orbit adjustment manoeuvres to safely bring the satellite back to Earth in 2029. For this reason, the INTEGRAL orbit is now merely 2.7 days long, and will decrease with ongoing mission time. These different timescales are a key issue for the understanding and the construction of a γ-ray telescope BG model in order to record sufficient BG data for reliable fits to the spectra, and to be able to trace gradual changes in the spectral response.
This paper is structured as follows. In Sect. 2 we describe how SPI data analysis is generally treated (Sect. 2.1), and describe the parts of the SPI BGRDB required for BG modelling in more detail. Based on this, we explain how we construct a self-consistent BG model in Sect. 3, with full details of the algorithm (Sect. 3.1) and a discussion about the underlying foundation (Sect. 3.3). In Sect. 4, we show how to evaluate the BG model fit adequacy for different test cases for point-like 2. SPI data analysis 2.1. General method: fitting time series SPI data analysis and source flux extraction is performed for individual energy bins. Each energy bin is treated separately, even though the bins are connected, for example by instrumental resolution. This dependence is taken into account when the BG model is built. The smallest public available bin size in SPI data analysis is 0.5 keV. The chosen bin width for the actual data set, however, is arbitrary. To increase the statistics and to reduce the time for the analysis at the same time (number of bins, for example), the bin width can be increased. Spectral information is lost when large energy bins are used.
The counting of photons into one energy bin over time obeys the Poisson statistics. From this the likelihood, L(θ|D), can be calculated, given the data set, D, and the model parameters, θ. The data in SPI per energy bin is structured as a vector of magnitude, number of detectors × number of pointings = d × p , where d is equal to 19 if all individual detectors 2 of SPI are used and p equals the number of observations, N obs , in a specific data set.
A pointing, p, is an observation unit for a specific amount of time, T p , typically 0.5-1.0 h, and for a specific region in the sky, marked by the galactic longitude and latitude (l p /b p ). The satellite is staring during this time at this position, and is recording data. Depending on the type of observation (diffuse/pointlike; persistent/transient), N obs pointings are observed over larger regions in the sky, or only around the target source. Sub-pointing time intervals (e.g. for gamma-ray bursts) can also be analysed. The likelihood is given by where m p is the model to describe (to be fitted to) the data d p per pointing p. The log-likelihood (C-stat) is then given by We use the likelihood and related values (such as C-stat or χ 2 values) and the degrees of freedom (d.o.f.: number of data points minus the number of fitted parameters) in a maximum likelihood fit as a goodness-of-fit criterion for our BG modelling method in Sect. 4. SPI data are typically dominated by instrumental BG, so that a subtraction in neither spectral nor in the d × p -dimension would provide reasonable results. Only in a few cases during the INTEGRAL mission did transient sources appear that outshined the BG. One example is the microquasar V404 Cygni (e.g. Rodriguez et al. 2015;Roques et al. 2015;Siegert et al. 2016a;Jourdain et al. 2017) during revolutions 1554-1557. Furthermore, there is no absolute BG model for SPI as there are variations in all dimensions which change according to the unpredictable solar activity. The goal is thus to obtain a description of the BG from the data itself, but independent from the celestial source of interest. The key to such a BG model is found in the way the data is modelled, and how the different model terms can influence each other: Here, M k j is the kth of N S sky images (celestial emission models) to which the instrumental response function (IRF, codedmask shadowing), R jp , is applied for each pointing p and image element (e.g. pixel), j. The N B BG models B kp are independent of the IRF in each observation, which means the background is assumed completely independent from the shadowing of the mask and therefore independent of spacecraft repointings. The model parts sky and BG can depend on time, and on different scales, t for celestial sources and t for the BG. While the BG timing depends on the instrument materials, being activated and decaying on various timescales as a consequence of solar activity, the variability of the sky emission is only subject to the physics of the sources themselves. This means that the change of the BG amplitude, θ k,t , depends on the process which resulted in the BG γ-ray photon. This may change on very short timescales (e.g. prompt emission after an intense dose of CR particles from a solar flare) or very long scales (e.g. from a radioactive buildup) when the creation of radioactive material happens on shorter timescales than the decay time.

Spectral background and response database
The SPI BGRDB by Diehl et al. (2018) spans the energy range between 20 and 2000 keV per INTEGRAL orbit, and consists of parameters for 383 γ-ray lines, each modelled with four parameters, and two continuum parameters, depending on the energy. This includes both data formats, the single events (SE) data from 20 to 1392 keV and from 1745 to 2000 keV, and the pulse-shapediscriminated (PSD) data from 490 to 2000 keV. In the energy range between 1 and 8 MeV 3 , the count statistics drops rapidly towards larger energies, so that longer integration times are required with increasing energy to obtain robust fitting results. For the BGRDB above 2 MeV, the time between two annealing periods is used to fit 614 γ-ray lines on top of a broken powerlaw continuum (Weinberger et al., in prep.). The low-and highenergy bands were subdivided into smaller energy regions, e, to limit the number of fitted parameters per fit. As a function of energy (E) the SPI spectra (F(E)) per detector d, orbit r, and energy range e are described by c der = C 0,der , α der , and Here, C(E; c der ) and L(E; l i der ) describe the shape of the γ-ray continuum and lines with the parameter tuples c der = (C 0 , α) der and l i der = (A 0 , E 0 , σ, τ) i der , respectively, where i is the index of the ith γ-ray line in the band e. The parameters C 0 and α are the amplitude and the spectral index of the power-law function C(E), normalised at the pivot energy E C . This shape is motivated from the superposition of many power-law-like continuum processes in the satellite: The parameters A 0 , E 0 , and σ represent the amplitude, centroid, and width of a symmetric Gaussian function, with τ accounting for the detector degradation. The line shape that results from cosmic-ray bombardment is physically motivated (Kretschmer 2011). As the regular lattice structure is deteriorated, the electronic structure changes, and charge carriers may be temporarily trapped. This leads to a time delay in the charge collection process, and thus in the charge pulse of the read-out electronics. The untrapped portion of a charge cloud is proportional to exp(−κx) (Debertin & Helmer 1988), where x is the distance to Ge detector electrode, and κ is the absorption coefficient. Since these parameters cannot be determined independently, a combined degradation parameter τ is used. The convolution of a symmetric Gaussian, G(E; A 0 , E 0 , σ), with an exponential tail function, T (E; τ), leads to an asymmetric line shape, L(E; A 0 , E 0 , σ, τ) (see also Eqs. (3)-(6) in Diehl et al. 2018): Diehl et al. (2018) find that there are families of γ-ray lines in the instrumental background of SPI that are characterised by their detector patterns. The authors define several groups of lines, of which we recall the most prominent: the "Ge-like" lines show higher count rates for inner detectors (00-06) with respect to outer detectors (07-18). These include excitation of nuclei in the SPI camera, such as Ge and Al, but also related and produced isotopes, such as Ga, Zn, and Mg. "Bi-like" lines show the opposite detector pattern, as their main sources are in the materials of the SPI anticoincidence shield, made of bismuth-germanate (BGO). Related isotopes in the vicinity of Bi (e.g. Pb, Ra, Tl) and also isotopes from the actinide alpha-decay chains (e.g. Ac, Th, U) show this behaviour. To a lesser extent, material from mountings (Ti, V), wires (Cu, Co, Fe), and other instruments aboard INTEGRAL (Cd, Te, Cs) also show this pattern.
The most important finding from monitoring the long-term trends in this spectral response database is that the patterns of individual isotopes stay constant on all timescales. This also means that isotopes that produce different line energies show the same pattern. Only detector failures change this behaviour, but then the patterns are again constant. This is the prime assumption of how a self-consistent background model is constructed, and is further explained in Sect. 3.1. Since the spectral shape changes with time (see above), the detectors may respond differently to this degradation, and the prompt and delayed background emission is different for different energies, the SPI BGRDB includes the information to reconstruct the expected spectral response of the SPI camera and predict the instrumental background as a whole at any point in time in high spectral resolution.

General approach
At each specific energy, E, we model the BG using two components: photons from continuum processes, and those from one (or more) γ-ray line-producing processes. In narrow energy bins, i.e. less than a few times the instrumental resolution (between 20 and 8000 keV the instrumental resolution is determined from narrow background lines, and ranges from 1.7 to 8.5 keV), we construct the BG by a combination of the continuum and the lines, i.e. combining all continuum-like processes and line-like processes to two individual BG models. Depending on the energy, either of the BG components may dominate (see Sect. 5.2 and Fig. 8). For each pointing, p, the factor j R jp M kp ≡ S kp changes for each sky component, k, i.e. the detector pattern (which detectors are illuminated and how strong they are) changes with time and/or pointing. This is illustrated in Fig. 1 for the standard INTEGRAL 5 × 5 observation scheme of a point-like source (see also Figs. 2 and 3).
On the other hand, for a specific physical process inside the satellite, the detector patterns from the BG, B kp , do not change. For our two BG model components, k = c for the continuum and k = l for the lines: B cp = const. = B c and B lp = const. = B l . What might change as a function pointing (=time), p, are the amplitudes θ c (p) and θ l (p). Therefore, only these are determined in the maximum likelihood fit, following Eqs. (2) and (3), see also Sects. 3.2 and 4.2.
Following these considerations (see also Diehl et al. 2018, Sect. 2.4), the SPI BG model for a particular detector, d, at a specific pointing, p, and energy bin, e, is written as L i (l i d,e,p(r) ) · t l i;e,p .
In Eq. (9), the BG is factorised into the fitted amplitudes for continuum and lines, θ c e,p and θ l e,p , respectively, the spectral parts and a temporal part. Here individual pointings may be scaled by the same amplitude parameter (see Sect. 4.2). For the lines component, we sum over the individual lines i, which contribute to one energy bin, e. In principle, each line amplitude could be fitted individually, but this is only robust for strong BG lines. The spectral parameters per pointing, c d,e,p(r) and l i d,e,p(r) , depend on the revolution in which the pointing occurred, hence p(r), and are fixed to the values from the SPI BGRDB in the maximumlikelihood fit. This is true because the SPI BGRDB was created on an orbit timescale. This can be generalised to arbitrary time bins in spectral parameter databases, e.g. above 1 MeV where the parameters are determined on the annealing timescale, i.e. the average over half a year.
The detector pattern per energy and process is included as each detector is assigned a specific value according to the spectral response at that time and energy. Thus, C d,e,p(r) and N lines (e) i=i(e) L i;d,e,p(r) completely determine the spectral and the detector ratio information required for BG modelling. The tracer functions t c e,p and t l i;e,p then allow for a relative weighting between pointings for continuum and line processes, respectively. They are parts of the specific BG modelling per science case, and thus are independent from the SPI BGRDB, and thus do not  Fig. 1. Detector patterns of a celestial source near the optical axis of SPI. Shown is the typical 5 × 5 rectangular dithering strategy with SPI, marked with dots and sequenced by number above the insets. Each dot represents one pointing with a field of view of 16 • × 16 • and is 2.1 • away from the next/neighbouring pointing. The celestial source is marked with a blue star symbol at the position (l/b) = (35.8 • / − 3.9 • ). In each inset panel, ranging from 0 to 2.75, the relative detector pattern of how the source would be seen by SPI is shown. The dashed lines in each sub-panel indicate the differences in the expected sky pattern and how it changes from pointing to pointing with respect to a flat 1:1: . . . :1-ratio pattern. From Siegert (2017).  follow p(r). In general, each process can have its own variability, being prompt from cosmic-ray excitation and instantaneous de-excitation, or delayed when there is a longer lifetime of the produced isotopes included. The latter can lead to long-lasting radioactive decays, for example after a strong solar flare that has created a lot of radioactive material (e.g. 48 V with a half-life time of 16 days), or can lead to radioactive build-up when the decaytime is considerably longer than the production rate (e.g. 60 Co with a half-life time of 5.27 years, cf. Diehl et al. 2018). The long-term trends are already traced by the SPI BGRDB as the amplitudes are determined on three-day timescales (Sect. 3.2), so the remaining prompt BG emission processes at each energy can be traced by one temporal function, i.e. t c e,p = t l i;e,p := t e,p .

Temporal behaviour
At individual energies, i.e. for small energy bins (not integrating over the energy range of the physical process), the instrumental γ-ray line BG patterns, B l , become a function of energy because of detector degradation. This does not mean that the physical processes change -the pattern per process is constantbut the pattern at a particular energy can change. This is caused by the degradation of individual detectors on different timescales due to their individual constitutions, behaviours, and reactions to particle irradiation. These changes in detector patterns versus energy and time are taken into account by using the SPI BGRDB. This database traces the small but important changes, as it has been created on a three-day timescale for energies below 2 MeV A73, page 4 of 13 T. Siegert et al.: Background modelling for γ-ray spectroscopy with INTEGRAL/SPI (Diehl et al. 2018). For higher energies, i.e. lower BG rate, the BG count statistics are rarely sufficient to determine the gradual change in degradation, so that a mean value over half a year is determined. The SPI BG amplitude of a certain instrumental process is in general unpredictable. In SPI data analysis, these BG variations are approximated at first order by "tracers", i.e. rates of onboard radiation monitors, such as the SPI anti-coincidence shield count rate, or the rate of saturating Ge detector events ( 15 MeV; GeD-Sat). These rates trace the cosmic-ray particle flux that leads to instrumental γ-ray BG. The variations of these BG time series during one orbit are of the order of 1%, i.e. <1% from pointing to pointing.
These approaches work well for all energy bins, i.e. as small as 0.5 keV or even 1 MeV continuum bands. It is important to note, however, that the described tracing relies on the prompt effect of cosmic-ray irradiation, which is true for many but not all BG-generating processes in the satellite. Any BG process on a longer scale is not represented by such a tracer, but instead is already implemented in the SPI BGRDB. In the special case of mid-term radioactivities, i.e. longer than several pointings, solar flare induced background lines or radioactive build-ups may be traced directly by the respective exponential decay law with the isotope's characteristic decay time (see Fig. 15 of Diehl et al. 2018, showing the examples 48 V and 60 Co). Thus instead of relying on independent rates, the natural timescale of the process can also be used.
This tracing provides a good first-order description of the BG model variation on shorter timescales below one orbit and down to pointing-by-pointing. When the BG is fitted to the data per energy bin, however, the appropriate BG re-scaling has to be determined. This depends on energy because the average BG count rate changes with energy due to different contributions and to strengths of continuum and lines, and the different intrinsic timescales of the processes, be they prompt or delayed. This is discussed in Sect. 4.2.
With the experience of hundreds of analysed X-ray and γ-ray sources during 16 years of the INTEGRAL mission in different sky regions and γ-ray energies, it has become evident that for a broad energy range between ≈200 and 8000 keV, the saturating Ge detector events (GeDSat) are sufficient to trace the interpointing variations. The 511 keV BG line, for example, closely follows the rate of the side-shield assembly of the SPI anticoincidence shield (SSATOTRATE, Skinner et al. 2014;Siegert et al. 2016b). Below 200 keV, the BG rate is very high and can often be determined on a pointing timescale, so that no tracer is required at all if the source is also strong. We note that GeDSat does not necessarily explain the full variation from one pointing to the next, and requires re-scaling in our method. Likewise, at the steps of the SPI BGRDB, the BG must be carefully investigated and re-scaled (Sect. 4.2).

Discussion: why does this work?
In a single pointing, the mask patterns for sources in the field of view can be very strong, so that a determination of detector patterns from the BG (the BG response) is not possible on this timescale. Off-observations, for example at high latitudes, can be used if broad energy bins are analysed (continuum sources). For detailed spectroscopy in small energy bins, however, the BG patterns change smoothly with time due to detector degradation so that the actual spectra should be used to determine the BG. If many pointings of the same observation are combined, the source imprints due to the mask smear out, such that the resulting Variance of the detector patterns as shown for a typical 5 × 5dithering pattern from Fig. 1. For each added pointing we calculate the standard deviation across the 19 Ge detectors, and also consider different ratios of source (right axis) to BG (left axis). For a specific ratio, the variance changes with the number of added pointings to the power of −0.8. Thus, even for strong source contributions, the mask pattern will smear out over the course of one INTEGRAL orbit. spectra per detector for a longer timescale appear as due to BG alone. In Fig. 4 we show how S kp smears out for a point source, observed with a 5 × 5-dithering strategy of INTEGRAL. For a source contribution of 10% (90%) to the total recorded spectrum, the cumulative pattern of BG plus source varies by less than 1% (10%) after only 20 pointings. During one INTEGRAL orbit, 50-90 pointings are performed, so that the contribution of a residual mask pattern from even strong sources would be less than 1%. This means that the SPI BGRDB (Diehl et al. 2018) is mostly free of source contributions of any type, and that in this way, a BG model for single pointings can be constructed.
The detector patterns from instrumental BG continuum, B c , are nearly flat around a value of 1.0 (Diehl et al. 2018). The limit, lim S kp , also evolves to a constant value of 1.0 across all detectors, so that any additional source contribution appears in the time-integrated spectra as an increased BG continuum. For individual pointings, however, as analysed in a data set, the amplitude of any source contribution can thus be determined in the fit because the expected patterns from BG and sky are now clearly different. The maximum likelihood fit including BG continuum, BG lines, and sources will, as a consequence of this procedure, scale down the continuum to "give way" to possible source counts.
As a result, the SPI BGRDB, fitted to the data, already provides a good first-order BG model for any INTEGRAL/SPI observation. For detailed and high-resolution spectroscopy of continuum and of γ-ray line sources, especially at fine energy binning, the characteristics of the parameter values in the database must be investigated further to ensure consistency of the spectral description in order to build a stable and robust BG model (see also Sects. 3.2 and 5).
In the case of partial coding of sources during one INTE-GRAL orbit, the relative counts of detectors at the rim of the SPI camera (parts of the outer detectors) may be increased on a few per cent level (see above), so that the derived BG patterns may be slightly skewed. This can also happen with strong sources in the fully coded field of view, leading to residual average patterns from these sources. The additional pattern, which is mixed with A73, page 5 of 13 A&A 626, A73 (2019)  the true BG pattern in the SPI BGRDB, is the sum of all detector responses from sources seen by SPI in a particular orbit (e.g. the sum of all patterns in Fig. 1). This can introduce weak coding in the partially coded field of view, but has marginal effects on the known sources. It mainly affects the energy range below ≈100 keV, and can be accounted for by adding the expected average pattern as a third BG model component.

Evaluation of the background model performance
If the fluxes of the analysed sources are known, for example from previous analysis, or can be expected from theoretical considerations, it is possible to estimate the contribution of source counts in the data set. Consequently, the fit adequacy can be investigated if the sources are ignored. In general, the covariance between BG and source amplitude parameters is nearly zero; however, it will turn to negative values if the source contribution is weak or absent. This is shown and explained in Appendix A, Figs. A.1-A.2c, for the case of a line signal at 1809 keV (see also Sect. 5.1.2). As a consequence, the BG (time) scaling and the source contributions (also their time scaling if needed) should be optimised simultaneously.
As an absolute figure of merit to determine a fit's quality, we use the "modified χ 2 γ -statistics" (Mighell 1999) defined as As an alternative to χ 2 γ and to distinguish between models, we use the Akaike information criterion (AIC, Akaike 1974;Cavanaugh 1997). The AIC is defined as This figure of merit is similar to the reduced χ 2 values in that it penalises models with more parameters. Using relative likelihoods (absolute AIC differences, ∆AIC), we identify on what time basis the BG has to be re-adjusted in the different cases of Sect. 5. We note that its absolute value has no proper meaning (Burnham & Anderson 2004). In general, the smaller the AIC, the more preferred a model is. It accounts for the appropriate likelihood of the data generating process, so that testing different assumptions on the BG (and source) scaling provides the most adequate number of BG (and source) parameters (Sect. 4.2).
The χ 2 γ and AIC values are calculated from the best fit ln L value for reference. In general, it is better to avoid relying on reduced χ 2 estimations when dealing with Poisson-distributed data; however, the χ 2 γ -statistics is specifically adapted for this case (Mighell 1999), and can provide a general measure.

Data sets
In order to illustrate different applications of the BG modelling method we chose four energy regions to perform our analysis towards point-like and diffuse emission. We demonstrate the performance of our method on line-like emission at 511 keV, binned into one energy bin between 508 and 514 keV, as would be used for imaging. The second strongest γ-ray line at 1809 keV of decaying 26 Al is analysed between 1795 and 1820 keV in 0.5 keV bins to also show a fine-binned but low-statistics case. Analysis of diffuse emission in a continuum band is presented between 2.5 and 3.5 MeV, summed into one bin of 1 MeV. For point-like emission, we also use a low-energy band, from 50 to 100 keV in 2 keV bins. We make use of Crab observations, performing tests in the same energy bands as for diffuse emission. A summary of the data sets and on what time basis the BGRDB is used is found in Table 1.

Background changes with time
As described above, the relative weighting of BG amplitudes between individual pointings can be determined by a tracer, but that tracer may not fully cover the actual variations. In the following, we use pre-defined time steps to re-scale the BG model, as shown in Table 2, and judge which scaling, i.e. how many fit parameters, are preferred.
In the case of diffuse emission, for instance, the expected detector pattern changes more smoothly than that for a point source. In addition, INTEGRAL/SPI data sets as they are typically analysed for diffuse emission cover very large areas in the sky, many of which are expected to include little or no source flux but only instrumental BG. For example, 26 Al is distributed dominantly in the plane of the Galaxy, but high-latitude observations are included in the data sets to get a better leverage on the absolute BG model. The data sets for large-scale diffuse emission thus include as much data and as many pointings as possible. In return this means, that the BG has to be re-scaled whenever emission from the sky (in certain regions given by the emission models) is expected to change its contribution to the total by a significant amount. This could also be when observing A73, page 6 of 13 T. Siegert et al.: Background modelling for γ-ray spectroscopy with INTEGRAL/SPI the same sky region at different mission times with different background levels. As a consequence, the sensitivity for diffuse emission with respect to analyses of point sources is reduced. If a particular point source is investigated, the data sets are typically constrained to the region in which the source is located. When this same source is monitored from time to time, the BG level should be adjusted accordingly, not because the source flux might have changed, but because the BG level varies on longer timescales (e.g. solar cycle). The source flux can also change, for example in the case of X-ray binaries. This just means that the fit is no longer stabilised by a constant source and varying BG; instead, it is independent for individual times.

Test cases
In Sect. 5.1 we present the different diffuse emission cases and explain the choice of parameters. In Sect. 5.2 we illustrate the findings for one point source, and highlight the differences as mentioned above.    of SPI, emission regions out to |b| and |l| ≈ 20 • are also included and have to be modelled. We use the best-fitting emission model from Siegert et al. (2016b) to characterise the 511 keV morphology in the bulge. This model consists of four 2D Gaussian templates to represent the bulge and the superimposed disk. Here we combine the individual components into one map. Clearly, a minimum is found on a BG timescale of three days, i.e. one INTEGRAL orbit (Fig. 5a). Between a timescale of 8 h and the annealing timescale (12p-Anneal), the measured flux shows only slight variations, indicating adequate and stable fits (Fig. 5b). Very long (DetFail/Const) and very short ( 8 h; 1p-6p) timescales, on the other hand, provide bad fits or over-fit the data, respectively, which leads to false flux values. For the optimum AIC at 3 d (476 × 2 = 952 fitted BG parameters; 1n), the 511 keV flux is determined to (1.6 ± 0.1) × 10 −3 ph cm −2 s −1 , which is consistent with previous measurements considering a narrow and a broad bulge component, together with a disk in this region (e.g. Skinner et al. 2014;Siegert et al. 2016b Fig. 6. Study of background scaling in the 26 Al region. Bottom panel: AIC, similar to Fig. 5a, but as a function of energy. From the optimal AIC at each energy bin, the flux is determined (black data points, upper panel, left axis). In addition, we show the fitted background (grey), divided into continuum (blue) and line (green) contributions (upper panel, right axis). To the black data points, a degraded Gaussian line on top of a power-law continuum is fitted, and shown as 1, 2, and 3σ uncertainty bands. The dotted line marks the zero flux level.
thus not over-fitting the data. In general in the 511 keV case, the fit, considering the χ 2 -statistics, is always acceptable. Between the Const BG model with two parameters and a fit per each individual pointing with 25 172 parameters, the reduced χ 2 changes between 1.0111 and 0.9881, i.e. +3.6σ and −3.5σ from an optimal χ 2 fit value of 1.0. The BG scaling closest to χ 2 = 1.0 is found at 2n, i.e. one BG parameter every second INTEGRAL orbit (approximately six days). The minimum reduced χ 2 value is found for a fit per pointing, 1p (0.5-1.0 h), but which is largely penalised by the AIC, and thus represents an over-parametrised fit.

Full-sky emission of radioactive 26 Al
The radioactive isotope 26 Al is produced in massive stars and ejected by winds and core-collapse supernovae. With a halflife time of 717 kyr, 26 Al traces the ongoing nucleosynthesis in the Milky Way. These nuclei decay via β + -decay to an excited state of 26 Mg, which is short lived (476 fs) and de-excites to the stable ground state of 26 Mg by the emission of a E lab = 1808.63 keV γ-ray photon (cf. Oberlack et al. 1996;Plüschke et al. 2001;Diehl et al. 2006Diehl et al. , 2010Kretschmer et al. 2013;Bouchet et al. 2015;Siegert 2017, for example). In this work, we use the Maximum Entropy Map from COMPTEL (Plüschke et al. 2001) for our sky model to determine the BG scaling. Table 3. Spectral parameters of the 26 Al-line region, selected for the optimal χ 2 case and the optimal AIC case, χ 2 opt and AIC opt , respectively. 3.81 +0.14 −0.14 Notes. The units are 10 −6 ph cm −2 s −1 keV −1 , 1, 10 −3 ph cm −2 s −1 , keV, and keV for the continuum amplitude C 0 , the power-law index α, the line flux I L , the line centroid shift ∆E peak = E peak − E lab , and line width (FWHM) Γ L , respectively.
Here, we use a 0.5 keV binning to resolve the 1809 keV γ-ray line between 1795 and 1820 keV. This allows us to show that the optimal BG scaling depends on energy, as the BG intensity rises when strong instrumental BG lines have to be taken into account. In Fig. 6, we show the result of our BG analysis in the 26 Al case for diffuse emission. While the energy bins containing the 26 Alline and a complex of strong BG lines (green), require a BG scaling between 30 d and half a year (30n-Anneal), the neighbouring continuum is sufficiently well fitted by only 1-29 parameters per BG component (Const-Anneal). It is important to note how the weak instrumental BG line around 1797 keV is also captured by our BG modelling method. In Appendix B, we show the same analysis, based on the minimal reduced χ 2 value in each energy bin. The major difference between the AIC and the χ 2 profiles is in the erratic behaviour among bins, and the wide spread in the choice of the optimal number of parameters. The line contribution also favours a larger number of parameters, similar to the AIC analysis. The number of parameters in the 26 Al line is very large (5396), and is the reason why the errors bars on the derived fluxes are much larger in the optimal χ 2 case than in the optimal AIC case. We compare the spectral parameters of the 26 Al region in Table 3. All parameter are consistent, except for the line width, being as small as the instrumental resolution in the optimal χ 2 case, and kinematically broadened for the optimal AIC case. In particular, the 1.8 MeV line flux is consistent with previous studies of the full 26 Al sky (e.g. Bouchet et al. 2015;Siegert 2017).

Diffuse soft γ-ray continuum
Above ≈100 keV and up to several tens of MeV, the diffuse continuum emission in the Milky Way is dominated by inverse Compton scattering (Strong et al. 2005), with contributions from bremsstrahlung and positron annihilation (Beacom & Yüksel 2006;Knödlseder et al. 2007;Strong et al. 2010;Lacki et al. 2014). Here we use a continuum band between 2500 and 3500 keV to illustrate how our BG modelling method performs for diffuse continuum emission. We combine all photon events into one large energy bin of 1 MeV binwidth to increase the photon number statistics, and to illustrate the capabilities of our method in a broad-band analysis.
For our analysis, we use the central part of the Milky Way, as in to Sect. 5.1.1. Because the high-energy data sets of SPI, between 2 and 8 MeV, are very sensitive to solar activity, we removed periods in which solar flares strongly imprint and enhance BG intensities of nuclear de-excitation lines, such as 2 H at 2.2 MeV, 12 C at 4.4 MeV, or 16 O at 6.1 MeV. Consequently, the high-energy data sets are smaller than in the 511 keV case. We use an inverse Compton template from GALPROP (Strong et al. 2011) to fit the morphology in the 2.5-3.5 MeV band.   The optimal AIC solution is found at a re-scaling time of ≈8 h (12p), corresponding to 2630 BG fit parameters for 176 537 d.o.f. (Fig. 7). The reduced χ 2 value at this point is 1.0107, 3.2σ away from the desired χ 2 /ν = 1.0 solution. Based on the reduced χ 2 value alone, the optimal solution would be found at a 1 h BG scaling (1p), but which is clearly disfavoured by the AIC because it penalises too many fit parameters. The fitted flux values at 3.0 ± 0.5 MeV for acceptable fits range between 1.7 and 5.3 × 10 −6 ph cm −2 s −1 keV. From these fits, a detection significance between 3.4 and 7.0σ is found. At the optimum AIC, the diffuse continuum flux, based on the used inverse Compton template, is determined to (2.7±0.7)×10 −6 ph cm −2 s −1 keV −1 . This is consistent with measurements from COMPTEL (Strong et al. 1999), considering the different solid angles covered in the analyses.

Point source emission: small and partitioned data sets
We use the Crab observations of INTEGRAL/SPI from 14 mission years. As a comprehensive example to be compared to the above cases, we show the energy regions from 50-100 keV in 2 keV bins, the positron annihilation line region between 508 and 514 keV in one 6 keV bin, the 26 Al region between 1790 and 1840 keV in 0.5 keV binning, and one 1 MeV bin between 2.5 and 3.5 MeV. Unlike in the diffuse emission examples, the Crab should only show continuum emission and no γ-ray lines. Emission lines from decaying 44 Ti could be expected, but are below the sensitivity limit for SPI. Similar to the high-energy case for diffuse emission, we apply special selection criteria to the Crab observations to avoid strong solar activity, which is also why the Crab data set above 2 MeV is smaller than below this value. The choice of energy regions allows for direct comparison of the diffuse and line emission cases.
In Fig. 8, we show the results of our BG modelling method for the Crab and the time scaling analysis in four different bands with varying energy bin size. In the energy range, between 50 and 100 keV, it is evident that there is a large difference in BG re-scaling with time, as the BG flux changes rapidly with energy. Especially below ≈80 and above ≈90 keV, the instrumental BG line contributions dominate over the instrumental BG continuum, so that the BG scaling time is optimal between 4 and 8 h (6p-12p). In the continuum-dominated region, 3 d (1n) scaling and thus fewer parameters are enough for an optimal BG scaling. The resulting spectrum in this energy band is smooth and power-law like. Similar to the diffuse emission in the centre of the Galaxy, the BG amplitude in the 511 keV line band should be fitted every three days. There is no enhancement of a possible narrow 511 keV line in the spectrum, as determined from the spectral fit over all the analysed energy bands (see below). Likewise, the energy band between 1790 and 1840, in which emission from 26 Al at 1809 keV could be expected, is sufficiently scaled once between each detector failure or annealing period (DetFail-Anneal) in the continuum, and every 60-90 d (20n-30n) in the comparably strong instrumental line complex between 1805 and 1813 keV. No 26 Al line is detected, which reinforces our method, being able to systematically suppress instrumental BG lines, and only reveals the source spectrum. At higher energies, between 2.5 and 3.5 MeV, the Crab flux is about the same order of magnitude as the diffuse emission in the Galactic centre. However, the emission is concentrated into one point, so that the re-scaling must be performed less often than in the galactic diffuse case, here only once every three days (1n). Also here, the difference in the uncertainties is evident between diffuse emission and point-like emission. While the galactic centre shows a flux uncertainty of 7.0 × 10 −7 ph cm −2 s −1 keV −1 , the Crab emission is uncertain by 0.9×10 −7 ph cm −2 s −1 keV −1 , even though the effective exposure times are about three times longer in the Galactic centre (25 Ms) with respect to the Crab (7.5 Ms).
We check for consistency in our analysis by fitting the Crab spectrum, using three different energy bands: (a) between 50 and 100 keV, (b) between 1790 and 1840 keV, and (c) all data points between 50 and 3500 keV. The fitting results are shown in Table 4, demonstrating consistency between the analyses, and further substantiating our BG modelling method. The Crab spectrum is determined to be (10.53 +0.12 −0.17 ) × E −2.073±0.003 ph cm −2 s −1 keV −1 between 50 and 3500 keV with a reduced χ 2 value of 1.32 for 165 d.o.f. This is consistent with previous SPI results reported for the Crab (Jourdain & Roques 2009, with 1.2 Ms of observation time), and reduces the uncertainties by the increased amount of exposure.

Conclusion
In this work we showed how to construct BG models for INTE-GRAL/SPI data from a database of spectral parameters that characterise the response as well as characterise the BG properties of individual components of the instrument. In particular we illustrate our method on a few characteristic examples, including point-like, diffuse, continuum, and γ-ray line emission.
Our method is based on long-term monitoring and understanding of instrumental BG, which is separated into continuum BGs and γ-ray line BGs. These physical components show individual, but characteristic detector patterns on the SPI detector array, and are constant over time. For individual energies (energy bins not integrating over a γ-ray line, for example), the detector patterns change smoothly with time due to the change in their responses, mainly caused by detector degradation and the influence of solar activity. This is taken into account by our detailed spectral fits of INTEGRAL/SPI data per unit time (Diehl et al. 2018).
The time basis for these spectral fits must be chosen to fulfil the following criteria: 1. long enough to smear out residual celestial patterns which are imprinted by the coded mask;   Fig. 6. Bottom panel: AIC as a function of energy. From the optimal AIC at each energy bin, the flux is determined (black data points, upper panel, left axis). The fitted background is shown as a grey histogram, divided into continuum (blue) and line (green) contributions (upper panel, right axis). The black data points are fitted by a power-law continuum, either only using the 50-100 keV range (pink band), only the 1790-1840 keV range (yellow band), or all four ranges together (red band). Notes. We normalise the spectra with A 0 in units of ph cm −2 s −1 keV −1 at different energies E m to avoid numerical problems.
2. short enough to trace gradual variations of detector responses; 3. long enough to accumulate enough statistics for reliable fits. Point 1 is fulfilled after 15-25 INTEGRAL pointings, depending on the strength of the source (Fig. 4). We note that a correlation of our BG model with celestial emission is to be minimised by the way the BG is re-normalised in time, and want to point out that the more pointings are added for determining the BG response, the smaller the correlation will be. For example, a source contribution of 10% (50%) is smeared out to less than 1% (5%) after a standard 5 × 5 dithering pattern. This means one INTEGRAL orbit (50-90 pointings) would smear out the expected detector pattern of a single point source almost completely (<3% for any source strength; <0.5% for typical source contributions of less than 10%), so that the correlation is minimal. On the same timescale, we find significant changes in the detector response (γ-ray line shape parameters) for the strongest γ-ray lines, so that point 2 is also fulfilled using one INTEGRAL orbit, or three days. Depending on the energy, i.e. the BG count rate, the spectral fits are adequate (point 3) on a timescale of one orbit below energies of about 1-2 MeV. Above this energy, a spectral fit per detector requires more statistics (longer integration times) to provide robust results. We choose the time span between SPI detector annealings to determine spectral parameters above 1-2 MeV. This reduces the accuracy for the line broadenings within half a year, but this can be accounted for by correcting for a linear trend using the orbit-by-orbit fits if required. We find, however, that fits per half-year period provide A73, page 10 of 13 T. Siegert et al.: Background modelling for γ-ray spectroscopy with INTEGRAL/SPI accurate results, as shown for the 26 Al-case in (diffuse) line emission. If continuum sources are investigated and broader spectral bands are analysed, a spectral database per orbit is also useful.
We show that our BG modelling method is reproducible and also results in values for celestial sources that are in agreement with the literature values. The optimal choice of BG re-scaling as a function of time, i.e. the number of parameters at each energy, depends on the emission type (diffuse or point-like), the spectral regime (low-energy or high-energy; dominated by instrumental continuum or lines), and the width of the energy bins analysed. We provide examples of how to determine adequate fits for INTEGRAL/SPI analysis, based on the Akaike information criterion. This penalises too many and disfavours too few parameters, and we compare the results to reduced χ 2 values, which are commonly used in such analyses, but should be taken with care when using Poisson-distributed data with a low mean count rate. This BG method can be applied to a multitude of analysis cases, from the short term (e.g. gamma-ray bursts, state-variable X-ray binaries) to the long term (e.g. persistent sources, diffuse emission), continuum emission (e.g. synchrotron, bremsstrahlung, inverse Compton), and γ-ray line searches (e.g. decay, excitation, annihilation), or a combination of all of them at the same time. We show that our method eliminates most of the systematic uncertainties considering BG modelling with INTEGRAL/SPI.  sources are closest to those of the BG continuum (Fig. A.2b). The correlations between line BGs and the source is flat and around zero because they share no similarities in detector space, even though they are centred at the same energies (Fig. A.2c). In general, χ 2 distributions have expectation values equal to the number of degrees of freedom (d.o.f.), ν = N obs − N par (Mighell 2000). Their second moments (standard deviation) are √ 2ν.
Likewise, the reduced χ 2 distribution, i.e. χ 2 /ν, shows an expectation value of 1.0 with a standard deviation of √ 2/ν. We note that these values are only correct for large Poisson mean values and over-or under-estimate the true value in the low-count regime. Andrae et al. (2010) discussed that the variance in χ 2 means that even if the target value of χ 2 /ν = 1.0 is not reached, the fit may still be considered "good". As the variance depends on the number of data points and the number of fitted parameters, allowed regions around the optimal value can be identified: for a data set with N obs data points and ν d.o.f., a fit may be considered good if the reduced χ 2 value falls in the range 1 − n √ 2/ν, 1 + n √ 2/ν . Here n should be chosen according to the size of the data set, i.e. the number of data points N obs . For large data sets, the chance of having individual 5σ outliers is higher than for small data sets, so that larger n are adequate for larger data sets. This means that a reduced χ 2 value of 1.2 can still be considered good for a data set with only 1000 d.o.f., for instance, within 5σ; however, extremely bad for a data set with 10 6 d.o.f., because it would be off by ≈140σ.
In Sect. 4.2, we show the resulting fluxes for different data sets as derived from the optimal AIC solution. For our chosen examples with one broad energy bin each, Figs. 5b and 7 already contain the respective χ 2 values for the different number of BG scaling parameters. It is worth noting that there is no trend for decreased flux values when choosing the optimal AIC or optimal χ 2 solution: in the 511 keV case, the fluxes are F 511 χ 2 = (1.43 ± 0.11) × 10 −3 ph cm −2 s −1 and F 511 AIC = (1.60 ± 0.10) × 10 −3 ph cm −2 s −1 for 25 172 and 952 BG fit parameters, respectively. For the broad bin between 2.5 and 3.5 MeV, the solutions yield F 3000 χ 2 = (4.7 ± 0.7) × 10 −6 ph cm −2 s −1 keV −1 and