Compressed convolution

Franz Elsner; Benjamin D. Wandelt

doi:10.1051/0004-6361/201322177

Home

All issues

Volume 561 (January 2014)

A&A, 561 (2014) A88

Full HTML

Free Access

Issue		A&A Volume 561, January 2014


Article Number		A88
Number of page(s)		7
Section		Numerical methods and codes
DOI		https://doi.org/10.1051/0004-6361/201322177
Published online		07 January 2014

A&A 561, A88 (2014)

Compressed convolution

Franz Elsner¹^,2 and Benjamin D. Wandelt²^,3

¹ Department of Physics and AstronomyUniversity College London, London WC1E 6BT, UK
e-mail: f.elsner@ucl.ac.uk
² Institut d’Astrophysique de Paris, UMR 7095, CNRS - Université Pierre et Marie Curie (Univ Paris 06), 98 bis blvd Arago, 75014 Paris, France
³ Departments of Physics and Astronomy, University of Illinois at Urbana-Champaign, Urbana IL 61801, USA

Received: 29 June 2013
Accepted: 11 December 2013

Abstract

We introduce the concept of compressed convolution, a technique to convolve a given data set with a large number of non-orthogonal kernels. In typical applications our technique drastically reduces the effective number of computations. The new method is applicable to convolutions with symmetric and asymmetric kernels and can be easily controlled for an optimal trade-off between speed and accuracy. It is based on linear compression of the collection of kernels into a small number of coefficients in an optimal eigenbasis. The final result can then be decompressed in constant time for each desired convolved output. The method is fully general and suitable for a wide variety of problems. We give explicit examples in the context of simulation challenges for upcoming multi-kilo-detector cosmic microwave background (CMB) missions. For a CMB experiment with detectors with similar beam properties, we demonstrate that the algorithm can decrease the costs of beam convolution by two to three orders of magnitude with negligible loss of accuracy. Likewise, it has the potential to allow the reduction of disk space required to store signal simulations by a similar amount. Applications in other areas of astrophysics and beyond are optimal searches for a large number of templates in noisy data, e.g. from a parametrized family of gravitational wave templates; or calculating convolutions with highly overcomplete wavelet dictionaries, e.g. in methods designed to uncover sparse signal representations.

Key words: methods: data analysis / methods: statistical / methods: numerical / cosmic background radiation

© ESO, 2014

1. Introduction

Convolution is a very common operation in processing pipelines of scientific data sets. For example, in the analysis of cosmic microwave background (CMB) radiation experiments, convolutions are used to improve the detection of point sources (e.g., Tegmark & de Oliveira-Costa 1998; Cayón et al. 2000), in the search for non-Gaussian signals on the basis of wavelets (e.g., Barreiro & Hobson 2001; Martínez-González et al. 2002), during mapmaking (e.g., Tegmark 1997; Natoli et al. 2001), or Wiener filtering (Elsner & Wandelt 2013).

Convolution for data simulation presents similar if not greater challenges: the current and next generations of CMB experiments are nearly photon-noise limited. The only way to reach the sensitivity required to detect and resolve B-modes or to resolve the Sunyaev-Zel’dovich effect of clusters of galaxies over large fractions of sky is to build detector arrays with N ~ 10²−10⁴ detectors. Simulating the signal for these experiments requires convolving the same input sky with N different and often quite similar kernels.

In the simplest case, when the convolution kernel is azimuthally symmetric, convolution involves the computation of spherical harmonic transformations. Although highly optimized implementations exist (e.g., libsharp, Reinecke & Seljebotn 2013; the default back end in the popular HEALPix library, Górski et al. 2005), spherical harmonic transformations are numerically expensive and may easily become the bottleneck in data simulation and processing pipelines.

Even more critical is the more realistic setting when the kernels are anisotropic (e.g., when modeling the physical optics of a CMB experiment or when performing edge or ridge detection with curvelets or steerable filters, e.g., Wiaux et al. 2006; McEwen et al. 2007) In this case, the cost of convolution additionally scales with the degree of azimuthal structure in the kernel (Wandelt & Górski 2001) and the convolution output is parametrized in terms of three Euler angles each taking ~L distinct values, where L is the bandlimit of the convolution output. Storing thousands of such objects, one for each beam, requires storage capacity approaching the peta-byte scale.

In this paper, we show that regardless of the details of the convolution problem, or the algorithm used for performing the convolution, the computational costs and storage requirements associated with multiple convolutions can be considerably reduced as long as the set of convolution kernels contains linearly compressible redundancy. Our approach exploits the linearity of the convolution operation to represent the set of convolution kernels in terms of an often much smaller set of optimal basis kernels. We demonstrate that this approach can greatly accelerate several examples taken from CMB data simulation and analysis.

Approaches based on singular value decompositions (SVD) have already proven very successful in observational astronomy to correct imaging data for spatially varying point spread functions (e.g., Lupton et al. 2001; Lauer 2002). Likewise, SVDs have been used to accelerate the search for gravitational wave signatures (e.g., Cannon et al. 2010) using precomputed templates (Jaranowski & Królak 2005). In this paper, we show these methods to be special cases of a more general approach that returns a signal-to-noise eigenbasis that achieves optimal acceleration and compression for a given accuracy goal.

The paper is organized as follows. In Sect. 2, we introduce the mathematical foundations of our method. Using existing spaceborne and ground based CMB experiments as an example, we then analyze the performance of the compressed convolution scheme when applied to the beam convolution problem (Sect. 3). After outlining the scope of our algorithm in Sect. 4, we summarize our findings in Sect. 5.

2. Method

Starting from the defining equation of the convolution integral, we first review the basic concept of the algorithm. Given a raw data map d(x), the convolved object (time stream, map) s(x) is derived by convolution with a kernel K(x,y), $\begin{matrix} s (x) = \int K (x, y) d (y) d y . \end{matrix}$ $\begin{eqnarray} \label{eq:conv} s(\boldsymbol{x}) = \int K(\boldsymbol{x}, \boldsymbol{y}) d(\boldsymbol{y}) {\rm d}\boldsymbol{y} . \end{eqnarray}$ (1)Note that, without loss of generality, we focus on the convolution of two-dimensional data sets in this paper.

2.1. Overview

In practice, a continuous signal is usually measured only on a finite number of discrete pixels. We therefore approximate the integral in Eq. (1) by a sum in what follows, $\begin{matrix} s_{i} = \sum_{j} K_{i,j} d_{j} = {(R_{i} k^{)}}^{†} d . \end{matrix}$ $\begin{eqnarray} \label{eq:conv_discrete} s_i = \sum_j K_{i, j} d_j = \left(R_{i} k\right) ^{\dagger}d . \end{eqnarray}$ (2)For our subsequent analysis, we introduced the operator R in the latter equation such that R_ik is the ith row of the convolution matrix, constructed from the convolution kernel K.

For any complete set of basis functions { φ¹,...,φ^N }, there exists a unique set of coefficients { λ¹,...,λ^N }, such that $\begin{matrix} K_{i,j} = \sum_{k} λ^{k} \begin{matrix} 􏽢 \\ K \end{matrix} (φ^{k})_{i,j}, \end{matrix}$ $\begin{eqnarray} K_{i, j} = \sum_k \lambda^k \widehat{K}(\phi^k)_{i, j} , \end{eqnarray}$ (3)i.e., we do a basis transformation of the kernel from the standard basis to the basis given by the { φ^k }.

Taking advantage of the linearity of the convolution operation, Eq. (2) can then be transformed to read $\begin{matrix} s_{i} & = & \sum_{j} (\sum_{k} λ^{k} \begin{matrix} 􏽢 \\ K \end{matrix} (φ^{k})_{i,j}) d_{j} = \sum_{k} λ^{k} (\sum_{j} \begin{matrix} 􏽢 \\ K \end{matrix} (φ^{k})_{i,j} d_{j}) \\ = & \sum_{k} λ^{k} s_{i}^{k}, \end{matrix}$ $\begin{eqnarray} \label{eq:conv_decomposition} s_i &=& \sum_j \left( \sum_k \lambda^k \widehat{K}(\phi^k)_{i, j} \right) d_j = \sum_k \lambda^k \left( \sum_j \widehat{K}(\phi^k)_{i, j} \, d_j \right) \nonumber \\ &=& \sum_k \lambda^k s_i^k , \end{eqnarray}$ (4)where the s^k are the raw input map convolved with the kth mode of the basis functions themselves. That is, the final convolution outputs are now expressed in terms of a weighted sum of individually convolved input maps with a set of basis kernels.

We note that for a single convolution operation, the decomposition of the convolution kernel into multiple basis functions in Eq. (4) cannot decrease the numerical costs of the operation. However, potential performance improvements can be realized if multiple convolutions are to be calculated, as we will discuss in the following.

Consider the particular problem where a single raw map d should be convolved with n_tot different convolution kernels, i.e., we want to compute $\begin{matrix} s_{i}^{(n)} = \sum_{j} K_{i,j}^{(n)} d_{j}, \end{matrix}$ $\begin{eqnarray} s_i^{(n)} = \sum_j K_{i, j}^{(n)} d_j , \end{eqnarray}$ (5)where we introduced the kernel ID n ∈ { 1,...,n_tot } as a running index.

Applying the kernel decomposition into a common set of basis functions, Eq. (4) now reads $\begin{matrix} s_{i}^{(n)} = \sum_{k} λ^{(n),k} s_{i}^{k} . \end{matrix}$ $\begin{eqnarray} \label{eq:kernel_expansion} s_i^{(n)} = \sum_k \lambda^{(n), k} s_i^k . \end{eqnarray}$ (6)This finding builds the foundation of our fast algorithm: the numerically expensive convolution operations are applied only to a limited number of basis modes used in the expansion. The computational cost is therefore largely independent of the total number of kernels, n_tot, since each individual solution is constructed very efficiently via a simple linear combination out of a set of precomputed convolution outputs.

2.2. Optimal kernel expansion

For the kernel decomposition in Eq. (6) to be useful in practice, we have to restrict the total number of basis modes for which the convolution is calculated explicitly. To find the optimal expansion, i.e., the basis set with the smallest number of modes for a predefined truncation error, we first define the weighted sum of the expected covariance of all the elements of the convolution output $\begin{matrix} σ^{2} = ⟨ \sum_{(n)} \sum_{i, i^{'}} s_{i}^{(n)} N_{i i^{'}}^{(n) - 1} s_{i^{'}}^{(n)} ⟩ . \end{matrix}$ $\begin{eqnarray} \label{eq:error_measure} \sigma^2 = \left \langle \sum_{(n)} \sum_{i, \, i^{\prime}} s_i^{(n)} N^{(n)\,-1}_{i \, i^{\prime}} s_{i^{\prime}}^{(n)} \right \rangle . \end{eqnarray}$ (7)Here, we have introduced a real symmetric weighting matrix, N⁽ⁿ⁾, which allows us to specify what aspects of the convolved maps we require to be accurate. For the case of convolving to simulate CMB data, a natural choice for N⁽ⁿ⁾ would be the noise covariance for the nth channel. It ensures that any given channel will be simulated at sufficient accuracy and that after the addition of instrumental noise, the statistics of the resulting simulation are indistinguishable from an exact simulation.

It is now easy to see how to decompose the kernels into a basis such a way as to concentrate the largest amount of variance in the first basis elements. Define the Hermitian matrix $\begin{matrix} M_{nm} & = ⟨ \sum_{i, i^{'}, i^{′′}} N_{i^{'} i}^{(n) - \frac{1}{2}} s_{i}^{(n)} N_{i^{'} i^{′′}}^{(m) - \frac{1}{2}} s_{i^{′′}}^{(m)} ⟩ \\ = \sum_{i, i^{'}} {[{(N^{(n) - \frac{1}{2}})}^{†} N^{(m) - \frac{1}{2}}]}_{i i^{'}} (R_{i^{'}} k^{(m)}) C {(R_{i} k^{(n)})}^{†}, \end{matrix}$ $\begin{eqnarray} \label{eq:weighted_covariance} M_{nm} &= \left \langle \sum_{i, \, i^{\prime}, \, i^{\prime \prime}} N^{(n) \, -\frac{1}{2}}_{i^\prime \, i} s_i^{(n)} N^{(m) \, -\frac{1}{2}}_{i^{\prime} \, i^{\prime \prime}} s_{i^{\prime \prime}}^{(m)} \right \rangle \nonumber \\ & = \sum_{i, \, i^{\prime}} \left[ \left( N^{(n) \, -\frac{1}{2}} \right)^{\dagger} N^{(m) \, -\frac{1}{2}} \right]_{i \, i^{\prime}} \left( R_{i^{\prime}} k^{(m)} \right) \mathbf{C} \left( R_{i} k^{(n)} \right)^{\dagger} , \end{eqnarray}$ (8)where C is the covariance of the input signal and $N^{(n) \frac{1}{2}}$ $\hbox{$\mathbf{N}^{(n)\,\frac{1}{2}}$}$ is any matrix such that $(N^{(n) \frac{1}{2}})^{†} N^{(n) \frac{1}{2}} = N^{(n)}$ $\hbox{$(\mathbf{N}^{(n) \, \frac{1}{2}})^{\dagger} \mathbf{N}^{(n) \, \frac{1}{2}} = \mathbf{N}^{(n)}$}$ .

Then we can rewrite the scalar Eq. (7) as a matrix trace over the kernel IDs $\begin{matrix} σ^{2} & = ⟨ tr (M^{)} ⟩ = \sum_{n, i, i^{'}} N_{i i^{'}}^{(n) - 1} (R_{i^{'}} k^{(n)}) C (R_{i} k^{(n)})^{†} . \end{matrix}$ $\begin{eqnarray} \label{eq:error_criterion} \sigma^2 &= \left\langle \mathrm{tr} \left( M \right) \right\rangle = \sum_{n, \, i, \, i^{\prime}} N^{(n)\,-1}_{i \, i^{\prime}} (R_{i^{\prime}} k^{(n)}) C (R_{i} k^{(n)})^{\dagger} . \end{eqnarray}$ (9)Since the matrix in Eq. (9) is Hermitian, its ordered diagonal elements cannot decrease faster than its ordered eigenvalues by Schur’s theorem. Finding the eigensystem of M therefore results in the kernel decomposition that converges faster than any other decomposition to the result of the direct computation. In other words, the decomposition is optimal because discarding the eigenmodes with the smallest eigenvalues results in the smallest possible change in the overall signal power.

If we denote the eigenvectors of M by u^r, with corresponding eigenvalues ν_r, the optimal compression kernel eigenmodes are given by $φ_{i}^{(n)} = \sum_{m} u_{(m)}^{(n)} k_{i}^{(m)}$ $\hbox{$\phi^{(n)}_{i} = \sum_{m} u^{(n)}_{(m)} k_{i}^{(m)}$}$ , and the mean square truncation error is the sum of the truncated eigenvalues.

Considering the CMB case of a convolution on the sphere with azimuthally symmetric convolution kernels and multipole-dependent diagonal weights, N_ℓ, Eq. (8) simplifies to $\begin{matrix} M_{nm} = \sum_{ℓ} \frac{2 ℓ + 1}{4 π} (\frac{𝒞_{ℓ}}{\sqrt{N_{ℓ}^{(n)} N_{ℓ}^{(m)}}}) K_{ℓ}^{(n)} K_{ℓ}^{(m)}, \end{matrix}$ $\begin{eqnarray} M_{nm} = \sum_{\ell} \frac{2\ell+1}{4\pi} \left( \frac{\mathcal{C}_{\ell}}{\sqrt{N^{(n)}_{\ell} N^{(m)}_{\ell}}} \right) K^{(n)}_{\ell} K^{(m)}_{\ell} , \end{eqnarray}$ (10)and Eq. (9) becomes $\begin{matrix} σ^{2} = \sum_{ℓ, n} \frac{2 ℓ + 1}{4 π} (\frac{𝒞_{ℓ}}{N_{ℓ}^{(n)}}) K_{ℓ}^{(n)} K_{ℓ}^{(n)}, \end{matrix}$ $\begin{eqnarray} \sigma^{2} = \sum_{\ell, \, n} \frac{2\ell+1}{4\pi} \left( \frac{\mathcal{C}_{\ell}}{N^{(n)}_{\ell}} \right) K^{(n)}_{\ell} K^{(n)}_{\ell} , \end{eqnarray}$ (11)which clearly shows the signal-to-noise weighting at work.

Note, that the expression for the variance can be promoted to a matrix in a dual way, $\begin{matrix} M_{l l^{'}} = \sum_{n} \sqrt{\frac{2 ℓ + 1}{4 π} \frac{𝒞_{ℓ}}{N_{ℓ}^{(n)}}} \sqrt{\frac{2 ℓ^{'} + 1}{4 π} \frac{𝒞_{ℓ^{'}}}{N_{ℓ^{'}}^{(n)}}} K_{ℓ}^{(n)} K_{ℓ^{'}}^{(n)}, \end{matrix}$ $\begin{eqnarray} \label{eq:dual_{covmatrix}} M_{ll^{\prime}} = \sum_{n}\sqrt{ \frac{2 \ell + 1}{4 \pi} \frac{\mathcal{C}_{\ell}}{ N^{(n)}_{\ell}}}\sqrt{\frac{2 \ell^{\prime} + 1}{4 \pi} \frac{\mathcal{C}_{\ell^{\prime}}}{ N^{(n)}_{\ell^{\prime}}}} K_{\ell}^{(n)} K_{\ell^{\prime}}^{(n)} , \end{eqnarray}$ (12)which gives rise to an alternative way to calculate the optimal compression basis.

This dual approach will be computationally more convenient than the other approach if the number of kernels is larger than the number of multipoles in the ℓ-range considered. The resulting compression scheme will be identical in both cases. This is so because both approaches are optimal by Schur’s theorem and each gives a unique answer if none of the eigenvalues are degenerate¹.

2.3. Truncation error estimates

In case the kernels are of similar shape, or differ only in regimes that are irrelevant due to low signal-to-noise, the eigenvalues of the individual modes will decrease quickly. As a result, we can truncate the expansion in Eq. (6) at n_modes ≪ n_tot. This will induce a mean square truncation in the weighted variance of the convolution products of $\sum_{r = n_{modes} + 1}^{n_{tot}} ν_{r}$ $\hbox{$\sum_{r\,=\,n_{\mathrm{modes}}\,+\,1}^{n_{\mathrm{tot}}} \nu_{r}$}$ .

The error ΔK introduced by the truncation can be calculated for each kernel explicitly, $\begin{matrix} Δ K_{i,j} = \sum_{k = n_{modes} + 1}^{n_{tot}} λ^{k} \begin{matrix} 􏽢 \\ K \end{matrix} (φ^{k})_{i,j} . \end{matrix}$ $\begin{eqnarray} \label{eq:error} \Delta K_{i, j} = \sum_{k \,=\, n_{\mathrm{modes}} \,+\, 1}^{n_{\mathrm{tot}}} \lambda^k \widehat{K}(\phi^k)_{i, j} . \end{eqnarray}$ (13)For the convolution of a data set with power spectrum $\hbox{$\mathcal{C}_{\ell}$}$ on the sphere, for example, the mean square error will then amount to $\begin{matrix} σ_{total}^{2} = \sum_{ℓ = 0}^{ℓ_{\max}} \frac{(2 ℓ + 1)}{4 π} Δ K_{ℓ}^{2} 𝒞_{ℓ}, \end{matrix}$ $\begin{eqnarray} \label{eq:total_error} \sigma^2_{\mathrm{total}} = \sum_{\ell \,=\, 0}^{\lmax} \frac{(2\ell + 1)}{4\pi} \, \Delta K_{\ell}^2 \, \mathcal{C}_{\ell} , \end{eqnarray}$ (14)where ΔK_ℓ is the expansion of the beam truncation error into Legendre polynomials.

2.4. Connection to the SVD

While Eq. (9) provides us with the optimal kernel decomposition, the power spectrum of the data or their noise properties to construct the kernel weights may not necessarily be known in advance. For uniform weightings, N ∝1, and assuming a flat signal power spectrum, the equation simplifies and we obtain the mode expansion from a singular value decomposition of the collection of kernels.

Although not strictly optimal, we note that it is possible to obtain good results with this simplified approach in practice. To compute the kernel expansion, we reshape the convolution kernels into one-dimensional arrays of length m and arrange them into a common matrix T, with size n_tot × m. The singular value decomposition of this matrix, $\begin{matrix} T = UD V^{†}, \end{matrix}$ $\begin{eqnarray} \label{eq:svd} \mathbf {T = U D V^{\dagger} }, \end{eqnarray}$ (15)computes the n_tot × n_tot matrix U, the n_tot × m matrix D, and the m × m matrix V. The decomposition then provides us with a set of basis functions, returned in the columns of V. Their relative importance is indicated by the entries of the diagonal matrix D, and their individual coefficients λ are stored in U.

2.5. Summary

In summary, the individual steps of the algorithm are as follows: we first find the eigenmode decomposition of the set of convolution kernels using either the optimal expansion criterion or a simplified singular value decomposition. Then, we identify the number of modes to retain to comply with the accuracy goal. As a next step, we perform the convolution of the input map for each eigenmode separately. To obtain the final results, we compute the linear combination of the convolved maps with optimal weights for each kernel.

It is worth noting that compressed convolution can never increase the computational time required for convolution, except possibly for some overhead of sub-leading order, attributed to the calculation of the optimal kernel expansion (this computation has to be done only once for a given set of kernels). This can be seen explicitly in the worst case scenario of strictly orthogonal kernels: all modes must be retained and the method becomes equivalent to the brute force approach.

Fig. 1

All six Planck beams at 217 GHz (left panel) have very similar shapes. As a result, the eigenvalues of their singular value decomposition decrease quickly (right panel), allowing half of the modes to be safely discarded.

Fig. 2

Left panel: retaining the first three out of six Planck 217 GHz beam eigenmodes allows to reduce the relative truncation error of all convolution kernels to the order $𝒪 (10^{-3})$ $\hbox{${{\cal O}\! \left( 10^{-3} \right)}$}$ . Right panel: we compare the eigenmodes used in the convolution (solid lines) to the discarded modes (dotted lines). Results in this plot have been obtained from a SVD, i.e. using kernel weights $(2 ℓ + 1) 𝒞_{ℓ} / N_{ℓ}^{(n)} = const .$ $\hbox{$(2 \ell + 1) \mathcal{C}_{\ell} / N^{(n)}_{\ell} = {\rm const.}$}$

Fig. 3

Kernel weights allow for a full control over truncation errors. Same as Fig. 2, but for a $(2 ℓ + 1) 𝒞_{ℓ} / N_{ℓ}^{(n)} \propto (2 ℓ + 1) / (ℓ (ℓ + 1))$ $\hbox{$(2 \ell + 1) \mathcal{C}_{\ell} / N^{(n)}_{\ell} \propto (2 \ell + 1) / (\ell \, (\ell + 1))$}$ weighting scheme, enforcing a more precise kernel reconstruction on large angular scales at the cost of increased errors at high multipoles.

3. Application to CMB experiments

After having outlined the basic principle of the algorithm, we now analyze the performance of the method when applied to the beam convolution operation of current CMB experiments.

3.1. Planck

We use the third generation CMB satellite experiment Planck (Planck Collaboration 2011) as a first example to illustrate the application of the algorithm. We make use of the 217 GHz HFI instrument (Planck HFI Core Team 2011) and consider the beam convolution problem of CMB simulations. Azimuthally symmetrized beam functions for the six individual detectors at that frequency are available from the reduced instrument model (Planck Collaboration 2014).

A comparison of the eigenvalues of a singular value decomposition reveals that the beam shapes are sufficiently similar to be represented with only a limited number of basis functions (Fig. 1). As shown in Fig. 2, selecting the first three eigenmodes for a reconstruction is sufficient to represent the beams to an accuracy of the order $𝒪 (10^{-3})$ $\hbox{${{\cal O}\! \left( 10^{-3} \right)}$}$ , better than the typical precision to which the beams are known.

To illustrate the impact of the weighting scheme, we also show the resulting eigenmodes using the optimal kernel expansion (Eq. (12)) in Fig. 3. Here, we assumed a white noise power spectrum, N_ℓ = const., in combination with a signal covariance of $\hbox{$\mathcal{C}_{\ell} \propto 1/(\ell \, (\ell + 1))$}$ , reflecting the approximate scaling behavior of the CMB power spectrum.

We chose the beam with the largest reconstruction error for an explicit test on simulated CMB signal maps. In Fig. 4, we plot the difference map computed from the brute force beam convolution and the compressed convolution with three eigenmodes. A power spectrum analysis confirms that the truncation induced errors are clearly subdominant on all angular scales.

Fig. 4

Truncation errors are negligible. Using the kernel with the largest truncation error as worst case scenario, we plot the beam convolved simulated CMB map used in this test of the Planck 217 GHz channels (left panel, we show a 10° × 10° patch). Middle panel: the difference map between the results obtained with the exact convolution and the compressed convolution with three beam modes. Right panel: compared to the fiducial power spectrum of the input map (dashed line), the power spectrum of the difference map is subdominant by a large margin on all angular scales.

The test demonstrates that the algorithm can be applied straightforwardly to the beam convolution problem. In case of the six Planck 217 GHz detectors, we reduce the number of computationally expensive spherical harmonic transformations by a factor of two. This finding is characteristic for the scope of the algorithm: for a small total number of convolution kernels, the reductions in computational costs can only be modest. However, already for the latest generation of CMB instruments, the compressed convolution scheme can offer very large performance improvements as we will demonstrate explicitly in the next paragraph.

3.2. Keck

Exemplary for modern ground based and balloon-borne CMB experiments, we now discuss the application of the algorithm for the Keck array, a polarization sensitive experiment located at the south pole that started data taking in 2010 (Sheehy et al. 2010). Its instrument currently consists of five separate receivers, each housing 496 detectors, and scanning the sky at a common frequency of 150 GHz.

Measurements have shown that the 2480 Keck beams can be described by elliptic Gaussian profiles to good approximation (Vieregg et al. 2012), $\begin{matrix} K (x) \propto e^{- \frac{1}{2} (x - x_{0}) C^{-1} (x - x_{0})}, \end{matrix}$ $\begin{eqnarray} \label{eq:k_beams} K(\boldsymbol{x}) \propto \mathrm{e}^{-\frac{1}{2}(\boldsymbol{x} - \boldsymbol{x_0}) \mathbf{C}^{-1} (\boldsymbol{x} - \boldsymbol{x_0})} , \end{eqnarray}$ (16)where the beam center is located at x₀. Here, the beam size and ellipticity is parametrized by the covariance matrix, $\begin{matrix} C = σ^{2} (\begin{matrix} \end{matrix}), \end{matrix}$ $\begin{eqnarray} \mathbf{C} = \sigma^2 \left( \begin{array}{cc} 1 + \epsilon & 0 \\ 0 & 1 - \epsilon \end{array} \right) , \end{eqnarray}$ (17)with the receiver specific parameters σ and ϵ reproduced in Table 1.

To simulate the optical system of the full Keck array, we drew 2480 realizations of beam size and ellipticity according to the receiver specifications and then used Eq. (16) to construct individual beams. We finally rotated the beams around their axes with randomly chosen angles between 0 ≤ φ < 2π. Applying fully random rotations is conservative since beams of bolometers in the same receiver are known to have similar orientations.

We found that only the first eight common eigenmodes are necessary to approximate all 2480 individual beams to a precision of at least the order $𝒪 (10^{-3})$ $\hbox{${{\cal O}\! \left( 10^{-3} \right)}$}$ . We illustrate this set of eigenmodes in Fig. 5. In Fig. 6, we show as an example the beam with the largest reconstruction error. For about 90 % of the detectors, the truncation errors are below $𝒪 (10^{-4})$ $\hbox{${{\cal O}\! \left( 10^{-4} \right)}$}$ .

We verified the results with a CMB simulation in flat sky approximation, high-pass filtered to suppress signal below ℓ < 50. We plot the difference map computed from a direct convolution and the compressed convolution with eight eigenmodes in Fig. 7. The error is subdominant on all angular scales.

The example outlined here demonstrates the full strength of the algorithm. Computing beam convolutions for the Keck array, we are able to reduce the number of computationally expensive convolution operations from 2480 to only eight, an improvement by a factor as high as 310.

Table 1

Keck beam parameters as provided by Vieregg et al. (2012).

Fig. 5

Simulated 2480 asymmetric Keck beams at 150 GHz are similar enough to be represented by only eight distinct beam eigenmodes to high precision.

Fig. 6

Left panel: we show the beam with the largest reconstruction error for the simulated Keck array. Right panel: using the first eight beam eigenmodes, the truncation error is at most of the order $𝒪 (10^{-3})$ $\hbox{${{\cal O}\! \left( 10^{-3} \right)}$}$ .

Fig. 7

Same as Fig. 4, but for the worst case of the simulated Keck experiment. Using eight beam modes for the convolution is sufficient to reduce the truncation error to negligible levels.

4. Scope of the algorithm

As shown in Sect. 3, the algorithm has the potential to provide huge speedups for the beam convolution operation of modern experiments with a large number of detectors, necessary to improve the sensitivity of CMB measurements in the photon noise limited regime. Fast beam convolutions are not only important for the simulation of signal maps for individual detectors. They also play a crucial role in the mapmaking process, the iterative construction of a common sky map out of the time ordered data from different detectors observing at the same frequency.

Current experiments already deploy several hundreds to thousands of detectors, making them ideal candidates for the algorithm, e.g., SPTpol (about 800 pixels, Austermann et al. 2012), POLARBEAR (about 1300 pixels, Kermish et al. 2012), EBEX (about 1400 pixels, Reichborn-Kjennerud et al. 2010), Spider (about 2600 pixels, Filippini et al. 2010), ACTPol (about 3000 pixels, Niemack et al. 2010). For future experiments, the number of detectors can be expected to increase further, e.g., for PIPER (about 5000 pixels, Lazear et al. 2013), the Cosmic Origins Explorer (about 6000 pixels, The COrE Collaboration 2011), or POLARBEAR-2 (about 7500 pixels Tomaru et al. 2012), making the application of the algorithm even more rewarding.

The new method also allows a fast implementation of matched filtering on the sphere (or other domains) if the size of the target is unknown (e.g., to detect signatures of bubble collisions in the CMB, McEwen et al. 2012), or analogously for continuous wavelet transforms, frequently used in the context of data compression or pattern recognition (e.g., Mallat 1989). Here, the input signal is convolved with a large set of scale dilations of an analyzing filter or wavelet. Since the resulting convolution kernels are of similar shape by construction, the decomposition into only a few eigenmodes can be done efficiently. Our new method therefore has the potential to increase the numerical performance of such computations by a substantial factor.

Finally, besides from the reduction in computational costs, we note that compressed convolution may also offer the possibility to reduce the disk space required to store convolved data sets. Instead of saving the convolved signal for each kernel separately, it now becomes possible to just keep the compressed output for the most important eigenmodes, and efficiently decompress it with their proper weights for each individual kernel on the fly as needed.

5. Summary

In signal processing, a single data set often has to be convolved with many different kernels. With increasing data size, this operation quickly becomes numerically expensive to evaluate, possibly even dominating the execution time of analysis pipelines.

To increase the performance of such convolution operations, we introduced the general method of compressed convolution. Using an eigenvector decomposition of the convolution kernels, we first obtain their optimal expansion into a common set of basis functions. After ordering the modes according to their relative importance, we identify the minimal number of basis functions to retain to satisfy the accuracy requirements. Then, the convolution operation is executed for each mode separately, and the final result obtained for each kernel from a linear combination.

This algorithm offers particularly large performance improvements, if

the total number of kernels to consider is large, and
the kernels are sufficiently similar in shape, such that they can be approximated to good precision with only a few eigenmodes.

In case of the analysis of CMB data, we use the beam convolution problem as an example application of the compressed convolution scheme. On the basis of simulations of the Keck array with 2480 detectors (Vieregg et al. 2012), we demonstrated that the compressed convolution scheme allows to reduce the number of beam convolution operations by a factor of about 300, offering the possibility to cut the runtime of convolution pipelines by orders of magnitude. Additional improvements are possible when used in combination with efficient convolution algorithms (e.g., Elsner & Wandelt 2011).

¹

If some eigenvalues do happen to be degenerate then the solutions will differ in ways that are not relevant to the compression efficiency.

Acknowledgments

The authors thank Clem Pryke for highlighting beam convolution for kilo-detector experiments as an outstanding problem, and Guillaume Faye and Xavier Siemens for conversations regarding the applications to searches for gravitational wave signals. B.D.W. was supported by the ANR Chaire d’Excellence and NSF grants AST 07-08849 and AST 09-08902 during this work. F.E. gratefully acknowledges funding by the CNRS. Some of the results in this paper have been derived using the HEALPix (Górski et al. 2005) package. Based on observations obtained with Planck (http://www.esa.int/Planck), an ESA science mission with instruments and contributions directly funded by ESA Member States, NASA, and Canada. The development of Planck has been supported by: ESA; CNES and CNRS/INSU-IN2P3-INP (France); ASI, CNR, and INAF (Italy); NASA and DoE (USA); STFC and UKSA (UK); CSIC, MICINN and JA (Spain); Tekes, AoF and CSC (Finland); DLR and MPG (Germany); CSA (Canada); DTU Space (Denmark); SER/SSO (Switzerland); RCN (Norway); SFI (Ireland); FCT/MCTES (Portugal); and PRACE (EU). A description of the Planck Collaboration and a list of its members, including the technical or scientific activities in which they have been involved, can be found at http://www.sciops.esa.int/index.php?project=planck&page=Planck_Collaboration.

References

Austermann, J. E., Aird, K. A., Beall, J. A., et al. 2012, in Proc. SPIE, 8452, 1E [Google Scholar]
Barreiro, R. B., & Hobson, M. P. 2001, MNRAS, 327, 813 [NASA ADS] [CrossRef] [Google Scholar]
Cannon, K., Chapman, A., Hanna, C., et al. 2010, Phys. Rev. D, 82, 044025 [NASA ADS] [CrossRef] [Google Scholar]
Cayón, L., Sanz, J. L., Barreiro, R. B., et al. 2000, MNRAS, 315, 757 [NASA ADS] [CrossRef] [Google Scholar]
Elsner, F., & Wandelt, B. D. 2011, A&A, 532, A35 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Elsner, F., & Wandelt, B. D. 2013, A&A, 549, A111 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Filippini, J. P., Ade, P. A. R., Amiri, M., et al. 2010, in Proc. SPIE, 7741, 46 [Google Scholar]
Górski, K. M., Hivon, E., Banday, A. J., et al. 2005, ApJ, 622, 759 [NASA ADS] [CrossRef] [Google Scholar]
Jaranowski, P., & Królak, A. 2005, Liv. Rev. Relativity, 8, 3 [NASA ADS] [Google Scholar]
Kermish, Z. D., Ade, P., Anthony, A., et al. 2012, in Proc. SPIE, 8452, 1C [Google Scholar]
Lauer, T. 2002, in Proc. SPIE 4847, eds. J.-L. Starck, & F. D. Murtagh, 167 [Google Scholar]
Lazear, J., Ade, P., Benford, D. J., et al. 2013, in American Astronomical Society Meeting Abstracts, 221, 229.04 [Google Scholar]
Lupton, R., Gunn, J. E., Ivezić, Z., et al. 2001, in Astronomical Data Analysis Software and Systems X, eds. F. R. Harnden, Jr., F. A. Primini, & H. E. Payne, ASP Conf. Ser., 238, 269 [Google Scholar]
Mallat, S. G. 1989, IEEE Trans. Pattern Anal. Mach. Intell., 11, 674 [Google Scholar]
Martínez-González, E., Gallegos, J. E., Argüeso, F., Cayón, L., & Sanz, J. L. 2002, MNRAS, 336, 22 [NASA ADS] [CrossRef] [Google Scholar]
McEwen, J. D., Hobson, M. P., Mortlock, D. J., & Lasenby, A. N. 2007, IEEE Trans. Signal Process., 55, 520 [NASA ADS] [CrossRef] [Google Scholar]
McEwen, J. D., Feeney, S. M., Johnson, M. C., & Peiris, H. V. 2012, Phys. Rev. D, 85, 103502 [NASA ADS] [CrossRef] [Google Scholar]
Natoli, P., de Gasperis, G., Gheller, C., & Vittorio, N. 2001, A&A, 372, 346 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Niemack, M. D., Ade, P. A. R., Aguirre, J., et al. 2010, in Proc. SPIE, 7741, 51 [Google Scholar]
Planck Collaboration 2011, A&A, 536, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Planck Collaboration 2014, A&A, in press [arXiv:1303.5068] [Google Scholar]
Planck HFI Core Team 2011, A&A, 536, A4 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Reichborn-Kjennerud, B., Aboobaker, A. M., Ade, P., et al. 2010, in Proc. SPIE, 7741, 37 [Google Scholar]
Reinecke, M., & Seljebotn, D. S. 2013, A&A, 554, A112 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Sheehy, C. D., Ade, P. A. R., Aikin, R. W., et al. 2010, in Proc. SPIE, 7741, 50 [Google Scholar]
Tegmark, M. 1997, ApJ, 480, L87 [NASA ADS] [CrossRef] [Google Scholar]
Tegmark, M., & de Oliveira-Costa, A. 1998, ApJ, 500, L83 [NASA ADS] [CrossRef] [Google Scholar]
The COrE Collaboration 2011 [arXiv:1102.2181] [Google Scholar]
Tomaru, T., Hazumi, M., Lee, A. T., et al. 2012, in Proc. SPIE, 8452, 1H [Google Scholar]
Vieregg, A. G., Ade, P. A. R., Aikin, R., et al. 2012, in Proc. SPIE, 8452, 26 [Google Scholar]
Wandelt, B. D., & Górski, K. M. 2001, Phys. Rev. D, 63, 123002 [NASA ADS] [CrossRef] [Google Scholar]
Wiaux, Y., Jacques, L., Vielva, P., & Vandergheynst, P. 2006, ApJ, 652, 820 [NASA ADS] [CrossRef] [Google Scholar]

All Tables

Table 1

Keck beam parameters as provided by Vieregg et al. (2012).

In the text

All Figures

	Fig. 1 All six Planck beams at 217 GHz (left panel) have very similar shapes. As a result, the eigenvalues of their singular value decomposition decrease quickly (right panel), allowing half of the modes to be safely discarded.
In the text

Fig. 2

Left panel: retaining the first three out of six Planck 217 GHz beam eigenmodes allows to reduce the relative truncation error of all convolution kernels to the order $𝒪 (10^{-3})$ $\hbox{${{\cal O}\! \left( 10^{-3} \right)}$}$ . Right panel: we compare the eigenmodes used in the convolution (solid lines) to the discarded modes (dotted lines). Results in this plot have been obtained from a SVD, i.e. using kernel weights $(2 ℓ + 1) 𝒞_{ℓ} / N_{ℓ}^{(n)} = const .$ $\hbox{$(2 \ell + 1) \mathcal{C}_{\ell} / N^{(n)}_{\ell} = {\rm const.}$}$

In the text

Fig. 3

Kernel weights allow for a full control over truncation errors. Same as Fig. 2, but for a $(2 ℓ + 1) 𝒞_{ℓ} / N_{ℓ}^{(n)} \propto (2 ℓ + 1) / (ℓ (ℓ + 1))$ $\hbox{$(2 \ell + 1) \mathcal{C}_{\ell} / N^{(n)}_{\ell} \propto (2 \ell + 1) / (\ell \, (\ell + 1))$}$ weighting scheme, enforcing a more precise kernel reconstruction on large angular scales at the cost of increased errors at high multipoles.

In the text

Fig. 4

Truncation errors are negligible. Using the kernel with the largest truncation error as worst case scenario, we plot the beam convolved simulated CMB map used in this test of the Planck 217 GHz channels (left panel, we show a 10° × 10° patch). Middle panel: the difference map between the results obtained with the exact convolution and the compressed convolution with three beam modes. Right panel: compared to the fiducial power spectrum of the input map (dashed line), the power spectrum of the difference map is subdominant by a large margin on all angular scales.

In the text

	Fig. 5 Simulated 2480 asymmetric Keck beams at 150 GHz are similar enough to be represented by only eight distinct beam eigenmodes to high precision.
In the text

	Fig. 6 Left panel: we show the beam with the largest reconstruction error for the simulated Keck array. Right panel: using the first eight beam eigenmodes, the truncation error is at most of the order $𝒪 (10^{-3})$ $\hbox{${{\cal O}\! \left( 10^{-3} \right)}$}$ .
In the text

	Fig. 7 Same as Fig. 4, but for the worst case of the simulated Keck experiment. Using eight beam modes for the convolution is sufficient to reduce the truncation error to negligible levels.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Austermann, J. E., Aird, K. A., Beall, J. A., et al. 2012, in Proc. SPIE, 8452, 1E [Google Scholar]

[2] Barreiro, R. B., & Hobson, M. P. 2001, MNRAS, 327, 813 [NASA ADS] [CrossRef] [Google Scholar]

[3] Cannon, K., Chapman, A., Hanna, C., et al. 2010, Phys. Rev. D, 82, 044025 [NASA ADS] [CrossRef] [Google Scholar]

[4] Cayón, L., Sanz, J. L., Barreiro, R. B., et al. 2000, MNRAS, 315, 757 [NASA ADS] [CrossRef] [Google Scholar]

[5] Elsner, F., & Wandelt, B. D. 2011, A&A, 532, A35 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[6] Elsner, F., & Wandelt, B. D. 2013, A&A, 549, A111 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[7] Filippini, J. P., Ade, P. A. R., Amiri, M., et al. 2010, in Proc. SPIE, 7741, 46 [Google Scholar]

[8] Górski, K. M., Hivon, E., Banday, A. J., et al. 2005, ApJ, 622, 759 [NASA ADS] [CrossRef] [Google Scholar]

[9] Jaranowski, P., & Królak, A. 2005, Liv. Rev. Relativity, 8, 3 [NASA ADS] [Google Scholar]

[10] Kermish, Z. D., Ade, P., Anthony, A., et al. 2012, in Proc. SPIE, 8452, 1C [Google Scholar]

[11] Lauer, T. 2002, in Proc. SPIE 4847, eds. J.-L. Starck, & F. D. Murtagh, 167 [Google Scholar]

[12] Lazear, J., Ade, P., Benford, D. J., et al. 2013, in American Astronomical Society Meeting Abstracts, 221, 229.04 [Google Scholar]

[13] Lupton, R., Gunn, J. E., Ivezić, Z., et al. 2001, in Astronomical Data Analysis Software and Systems X, eds. F. R. Harnden, Jr., F. A. Primini, & H. E. Payne, ASP Conf. Ser., 238, 269 [Google Scholar]

[14] Mallat, S. G. 1989, IEEE Trans. Pattern Anal. Mach. Intell., 11, 674 [Google Scholar]

[15] Martínez-González, E., Gallegos, J. E., Argüeso, F., Cayón, L., & Sanz, J. L. 2002, MNRAS, 336, 22 [NASA ADS] [CrossRef] [Google Scholar]

[16] McEwen, J. D., Hobson, M. P., Mortlock, D. J., & Lasenby, A. N. 2007, IEEE Trans. Signal Process., 55, 520 [NASA ADS] [CrossRef] [Google Scholar]

[17] McEwen, J. D., Feeney, S. M., Johnson, M. C., & Peiris, H. V. 2012, Phys. Rev. D, 85, 103502 [NASA ADS] [CrossRef] [Google Scholar]

[18] Natoli, P., de Gasperis, G., Gheller, C., & Vittorio, N. 2001, A&A, 372, 346 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[19] Niemack, M. D., Ade, P. A. R., Aguirre, J., et al. 2010, in Proc. SPIE, 7741, 51 [Google Scholar]

[20] Planck Collaboration 2011, A&A, 536, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[21] Planck Collaboration 2014, A&A, in press [arXiv:1303.5068] [Google Scholar]

[22] Planck HFI Core Team 2011, A&A, 536, A4 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[23] Reichborn-Kjennerud, B., Aboobaker, A. M., Ade, P., et al. 2010, in Proc. SPIE, 7741, 37 [Google Scholar]

[24] Reinecke, M., & Seljebotn, D. S. 2013, A&A, 554, A112 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[25] Sheehy, C. D., Ade, P. A. R., Aikin, R. W., et al. 2010, in Proc. SPIE, 7741, 50 [Google Scholar]

[26] Tegmark, M. 1997, ApJ, 480, L87 [NASA ADS] [CrossRef] [Google Scholar]

[27] Tegmark, M., & de Oliveira-Costa, A. 1998, ApJ, 500, L83 [NASA ADS] [CrossRef] [Google Scholar]

[28] The COrE Collaboration 2011 [arXiv:1102.2181] [Google Scholar]

[29] Tomaru, T., Hazumi, M., Lee, A. T., et al. 2012, in Proc. SPIE, 8452, 1H [Google Scholar]

[30] Vieregg, A. G., Ade, P. A. R., Aikin, R., et al. 2012, in Proc. SPIE, 8452, 26 [Google Scholar]

[31] Wandelt, B. D., & Górski, K. M. 2001, Phys. Rev. D, 63, 123002 [NASA ADS] [CrossRef] [Google Scholar]

[32] Wiaux, Y., Jacques, L., Vielva, P., & Vandergheynst, P. 2006, ApJ, 652, 820 [NASA ADS] [CrossRef] [Google Scholar]