Open Access
Issue
A&A
Volume 620, December 2018
Article Number A151
Number of page(s) 10
Section Numerical methods and codes
DOI https://doi.org/10.1051/0004-6361/201732388
Published online 11 December 2018

© ESO 2018

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

The newly launched Hard X-ray Modulation Telescope (Insight-HXMT) is China’s first X-ray astronomical satellite. Insight-HXMT carries three main payloads onboard: the High Energy telescope (HE), the Medium Energy telescope (ME) and the Low Energy telescope (LE) (Zhang et al. 2014). One of its primary scientific objectives is scanning the Galactic plane to monitor time-varying objects and to discover X-ray transients (Zhang et al. 2014). Insight-HXMT is based on the direct demodulation (DD) method because all telescopes onboard are position-insensitive collimated detectors and images can only be reconstructed from the observed data by offline data analysis (Li & Wu 1993, 1994; Li 2007; Zhang 2009).

In the analysis of astronomical data, observed data can be modelled as object functions modulated by kernel functions, which characterise the observation process, mainly the instrument response. For example, an observed image is a spatial distribution of objects modulated by an imaging system, where a point spread function (PSF) serves as the modulation kernel function. Observed spectra are object spectra modulated by energy response matrices. The relations among the observed data, the object, and the observation process can be formulated with a Fredholm integral equation of the first kind:

d ( ω ) = h ( x , ω ) f ( x ) d x , $$ \begin{aligned} d\left(\boldsymbol{\omega }\right) = \int h\left(\boldsymbol{x}, \boldsymbol{\omega }\right) f\left(\boldsymbol{x}\right) \mathrm{d} \boldsymbol{x}\text{,} \end{aligned} $$(1)

where d(ω), h(x,ω) and f(x) represent models of the observed data, the kernel, and the object, respectively, and ω and x are coordinate variables for the observed data domain (e.g. pixel indices, energy channels, possible state parameters of instruments and so on) and the object domain (e.g. WCS celestial sphere coordinates), respectively. The object model f(x) represents both the sources and the sky background. The kernel h(x,ω) characterises the observation process including the instrument response as well as non-sky backgrounds; for example, dark currents, or cosmic rays for high-energy detectors. In order to solve the equation, both periodic calibration of instrument response and non-sky background estimation are necessary to estimate the kernel. Differences between the true kernel and its estimation lead to systematic errors in the solution, that is, the estimated object. Discussions of instrument response calibration, non-sky background modelling and estimation, and systematic error due to above issues are beyond the scope of this work.

The DD method is introduced by Li & Wu (1993) to solve the modulation equation and then to reconstruct the unknown object from the observed data. In Eq. (1) a model of the observed data instead of the data itself is given. Since the data itself is a random outcome of the observation process, statistical solution of the modulation equation is achieved via the DD method according to the probability distribution of the observed data implied by its model. For observed data dominated by Poisson fluctuation, the simplest implementation of the DD method degenerates into Richardson–Lucy (RL) iteration (Richardson 1972; Lucy 1974). As a non-parametric approach, reconstruction through RL iteration allows a larger pool of solutions that fit the observed data, compared with parametric fitting approaches that require explicit and known models of both signals and backgrounds (Puetter et al. 2005). However, a larger pool of solutions often results in an ill-posed problem, which is unstable or divergent. The DD method overcomes this problem by continuously cross-correlating both sides of Eq. (1) and transforming the original modulation equation into its L-fold equivalents to improve its positive definitiveness, as well as non-linear physical constraints used to regularize the iteration, thereby shrinking the pool of solutions so that the object reconstruction problem is reduced.

In real-world data analysis, it is often necessary to avoid data with a poor signal-to-noise ratio (S/N) through screening and selection; for example, the Good Time Interval (GTI) auxiliary record (Blackburn 1995) is used to select time series data in Rossi X-ray Timing Explorer (RXTE) as well as Insight-HXMT data analysis, or, to include relevant data from other instruments or epochs in order to provide better statistics. Meanwhile, the dimensionality and irregularity of the corresponding modulation equation increases, which imposes growing complexity on the reconstruction, especially for Insight-HXMT data analysis, where the modulation kernels lack circular symmetry. The computational complexity introduced by rotation modulation is reduced by an angular clustering method specifically designed for Insight-HXMT all-sky survey (Huo & Zhou 2013), since the rotation modulation can be represented much more sparsely with explicit angular coordinates.

Although the DD method treats the ill-posed object reconstruction problem, the lack of a modulation-kernel-constructing process or accelerated implementation for general cases prevents this method from being applied to data analysis tasks with high dimensionality or large datasets. In this article we provide the synthetic direct demodulation method (synDD), which features

  1. a modulation-kernel-constructing process to combine kernels characterizing individual instruments, observations and/or data screening/selection into one synthetic kernel (modulation equation synthesis process), and

  2. an accelerated implementation for general cases by decomposing an arbitrary kernel matrix into a weighted sum of a series of circulant matrices so that the fast Fourier transform (FFT) can help (modulation kernel matrix analysis process).

Consequently, we can not only deal with the previously mentioned rotation modulation without losing important information sampled from different position angles, as well as the projection distortion that occurs on the tangential plane of the celestial sphere, which we put aside in previous work (Huo & Zhou 2013), but can also squeeze a more complicated kernel (e.g. with screening or weighing matrices) and additional observed data into our representation of the modulation equation.

2. Method

2.1. Modulation equation

In this article, vectors and matrices are highlighted in bold, while regular type face is used for scalars, plain sequences (or, 1D array), and multiple dimensional arrays. Operations on plain arrays are element-wise if not specifically indicated. For instance, x indicates a vector of x1, x2, …, xn, while x simply suggests a plain sequence of these elements. Modular arithmetic is used for subscripts in this article.

Consider the algebraic form of Eq. (1)

d = H f , $$ \begin{aligned} \boldsymbol{d} = \boldsymbol{H}\boldsymbol{f}\text{,} \end{aligned} $$(2)

where N × 1 column vector d and M × 1 vector f are discrete samplings of the models of the observed data and the object, while N × M matrix H is the modulation kernel matrix, which is a discrete sampling of the modulation kernel.

2.2. Analysis and synthesis of modulation equation

Both N and M in Eq. (2) are used to indicate the size of the object reconstruction problem. Time costs in solving the modulation equation increase rapidly as the size of the problem increases unless the naive matrix-matrix and matrix-vector multiplications are carefully treated. For N × N matrices and N × 1 vectors the time costs of matrix-matrix and matrix-vector multiplications are proportional to N 3 and N 2, respectively. In computer science, such costs are measured with computational complexities, denoted by O(N 3) (cubic complexity) and O(N 2) (quadratic complexity). If the complexity could be reduced from polynomials (cubic or quadratic) to quasilinear, for example, O(NlogN) by replacing all the naive matrix mulplications with FFTs, time costs for a typical 512 × 512 Insight-HXMT image reconstruct (N = 512 × 512) can be reduced from 105 s to a few seconds (Huo & Zhou 2013). In order to achieve such a treatment, we decompose the kernel matrix H, for this purpose taken to be an arbitrary matrix, into the sum of a finite sequence of circulant kernel matrices multiplied by diagonal coefficient matrices, as

H = k A k H k , $$ \begin{aligned} \boldsymbol{H} = \sum _k \boldsymbol{A}_k \boldsymbol{H}_k\text{,} \end{aligned} $$(3)

where Ak is an N × N diagonal matrix serving as the coefficient of the kth N × M kernel matrix Hk, which is row-circulant. Diagonal entries of Ak have values of either 0 or 1 only. A matrix Hk is row-circulant if each of its row vectors is circular-shifted by one element to the right relative to its preceding row; that is, Hk, i, j = Hk, i + 1, j + 1 = hk, ji + 1 for all i and j, where hk is the first row vector of Hk. A row-circulant matrix is not necessarily square. If the number of its rows is greater than the number of its columns (N > M), its excess rows are considered as circularly appended to the first M rows of the matrix, while if N < M the matrix is considered as the first N rows that truncated from a M × M square circulant matrix. In addition, it is required that

A k A l = { A k , if k = l 0 , otherwise , $$ \begin{aligned} \boldsymbol{A}_k \boldsymbol{A_l} = {\left\{ \begin{array}{ll} \boldsymbol{A}_k\text{,}&\text{ if} ~k=l\\ \boldsymbol{0}\text{,}&\text{ otherwise} \end{array}\right.}\text{,} \end{aligned} $$(4)

and

k A k = I N , $$ \begin{aligned} \sum _k \boldsymbol{A}_k = \boldsymbol{I}_N\text{,} \end{aligned} $$(5)

where IN is the N × N identity matrix.

Allowing ak to be a column vector of diagonal elements of Ak, Eq. (2) becomes

d = k a k · ( f h k ) . $$ \begin{aligned} \boldsymbol{d} = \sum _k \boldsymbol{a}_k \cdot \left( \boldsymbol{f} *\boldsymbol{h}_k^{\star } \right)\text{.} \end{aligned} $$(6)

The operator ⋅ denotes the element-wise multiplication, the operator * denotes the circular convolution, and h k $ \boldsymbol{h}_{k}^{\star} $ is the reverse of hk, that is, h k , i = h k , M + 1 i $ h_{k,i}^{\star}=h_{k,M+1-i} $. We note that ak has N elements while f * h k $ \boldsymbol{f} \ast \boldsymbol{h}_{k}^{\star} $ has M elements so before calculating their element-wise multiplication the right operand f * h k $ \boldsymbol{f} \ast \boldsymbol{h}_{k}^{\star} $ is truncated (if N > M) or padded (if N < M) to the same size as the left operand ak. Such truncation or padding is applied when necessary in this article.

Decomposition of d is naturally derived as

d = k a k · d k , $$ \begin{aligned} \boldsymbol{d} = \sum _k \boldsymbol{a}_k \cdot \boldsymbol{d}_k \text{,} \end{aligned} $$(7)

where the kth element of the observed data d k =f h k $ {d_k} = f * h_k^ \star \ $ .

The term analysis refers to separating the original modulation equation into a finite sequence of equations by decomposing the kernel matrix as well as the observed data, while the term synthesis refers to combining a finite sequence of modulation equations to form an equivalent one. Analysis of a modulation equation is useful for reducing the computational complexity of the corresponding inverse problem. Data obtained from different observations of the same object can be synthesised as in Eq. (7) and demodulated as in Eq. (2) or Eq. (6).

2.3. L-fold correlation

The one-fold correlation of a modulation equation such as Eq. (2) is achieved by left-multiplying both sides of the equation by the transpose of the kernel matrix, as

H T d = H T H f . $$ \begin{aligned} \boldsymbol{H}^\mathrm{T} \boldsymbol{d} = \boldsymbol{H}^\mathrm{T} \boldsymbol{H}\boldsymbol{f}\text{.} \end{aligned} $$(8)

In the correlated equation, the unknown object image remains f while the observed data as well as the kernel matrix are both transformed into one-fold correlated data c1 = HTd and kernel P1 = HTH. The L-fold correlation is achieved recursively as

{ P L = P L 1 T P L 1 c L = P L 1 T c L 1 , L 1 , $$ \begin{aligned} \left\{ \begin{array}{ll} \boldsymbol{P}_L&= \boldsymbol{P}^\mathrm{T} _{L-1}\boldsymbol{P}_{L-1}\\ \boldsymbol{c}_L&= \boldsymbol{P}^\mathrm{T} _{L-1}\boldsymbol{c}_{L-1} \end{array}\right. \!{,}\,\forall L\ge 1\text{,} \end{aligned} $$(9)

provided that c0 = d and P0 = H (Li & Wu 1994), so the L-fold correlated equation is

c L = P L f . $$ \begin{aligned} \boldsymbol{c}_L = \boldsymbol{P}_L \boldsymbol{f}\text{.} \end{aligned} $$(10)

Although L-fold correlation is preferred when using the DD method, it is difficult to compute with naive matrix multiplication when the kernel matrix is very large; for example, a 220 × 220 kernel matrix, since the computational complexity of multiplication between two N × N matrices is O(N3) if performed naively. With a more complicated algorithm the computational complexity can be reduced to O(N2.376) (Coppersmith & Winograd 1990), however it would still take hundreds of hours to compute the multiplication between two 220 × 220 matrices.

We have derived a simple but efficient approach to compute the matrix multiplication from Eq. (3). First we compute the first column vector of HTH, as

H T H ( 1 0 0 ) = k , l H l T A l T A k H k ( 1 0 0 ) = k H k T A k H k ( 1 0 0 ) = k h k ( a k h k ) . $$ \begin{aligned} \boldsymbol{H}^\mathrm{T} \boldsymbol{H}\begin{pmatrix} 1\\ 0\\ \vdots \\ 0 \end{pmatrix}&= \sum _{k,l}{\boldsymbol{H}_l}^\mathrm{T} {\boldsymbol{A}_l}^\mathrm{T} \boldsymbol{A}_k\boldsymbol{H}_k\begin{pmatrix} 1\\ 0\\ \vdots \\ 0 \end{pmatrix} \nonumber \\&= \sum _k {\boldsymbol{H}_k}^\mathrm{T} \boldsymbol{A}_k\boldsymbol{H}_k\begin{pmatrix} 1\\ 0\\ \vdots \\ 0 \end{pmatrix} \nonumber \\&= \sum _k h_k *\left(a_k h_k^{\star }\right). \end{aligned} $$(11)

We can compute the remaining column vectors in the same way. The time cost of computing each column vector is proportional to KNlogN, where K is the number of circulant matrices, if the circulant convolution is calculated with the FFT algorithm. For rotation modulation, K is the number of position angle clusters, for wide-field image reconstruction, K is the number of field-of-view (FoV) clusters, and for synthetic kernel, K is the number of observations. Because both the position angle and the FoV distortion vary gradually in practical observations, their clusters are always far less than pixels or bins of the observed data samples. We found that 20–100 clusters are sufficient to approximate the kernel variation where the difference between the simulated true kernel and the kernel approximated with the weighted sum of circulant matrices are negligible compared to the required kernel calibration accuracy. Therefore K is always less than N by several orders of magnitude and is independent of N. As a result, the complexity is reduced to O(N2logN).

2.4. Screening and weighing

We can avoid data with poor S/N (screening) or adjust their weights accordingly by introducing a weight matrix M into Eq. (2),

M d = M H f , $$ \begin{aligned} \boldsymbol{M}\boldsymbol{d} = \boldsymbol{M}\boldsymbol{H}\boldsymbol{f}\text{,} \end{aligned} $$(12)

where the weight matrix M is an N × N diagonal matrix. The ith element on its main diagonal mi serves as the weight of the ith observed datum di. We can avoid certain observed data by assigning 0-valued elements as their weights.

Modulated estimates of the object Hf′ (f′ is an estimate of the unknown object f) appear as denominators in reconstructions with the DD method as well as RL iterations. The original kernel in Eq. (1) contains both instrument response and non-sky background, while the object contains sources and the sky background; hence division-by-zero will not happen. But possible zero-weights in M of Eq. (12) are like holes and may cause such a problem. To prevent this problem in later reconstructions, L-fold correlation is applied so that zero-holes in MHf′ are smoothed to non-zeros by the kernel and their non-zero neighbours.

2.5. Additional observation

Observations at multiple epochs or with different instruments can be joined into a single modulation equation by using a partitioned kernel matrix, as

( d 1 d 2 d Λ ) = ( H 1 H 2 H Λ ) f , $$ \begin{aligned} \begin{pmatrix} \boldsymbol{d}_1 \\ \boldsymbol{d}_2 \\ \vdots \\ \boldsymbol{d}_\Lambda \end{pmatrix} = \begin{pmatrix} \boldsymbol{H}_1 \\ \boldsymbol{H}_2 \\ \vdots \\ \boldsymbol{H}_\Lambda \end{pmatrix} \boldsymbol{f}\text{,} \end{aligned} $$(13)

where dλ and Hλ (λ = 1, 2, …, Λ) are the observed data and the kernel matrix of the λth observation, respectively.

2.6. Iterative solution of synthetic modulation equation

A model of the observed data instead of the observed data themselves has been placed on the left side of Eqs. (1) and (6), where the model characterizes the statistically expected value of the observed data while the data itself is a random outcome from a specific observation. For Poisson fluctuation(photon noise)-dominant data, the likelihood function of the object image f as an unknown parameter given the observed data d is

L ( f | d ) = j ( k , i a k , j f j h k , i j + 1 ) d j e k , i a k , j f j h k , i j + 1 d j ! · $$ \begin{aligned} \mathcal{L} (\boldsymbol{f}|\boldsymbol{d}) = \prod _{j}\dfrac{(\sum _{k,i}a_{k,j}f_j h_{k,i-j+1})^{d_j}\mathrm{e}^{-\sum _{k,i} a_{k,j} f_j h_{k,i-j+1}}}{d_j !}\cdot \end{aligned} $$(14)

Once we find the f that maximizes the likelihood function, the maximum likelihood (ML) solution of the modulation equation is achieved. The object reconstruction problem is therefore transformed into solving the following equation.

ln L ( f | d ) f i = 0 , i . $$ \begin{aligned} \frac{\partial \ln \mathcal{L} (\boldsymbol{f}|\boldsymbol{d})}{\partial f_i} = 0 \text{,}\;\forall i\text{.} \end{aligned} $$(15)

Therefore,

j ( d j k , i a k , j f i h k , i j + 1 1 ) k a k , j h k , i j + 1 = k , j a k , j h k , i j + 1 d j k , i a k , j f i h k , j i + 1 k , j a k , j h k , i j + 1 = 0 , i , $$ \begin{aligned}&\sum _j \left( \dfrac{d_j}{\sum _{k,i}a_{k,j}f_i h_{k,i-j+1}} - 1\right) \sum _{k}a_{k,j}h_{k,i-j+1} \nonumber \\ =&\sum _{k,j} \dfrac{a_{k,j}h_{k,i-j+1}d_j}{\sum _{k,i}a_{k,j}f_i h_{k,j-i+1}^{\star }} - \sum _{k,j}a_{k,j}h_{k,i-j+1} = 0\text{,}\;\forall i \text{,} \end{aligned} $$(16)

and

f k a k h k k a k · d k a k · ( f h k ) h k = f . $$ \begin{aligned} \dfrac{\boldsymbol{f}}{\sum _{k}\boldsymbol{a}_k *\boldsymbol{h}_k}\sum _k \dfrac{\boldsymbol{a}_k \cdot \boldsymbol{d}}{\sum _k \boldsymbol{a}_k \cdot \left(\boldsymbol{f} *\boldsymbol{h}_k^{\star }\right)} *\boldsymbol{h}_k = \boldsymbol{f} \text{.} \end{aligned} $$(17)

Fixed-point iteration is a method of finding a fixed point of a given function in a numerical analysis. x is a fixed point of the function g(x) if x = g(x). The iteration x(n + 1) = g(x(n)) is expected to converge to x if g is continuous. The ML estimate of the true image f is a fixed point of function

φ ( f ) = f k a k h k k a k · d k a k ( f h k ) h k . $$ \begin{aligned} \varphi (\boldsymbol{f}) = \dfrac{\boldsymbol{f}}{\sum _{k}\boldsymbol{a}_k *\boldsymbol{h}_k}\sum _k \dfrac{\boldsymbol{a}_k \cdot \boldsymbol{d}}{\sum _k \boldsymbol{a}_k \left(\boldsymbol{f} *\boldsymbol{h}_k^{\star }\right)} *\boldsymbol{h}_k \text{.} \end{aligned} $$(18)

We therefore expect to find the ML estimate iteratively. Iteration at the lth step is

f ( l ) = f ( l 1 ) w k a k · d k a k · ( f ( l 1 ) h k ) h k , $$ \begin{aligned} \boldsymbol{f}^{\left(l\right)} = \frac{\boldsymbol{f}^{\left(l-1\right)}}{\boldsymbol{w}} \sum _k\frac{\boldsymbol{a}_k \cdot \boldsymbol{d}}{\sum _k\boldsymbol{a}_k\cdot \left(\boldsymbol{f}^{\left(l-1\right)} *\boldsymbol{h}_k^{\star }\right)} *\boldsymbol{h}_k\text{,} \end{aligned} $$(19)

where the normalization factor w = ∑kak * hk. RL iteration can also be considered as a fixed-point iteration that achieves an ML solution that restores an image blurred by a convolution kernel. For a perfectly known modulation kernel and statistical model of the observed data, the proper criterion to terminate the fixed-point iteration can be found by monitoring the residuals. However when the modulation kernel is not perfectly known or the statistical model of the observed data does not perfectly describe the random nature of the data, one should terminate the iteration as soon as the required source is resolved from the data. In addition, cross-validation and sensitivity assessment (Huo et al. 2015) are necessary to prevent overfitting or noise amplification. For Insight-HXMT survey data analysis, which is mainly focused on point-like-source detection and monitoring, the stopping criterion is not critical since the demodulated image is not the final result of object reconstruction but only serves as a hint for the following parametric fitting procedures.

Provided both the object image f and the observed data d are N × 1 and the kernel matrix is then N × N, it takes ∼2N2 scalar multiplications for each RL iteration of the original DD. In contrast, it takes ∼2KNlogN scalar multiplications for each iteration of demodulation synthesised from a K circulant kernel matrix with the aid of the FFT algorithm. The computational complexity is therefore reduced, provided that KN.

Let us take Insight-HXMT as an example, where the level-1 scientific data products consist of lists of X-ray photon arrival events detected by each scientific instrument. Each event is described by properties such as time on arrival, detector identifier, energy channel identifier, anti-coincidence detector counts, and so on. Satellite orbit coordinates and telescope attitude (pointing angles and position angle) on arrival of each event are interpolated from the housekeeping data provided by the satellite platform. Two-dimensional discrete samples of the observed image, 1D binned samples of the observed light curve, and/or a 1D binned energy spectrum are counted from the above events as well as auxiliary housekeeping data accordingly. For object image reconstruction, each image contains too many pixels for the naive matrix multiplication in Eq. (2) and the modulation cannot be computed with spherical harmonics or fourier transforms due to the absence of circular symmetry in the kernel, i.e. the point spread function (PSF) (Huo & Zhou 2013). With the approach described here the time cost of Insight-HXMT image reconstruction is reduced by orders of magnitude.

2.7. Cluster analysis of the modulation kernel

Cluster analysis of the modulation kernel is the key building block of the method we present in this article. A kernel matrix is an ordered set of row vectors, which are classified into K groups through cluster analysis. In cluster analysis we refer to each group as a cluster. Row vectors in the same cluster are similar to each other, as if they were taken from a circulant matrix. A generalized Euclidean distance ∥xixj∥ is introduced here to measure the similarity between a pair of row vectors xi and xj. The two row vectors are considered sufficiently similar if the distance is less than the kernel calibration accuracy. So cluster analysis is the underlying procedure through which a kernel matrix H is decomposed into a sequence of circulant matrices Hk and their coefficients Ak. The number of clusters, K, determines the extent to which the computational complexity can be reduced. The smaller the number K, the more complexity is reduced when we approximate H with k = 1 K A k H k $ \sum_{k=1}^K \boldsymbol{A}_k \boldsymbol{H}_k $. The accuracy of the approximation is determined by similarities between any two row vectors in the same cluster. Therefore, to perform a cluster analysis of a modulation kernel is to classify all row vectors of the kernel matrix into as few clusters as possible, while the similarities between any two vectors in the same cluster are acceptable.

Principal component analysis (PCA) is also used to decompose a kernel in order to reduce its dimensionality (Jolliffe 1986). We categorize the analysis of a modulation kernel in this article as a cluster analysis. Because PCA is only used to reduce the dimensionality of calculating the similarities between row vectors here, this means that instead of calculating the generalized Euclidean distance between two M × 1 row vectors, we actually calculate the distance between two representative vectors that each contain many fewer components, i.e., the principal components. Therefore, PCA serves as a preprocessing for cluster analysis. Since each row vector of the kernel matrix has the same size as the object image, expense of computing similarities between row vectors increases sharply with the image resolution, especially if the image has more than one dimension. Fortunately a row vector of a kernel could be more sparse with a certain representation than it appears with a naive pixel-wise representation. PCA is used to find the basis of a sparse representation of given row vectors of a specific kernel. In PCA the basis vectors are termed principal components. By expressing a row vector as a finite linear combination of the basis vectors, its dimensionality is effectively reduced, since the number of non-zero coefficients of a sparse representation is usually less than the number of pixels of the image.

Each row of the given kernel matrix H is circularly shifted leftwards according to its row number in the matrix, that is, the ith row is shifted by i elements leftwards. In this way all row vectors of a circulant matrix would be aligned so that the shifted matrix appears as N vertically stacked copies of identical row vectors, which is ready for the following PCA processing.

Allowing xi be the shifted ith row vector with pixel-wise representation, we use the NIPALS-PCA algorithm (Geladi & Kowalski 1986) to find the principal components of the set of shifted row vectors {xi} iteratively, and transform each row vector xi to a new vector ti in a space with reduced dimensionality defined by the principal components.

The similarity between two row vectors xi and xj is measured by the Euclidean distance between their sparse representations ti and tj, namely, ∥titj∥. As a result, k-means clustering (Lloyd 1982) is used to classify {ti} into clusters. A drawback of k-means clustering is that the number of clusters is taken as an input parameter. The minimum number of clusters should therefore be determined through extra running of clustering beforehand to make sure the similarities between vectors in the same cluster are acceptable.

Once the vectors {ti} are classified into K clusters, a central vector of each cluster is calculated according to the corresponding row vectors {xi}, as

h k = 1 N k i S k x i , $$ \begin{aligned} \boldsymbol{h}_k = \frac{1}{N_k} \sum _{i \in S_k} \boldsymbol{x}_i\text{,} \end{aligned} $$(20)

where Nk is the number of vectors in the kth cluster, while Sk is the set of indices of vectors in this cluster. A circulant matrix Hk is then constructed from hk as its first row vector. Its coefficient matrix Ak is constructed from its diagonal elements ak, as

a k , i = { 1 , if i S k 0 , otherwise . $$ \begin{aligned} a_{k,i} = {\left\{ \begin{array}{ll} 1\text{,} \text{ if} ~i\in S_k \\ 0\text{,} \text{ otherwise} \end{array}\right.}\!\!\!\!\!. \end{aligned} $$(21)

3. Test and results

3.1. Object, modulation kernels, and observed data

We produced a model with a series of point sources located on a spiral as the object (refer to Fig. 1). Distance between adjacent sources increases from the interior to the exterior of the spiral, while the intensity decreases. We can therefore use this model object to assess both spatial resolution and sensitivity of a given observation and the corresponding reconstruction.

thumbnail Fig. 1.

Model object.

Aperture PSF model of the high energy (HE) telescope of Insight-HXMT (as shown in Fig. 2), which reflects the geometrical effects of all the collimators of HE/Insight-HXMT detectors, that is, the detection efficiency of the telescope to a test point source in its FoV is brought here to build modulation kernels. The spatial coordinates of the test point source are variables of the PSF.

thumbnail Fig. 2.

HE/Insight-HXMT aperture PSF model, all collimated detectors combined.

We simulated observations in which the sky region is scanned in different directions as shown in Fig. 3. Scanning speed along each row is 0.10° s−1, while the interval between adjacent scanning rows is 3.0°.

thumbnail Fig. 3.

Simulated scanning paths to the same object sky region along different directions.

With the simulated scanning paths, PSF, object, and a uniform background of 0.3 counts s−1 deg−2, which is consistent with the scanning paths and rates and the HE background (Li et al. 2009), we calculated the modulation kernel matrices as well as the corresponding modulated light curves, respectively. As for the detectors, we simulated Poisson fluctuation by generating pseudo Poisson random numbers with the modulated light curves as the expected values. The simulated observed light curves are shown in Fig. 4.

thumbnail Fig. 4.

Simulated observed light curves in different scanning observations.

3.2. Reconstruction and comparison

First we reconstructed the object from a single-epoch observed light curve (as shown in the top left panel of Fig. 4) on a low-resolution pixel grid (64 by 64 pixels). The reconstructed image is shown in Fig. 5. We estimated from the marginal null region of the reconstructed image that the 3σ pixel-wise sensitivity is 2.4 counts s−1 deg−2. So pixels with image values below the estimated sensitivity were discarded.

thumbnail Fig. 5.

Low-resolution image reconstructed from simulated single-epoch light curve.

Similarly we reconstructed the object again from simulated light curves from the four different scanning paths shown in Fig. 3 by synthesizing all modulation kernel matrices and all light curves together, according to Sect. 2.5. The reconstructed image is shown in Fig. 6. The estimated 3σ sensitivity is 1.7 counts s−1 deg−2.

thumbnail Fig. 6.

Low-resolution image reconstructed from simulated multiple-epoch light curves.

In the above reconstructions we employed the original DD iterations with vector-extrapolation between iterations to speed up the convergence (Biggs & Andrews 1997). The vector-extrapolation is a technique in numerical analysis which predicts the subsequent step with previous steps instead of computing the iteration, meaning that less iterations are required. For the low-resolution reconstruction, we find that after 30 iterations the solutions no longer improve due to significant artefacts. For each iteration the modulation was calculated through matrix multiplication routines provided by OpenBLAS, which is a widely used open source implementation of the Basic Linear Algebra Subprograms (BLAS; Xianyi et al. 2014). This is referred to as the original DD while the solution specified in Eq. (19) is referred to as the synthetic DD. Time cost per original DD iteration is 5.3 s on a dual-core PC. We reconstructed the object from simulated light curves from the four different scanning paths shown in Fig. 3 on the same pixel grid with synthetic DD for comparison, as shown in Fig. 7.

thumbnail Fig. 7.

Low-resolution image reconstructed from simulated multiple-epoch light curves with synthetic DD.

Time cost per synthetic DD iteration is 15 milliseconds on one personal computer (PC). As we can see the reconstructed image with synthetic DD is similar to that with original DD but the time cost is reduced by orders of magnitude.

Subsequently, we tested higher-resolution (512 by 512 pixels) reconstruction with synthetic DD. Original DD was not tested because of its expected time cost (at least 10 h). Time cost per original DD iteration increases polynomially and more iterations are requested with higher resolution. The image reconstructed from the simulated light curves for the four different scanning paths illustrated in Fig. 3 is shown in Fig. 8. Time cost per iteration on this pixel grid is 0.6 s.

thumbnail Fig. 8.

High-resolution image reconstructed from simulated multiple-epoch light curves, with synthetic DD.

We also observe the convergence of clustering analysis as a prerequisite of synthetic DD. The residual of the sparse representation of the kernel matrix ( | H k A k H k | | H | ) $ \left(\frac{\left\vert\boldsymbol{H} - \sum_k\boldsymbol{A}_k\boldsymbol{H}_k\right\vert}{\left\vert\boldsymbol{H}\right\vert}\right) $ is reduced rapidly with the first components; subsequently the marginal improvement involving more components appears to diminish, as illustrated in Fig. 9.

thumbnail Fig. 9.

PC representation residuals.

Finally time costs of reconstructions with original DD iteration and synthetic DD iteration are shown in Fig. 10.

thumbnail Fig. 10.

Time costs of reconstructions.

4. Conclusion

The synDD method including modulation kernel matrix analysis and modulation equation synthesis reduces the polynomial computational complexity of a time-consuming process in solving the modulation equation to linear-logarithmic complexity, which in turn saves time costs of object reconstruction in Insight-HXMT data analysis by orders of magnitude. This encourages us to include more observed data, for example, in multiple epochs or multiple energy bands, and to reconstruct the object with higher resolution. Therefore it is possible to achieve both improved sensitivity and higher resolution but with even less resources. We suggest using the proposed method in Insight-HXMT data analysis especially with scanning observations from multiple payloads, different energy bands, and/or multiple epochs for transient detection, time-varying objects monitoring and so on.

Acknowledgments

Pseudocodes in this article are typeset with the package algorithms bundle. In this work we made use of SciPy (Jones et al. 2001) and PyTables (Alted et al. 2002) in numerical computing and dealing with large datasets. We thank the anonymous referee whose comments and suggestions helped improve and clarify this manuscript.

References

  1. Alted, F., Vilata, I., Prater, S., et al. 2002, PyTables: Hierarchical Datasets in Python [Google Scholar]
  2. Biggs, D. S., & Andrews, M. 1997, Appl. Opt., 36, 1766 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
  3. Blackburn, J. 1995, Astronomical Data Analysis Software and Systems IV, 77, 367 [Google Scholar]
  4. Coppersmith, D., & Winograd, S. 1990, J. Symbol. Comput., 9, 251 [Google Scholar]
  5. Geladi, P., & Kowalski, B. R. 1986, Anal. Chim. Acta, 185, 1 [Google Scholar]
  6. Huo, Z.-X., & Zhou, J.-F. 2013, Res. Astron. Astrophys., 13, 991 [NASA ADS] [CrossRef] [Google Scholar]
  7. Huo, Z.-X., Li, Y.-M., Li, X.-B., & Zhou, J.-F. 2015, Res. Astron. Astrophys., 15, 1905 [NASA ADS] [CrossRef] [Google Scholar]
  8. Jolliffe, I. T. 1986, Principal Component Analysis (Springer), 115 [Google Scholar]
  9. Jones, E., Oliphant, T., Peterson, P., et al. 2001, SciPy: Open Source Scientific Tools for Python [Google Scholar]
  10. Li, T. P. 2007, in Proceedings of the Third International Conference on Particle and Fundamental Physics in Space, Nuclear Physics B - Proceedings Supplements , 166, 131 [NASA ADS] [Google Scholar]
  11. Li, T.-P., & Wu, M. 1993, Ap&SS, 206, 91 [NASA ADS] [CrossRef] [Google Scholar]
  12. Li, T.-P., & Wu, M. 1994, Ap&SS, 215, 213 [NASA ADS] [CrossRef] [Google Scholar]
  13. Li, G., Wu, M., Zhang, S., & Jin, Y.-K. 2009, ChA&A, 33, 333 [NASA ADS] [Google Scholar]
  14. Lloyd, S. 1982, IEEE Trans. Inform. Theory, 28, 129 [Google Scholar]
  15. Lucy, L. B. 1974, AJ, 79, 745 [NASA ADS] [CrossRef] [Google Scholar]
  16. Puetter, R., Gosnell, T., & Yahil, A. 2005, ARA&A, 43, 139 [NASA ADS] [CrossRef] [Google Scholar]
  17. Richardson, W. H. 1972, JOSA, 62, 55 [Google Scholar]
  18. Xianyi, Z., Qian, W. & Chothia, Z. 2014, http://xianyi.github.io/OpenBLAS [Google Scholar]
  19. Zhang, S.-N. 2009, BAAS, 41, 474 [NASA ADS] [Google Scholar]
  20. Zhang, S., Lu, F. J., Zhang, S. N., & Li, T. P. 2014, in Space Telescopes and Instrumentation 2014: Ultraviolet to Gamma Ray, Proc. SPIE, 9144, 914421 [Google Scholar]

Appendix A: Algorithm and codes

The methods we explained above are specified with several separate subroutines in pseudocodes. In Table A.1 we have summarised all the symbols and their definitions used in the following subroutines.

Vectors are implemented by 1D arrays. Matrices are implemented by 2D arrays, where rows and columns are numbered by the first and second indices respectively, that is, A[i][j] represents the matrix element Ai, j in the following algorithm descriptions, A[i] denotes the ith row, and A[⋯][j] denotes the jth column. Here ellipsis ⋯ between a pair of square brackets [] denotes the sequence of all indices. Similar notations are used for 3D arrays. For example, provided that A is an L × M × N array of scalar elements, A[i][j][k] represents the element Ai, j, k. A[⋯][j][k], A[i][⋯][k] and A[i][j] represent vectors. A[⋯][k], A[i][⋯] and A[⋯][j][⋯] represent matrices.

The circularly-shifting operation mentioned in Sect. 2.7 is specified with Subroutine 1, which is usually implemented by built-in functions in numerical computing languages such as MATLAB, NumPy, and so on.

Subroutine 1

Circularly-shift row vectors.

Require:M, N, H

for i = 1 to N do

  for j = 1 to M do

   X[i][j]←H[i][(j + i − 2)%M + 1]

  end for

end for

return X

A resumable implementation of the NIPALS method for PCA is specified in Subroutine 2 If the principal component scores matrix T is not initialized for writing, for example, when no memory or file system resources have been allocated to it, the current call to this subroutine will be considered as the first run. In this case the necessary resources will be allocated to the score matrix T, the loading matrix P as well as the residual R inside the subroutine during the current call. Otherwise it will be considered as resumed from a previous run, where the resources that T, P and R refer to must be preserved. The expected number of principal components J is provided as an input parameter. Elements of each principal component score are computed iteratively. An iteration is stopped when the increment of the estimate of the eigenvalue becomes negligible (Geladi & Kowalski 1986).

The k-means clustering is specified with Subroutine 3 We randomly choose K vectors as the initial central vectors of the K clusters. The loop is stopped when none of the central vectors changes any more.

The iterative solution of synthetic modulation equation formulated in Eq. (19) is specified in Subroutine 4 If M = N the observed data d itself can serve as the initial estimate, otherwise its one-fold correlation HTd is provided as initial estimate of the object. Convolutions in Eq. (19) are implemented with FFTs. The iteration is stopped when the difference between the current estimate of the image and the previous one is negligible. The convergence of the iteration can be accelerated via vector extrapolation (Biggs & Andrews 1997).

The workflow of algorithms used to decompose a kernel matrix H is summarised in Fig. A.1.

Subroutine 2

NIPALS-PCA

Require:M, N, J, X, R, P, T, ϵ

j ← 1

if resumed from previous run then

  while jJ and ∥T[⋯][j]∥ ≤ ϵ do

   jj + 1

  end while

else // this is the first run

  for i = 1 to N do

   X0[i]←0

  end for

  for i = 1 to M do

   X0X0 + X[⋯][i]

  end for

  X0X0/M

  for i = 1 to M do

   R[⋯][i]←X[⋯][i]−X0

  end for

end if

if j > J then // all expected PC scores have been calculated

  return R, P, T

end if

for j = j to J do

  λ ← 0

  T[⋯][j]←R[⋯][j]

  repeat

   for i = 1 to M do

    P[i][j]← sum of R[⋯][i]⋅T[⋯][j]

   end for

    P[][j] P[][j] / P[][j] $ P[ \cdots ][j] \leftarrow P[ \cdots ][j]/||P[ \cdots ][j]|| $

   for i = 1 to N do

    T[i][j]← sum of R[i]⋅P[⋯][j]

   end for

   λpλ

   λ ← ∥T[⋯][j]∥

  until λ p λ ϵ/2 λ p +λ $ \left\| {{\lambda _p} - \lambda } \right\| \le\,\epsilon/2 \cdot \left\| {{\lambda _p} + \lambda } \right\|\ $

  for i = 1 to M do

   R[⋯][i]←R[⋯][i]−T[⋯][j]⋅P[i][j]

  end for

end for

return R, P, T

Table A.1.

Symbols in algorithm descriptions.

Subroutine 3

k-means clustering on sparse representation.

Require:K, N, T, X, ϵ

for k = 1 to K do

  i ← a random integer from1 to N

  C[k]←T[i]

  Y[k]←0

end for

δ ← 0

repeat

  for i = 1 to N do

   for k = 1 to K do

    D[k][i]←∥T[i]−C[k]∥

   end for

   k ← index of the least element of D[⋯][i]

   append i to S[k]

  end for

  for k = 1 to K do

   L[k]← number of elements in S[k]

   Cp[k]←C[k]

   C[k]←0

   for i = 1 to L[k] do

    C[k]←C[k]+T[S[i]]

   end for

    C[k] C[k] / L[k] $ C[k] \leftarrow C[k]/L[k] $

   δ ← max[δ, ∥Cp[k]−C[k]∥]

  end for

until δϵ

for k = 1 to K do

  for i = 1 to L[k] do

   Y[k]←Y[k]+X[S[i]]

  end for

   Y[k] Y[k] / L[k] $ Y[k] \leftarrow Y[k]/L[k] $

end for

return A, C, Y

thumbnail Fig. A.1.

Algorithms used to decompose a kernel matrix.

Subroutine 4Iterative ML demodulation.

RequireK, M, N, A, Y

f ← initial estimate

for k = 1 to K do

  for i = 1 to N do

   a[k]←A[k][i][i]

  end for

end for

for i = 1 to M do

  w[i]←0

end for

for k = 1 to K do

  ww + IFFT(FFT(a[k])⋅FFT(Y[k]))

end for

repeat

  fpf

  for i = 1 to M do

   q[i]←0

  end for

  for i = 1 to N do

   r[i]←0

  end for

  for k = 1 to K do

    r r + a [ k ] · IFFT ( FFT ( f p ) · FFT ( Y [ k ] ) ¯ ) $ r \leftarrow r + a[k]\cdot \mathrm{IFFT}\left( \mathrm{FFT}\left(f_\text{ p}\right) \cdot \overline{ \mathrm{FFT}\left( Y[k] \right)} \right) $

  end for

   rd/r $ r \leftarrow d/r\ $

  for k = 1 to K do

   qq + a[k]⋅IFFT(FFT(r)⋅FFT(Y[k]))

  end for

   f f p ·q/w $ f \leftarrow {f_p} \cdot q/w\ $

untilffp∥≤ϵ

return f

All Tables

Table A.1.

Symbols in algorithm descriptions.

All Figures

thumbnail Fig. 1.

Model object.

In the text
thumbnail Fig. 2.

HE/Insight-HXMT aperture PSF model, all collimated detectors combined.

In the text
thumbnail Fig. 3.

Simulated scanning paths to the same object sky region along different directions.

In the text
thumbnail Fig. 4.

Simulated observed light curves in different scanning observations.

In the text
thumbnail Fig. 5.

Low-resolution image reconstructed from simulated single-epoch light curve.

In the text
thumbnail Fig. 6.

Low-resolution image reconstructed from simulated multiple-epoch light curves.

In the text
thumbnail Fig. 7.

Low-resolution image reconstructed from simulated multiple-epoch light curves with synthetic DD.

In the text
thumbnail Fig. 8.

High-resolution image reconstructed from simulated multiple-epoch light curves, with synthetic DD.

In the text
thumbnail Fig. 9.

PC representation residuals.

In the text
thumbnail Fig. 10.

Time costs of reconstructions.

In the text
thumbnail Fig. A.1.

Algorithms used to decompose a kernel matrix.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.