Source detection using a 3D sparse representation: application to the Fermi gammaray space telescope
J.L. Starck^{1}  J. M. Fadili^{2}  S. Digel^{3}  B. Zhang^{4}  J. Chiang^{3}
1  CEA, IRFU, SEDISAP, Laboratoire Astrophysique des Interactions Multiéchelles (UMR 7158), CEA/DSMCNRSUniversité Paris Diderot, Centre de Saclay, 91191 GifSurYvette, France
2  GREYC CNRS UMR 6072, Image Processing Group, ENSICAEN 14050 Caen Cedex, France
3  Stanford Linear Accelerator Center & Kavli Institute for Particle Astrophysics and Cosmology, Stanford, CA 94075, USA
4  Quantitative Image Analysis Unit URA CNRS 2582, Institut Pasteur, 2528 rue du Docteur Roux, 75724 Paris Cedex 15, France
Received 20 November 2008 / Accepted 25 February 2009
Abstract
The multiscale variance stabilization Transform (MSVST) has recently been proposed for Poisson data denoising (Zhang et al. 2008a). This procedure, which is nonparametric, is based on thresholding wavelet coefficients. The restoration algorithm applied after thresholding provides good conservation of source flux. We present in this paper an extension of the MSVST to 3D datain fact 2D1D data when the third dimension is not a spatial dimension, but the wavelength, the energy, or the time. We show that the MSVST can be used for detecting and characterizing astrophysical sources of highenergy gamma rays, using realistic simulated observations with the Large Area Telescope (LAT). The LAT was launched in June 2008 on the Fermi Gammaray Space Telescope mission. Source detection in the LAT data is complicated by the low fluxes of point sources relative to the diffuse celestial foreground, the limited angular resolution, and the tremendous variation in that resolution with energy (from tens of degrees at 30 MeV to 0.1
at 10 GeV). The highenergy gammaray sky is also quite dynamic, with a large population of sources such active galaxies with accretionpowered black holes producing highenergy jets, episodically flaring. The fluxes of these sources can change by an order of magnitude or more on time scales of hours. Perhaps the majority of blazars will have average fluxes that are too low to be detected but could be found during the hours or days that they are flaring. The MSVST algorithm is very fast relative to traditional likelihood model fitting, and permits efficient detection across the time dimension and immediate estimation of spectral properties. Astrophysical sources of gamma rays, especially active galaxies, are typically quite variable, and our current work may lead to a reliable method to quickly characterize
the flaring properties of newlydetected sources.
Key words: methods: data analysis  techniques: image processing
1 Introduction
The highenergy gammaray sky will be studied with unprecedented sensitivity by the Large Area Telescope (LAT), which was launched by NASA on the Fermi mission in June 2008. The catalog of gammaray sources from the previous mission in this energy range, EGRET on the Compton GammaRay Observatory, has approximately 270 sources (Hartman et al. 1999). For the LAT, several thousand gammaray sources are expected to be detected, with much more accurately determined locations, spectra, and light curves.
We would like to reliably detect as many celestial sources of gamma rays as possible. The question is not simply one of building up adequate statistics by increasing exposure times. The majority of the sources that the LAT will detect are likely to be gammaray blazars (distant galaxies whose gammaray emission is powered by accretion onto supermassive black holes), which are intrinsically variable. They flare episodically in gamma rays. The time scales of flares, which can increase the flux by a factor of 10 or more, can be minutes to weeks. The duty cycle of flaring in gamma rays is not well determined yet, but individual blazars can go months or years between flares and in general we will not know in advance where on the sky the sources will be found.
The fluxes of celestial gamma rays are low, especially relative to the 1 m^{2} effective area of the LAT (by far the largest effective collecting area ever in the GeV range). An additional complicating factor is that diffuse emission from the Milky Way itself (which originates in cosmicray interactions with interstellar gas and radiation) makes a relatively intense, structured foreground emission. The few very brightest gammaray sources will provide approximately 1 detected gamma ray per minute when they are in the field of view of the LAT. The diffuse emission of the Milky Way will provide about 2 gamma rays per second, distributed over the 2 sr field of view.
For previous highenergy gammaray missions, the standard method of source detection has been model fitting  maximizing the likelihood function while moving trial point sources around in the region of the sky being analyzed. This approach has been driven by the limited photon counts and the relatively limited resolution of gammaray telescopes. However, at the sensitivity of the LAT, even a relatively ``quiet'' part of the sky may have 10 or more point sources close enough together to need to be modeled simultaneously when maximizing the (computationally expensive) likelihood function. For this reason and because of the need to search in time, nonparametric algorithms for detecting sources are being investigated.
Literature overview for Poisson denoising using wavelets
A host of estimation methods have been proposed in the literature for nonparametric Poisson noise removal. Major contributions consist of variance stabilization: a classical solution is to preprocess the data by applying a variance stabilizing transform (VST) such as the Anscombe transform (Anscombe 1948; Donoho 1993). It can be shown that the transformed data are approximately stationary, independent, and Gaussian. However, these transformations are only valid for a sufficiently large number of counts per pixel (and of course, for even more counts, the Poisson distribution becomes Gaussian with equal mean and variance) (Murtagh et al. 1995). The necessary average number of counts is about 20 if bias is to be avoided.
In this case, as an alternative approach, a filtering approach for very small numbers of counts, including frequent zero cases, has been proposed in Starck & Pierre (1998), which is based on the popular isotropic undecimated wavelet transform (implemented with the socalled à trous algorithm) (Starck & Murtagh 2006) and the autoconvolution histogram technique for deriving the probability density function (pdf) of the wavelet coefficient (Bijaoui & Jammal 2001; Starck & Murtagh 2006; Slezak et al. 1993). This method is part of the data reduction pipeline of the XMMLSS project (Pierre et al. 2004) for detecting of clusters of galaxies (Pierre et al. 2007). This algorithm is obviously a good candidate for Fermi LAT 2D map analysis, but its extension to 2D1D data sets does not exist. It is far from being trivial, and even if it were possible, computation time would certainly be prohibitive to allow its use for Fermi LAT 2D1D data sets. Then, an alternative approach is needed. Several authors (Bijaoui & Jammal 2001; Fryzlewicz & Nason 2004; Kolaczyk 1997; Zhang et al. 2008b; Timmermann & Nowak 1999; Nowak & Baraniuk 1999) have suggested that the Haar wavelet transform is very wellsuited for treating data with Poisson noise. Since a Haar wavelet coefficient is just the difference between two random variables following a Poisson distribution, it is easier to derive mathematical tools for removing the noise than with any other wavelet method. Starck & Murtagh (2006) study shows that the Haar transform is less effective for restoring Xray astronomical images than the à trous algorithm. The reason is that the wavelet shape of the isotropic wavelet transform is much better adapted to astronomical sources, which are more or less Gaussianshaped and isotropic, than the Haar wavelet. Some papers (Scargle 1998; Willet & Nowak 2005; Kolaczyk & Nowak 2004; Willett 2006) proposed a spatial partitioning, possibly dyadic, of the image for complicated geometrical content recovery. This dyadic partitioning concept is however again not very well suited to astrophysical data.
The MSVST alternative
In a recent paper, Zhang et al. (2008a) have proposed to merge a variance stabilization technique and the multiscale decomposition, leading to the MultiScale Variance Stabilization Transform (MSVST). In the case of the isotropic undecimated wavelet transform, as the wavelet coefficients w_{j} are derived by a simple difference of two consecutive dyadic scales of the input image (see Sect. 3.2), w_{j} = a_{j1}  a_{j}, the stabilized wavelet coefficients are obtained by applying a stabilization on both a_{j1} and a_{j}, , where and are nonlinear transforms that can be seen as a generalization of the Anscombe transform; see Sect. 3 for details. This new method is fast and easy to implement, and more importantly, works very well at very low count situations, down to 0.1 photons per pixel.
This paper
In this paper, we present a new multiscale representation, derived from the MSVST, which allows us to remove the Poisson noise in 3D data sets, when the third dimension is not a spatial dimension, but the wavelength, the energy or the time. Such 3D data are called 2D1D data sets in the sequel. We show that it could be very useful to analyze Fermi LAT data, especially when looking for rapidly time varying sources. Section 2 describes the Fermi LAT simulated data. Section 3 reviews the MSVST method relative to the isotropic undecimated wavelet transform and Sect. 4 shows how it can be extended to the 2D1D case. Section 5 presents some experiments on simulated Fermi LAT data. Conclusions are given in Sect. 6.
Definitions and notations
For a real discretetime filter whose impulse response is h[i], is its timereversed version. For the sake of clarity, the notation h[i] is used instead of h_{i} for the location index. This will lighten the notation by avoiding multiple subscripts in the derivations of the paper. The discrete circular convolution product of two signals will be written , and the continuous convolution of two functions *. The term circular stands for periodic boundary conditions. The symbol is the Kronecker delta.
For the octave band wavelet representation, analysis (respectively, synthesis) filters are denoted h and g (respectively, and ). The scaling and wavelet functions used for the analysis (respectively, synthesis) are denoted (with ) and (with ) (respectively, and ). We also define the scaled dilated and translated version of at scale j and position k as , and similarly for , and . A function f(x,y) is isotropic if it is constant along all points (x,y) that are equidistant from the origin.
A distribution is stabilized if its variance is made constant, typically equal to 1, independently of its mean. A transformation applied to a random variable is called a variance stabilizing transform (VST), if the distribution of the transformed variable is stabilized and is approximately Gaussian.
Glossary
2 Data description
Figure 1: Cutaway view of the LAT. The LAT is modular; one of the 16 towers is shown with its tracking planes revealed. Highenergy gamma rays convert to electronpositron pairs on tungsten foils in the tracking layers. The trajectories of the pair are measured very precisely using silicon strip detectors in the tracking layers and the energies are determined with the CsI calorimeter at the bottom. The array of plastic scintillators that cover the towers provides an anticoincidence signal for cosmic rays. The outermost layers are a thermal blanket and micrometeoroid shield. The overall dimensions are 1.8 1.8 0.75 m. 

Open with DEXTER 
2.1 Fermi Large area telescope
The LAT (Fig. 1) is a photoncounting detector, converting gamma rays into positronelectron pairs for detection. The trajectories of the pair are tracked and their energies measured in order to reconstruct the direction and energy of the gamma ray.
The energy range of the LAT is very broad, approximately 20 MeV300 GeV. At energies below a few hundred MeV, the reconstruction and tracking efficiencies are lower, and the angular resolution is poorer, than at higher energies. The point spread function (PSF) width varies from about 3.5 at 100 MeV to better than 0.1 (68% containment) at 10 GeV and above. Owing to largeangle multiple scattering in the tracker, the PSF has broad tails; the 95%/68% containment ratio may be as large as 3.
Wavelet denoising of LAT data has application as part of an algorithm for quickly detecting celestial sources of gamma rays. The fundamental inputs to highlevel analysis of LAT data will be energies, directions, and times of the detected gamma rays. (Pointing history and instrument live times are also inputs for exposure calculations.) For the analysis presented here, we consider the LAT data for some range of time to have been binned into ``cubes'' v(x,y,t) of spatial coordinates and time or, v(x,y,E) of spatial coordinates and energy, because, as we shall see, the wavelet denoising can be applied in multiple dimensions, and so permits estimation of counts spectra. The motivations for filtering data with Poisson noise in the wavelet domain are well knownsources of small angular size are localized in wavelet space.
2.2 Simulated LAT data
The application of MSVST to problems of detection and characterization of LAT sources was investigated using simulated data. The simulations included a realistic observing strategy (sky survey with the proper orbital and rocking periods) and response functions for the LAT (effective area and angular resolution as functions of energy and angle). Point sources of gamma rays were defined with systematically varying fluxes, spectral slopes, and/or flare intensities and durations. The simulations also included a representative level of diffuse ``background'' (celestial plus residual chargedparticle) for regions of the sky well removed from the Galactic equator, where the celestial diffuse emission is particularly intense. The denoising results reported in Sect. 5 use a data cube obtained according to this simulation scenario.
3 The 2D multiscale variance stabilization transform (MSVST)
In this section, we review the MSVST method (Zhang et al. 2008a), restricted to the Isotropic Undecimated Wavelet Transform (IUWT). Indeed, the MSVST can use other transforms such as the standard threeorientation undecimated wavelet transform, the ridgelet or the curvelet transforms; see Zhang et al. (2008a). In our specific case here, only the IUWT is of interest.
3.1 VST of a filtered Poisson process
Given X a sequence of n independent Poisson random variables each of mean , let be the filtered process obtained by convolving the sequence X with a discrete filter h. Y denotes any one of the Y_{i}'s, and for .
If , then we recover the Anscombe VST (Anscombe 1948) of Y_{i} (hence X_{i}) which acts as if the stabilized data arose from a Gaussian white noise with unit variance, under the assumption that the intensity is large. This is why the Anscombe VST performs poorly in lowcount settings. But, if the filter h acts as an ``averaging'' kernel (more generally a lowpass filter), one can reasonably expect that stabilizing Y_{i} would be more beneficial, since the signaltonoise ratio measured at the output of h is expected to be higher.
Using a local homogeneity assumption, i.e.
for all j within the support of h, it has been shown (Zhang et al. 2008a) that for a nonnegative filter h, the transform
with b > 0 and c >0 defined as
is a second order accurate variance stabilization transform, with asymptotic unit variance. By secondorder accurate, we mean that the error term in the variance of the stabilized variable Z decreases rapidly as . From (1), it is obvious that when , we obtain the classical Anscombe VST parameters b=2 and c=3/8. The authors in Zhang et al. (2008a) have also proved that Z is asymptotically distributed as a Gaussian variate with mean and unit variance. A nonpositive h with a negative c could also be considered; see Zhang et al. (2008a) for more details.
Figure 2: Behavior of the expectation ( left) and variance ( right) as a function of the underlying intensity, for the Anscombe VST, 2D HaarFisz VST, and out VST with the 2D B_{3}Spline filter as a lowpass filter h. 

Open with DEXTER 
Figure 2 shows the MonteCarlo estimates of the expectation (left) and the variance (right) obtained from 2 10^{5} Poisson noise realizations of X, plotted as a function of the intensity for both Anscombe (Anscombe 1948) (dasheddotted), HaarFisz (dashed) (Fryzlewicz & Nason 2004) and our VST with the 2D B_{3}Spline filter as a lowpass filter h (solid). The asymptotic bounds (dots) (i.e. 1 for the variance and for the expectation) are also shown. It can be seen that for increasing intensity, and approach the theoretical bounds at different rates depending on the VST used. Quantitatively, Poisson variables transformed using the Anscombe VST can be reasonably considered to be unbiased and stabilized for , using HaarFisz for , and using out VST (after lowpass filtering with the chosen h) for .
3.2 The isotropic undecimated wavelet transform
The undecimated wavelet transform (UWT) uses an analysis filter bank (h,g) to decompose a signal a_{0} into a coefficient set , where d_{j} is the wavelet (detail) coefficients at scale j and a_{J} is the approximation coefficients at the coarsest resolution J. The passage from one resolution to the next one is obtained using the ``à trous'' algorithm (Shensa 1992; Holschneider et al. 1989)
where if and 0 otherwise, , and ``'' denotes discrete circular convolution. The reconstruction is given by . The filter bank needs to satisfy the socalled exact reconstruction condition (Starck & Murtagh 2006; Mallat 1998).
The Isotropic UWT (IUWT) (Starck et al. 2007) uses the filter bank where h is typically a symmetric lowpass filter such as the B_{3}Spline filter. The reconstruction is trivial, i.e., . This algorithm is widely used in astronomical applications (Starck et al. 1998) and biomedical imaging (OlivoMarin 2002) to detect isotropic objects.
The IUWT filter bank in qdimension () becomes where is the tensor product of q 1D filters . Note that is in general nonseparable.
Figure 3: Diagrams of the MSVST combined with the IUWT. The notations are the same as those of (4) and (7). The left dashed frame shows the decomposition part. Each stage of this frame corresponds to a scale j and an application of (4). The right dashed frame illustrates the direct inversion (7). 

Open with DEXTER 
3.3 MSVST with the IUWT
Now the VST can be combined with the IUWT in the following way: since the filters
at all scales j are lowpass filters (so have nonzero means), we can first stabilize the approximation coefficients a_{j} at each scale using the VST, and then compute in the standard way the detail coefficients from the stabilized a_{j}'s. Given the particular structure of the IUWT analysis filters (h,g), the stabilization procedure is given by
Note that the VST is now scaledependent (hence the name MSVST). The filtering step on a_{j1} can be rewritten as a filtering on a_{0}=X, i.e., , where for and . is the VST operator at scale j
Let us define . Then according to (1), the constants b^{(j)} and c^{(j)} associated to h^{(j)} must be set to
The constants b^{(j)} and c^{(j)} only depend on the filter h and the scale level j. They can all be precomputed once for any given h. A schematic overview of the decomposition and the inversion of MSVST+IUWT is depicted in Fig. 3.
In summary, IUWT denoising with the MSVST involves the following three main steps:
 1.
 Transformation: compute the IUWT in conjunction with the MSVST as described above.
 2.
 Detection: detect significant detail coefficients by hypothesis testing. The appeal of a binary hypothesis testing approach is that it allows quantitative control of significance. Here, we take benefit from the asymptotic Gaussianity of the stabilized a_{j}'s that will be transferred to the w_{j}'s as it has been shown by Zhang et al. (2008a). Indeed, these authors have proved that under the null hypothesis
H_{0}:w_{j}[k] = 0 corresponding to the fact that the signal is homogeneous (smooth), the stabilized detail coefficients w_{j} follow asymptotically a centered normal distribution with an intensityindependent variance; see Zhang et al. (2008a, Theorem 1) for details. This variance depends only on the filter h and the current scale, and can be tabulated once for any h. Thus, the distribution of the w_{j}'s being known (Gaussian), we can detect the significant coefficients by classical binary hypothesis testing.
 3.
 Estimation: reconstruct the final estimate using the knowledge of the detected coefficients. This step requires inverting the MSVST after the detection step. For the IUWT filter bank, there is a closedform inversion expression as we have
3.3.1 Example
Figure 4: Top: XMM simulated data, and HaarKolaczyk (Kolaczyk 1997) filtered image. Bottom: HaarJammalBijaoui (Bijaoui & Jammal 2001) and MSVST filtered images. Intensities logarithmically transformed. 

Open with DEXTER 
Figure 4 upper left shows a set of objects of different sizes and different intensities contaminated by a Poisson noise. Each object along any radial branch has the same integrated intensity within its support and has a more and more extended support as we go farther from the center. The integrated intensity reduces as the branches turn in the clockwise direction. Denoising such an image is challenging. Figure 4, topright, bottomleft and right, show respectively the filtered images by HaarKolaczyk (Kolaczyk 1997), HaarJammalBijaoui (Bijaoui & Jammal 2001) and the MSVST.
As expected, the relative merits (sensitivity) of the MSVST estimator become increasingly salient as we go farther from the center, and as the branches turn clockwise. That is, the MSVST estimator outperforms its competitors as the intensity becomes low. Most sources were detected by the MSVST estimator even for very low counts situations; see the last branches clockwise in Fig. 4 bottom right and compare to Fig. 4 top right and Fig. 4 bottom left.
4 2D1D MSVST denoising
4.1 2D1D wavelet transform
In the previous section, we have seen how a Poisson noise can be removed from 2D image using the IUWT and the MSVST. Extension to a qD data sets is straightforward, and the
denoising will be nearly optimal as long as each object belonging to this qdimensional space is roughly isotropic. In the case of 3D data where the third dimension is either the time or the energy, we are clearly not in this configuration, and the naive analysis of a 3D isotropic wavelet does not make sense. Therefore, we want to analyze the data with a nonisotropic wavelet, where the time or energy scale is not connected to the spatial scale. Hence, an ideal wavelet function would be defined by:
(8) 
where is the spatial wavelet and is the temporal (or energy) wavelet. In the following, we will consider only isotropic and dyadic spatial scales, and we note j_{1} the spatial resolution index (i.e. scale = 2^{j1}), j_{2} the time (or energy) resolution index. Thus, define the scaled spatial and temporal (or energy) wavelets
Hence, we derive the wavelet coefficients w_{j1,j2} [k_{x}, k_{y}, k_{z}] from a given data set D (k_{x} and k_{y} are spatial index and k_{z} a time (or energy) index). In continuous coordinates, this amounts to the formula
where * is the convolution and .
Figure 5: Overview of MSVST with the 2D1D IUWT. The diagram summarizes the main steps for computing the detail coefficients w_{j1,j2} in (19). The notations are exactly the same as those of Sect. 4.2 with . 

Open with DEXTER 
Fast undecimated 2D1D decomposition/reconstruction
In order to have a fast algorithm for discrete data, we use wavelet functions associated to filter banks. Hence, our wavelet decomposition consists in applying first a 2D IUWT for each frame k_{z}. Using the 2D IUWT, we have the reconstruction formula:
(10) 
where J_{1} is the number of spatial scales. Then, for each spatial location (k_{x},k_{y}) and for each 2D wavelet scale scale j_{1}, we apply a 1D wavelet transform along z on the spatial wavelet coefficients w_{j1}[k_{x},k_{y},k_{z}] such that
(11) 
where J_{2} is the number of scales along z. The same processing is also applied on the coarse spatial scale a_{J1}[k_{x},k_{y},k_{z}], and we have
(12) 
Hence, we have a 2D1D undecimated wavelet representation of the input data D:
(13) 
From this expression, we distinguish four kinds of coefficients:
 DetailDetail coefficients (
and
):
(14)
 ApproximationDetail coefficients (j_{1} = J_{1} and
):
(15)
 DetailApproximation coefficients (
and j_{2} = J_{2}):
(16)
 ApproximationApproximation coefficients (j_{1} = J_{1} and j_{2} = J_{2}):
(17)
(18) 
A typical choice of is the hard thresholding operator, i.e. if  x  is below a given threshold , and if . The threshold is generally chosen between 3 and 5 times the noise standard deviation (Starck & Murtagh 2006).
4.2 Variance stabilization
Putting all pieces together, we are now ready to plug the MSVST into the 2D1D undecimated wavelet transform. Again, we distinguish four kinds of coefficients that take the following forms:
 DetailDetail coefficients (
and
):
The schematic overview of the way the detail coefficients w_{j1,j2} are computed is illustrated in Fig. 5.  ApproximationDetail coefficients (j_{1} = J_{1} and
):
 DetailApproximation coefficients (
and j_{2} = J_{2}):
 ApproximationApproximation coefficients (j_{1} = J_{1} and j_{2} = J_{2}):
4.3 Detectionreconstruction
As the noise on the stabilized coefficients is Gaussian, and without loss of generality, we let its standard deviation equal to 1, we consider that a wavelet coefficient w_{j1,j2}[k_{x},k_{y},k_{z}] is significant, i.e., not due to noise, if its absolute value is larger than a critical threshold , where is typically between 3 and 5.
The multiresolution support will be obtained by detecting at each scale the significant coefficients. The multiresolution support for
and
is defined as
(23) 
In words, the multiresolution support M indicates at which scales (spatial and time/energy) and which positions, we have significant signal. We denote the 2D1D undecimated wavelet transform described above, the inverse wavelet transform and Y the input noisy data cube.
We want our solution X to preserve the significant structures in the original data by reproducing exactly the same coefficients as the wavelet coefficients of the input data Y, but only at scales and positions where significant signal has been detected (i.e.
). At other scales and positions, we want the smoothest solution with the lowest budget in terms of wavelet coefficients. Furthermore, as Poisson intensity functions are positive by nature, a positivity constraint is imposed on the solution. It is clear that there are many solutions satisfying the positivity and multiresolution support consistency requirements, e.g. Y itself. Thus, our reconstruction problem based solely on these constraints is an illposed inverse problem that must be regularized. Typically, the solution in which we are interested must be sparse by involving the lowest budget of wavelet coefficients. Therefore our reconstruction is formulated as a constrained sparsitypromoting minimization problem that can be written as follows
(24) 
where is the norm playing the role of regularization and is well known to promote sparsity (Donoho 2004). This problem can be solved efficiently using the hybrid steepest descent algorithm (Yamada 2001; Zhang et al. 2008a), and requires about 10 iterations in practice. Transposed into our context, its main steps can be summarized as follows:
where P_{+} is the projector onto the positive orthant, i.e. . is the softthresholding operator with threshold , i.e. if , and 0 otherwise.
4.4 Algorithm summary
The final MSVST 2D1D wavelet denoising algorithm is the following:
5 Experimental results and discussion
5.1 MSVST2D1D versus MSVST2D
Figure 6: Image obtained by integrating along the zaxis of the simulated data cube. 

Open with DEXTER 
Figure 7: Top: 2DMSVST filtering on the integrated image with respectively a and a detection level. Bottom: integrated image after a 2D1DMSVST denoising of the simulated data cube, with respectively a and a detection level. 

Open with DEXTER 
We have simulated a data cube according to the procedure described in Sect. 2.2. The cube contains several sources, with spatial positions on a grid. It contains seven columns and five rows of LAT sources (i.e. 35 sources) with different powerlaw spectra. The cube size is , with a total number of photons equal to 25 948, i.e. an average of 0.032 photons per pixel. Figure 6 shows the 2D image obtained after integrating the simulated data cube along the zaxis. Figure 7 shows a comparison between 2DMSVST denoising of this image, and the image obtained by first applying a 2D1DMSVST denoising to the input cube, and integrating afterward along the zaxis. Figure 7 upper left and right show denoising results for the 2DMSVST with respectively threshold values and , and Fig. 7 bottom left and right show the results for the 2D1DMSVST using respectively and detection levels. The reason for using a higher threshold level for the 2D1D cube is to correct for multiple hypothesis testings, and to get the same control over global statistical error rates. Roughly speaking, the number of false detections increases with the number of coefficients being tested simultaneously. Therefore, one must correct for multiple comparisons using e.g. the conservative Bonferroni correction or the false discovery rate (FDR) procedure (Benjamini & Hochberg 1995). As the number of coefficients is much higher with the whole 2D1D cube, the critical detection threshold of 2D1D denoising must be higher to have a false detection rate comparable to the 2D denoising. As we can clearly see from Fig. 7, the results are very close. This means that applying a 2D1D denoising on the cube instead of a 2D denoising on the integrated image does not degrade the detection power of the MSVST. The main advantage of the 2D1DMSVST is the fact that we recover the spectral (or temporal) information for each spatial position. Figure 8 shows two frames (frame 16 top left and frame 25 bottom left) of the input cube and the same frames after the 2D1DMSVST denoising top right and bottom right. Figure 9 displays the obtained spectra at two different spatial positions (112,47) and (126, 79) which correspond to the centers of two distinct sources.
Figure 8: Top: frame number 16 of the input cube and the same frame after the 2D1DMSVST filtering at . Bottom: frame number 25 of the input cube and the same frame after the 2D1DMSVST filtering at . 

Open with DEXTER 
Figure 9: Pixel spectra at two different spatial locations after the 2D1DMSVST filtering. 

Open with DEXTER 
Figure 10: Timevarying source. From left to right, simulated source, temporal flux, and coadded image along the time axis of noisy data cube. 

Open with DEXTER 
Figure 11: Recovered timevarying source. Left: one frame of the denoised cube. Right: flux per time frame for the noisy data after background subtraction (solid line), for the original noisefree cube (thicksolid line) and for the recovered source (dashed line). 

Open with DEXTER 
5.2 Timevarying source detection
We have simulated a time varying source in a cube of size . The source has a Gaussian shape both in space and time. It is centered in the middle of the cube at (32,32,64); i.e. its brightest point is at this location. The standard deviation of the Gaussian is 1.8 in space (pixel unit), and 1.2 along time (frame unit). The total flux of the source (i.e. spatial and temporal integration) is 100. We have added a background level of 0.1. Finally, Poisson noise was generated. Figure 10 shows respectively from left to right an image of the original source, the flux per time frame and the integration of all noisy frames along the time axis. As it can be seen, the source is hardly detectable in Fig. 10 right. By running the 2DMSVST denoising method on the timeintegrated image, we were not able to detect it. Then we applied the 2D1DMSVST denoising method on the noisy 3D data set. This time, we were able to restore the source with a threshold level . Figure 11 left depicts one frame (frame 64) of the denoised cube, and Fig. 11 right shows the flux of the recovered source per frame (dotted line). The solid and thicksolid lines show respectively the flux per time frame after background subtraction in the noisy data and the original noisefree data set. We can conclude from this experiment that the 2D1DMSVST is able to recover rapidly timevarying sources in the spatiotemporal data set, whereas even a robust algorithm such as the 2DMSVST method will completely fail if we integrate along the time axis. This was expected since the coaddition of all frames mixes the few frames containing the source with those which contain only the noisy background. Coadding followed by a 2D detection is clearly suboptimal, except if we repeat the denoising procedure with many temporal windows with varying size. We can also notice that the 2D1DMSVST is able to recover very well the times at which the source flares, although the source is slightly spread out on the time axis and the flux of the source is not very well estimated, and other methods such as maximum likelihood should be preferred for a correct flux estimation, once the sources have been detected.
5.3 Diffuse emission of the Galaxy
Figure 12: Left, from top to bottom: simulated data of the diffuse gammaray emission of the Milky Way in energy band 171181 MeV, noisy simulated data and filtered data using the MSVST. Right: same images for energy band 9.871.04 GeV. 

Open with DEXTER 
In this experiment, we have simulated a cube using the Galprop code (Strong et al. 2007) that has a model of the diffuse gammaray emission of the Milky Way. The units of the pixels are photons . The gridding in Galactic longitude and latitude is 0.5 degrees, and the 128 energy planes are logarithmically spaced from 30 MeV to 50 GeV. A six months LAT data set was created by multiplying the simulated cube with the exposure (6 months), and by convolving each energy band with the point spread function of the LAT instrument. The PSF strongly varies with the energy. Finally we have created the noisy observations assuming a Poisson noise distribution.
Figure 12 left shows from top to bottom the original simulated data, the noisy data and the filtered data for the band at energy 171181 Mev. The same figures for the band 9.871.04 GeV are shown in Fig. 12 right.
6 Conclusion
The motivations for a reliable nonparametric source detection algorithm to apply to Fermi LAT data are clear. Especially for the relatively short time ranges over which we will want to study sources, the data will be squarely in the low counts regime with widely varying response functions and significant celestial foregrounds. In this paper, we have shown that the MSVST, associated with a 2D1D wavelet transform, is a very efficient way to detect timevarying sources. The proposed algorithm is as powerful as the 2DMSVST applied to coadded frames to detect a source if the latter is slowly varying or constant over time. But when the source is rapidly varying, we lose some detection power when we coadd frames having no source and those containing the sources. Our approach gives us an alternative to framecoadding and outperforms the 2D algorithms on the coadded frames. Unlike 2D denoising, our method fully exploits the information in the 3D data set and allows to recover the source dynamics by detecting temporally varying sources.
Acknowledgements
We thank JeanMarc Casandjian for providing us the simulated data set of the diffuse emission of the Galaxy and Jeff Scargle for his helpful comments and critics. This work was partially supported by the French National Agency for Research (ANR 08EMER00901).
References
 Anscombe, F. 1948, Biometrika, 15, 246
 Benjamini, Y., & Hochberg, Y. 1995, J. R. Stat. Soc. B, 57, 289 (In the text)
 Bijaoui, A., & Jammal, G. 2001, Signal Processing, 81, 1789 [CrossRef]
 Donoho, D. L. 1993, Proc. Symp. Applied Mathematics: Different Perspectives on Wavelets, 47, 173
 Donoho, D. L. 2004, For Most Large Underdetermined Systems of Linear Equations, the minimal norm solution is also the sparsest solution, Tech. rep., Department of Statistics of Stanford Univ. (In the text)
 Fryzlewicz, P., & Nason, G. P. 2004, J. Comp. Graph. Stat., 13, 621 [CrossRef]
 Hartman, R. C., Bertsch, D. L., Bloom, S. D., et al. 1999, VizieR Online Data Catalog, 212, 30079 [NASA ADS] (In the text)
 Holschneider, M., KronlandMartinet, R., Morlet, J., & Tchamitchian, P. 1989, in Wavelets: TimeFrequency Methods and PhaseSpace (SpringerVerlag), 286
 Kolaczyk, E. 1997, ApJ, 483, 349 [NASA ADS] [CrossRef]
 Kolaczyk, E., & Nowak, R. 2004, Ann. Stat., 32, 500 [CrossRef]
 Mallat, S. 1998, A Wavelet Tour of Signal Processing (Academic Press)
 Murtagh, F., Starck, J.L., & Bijaoui, A. 1995, A&AS, 112, 179 [NASA ADS] (In the text)
 Nowak, R., & Baraniuk, R. 1999, IEEE Trans. Image Processing, 8, 666 [NASA ADS] [CrossRef]
 OlivoMarin, J. C. 2002, Pattern Recognition, 35, 1989 [CrossRef] (In the text)
 Pierre, M., Valtchanov, I., Altieri, B., et al. 2004, J. Cosmol. AstroPart. Phys., 9, 11 [NASA ADS] [CrossRef] (In the text)
 Pierre, M., Chiappetti, L., Pacaud, F., et al. 2007, MNRAS, 382, 279 [NASA ADS] [CrossRef] (In the text)
 Scargle, J. D. 1998, ApJ, 504, 405 [NASA ADS] [CrossRef]
 Shensa, M. J. 1992, IEEE Trans. Signal Processing, 40, 2464 [NASA ADS] [CrossRef]
 Slezak, E., de Lapparent, V., & Bijaoui, A. 1993, ApJ, 409, 517 [NASA ADS] [CrossRef]
 Starck, J.L., & Pierre, M. 1998, A&AS, 128 (In the text)
 Starck, J.L., & Murtagh, F. 2006, Astronomical Image and Data Analysis, Astronomical image and data analysis, ed. J.L. Starck, & F. Murtagh (Berlin: Springer), Astronomy and astrophysics library (In the text)
 Starck, J.L., Murtagh, F., & Bijaoui, A. 1998, Image Processing and Data Analysis, The Multiscale Approach (Cambridge University Press) (In the text)
 Starck, J.L., Fadili, M., & Murtagh, F. 2007, IEEE Trans. Image Processing, 16, 297 [NASA ADS] [CrossRef] (In the text)
 Strong, A. W., Moskalenko, I. V., & Ptuskin, V. S. 2007, Ann. Rev. Nucl. Part. Sci. 57, 285 (In the text)
 Timmermann, K. E., & Nowak, R. 1999, IEEE Trans. Signal Processing, 46, 886
 Willet, R., & Nowak, R. 2005, IEEE Transactions on Information Theory, submitted
 Willett, R. 2006, SCMA IV, in press
 Yamada, I. 2001, in Inherently Parallel Algorithms in Feasibility and Optimization and Their Applications, ed. D. Butnariu, Y. Censor, & S. Reich (Elsevier)
 Zhang, B., Fadili, M., & Starck, J.L. 2008a, IEEE Transactions on Image Processing, 17, 1093 [NASA ADS] [CrossRef] (In the text)
 Zhang, B., Fadili, M. J., Starck, J.L., & Digel, S. W. 2008b, Stat. Methodol., 5, 387 [NASA ADS] [CrossRef]
All Figures
Figure 1: Cutaway view of the LAT. The LAT is modular; one of the 16 towers is shown with its tracking planes revealed. Highenergy gamma rays convert to electronpositron pairs on tungsten foils in the tracking layers. The trajectories of the pair are measured very precisely using silicon strip detectors in the tracking layers and the energies are determined with the CsI calorimeter at the bottom. The array of plastic scintillators that cover the towers provides an anticoincidence signal for cosmic rays. The outermost layers are a thermal blanket and micrometeoroid shield. The overall dimensions are 1.8 1.8 0.75 m. 

Open with DEXTER  
In the text 
Figure 2: Behavior of the expectation ( left) and variance ( right) as a function of the underlying intensity, for the Anscombe VST, 2D HaarFisz VST, and out VST with the 2D B_{3}Spline filter as a lowpass filter h. 

Open with DEXTER  
In the text 
Figure 3: Diagrams of the MSVST combined with the IUWT. The notations are the same as those of (4) and (7). The left dashed frame shows the decomposition part. Each stage of this frame corresponds to a scale j and an application of (4). The right dashed frame illustrates the direct inversion (7). 

Open with DEXTER  
In the text 
Figure 4: Top: XMM simulated data, and HaarKolaczyk (Kolaczyk 1997) filtered image. Bottom: HaarJammalBijaoui (Bijaoui & Jammal 2001) and MSVST filtered images. Intensities logarithmically transformed. 

Open with DEXTER  
In the text 
Figure 5: Overview of MSVST with the 2D1D IUWT. The diagram summarizes the main steps for computing the detail coefficients w_{j1,j2} in (19). The notations are exactly the same as those of Sect. 4.2 with . 

Open with DEXTER  
In the text 
Figure 6: Image obtained by integrating along the zaxis of the simulated data cube. 

Open with DEXTER  
In the text 
Figure 7: Top: 2DMSVST filtering on the integrated image with respectively a and a detection level. Bottom: integrated image after a 2D1DMSVST denoising of the simulated data cube, with respectively a and a detection level. 

Open with DEXTER  
In the text 
Figure 8: Top: frame number 16 of the input cube and the same frame after the 2D1DMSVST filtering at . Bottom: frame number 25 of the input cube and the same frame after the 2D1DMSVST filtering at . 

Open with DEXTER  
In the text 
Figure 9: Pixel spectra at two different spatial locations after the 2D1DMSVST filtering. 

Open with DEXTER  
In the text 
Figure 10: Timevarying source. From left to right, simulated source, temporal flux, and coadded image along the time axis of noisy data cube. 

Open with DEXTER  
In the text 
Figure 11: Recovered timevarying source. Left: one frame of the denoised cube. Right: flux per time frame for the noisy data after background subtraction (solid line), for the original noisefree cube (thicksolid line) and for the recovered source (dashed line). 

Open with DEXTER  
In the text 
Figure 12: Left, from top to bottom: simulated data of the diffuse gammaray emission of the Milky Way in energy band 171181 MeV, noisy simulated data and filtered data using the MSVST. Right: same images for energy band 9.871.04 GeV. 

Open with DEXTER  
In the text 
Copyright ESO 2009