Issue 
A&A
Volume 549, January 2013



Article Number  A83  
Number of page(s)  13  
Section  Numerical methods and codes  
DOI  https://doi.org/10.1051/00046361/201219950  
Published online  21 December 2012 
Estimating hyperparameters and instrument parameters in regularized inversion Illustration for Herschel/SPIRE map making
^{1} Institut d’Astrophysique de Paris (CNRS – Univ. Paris 6), 75014 Paris, France
email: orieux@iap.fr
^{2} Univ. Bordeaux, IMS, UMR 5218, 33400 Talence, France
email: Giova@IMSBordeaux.fr
^{3} Laboratoire des Signaux et Systèmes (CNRS – Supélec – Univ. ParisSud 11), 91192 GifsurYvette, France
email: orieux,rodet@lss.supelec.fr
^{4} Institut d’Astrophysique Spatiale (CNRS – Univ. ParisSud 11), 91405 Orsay, France
email: abergel@ias.upsud.fr
Received: 5 July 2012
Accepted: 19 October 2012
We describe regularized methods for image reconstruction and focus on the question of hyperparameter and instrument parameter estimation, i.e. unsupervised and myopic problems. We developed a Bayesian framework that is based on the posteriordensity for all unknown quantities, given the observations. This density is explored by a Markov chain MonteCarlo sampling technique based on a Gibbs loop and including a MetropolisHastings step. The numerical evaluation relies on the SPIRE instrument of the Herschel observatory. Using simulated and real observations, we show that the hyperparameters and instrument parameters are correctly estimated, which opens up many perspectives for imaging in astrophysics.
Key words: methods: data analysis / methods: statistical / methods: numerical / techniques: image processing
© ESO, 2013
1. Unsupervised myopic inversion
The agreement of physical models and observations is a crucial question in astrophysics, however, observation instruments inevitably have defects and limitations (limited passband, nonzero response time, attenuation, error and uncertainty, etc.). Their inversion by numerical processing must, as far as possible, be based on an instrument model that includes a description of these defects and limitations. The difficulties of such inverse problems, and notably their often illposedness, were well identified several decades ago in various communities: signal and image processing and statistics, and also mathematical physics and astrophysics. It seems pertinent to take advantage of the knowledge amassed by these communities concerning both the analysis of the problems and their solutions.
The illposedness comes from a deficit of available information (and not only from a “simple numerical problem”), which becomes all the more marked as resolution requirements increase. The inversion methods must therefore take other information into account to compensate for the deficits in the observations: this is known as regularization. Each reconstruction method is thus specialised for a certain class of objects (point sources, diffuse emission, superposition of the two, etc.) according to the information accounted for. Consequently, in as much as it relies on various sources of information, each method is based on a tradeoff, which usually requires the setting of hyperparameters, denoted by ξ in the following. The question of their automatic tuning, namely unsupervised inversion, has been extensively studied and numerous attempts investigate statistical approaches: approximated, pseudo or marginal likelihood, in a Bayesian or nonBayesian sense, EM, SEM and SAEM algorithms, etc. The reader may consult papers such as (Zhou et al. 1997; de Figueiredo & Leitao 1997; Saquib et al. 1998; Descombes et al. 1999; Molina et al. 1999; Lanterman et al. 2000; Pascazio & Ferraiuolo 2003; Blanc et al. 2003; Chantas et al. 2007; Giovannelli 2008; Babacan et al. 2010; Orieux et al. 2010a) and reference books such as (Winkler 2003, Part.VI), (Li 2001, Chap. 7) or (Idier 2008, Chap. 8). Alternative methods are based on the Lcurve (Hansen 1992; Wiegelmann & Inhester 2003) or on generalised crossvalidation (Golub et al. 1979; Fortier et al. 1993; Ocvirk et al. 2006).
The construction of maps of high resolution and accuracy relies on increasingly complex instruments. So, inversion methods require instrument models that faithfully reflect the physical reality to distinguish, in the observations, between what is caused by the instrument and what is due to the actual sky. Then, a second set of parameters comes into play: the instrument parameters, denoted by η in the following, such as lobe width, amplitude of secondary lobes, response time, or gain. Their values are of prime importance and their settings are generally based on dedicated observation and rely on models and/or calibrations that inevitably contain errors. For example, the lobe widths are usually determined from a specific observation in a spectral band of nonzero width; consequently the result depends on the source spectrum. Correction factors can be applied but, naturally, they also contain errors when the source spectrum is poorly known or unknown. In contrast, our aim is to achieve myopic inversion, i.e. to estimate the instrument parameters without dedicated observation. The question arises in various fields: optical imaging (Pankajakshani et al. 2009), interferometry (Thiébaut 2008), satellite observation (Jalobeanu et al. 2002), magnetic resonance force microscopy (Dobigeon et al. 2009), fluorescence microscopy (Zhang et al. 2007), deconvolution (Orieux et al. 2010b), etc. A similar problem deals with nonparametric intrument response (blind inversion), for which the literature is also very abundant: (Mugnier et al. 2004; Thiébaut & Conan 1995; Fusco et al. 1999; Conan et al. 1998) in astronomy and (Lam & Goodman 2000; Likas & Galatsanos 2004; Molina et al. 2006; Bishop et al. 2008; Xu & Lam 2009) in the signalimage literature represent examples. The present paper is devoted to parameter estimation for the instrument model developed in our previous paper (Orieux et al. 2012b), based on an accurate instrument model.
A threefold problem has to be solved: from a unique observation, estimate the hyperparameters the instrument parameters and the map. This is referred to as unsupervised and myopic inversion. From the methodological point of view, the proposed inversion method comes within a Bayesian approach (Idier 2008). In this family, we find the classic Wiener and Kalman methods that calculate the expectation or the maximizer of a posteriordensity. In an equivalent way, the PhillipsTwomeyTikhonov methods calculate the minimizer of a leastsquares criterion with quadratic penalization. These methods are based on a secondorder analysis (Gaussian models, quadratic criteria) and lead to linear processing. The work proposed here is in a similar methodological vein as far as estimating the map goes; however, the contribution concerns the estimation of the hyperparameters and instrument parameters. We resort to an entirely Bayesian approach (also called fullBayes) that models the information for each variable (observations, unknown map as well as hyperparameters and instrument parameters) through a probability density. Based on an a posterioridistribution for all the unknown variables, the proposed method jointly estimates the instrument parameters, the hyperparameters, and the map of interest. Regarding experimental data processing, the present paper follows (Orieux et al. 2012b) on inversion for the SPIRE instrument onboard Herschel, which requires the hyperparameters to be fixed by hand and the instrument parameters to be known. The proposed method can automatically tune these parameters and may permit the systematic and automatic processing of large information streams coming from present and future spacebased instruments (e.g. Herschel, Planck, JWST, etc.).
The paper is structured as follows. Section 2 introduces the notation and sets out the problem. Section 3 presents the inversion method: it introduces the priordensities and leads to the posteriordensity. Section 4 describes the computing method based on Markov chain MonteCarlo (Gibbs) stochastic sampling algorithms. The work is essentially evaluated on simulated observations and on a first set of real observations in the context of the SPIRE instrument onboard Herschel. The results are presented in Sect. 5. Finally, some conclusions and perspectives are provided in Sect. 6.
2. Notation, instrument, and map models
To produce accurate and reliable maps, the inversion must exploit a description that represents the acquisition process as faithfully as possible. In this sense, the instrument model

is based on a map of the sky noted , which is naturally a function of continuous spatial variables (α,β) ∈ R^{2} (and possibly a spectral variable λ ∈ R_{+});

and accurately describes the formation of a set of N discrete observations grouped together in a vector y ∈ R^{N}.
A general description of the map of the sky as a function of continuous spatial variables can be written starting from a basic function ψ by combination and regular shifting: (1)The function ψ must be chosen so that this decomposition can describe the maps of interest and is easy to handle. It may be, among other choices, a pixel indicator, a cardinal sine function, or a wavelet (although in the last case the function ψ and the coefficients also depend on a scaling parameter). Whatever the choice, the map of interest is finally represented by its coefficients x_{ij}, the number of which is arbitrarily large and collected in x ∈ R^{M} in what follows. In practice, we choose the Gaussian family as this greatly simplifies the (theoretical and numerical) calculations of the model outputs, including for complex models (Orieux et al. 2012b; Rodet et al. 2008).
The presented work is quite generic in the sense that it is not a prioriattached to a specific instrument. It deals with a general linear instrument model that describes, at least to a fair approximation, the physics of the processes in play: optics, electrics, and thermodynamics. It also includes the passage from a continuous physical reality to a finite number of discrete observations. The instrument is then described by (2)i.e. a general linear model w.r.t.x (a special case of which is the convolutive model). This model shows the instrument parameters η ∈ R^{K} that define the form of the instrument response. The component represents the measuring and modelling errors additively. For the SPIRE instrument (Griffin et al. 2010) of the Herschel Space Observatory (Pilbratt et al. 2010) launched in May 2009, the paper of Orieux et al. (2012b) gives the details of the instrument model construction. The results of Sect. 5 are based on this instrument.
3. Probabilistic models and inversion
The proposed inversion is developed in the framework of Bayesian statistics. It relies on the posteriordensity p(x,ξ,η  y) for the unknown quantities x (image), ξ (hyperparameters), and η (instrument parameters) given y (observations). This density brings together the information about the unknowns in the sense that it attaches more or less confidence to each value of the triplet (x,ξ,η). A summary of this density in the form of a mean and a standard deviation will provide (1) a point estimate (the posteriormean) for the map of interest and the parameters, and (2) an indication of the associated uncertainty (the posteriorstandard deviation).
Remark 1. In statistical terms (Robert 2005), the posteriormean is an optimal estimator. More precisely, of all the possible estimators (whether Bayesian or not, empirical or not, a computation code, etc.), the posteriormean yields the minimum mean square error (MMSE)^{1}. Regarding firstorder statistics, this estimator has, moreover, a zero mean bias.
The posteriordensity is deduced as the ratio of the joint density for all considered quantities p(x,ξ,η,y) and the marginal density for the observations p(y) by application of Bayes’ rule (3)Seen as a function of the unknowns (x,ξ,η), this posteriordensity is proportional to the joint density: (4)This joint density is essential as all the other densities (marginal, conditional, prior, posterior, etc.) can be deduced from it. It can be factorised in various forms and, in preparatio for the developments to follow, we write (5)including the fact that (1) the hyperparameters ξ and the instrument parameters η are a prioriindependent and (2) the object x and the instrument parameters η are also a prioriindependent.
The different probability densities will be defined in the following sections according to the information available on each set of variables and according to practical concerns about dealing with the probability densities and numerical computation time.
3.1. Modelling of errors and likelihood
The factor p(y  x,ξ,η) in Eq. (5)is the density for the observations y given the map x, the instrument parameters η, and the hyperparameters ξ, i.e. the likelihood of the unknowns attached to the observations.
Given Eq. (2), the construction of this likelihood is based on the model for the error n. The analysis developed in this paper is essentially founded on its mean m_{n} and its covariance matrix , and the proposed model is Gaussian: (6)The choice of the Gaussian model is also justified via information property: based on the sole information of finite mean and covariance, the Gaussian density is the model that introduces the least information (Kass & Wasserman 1996). This property is also mentioned as a maximum entropy property of the Gaussian density.
m_{n} is a scalar that models a possible nonzero mean for the noise, such as an offset (in the proposed numerical evaluation of Sect. 5, one offset for each bolometer is introduced). Regarding the covariance matrix, to lighten the notation, we set : γ_{n} is a scale factor (called precision, homogeneous to an inverse variance) and contains the structure itself. For a stationary model has a Tœplitzstructure, for an autoregressive model is a band matrix, for a independent model is diagonal, for a white and stationary model , the identity matrix. In the developments below, the structure of is given while the scale factor γ_{n} and the mean m_{n} are unknown and included in the vector ξ. The results in Sect. 5 are presented for the case ; hence γ_{n} is the inverse of the noise power.
Remark 2. The proposed developments account for characteristics of the error n that may differ from channel to channel, sensor to sensor, etc. This will be the case in Sect. 5: a mean and power of the noise will be assigned and estimated for each bolometer.
As the error n is Gaussian and additive (Eqs. (6)and (2)), the vector of observations y, given x,ξ,η, is also Gaussian with mean (7)and with the same covariance as n: . So, the likelihood of the unknowns attached to the observations reads (8)It includes the information provided by the observations as the transform of a map x by the instrument, taking its parameters η and the noise parameters γ_{n} and m_{n} into consideration.
3.2. Prior density for the map and spatial regularity
The aim of this section is to introduce a priordensity p(x  ξ) for the unknown map coefficients x based upon available information about the map . The present work is mainly devoted to extended emissions. From a spatial standpoint, such maps are relatively regular, i.e. they involve positive correlation. From the spectral standpoint, the power is mainly located at relatively low frequencies. The Gaussian density includes these secondorder properties in a simple way. This choice can also be justified based on a maximum entropy principle. Its main interest here is to result in a linear processing method. It is written in the form (9)where γ_{x} is a precision parameter (homogeneous to an inversevariance that controls the regularity strength) and is a precision matrix (homogeneous to an inversecovariance matrix that controls the regularity structure). When the precision γ_{x} is low (strong priorvariance), the regularity information is weakly taken into account. Conversely, when the precision γ_{x} is high (weak priorvariance), the penalization of nonregular maps is high, i.e. the regularity is strongly imposed.
The subsequent developments are devoted to the design of to account for the desired regularity of the map. A simple regularity measure of the map is the energy of some of its derivatives. These derivatives can address the spatial variables (α,β) separately, can rely on cross derivatives and can intervene at various orders. This is the classical PhilippsTwomeyThikonov penalization idea (Tikhonov & Arsenin 1977). It can also embed directional derivatives or any differential operator (Mallat 2008). In the simplest and natural case, we choose where ∥u∥ is the standard function norm ^{2}. Given the decomposition (1), it is easy to establish the partial derivatives of from the derivatives of ψ. In the direction α, by noting , we have which brings out the autocorrelation of the derivative of ψ. We then have a quadratic form in xAs the coefficients Ψ_{α}^{ [}(i′ − i) δ_{α},(j′ − j) δ_{β}^{] } depend only on the difference between indices, the matrix has a Tœplitzstructure and the computations amounts to a discrete convolution that can be efficiently implemented by the use of fast Fourier transform (FFT). Finally, by performing the same development in the β dimension, a global quadratic norm appears: and designs the precision matrix . For more details and for a spectral interpretation, see Sect. 2.1 and Appendix A of (Orieux et al. 2012b).
3.3. Prior distribution for hyperparameters (hyperprior)
The hyperparameters are the unknown parameters of the densities for the error and for map Eqs. (8), (9)and they are collected in the vector ξ = [m_{n},γ_{n},γ_{x}] . It has been said that γ_{x} and γ_{n} are the precisions (scale parameters) and m_{n} is a mean (position parameter) of Gaussian densities.
The choice of the priordistributions for these hyperparameters is driven by two requirements: (i) little information is available a priorion their values and their relations and (ii) the chosen distributions must lead to efficient algorithms (see Sect. 4.2). Following this line of thought, we choose a prior distribution determined by Jeffreys’ principle^{3}: p(γ) = 1/γ for γ_{x} and γ_{n} and p(m_{n}) = 1. Moreover, regarding the triplet of hyperparameters ξ = [m_{n},γ_{x},γ_{n}] , they are modelled as independent variables, since no information is available about their eventual relations. Finally (10)has two advantages

1.
First of all, the posteriorconditional densities forγ_{x} and γ_{n} (resp. for m_{n}), as shown in Sect. 4.2, will be gamma densities (resp. Gaussian density), which will make the implementation easier.

2.
This prior distribution is noninformative (which introduces a minimum of information on the value of the hyperparameters) in the sense that it is invariant by certain parameterization changes (Robert 2005; Kass & Wasserman 1996).
3.4. Prior density for the instrument parameters
The instrument parameter η operates in a complex nonlinear way in the description of the observations. In consequence, whatever the priordensity, the conditional posteriordensity for η (see Sect. 4.3) will not have a standard form. The choice is thus purely oriented by the information on the instruments and the question that arises concerns the encoding of the available information in the form of a probability density. If we have no information except a minimum and a maximum value for a given parameter, the choice of a uniform density over the interval is a reasonable one. If we have a nominal value with an associated uncertainty and no other information, the most suitable choice is a Gaussian density. The rest of the development is valid whatever the choice, and we consider the Gaussian case in the following.
In addition, having no information available about possible links among the various parameters, we take it that the parameters are, a priori, independent and thus (11)In practice and for the first results presented in Sect. 5, the means μ_{k} and variances ρ_{k} were taken from the SPIRE observer manual or were fixed adhoc at plausible values.
Fig. 1 Graphical dependency representation (hierarchical structure). The round (square) nodes correspond to unknown (fixed) quantities. The directions of the arrows indicate the dependencies. 
3.5. Posteriordensity: histograms, mean and standard deviation
The posteriordensity (3)for all the unknowns x,ξ,η, is deduced from the joint density (5)for all quantities concerned as follows: In this expression,

the density p(x  γ_{x}) for the unknown map x ∈ R^{M} is Gaussian (Eq. (9));

the distribution p(ξ) (for ξ = [m_{n},γ_{n},γ_{x}] ) is a Jeffreys’ distribution (Eq. (10));

the instrument parameters η ∈ R^{K} is modelled by a Gaussian density p(η) (Eq. (11));

the density p(y  x,ξ,η) for the observations y ∈ R^{N} given the rest of the variables (i.e. likelihood) is a Gaussian density (Eq. (8)) and it is a function of x through given by Eq. (7).
Finally, from Eqs. (8), (9), (10), and (11), the posteriordensity can be written (12)This density brings together all information about the unknowns, and the estimators and algorithms presented below are entirely based on it. However, it is too complex to be analyzed directly as a whole and the difficulty stems from (i) the dimension of x (of size M ~ 10^{5} in practice) and (ii) the joint presence of other parameters (hyperparameters ξ and instrument parameters η). Moreover, for the latter in particular, the dependence is complicated and cannot be identified with a standard form. The proposed approach is to explore the posteriordensity by means of stochastic sampling (Robert & Casella 2004; Gilks et al. 1996). The idea is to produce a set of samples , for q = 1,2,...,Q, drawn at random under the posteriordensity. It is then possible, for example, to deduce histograms that approximate marginal densities, together with means and standard deviations. This strategy is by no means new but interest in its practical use has revived in recent years as new forms of algorithms have been developed and computer power has increased.
Concerning the estimates themselves for the map, the hyperparameters and the instrument parameters, we choose the posteriormean (PM), as indicated at the beginning of Sect. 3 (see also Remark 1). We will also look at the dispersion around the mean through the posteriorstandard deviation (PSD) and the links among components through posteriorcorrelations. Using the set of samples , for q = 1,2,...,Q, the posterior mean and the posterior covariance matrix are computed by where denotes the column concatenation . Practically, it is not possible to compute the entire covariance , but it is possible to compute its diagonal elements to characterize the marginal errors for each component (each pixel, hyperparameters, instrument parameters) and to compute a few nondiagonal elements to measure the correlations between components.
4. Exploration of posteriordensity and the computation algorithm
We have introduced an instrument model and various probability densities to define the posteriordensity that brings together the information on the map and the parameters (hyperparameters and instrument parameters). We have also defined the posteriormean (PM) as an estimate and the posteriorstandard deviation (PSD) as a measure of the uncertainty. We have then introduced the idea of computations via stochastic sampling. The developments in the present section concern the algorithm for computing these samples.
Gibbs algorithm.
The production of samples of the posteriordensity for the set (x,ξ,η) is not possible directly because of the complexity of the density. We therefore use a Gibbs algorithm (Robert & Casella 2004; Gilks et al. 1996), which breaks the problem down into three simpler subproblems: sampling x, ξ, and η separately. This is an iterative algorithm, described in Table 1: each variable x, ξ, and η is drawn under its conditional posteriordensity given the current value of the other two variables. For each of the three steps, this conditional posteriordensity can be deduced directly (up to a multiplying factor) from the posteriordensity (12): all we have to do is to keep only the factors depending on the variable of interest. This algorithm is a Markov chain MonteCarlo (MCMC) algorithm (Robert & Casella 2004; Gilks et al. 1996) and is known to give (after a certain time, called the burnin time) samples under the posteriordensity.
The conditional density of the map coefficients x (Step (1), Table 1) is Gaussian (Sect. 4.1). For the precisions ξ (Step (2), Table 1), the conditional densities are gamma densities (Sect. 4.2). They will be sampled using standard existing numerical routines (e.g. in Matlab). In contrast, the conditional density of the instrument parameters η (Step (3), Table 1) has a much more complex nonstandard form, so that it cannot be directly sampled by existing routines. To overcome this difficulty, sampling was carried out by means of a MetropolisHastings step (Sect. 4.3).
4.1. Map sampling
The density for the map x conditionally on the other variables is deduced from (12) by extracting the factors depending on x: (15)Considering the expression for given by Eq. (7), the argument of this exponential is quadratic in x. We deduce that we have a Gaussian density and, by rearranging the argument, we can determine the covariance and the mean
Remark 3. For a fixed value of the hyperparameters and the instrument parameters, the map defined by Eqs. (16), (17)is the maximizer (and the mean) of the conditional posteriordensity (15). This is the regularized leastsquares solution denoted parameterized by μ = γ_{x}/γ_{n}. This corresponds to the solution defined in our previous paper (Orieux et al. 2012b). For a convolutive instrument model, it is the Wiener solution (also called WienerHunt solution; Orieux et al. 2010b).
Step (1) of Table 1 consists of sampling this Gaussian but this operation is made very difficult by three elements: (i) the large size of the map; (ii) the correlation introduced by the instrument model and the priordensity; and (iii) the absence of structure of the instrument model (invariance, sparse nature). This problem can be solved using several approaches. For a convolutive instrument model (2), it is possible to approximately diagonalise the correlation matrix by FFT, thus producing a sample for the cost of an FFT (Chellappa & Chatterjee 1985; Chellappa & Jain 1992; Geman & Yang 1995; Giovannelli 2008; Orieux et al. 2010b). If where the inverse of the correlation matrix is sparse, a partially parallel Gibbs sampler may be particularly efficient (Winkler 2003, Chap. 8). In the present case, neither the correlation nor its inverse possess the required properties. A general solution relies on factorizing the correlation matrix (Cholesky decomposition, diagonalization, etc.) but the large size of the matrix (M × M with M ~ 10^{5}) does not permit the required calculations to be performed here.
The proposed solution consists of constructing a criterion such that its minimizer is a sample under the desired posteriorconditional density. To do this, we perturb the means of the noise component and of the map component by an additive component with covariance and . A perturbed regularized least squares criterion is then introduced and it can be shown (see Orieux et al. 2012a) that its minimizer (18)is Gaussian and does indeed have the correlation and mean defined by (16)and (17). This very powerful result has already been used by (Féron 2006; Orieux et al. 2012b). In different forms, similar ideas have been introduced and used by (Rue 2001; Rue & Held 2005; Lalanne et al. 2001; Tan et al. 2010).
Remark 4. For the non perturbed criterion ( and ), we have the regularized leastsquares solution Eqs. (16), (17), that was mentioned in Remark 3.
Remark 5. The approach described in Orieux et al. (2012a) involves the sampling of the prior density (9)that is not properly defined here: the matrix does not penalise the mean of the map (it is of deficient rank). But, for the same reason, the solution (18)does not depend on the mean of the realization of the prior density. Therefore, the simulated sample can have an arbitrary mean value.
4.2. Hyperparameter sampling
To determine the posteriorconditional density for γ_{x}, we examine the posteriordensity (12), and only keep the factors where γ_{x} appears, which gives and we recognize a Gamma density (see Appendix A) (19)Concerning γ_{n}, we also refer to the posteriordensity (12)and find which is also a Gamma density (20)For both γ_{x} and γ_{n} the second parameter of the Gamma density introduces a quadratic norm (regularity of the map in Eq. (19), and goodnessoffit in Eq. (21)), which can be easily computed.
Remark 6. An intuitive interpretation can be given to these results starting from the fact that the mean of the Gamma density is equal to the product of its parameters (see Appendix A), here for (21). In this sense, the conditional posterior mean is the inverse of the empirical variance of the residuals. Consequently, when the goodnessoffit term is small, the mean of the density is large and so the sampled value of γ_{n} is also high reporting a high precision, i.e. a weak variance (and vice versa). The same holds for the map regularity in relation with the mean of the density (19)given by . These observations support the coherence of the model and reinforce the prior choice for these hyperparameters as a convenient one.
Regarding the mean of the noise, m_{n}, it is a scalar whose posteriorconditional density is also deduced from the posteriordensity (12) and from (7)where m_{r} is the empirical mean of the residuals . We then have a Gaussian density (21)In the numerical evaluation of Sect. 5, one such mean is estimated for each bolometer.
Remark 7. If we examine the relationships above, we see that p(γ_{x}  y,x,γ_{n},η) = p(γ_{x}  x), in other words, γ_{x} and (y,γ_{n},η) are independent conditionally on x. Similarly, we note that p(γ_{n}  y,x,γ_{x},η) = p(γ_{n}  y,x,η), which means that γ_{n} and γ_{x} are independent conditionally on (y,x,η). In addition, m_{n} is independant of γ_{x}, given y,x,γ_{n},η.
4.3. Instrument parameter sampling
The last step (Step (3) of Table 1) is more complex. As for the other variables, the posteriorconditional density can be deduced from the posteriordensity (12)by keeping the factors that bring in η. There are two of these: the likelihood and the priordensity, and we thus have (22)However, this is not a usual density, notably because there is no simple mathematical form to represent the dependence of the observation w.r.t.η. Thus, Step (3) of Table 1 cannot be carried out directly with standard sampling routines and we resort to a MetropolisHastings step in a randomwalk version (Robert & Casella 2004; Gilks et al. 1996) described in Table 2. It can be briefly explained as follows. Because it is impossible to draw a sample directly under the conditional posteriordensity (22), a sample is drawn under another density (namely the proposal density), but is not systematically accepted. Acceptance or rejection is also random with a precisely defined probability (see Eq. (23)) to ensure that, at convergence, we have samples under the target density (Robert & Casella 2004; Gilks et al. 1996). The algorithm is divided into three substeps summarized in Table 2 and detailed here.
 (a)
Draw a proposal as a perturbation of the current value: , deduce the instrument matrix , and the corresponding model output .
 (b)
Compute the acceptation ratio (23)based on the conditionnal posterior law ratio that compares the goodnessoffit for the current parameter and the proposed one.
 (c)
Accept or reject the proposal, at random, with probability min(1,ρ). To do so, draw u uniformly in [0,1] and take
The algorithm can be explained as follows. Starting with a current value , the algorithm proposes a new value and compares the goodnessoffit for the two values. When the proposed value improves the fit, ρ > 1 and is accepted. When the proposed value degrades the fit, can be accepted or rejected, with a probability that is higher or lower, depending on how weak the degradation is.
Remark 8. There are other more complex (and potentially more efficient) approaches for MetropolisHastings sampling. In particular, the proposal density can be adapted, e.g. directional random walk (Vacar et al. 2011). They are not exploited here but are considered in the development perspectives.
5. Numerical results
The previous sections presented the approach for building the posteriordensity and for its exploration by stochastic sampling using a Gibbs algorithm including a MetropolisHastings step. The mean and standard deviation (SD) of the posteriordensity are numerically computed as empirical averages based on simulated samples, from relations (13)and (14). The developments below show the practicability of the proposed method (models, estimate and algorithm), and provide a first numerical evaluation.
5.1. Evaluation methodology
The evaluation is based on the SPIRE instrument (Griffin et al. 2010) of the Herschel Space Observatory (Pilbratt et al. 2010) launched in May 2009. It focuses on the PMW channel (centred around 350 μm) and the Large Map protocol in the nominal operating conditions: scan back and forth with constant speed (30″/s) over two almost perpendicular directions (88°). The scans are associated with a high sampling frequency (F_{s} ≈ 30 Hz) providing spatially redundant observation and Fig. 2 shows the corresponding redundancy/pointing map. The spatial shift between basis functions (see Eq. (1)) is fixed at δ_{α} = δ_{β} = 2″, based on our earlier work (Orieux et al. 2009, 2012b), to obtain the best gain in resolution without important increase of the computational cost. The angular size of the reconstructed map is 20′ × 20′, i.e. a map of 600 × 600 coefficients. The associated direct model, including the whole acquisition chain (scanning strategy, mirror, horns, wavelength filters, bolometers, and electronics) is detailed in our previous paper (Orieux et al. 2012b) and represented by Eq. (2)of the present paper.
Fig. 2 Redundancy/pointing map associated with our experiment of five crossed scans. 
Fig. 3 Comparison of reconstructed map for the Galactic Cirrus: proposed map (Fig. 3a), best map (Fig. 3b), naive map (Fig. 3d), and true map (Fig. 3e). Figure 3c shows a profile (marked by the white line in Fig. 3e) and Fig. 3f the spectrum (circular means of power spectra). Uncertainties are given in Fig. 4 and quantitative results are given in Table 3. Comments are given in Sect. 5.3.1. 
The unsupervised method is assessed based on two synthetic maps of extended emission (the Galactic Cirrus (Fig. 3e) and a realization of the priordensity Eq. (9)) as well as based on a real observation (reflection nebula NGC 7023, Fig. 7). The paper also proposes a first assessment of the unsupervised and myopic approach based on a synthetic map with broader spectral content (the Galactic Cirrus with point sources (Fig. 3e)).
In the simulated cases, a zeromean white Gaussian noise is added to the model output. Moreover, in these cases, since the original map (the “sky truth” denoted by ) is known, the quality of the reconstruction (denoted by ) can be quantified through an error: (24)where only coefficients in the observed area are taken into account, allowing assessments and comparisons between methods.
5.2. Algorithm behaviour and general comments
As explained in Sect. 4, the algorithm provides a series of samples that form a Markov chain for hyperparameters, instrument parameter and map. The MCMC theory then ensures that it correctly explores the parameter space and produces a density of samples reflecting the posterior density. Practically, the algorithm has been executed for the unsupervised problem as well as for the unsupervised and myopic problem. It has been run several times (1) using identical initial conditions and (2) using different initial conditions. In both cases, the same qualitative and quantitative behaviour as presented here has been systematically observed.
The computation time takes about one hour for the unsupervised (nonmyopic) case and about ten hours for the unsupervised and myopic case. The main computational cost is due to computating the instrument model output given by Eq. (2).
Figures 6–8 present some typical elements of the algorithm operation: visualization of progression, convergence phase (burnin period), stable phase, etc. The evolution of the chain is shown for hyperparameters (Figs. 6 and 7) and for instrument parameter (Fig. 8). It is thus possible to grasp how the parameter space is explored.
5.3. Unsupervised approach
5.3.1. Assessment of map estimation
The qualitative and quantitative assessment of the reconstructed maps is presented here for the Galactic Cirrus; the first results are shown in Fig. 3.

The unsupervised method (proposed method) is outlined inFig. 3a. The hyperparameters are automaticallyset (without knowing of the sky truth).

The bestsupervised method is outlined in Fig. 3b. The hyperparameters are set by hand to minimize the error ℰ (knowing the sky truth). It is referred to as the best map and was previously presented in our paper (Orieux et al. 2012b).

The naive map (coaddition) and the true map are shown in Figs. 3d and e.

In addition, Fig. 3c gives spatial profiles (vertical in the middle of the map) and Fig. 3f gives the spectral^{4} profiles.
As expected and shown in (Orieux et al. 2012b), the inversion based on an accurate instrument model considerably improves the quality of the map: see the proposed map and the best map compared to the naive map and to the true map. The proposed map is visually very similar to the true map. In particular, our method restores details of small spatial scales (with spectral extension from null to high frequency) that are invisible on the naive map but are present on the true map (see Fig. 3c). In addition, our method correctly restores the structures of large spatial scales (low frequencies) and also the mean level of the map (null frequency), i.e. the photometry.
To assess the pertinence of estimating γ_{n} and γ_{x} in terms of map quality, we compared the proposed map with the best map. They are visually very similar (see Figs. 3a and b). In the quantitative terms given by Table 3, the best map produces an error ℰ of 0.0129% and the proposed map produces an error ℰ of 0.016%, which is only slightly higher. In other words, the proposed unsupervised method automatically (without knowing the sky truth) determines hyperparameters that produce a map almost as good as the best map (which requires knowing the sky truth).
Fig. 4 PSD and quantification of uncertainties. Fig. 4a shows the map of the PSD and Figs. 4b and c show the interval around the estimated map ± PSD and the true map. In the spatial domain, Fig. 4b is a profile (marked by the white line on 3e) and in the spectral domain Fig. 4c is the circular means of the power spectra. 
Fig. 5 Residuals of maps in the central part of measured coefficiens. This illustrates the grainy structure of the proposed map wrt. best map. The naive map residuals suffer twice more errors and a squared feature caused by the pixel model. 
However, the proposed map shows a fine grainy texture that is visible neither in the true nor in the best map. This feature is also visible on the residual map of the coefficiens (Fig. 5). This is also observable in Fig. 3f: the spectrum of the proposed map passes above the spectrum of the true map in the spectral band 0.025–0.035 arsec^{1}. This defect is related to a slight overevaluation of the observation contribution with respect to the priorcontribution. It is referred to as underregularization and yields an overamplification of the observation in this spectral band. This confirms the behaviour previously observed in deconvolution (Orieux et al. 2010b; Giovannelli 2008) or noted for the maximum likelihood (Fortier et al. 1993). Nevertheless, it is remarkable that this defect is correctly notified by the PSD, as explained in the next paragraph.
Indeed, the approach naturally provides a measure of reliability through the PSD shown in Fig. 4. Two zones can be seen in Fig. 4a, in accordance with Fig. 2: the central zone (where observations are available) and the peripheral zone (extrapolated from observations of the central zone and based on the priorregularity). The boundary between the two zones also exhibits the variation of the observation hit and scanning strategy notably well. In addition, the posterior standard deviation also illustrates the difference between the zones observed with our without cross scan. From a spatial standpoint, Fig. 4b shows an interval around the estimated map with plus/minus PSD. The main result is that the true map is clearly within the interval. In a similar way, from a spectral standpoint the results are given by Fig. 4c (in relation with Fig. 3f): the true spectrum is also within the interval. More specifically, incorrectly reconstructed in the spectral band (above 0.025 arcsec^{1}), the stronger PSD clearly shows that the estimated spectrum is certainly submitted to unsatisfactory errors or confidence.
5.3.2. Assessment of hyperparameter estimation
Fig. 6 Chains and histograms for γ_{n}, Fig. 6a, and γ_{x}, Fig. 6b, for the Galactic Cirrus. The chains show the burnin period (about 1000 iterations) and the steady state. The corresponding histograms are computed on steady state only. 
Hyperparameter estimation: true values, estimates, PSD and best values.
Fig. 7 Results for real observation processing (reflection nebula NGC 7023). Chains for the noise parameter (γ_{n}) and for the image parameter (γ_{x}) in Figs. 7a and b. The stationary state is attained after a burnin time of about 500 iterations. Figure 7c shows the corresponding map. Figures 7d and e illustrate chains and marginal histogram for two bolometer offsets. 
This section assesses the unsupervised capabilities through evaluating the hyperparameter estimation using the Galactic Cirrus and a realization of the priorfor the map (which makes a true value available). Fig. 6 shows the chains and the histograms that approximate the marginal posteriordensities p(γ_{n}  y) and p(γ_{x}  y). In both cases, the histogram is relatively narrow although the prioris a wide noninformative Jeffreys’ distribution (see Eq. (10)). In other words, the observations are sufficiently informative to quantify noise and regularity level and the method is able to capture this information.
From a quantitative standpoint, results are given in Table 4. For the Galactic Cirrus and priorrealization, the estimated values are very similar to the true value (error is less than 1%). Moreover, the PSD are very low (0.40%). In the case of priorrealization, the estimated value is in the correct range but the error is larger (about 17%) and the PSD is 1.7%. This difference can be naturally explained by two elements: (i) the noise is added at the system output, so it is directly observed, whereas the map is at the system input, i.e. indirectly observed; (ii) the added noise is a realization of the priordensity for the noise while the Galactic Cirrus is not a realization of the priordensity for the map.
5.3.3. Real observation processing
This section proposes a first assessment for a real observation. It is based on the reflection nebula NGC 7023 acquired during the science demonstration phase of Herschel, which as been presented in (Abergel et al. 2010) and was processed in our previous paper (Orieux et al. 2012b). There and here, computations are made on the level1 files processed using HIPE. In our previous paper (Orieux et al. 2012b), the offsets were removed in a preprocessing step and the regularization parameter was tuned by hand compromise between gain in resolution and overamplification of the observations. In contrast, here both are automatically tuned.
Quantitative evaluation of the estimation of the instrument parameter using the Galactic Cirrus with point sources.
Figure 7 presents the evolution of the chains for the hyperparameters: Fig. 7a for the noise parameter γ_{n} and Fig. 7b for the image parameter γ_{x}. It is important to notice that the algorithm behaves in a very similar manner for the real observation and for the simulated observation (see Fig. 7 compared to Fig. 6). The figures also give an empirical indication of the algorithm operation: after a burnin time (empirically less than about 500 iterations) the stationary state is attained and the chain remains in a steady state: the samples are drawn under the posterior density. Concerning the offsets, the chains begin in the steady state, thanks to a good initialization based on (Orieux et al. 2012b) results. All bolometer offsets behave in the same manner with two example illustrated Figs. 7d and e. The empircal mean of the offsets sample start to stabilize at approximately 500 samples.
Figure 7c shows the corresponding reconstructed map. Its quality is equivalent to the quality of the map restored by empirically tuning the hyperparameter presented in Orieux et al. (2012b), Fig. 8. In other words, the proposed unsupervised method automatically determines hyperparameters (noise power and offsets as well as sky power) that produce a map almost as good as the map produced by a handmade hypermarameter tuning. In addition, the map remains far better than the naive map shown in Fig. 7f.
5.4. Myopic and unsupervised approach
The myopic and unsupervised question is a threefold problem that is much more ambitious: estimate the instrument parameter, the hyperparameters, and the map itself from a unique observation. In addition, the instrument parameter intervenes in a complex way in the description of the observations, and moreover, the problem is stated in a context that is doubly delicate: illposedness and highresolution.
Fig. 8 Instrument parameter chain (myopic and unsupervised approach) for the of Galactic Cirrus with point sources. Left (right) part of the figure deals with case 1, i.e. (case 2, i.e. ). The horizontal axis gives the iteration index and the vertical range is the prior interval in a twostandarddeviations sense. The true value is shown by the straight line. 
In Orieux et al. (2012b), the equivalent PSF has a Gaussian shape whose standard deviation is proportional to the wavelength: σ_{o}(λ) = ηλ. It is then integrated w.r.t.the wavelength (to include the spectral extend) and w.r.t.the time parameter (to account for the bolometer response) to form the global instrument response. To test the method, we consider the instrument parameter η to be poorly known and introduce elements of the feasibility to estimate it.
The prioris the Gaussian density given by Eq. (11), with K = 1. Its mean is taken from the SPIRE observer manual μ = 2.96 × 10^{4} [′′/m] and its standard deviation is set to ρ = 10^{4}, i.e. a relatively large uncertainty. It is about 33% of the mean and an equivalent prior interval is [0.96 × 10^{4},4.96 × 10^{4}] in a twostandarddeviations sense. Two cases are investigated for the true value (used to simulated observations): and . The conditional posteriorfor η (Sect. 4.3) does not have a standard form and its sampling (step (3) of Table 1) relies on a MetropolisHastings sampler, itself based on a randomwalk with a Gaussian excursion. The size of the excursion was chosen so that the acceptation rate is around 50%. Two maps are used for the observation: the Galactic Cirrus and the Galactic Cirrus with point sources. In each case, the algorithm was run several times from identical and different initializations, and shows similar qualitative and quantitative behaviours as those in Fig. 8.
Nevertheless, as expected, the spectral content of the Galactic Cirrus is not sufficiently extended towards high frequencies to provide an excitation that is adequate for instrument identification. In contrast, the Galactic Cirrus with point sources is more extended and estimations are more accurate. Table 5 presents quantitative assessments. The main result is that the estimation error is about 6%. It is a remarkable result given the difficulty of the problem (triple problem, complex relations, illpossedness, and high resolution) and given that the prior uncertainty is about 33%. In other words, the method is able to capture information about instrument parameter, jointly with noise level, regularity level, and map from a unique observation. However, the parameter η seems to be slightly underestimated which, we explain as follows. The input map (with point sources) presents a broad spectral extent whereas the priorfavours spatially extended maps (dominated by relatively low frequencies), so the posterior advocates a narrower PSF to compensate for this spectral discrepancy.
Fig. 9 Restoration of cirrus superimposed on point sources. The proposed maps must be compared to the maps restored with the true instrument parameter and the best hyperparameter and with the naive map. 
Figures 9a and b show the related maps. They must be compared to the map restored with the true instrument parameter and the best hyperparameter presented in Fig. 7b of Orieux et al. (2012b) and in Fig. 9d here. They must also be compared to the true map and the naive map also given in Orieux et al. (2012b) and in Figs. 9c–e here. As previously, the proposed maps show a fine grainy texture but despite this defect, they remain similar to the true map. The quality of the proposed maps is similar to the quality of the map restored with the true instrument parameter and the best hyperparameter. In addition, several point sources of the true map are visible on the proposed maps but not on the naive map. In other words, the proposed method automatically determines instrument parameter and hyperparameters that produce a map almost as good as the best one and better than the naive map.
6. Conclusion
We described regularized methods for image reconstruction and focused on parameter estimation:

hyperparameters, which guide the tradeoff betweenpriorbased and observationbased information;

instrument parameter, which tunes the physical characteristics of the model of the acquisition system.
They were jointly estimated with the map of interest. We were therefore dealing with an unsupervised and myopic inverse problem.
The most delicate point is jointly handling the different types of variables and their interactions in direct terms but, above all, in inverse terms. From a methodology point of view, we worked in the framework of hierarchical full Bayes strategies that model the available information for each set of variables (map, hyperparameters, instrument parameter, and observations) under a probability density. We defined the posteriordensity, which gathers the information on the map of interest and the parameters, given the observations. We then defined the posteriormean as an estimate of the map and the posteriorstandard deviation as a measure of uncertainty, which gives an uncertainty map. This approach makes it possible to work in a global and consistent framework to solve the problem as a whole. It draws its inspiration from our earlier works on deconvolution (Orieux et al. 2010b) and adapts them to the case at hand.
The posteriordensity was explored by stochastic sampling using a Gibbs algorithm. The sampling of the map was difficult: we are dealing with a largesized multivariate normal density for which classical techniques do not apply. We overcame this difficulty by constructing a sample as the minimizer of a wellchosen perturbated criterion (Orieux et al. 2012a). Another problematic point is the instrument parameter sampling: we are dealing with a very complex, nonstandard density. This difficulty was overcome by means of a MetropolisHastings step. The estimate of the map as well as the parameters (posteriormean) and the uncertainties (posteriorstandard deviation) were calculated numerically as empirical averages based on the simulated samples.
We presented a first application of the developments (Bayesian estimation method and stochastic sampling algorithm) in a real context: the SPIRE instrument of the Herschel Space Observatory. The study was essentially performed on simulated observations and has also yielded some initial results on real observations. We concluded that the approach is applicable and enables joint estimation of the map, the hyperparameters, and the instrument parameter from a unique observation. We showed, among other results, that the quality of the proposed map is similar to that obtained when the instrument parameter is known and the hyperparameters are fixed by hand in a supervised way (using the sky truth). The method shows remarkable results given the difficulty of the problem. It seems to us that these initial results are particularly promising and worth developing. They may open up many new perspectives for imaging in astrophysics in a myopic and unsupervised framework.
The mean square error is the expected value of the squared norm of the difference between estimated value and true value. The expectation is under the distribution of the observation and the unknown. The MSE is the sum of the variance and the squared bias of the estimator (under the distribution of the observation and the unknown).
References
 Abergel, A., Arab, H., Compiègne, M., et al. 2010, A&A, 518, L96 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Babacan, S., Molina, R., & Katsaggelos, A. 2010, IEEE Trans. Image Process., 19, 53 [Google Scholar]
 Bishop, T., Molina, R., & Hopgood, J. 2008, in Proc. IEEE ICIP [Google Scholar]
 Blanc, A., Mugnier, L., & Idier, J. 2003, J. Opt. Soc. Amer. (A), 20, 1035 [Google Scholar]
 Chantas, G. K., Galatsanos, N. P., & Woods, N. A. 2007, IEEE Trans. Image Process., 16, 1821 [NASA ADS] [CrossRef] [Google Scholar]
 Chellappa, R., & Chatterjee, S. 1985, IEEE Trans. Acoust. Speech Signal Process., ASSP33, 959 [Google Scholar]
 Chellappa, R., & Jain, A. 1992, Markov Random Fields: Theory and Application (Academic Press Inc) [Google Scholar]
 Conan, J.M., Mugnier, L., Fusco, T., Michau, V., & Rousset, G. 1998, App. Opt., 37, 4614 [CrossRef] [Google Scholar]
 de Figueiredo, M. T., & Leitao, J. M. N. 1997, IEEE Trans. Image Process., 6, 1089 [NASA ADS] [CrossRef] [Google Scholar]
 Descombes, X., Morris, R., Zerubia, J., & Berthod, M. 1999, IEEE Trans. Image Process., 8, 954 [NASA ADS] [CrossRef] [Google Scholar]
 Dobigeon, N., Hero, A., & Tourneret, J.Y. 2009, IEEE Trans. Image Process., 18 [Google Scholar]
 Féron, O. 2006, Ph.D. Thesis, Université de ParisSud, Orsay, France [Google Scholar]
 Fortier, N., Demoment, G., & Goussard, Y. 1993, J. Visual Comm. Image Repres., 4, 157 [CrossRef] [Google Scholar]
 Fusco, T., Véran, J.P., Conan, J.M., & Mugnier, L. M. 1999, Astron. Astrophys. Suppl. Ser., 134, 193 [Google Scholar]
 Geman, D., & Yang, C. 1995, IEEE Trans. Image Process., 4, 932 [NASA ADS] [CrossRef] [Google Scholar]
 Gilks, W. R., Richardson, S., & Spiegelhalter, D. J. 1996, Markov Chain Monte Carlo in practice (Boca Raton: Chapman & Hall/CRC) [Google Scholar]
 Giovannelli, J.F. 2008, IEEE Trans. Image Process., 17, 16 [NASA ADS] [CrossRef] [Google Scholar]
 Golub, G. H., Heath, M., & Wahba, G. 1979, Technometrics, 21, 215 [CrossRef] [Google Scholar]
 Griffin, M. J., Abergel, A., Abreu, A., et al. 2010, A&A, 518, L3 [Google Scholar]
 Hansen, P. 1992, SIAM Rev., 34, 561 [CrossRef] [MathSciNet] [Google Scholar]
 Idier, J., ed. 2008, Bayesian Approach to Inverse Problems (London: ISTE Ltd and John Wiley & Sons Inc.) [Google Scholar]
 Jalobeanu, A., BlancFeraud, L., & Zerubia, J. 2002, in Proc. IEEE ICASSP, 4, 3580 [Google Scholar]
 Kass, R. E., & Wasserman, L. 1996, J. Amer. Statist. Assoc., 91, 1343 [Google Scholar]
 Lalanne, P., Prévost, D., & Chavel, P. 2001, Appl. Opt., 40 [Google Scholar]
 Lam, E. Y., & Goodman, J. W. 2000, J. Opt. Soc. Am. A, 17, 1177 [NASA ADS] [CrossRef] [Google Scholar]
 Lanterman, A. D., Grenander, U., & Miller, M. I. 2000, IEEE Trans. Pattern Anal. Mach. Intell., 22, 337 [CrossRef] [Google Scholar]
 Li, S. Z. 2001, Markov Random Field Modeling in Image Analysis (Tokyo: SpringerVerlag) [Google Scholar]
 Likas, A. C., & Galatsanos, N. P. 2004, IEEE Trans. Image Process., 52, 2222 [Google Scholar]
 Mallat, S. 2008, A Wavelet Tour of Signal Processing: The Sparse Way (Academic Press Inc.) [Google Scholar]
 Molina, R., Katsaggelos, A. K., & Mateos, J. 1999, IEEE Trans. Image Process., 8, 231 [NASA ADS] [CrossRef] [Google Scholar]
 Molina, R., Mateos, J., & Katsaggelos, A. K. 2006, IEEE Trans. Image Process., 15, 3715 [NASA ADS] [CrossRef] [Google Scholar]
 Mugnier, L., Fusco, T., & Conan, J.M. 2004, J. Opt. Soc. Amer., 21, 1841 [Google Scholar]
 Ocvirk, P., Pichon, C., Lançon, A., & Thiébaut, E. 2006, MNRAS, 365, 46 [NASA ADS] [CrossRef] [Google Scholar]
 Orieux, F., Rodet, T., & Giovannelli, J.F. 2009, in Proc. IEEE ICIP, Le Caire, Egypt [Google Scholar]
 Orieux, F., Giovannelli, J.F., & Rodet, T. 2010a, J. Opt. Soc. Amer., 27, 1593 [NASA ADS] [CrossRef] [Google Scholar]
 Orieux, F., Giovannelli, J.F., & Rodet, T. 2010b, in Proc. IEEE ICASSP, Dallas [Google Scholar]
 Orieux, F., Féron, O., & Giovannelli, J.F. 2012a, IEEE Signal Process. Lett. [Google Scholar]
 Orieux, F., Giovannelli, J.F., Rodet, T., et al. 2012b, A&A, 539, A38 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Pankajakshani, P., Zhang, B., BlancFéraud, L., et al. 2009, Applied Optics, 48, 4437 [NASA ADS] [CrossRef] [Google Scholar]
 Pascazio, V., & Ferraiuolo, G. 2003, IEEE Trans. Image Process., 12, 572 [NASA ADS] [CrossRef] [Google Scholar]
 Pilbratt, G. L., Riedinger, J. R., Passvogel, T., et al. 2010, A&A, 518, L1 [CrossRef] [EDP Sciences] [Google Scholar]
 Robert, C. P. 2005, The Bayesian Choice, Statistiques et probabilités appliquées (Paris, France: Springer) [Google Scholar]
 Robert, C. P., & Casella, G. 2004, MonteCarlo Statistical Methods, Springer Texts in Statistics (New York: Springer) [Google Scholar]
 Rodet, T., Orieux, F., Giovannelli, J.F., & Abergel, A. 2008, IEEE J. of Selec. Topics in Signal Proc., 2, 802 [Google Scholar]
 Rue, H. 2001, J. R. Statist. Soc. B, 63 [Google Scholar]
 Rue, H., & Held, L. 2005, Monographs on Statistics and Applied Probability, 104, Gaussian Markov Random Fields: Theory and Applications (Chapman & Hall) [Google Scholar]
 Saquib, S. S., Bouman, C. A., & Sauer, K. D. 1998, IEEE Trans. Image Process., 7, 1029 [NASA ADS] [CrossRef] [Google Scholar]
 Tan, X., Li, J., & Stoica, P. 2010, in Proc. IEEE ICASSP, 3634 [Google Scholar]
 Thiébaut, E. 2008, in Proc. SPIE: Astronomical Telescopes and Instrumentation, 7013, 70131 [Google Scholar]
 Thiébaut, E., & Conan, J.M. 1995, J. Opt. Soc. Amer. (A), 12, 485 [CrossRef] [Google Scholar]
 Tikhonov, A., & Arsenin, V. 1977, Solutions of IllPosed Problems (Washington DC: Winston) [Google Scholar]
 Vacar, C., Giovannelli, J.F., & Berthoumieu, Y. 2011, in Proc. IEEE ICASSP, Prague, Czech Republic [Google Scholar]
 Wiegelmann, T., & Inhester, B. 2003, Sol. Phys., 214, 287 [Google Scholar]
 Winkler, G. 2003, Image Analysis, Random Fields and Markov Chain Monte Carlo Methods (Springer Verlag, Berlin Germany) [Google Scholar]
 Xu, Z., & Lam, E. Y. 2009, Opt. Lett., 34, 1453 [NASA ADS] [CrossRef] [Google Scholar]
 Zhang, B., Zerubia, J., & OlivoMarin, J.C. 2007, App. Opt., 46, 1819 [NASA ADS] [CrossRef] [Google Scholar]
 Zhou, Z., Leahy, R. M., & Qi, J. 1997, IEEE Trans. Image Process., 6, 844 [NASA ADS] [CrossRef] [Google Scholar]
Appendix A: Gamma probability density
The gamma pdf for γ > 0, with given parameters a > 0 and b > 0, is written (A.1)The following properties hold: mean is , variance is and maximizer is b(a − 1) if and only if a > 1.
All Tables
Quantitative evaluation of the estimation of the instrument parameter using the Galactic Cirrus with point sources.
All Figures
Fig. 1 Graphical dependency representation (hierarchical structure). The round (square) nodes correspond to unknown (fixed) quantities. The directions of the arrows indicate the dependencies. 

In the text 
Fig. 2 Redundancy/pointing map associated with our experiment of five crossed scans. 

In the text 
Fig. 3 Comparison of reconstructed map for the Galactic Cirrus: proposed map (Fig. 3a), best map (Fig. 3b), naive map (Fig. 3d), and true map (Fig. 3e). Figure 3c shows a profile (marked by the white line in Fig. 3e) and Fig. 3f the spectrum (circular means of power spectra). Uncertainties are given in Fig. 4 and quantitative results are given in Table 3. Comments are given in Sect. 5.3.1. 

In the text 
Fig. 4 PSD and quantification of uncertainties. Fig. 4a shows the map of the PSD and Figs. 4b and c show the interval around the estimated map ± PSD and the true map. In the spatial domain, Fig. 4b is a profile (marked by the white line on 3e) and in the spectral domain Fig. 4c is the circular means of the power spectra. 

In the text 
Fig. 5 Residuals of maps in the central part of measured coefficiens. This illustrates the grainy structure of the proposed map wrt. best map. The naive map residuals suffer twice more errors and a squared feature caused by the pixel model. 

In the text 
Fig. 6 Chains and histograms for γ_{n}, Fig. 6a, and γ_{x}, Fig. 6b, for the Galactic Cirrus. The chains show the burnin period (about 1000 iterations) and the steady state. The corresponding histograms are computed on steady state only. 

In the text 
Fig. 7 Results for real observation processing (reflection nebula NGC 7023). Chains for the noise parameter (γ_{n}) and for the image parameter (γ_{x}) in Figs. 7a and b. The stationary state is attained after a burnin time of about 500 iterations. Figure 7c shows the corresponding map. Figures 7d and e illustrate chains and marginal histogram for two bolometer offsets. 

In the text 
Fig. 8 Instrument parameter chain (myopic and unsupervised approach) for the of Galactic Cirrus with point sources. Left (right) part of the figure deals with case 1, i.e. (case 2, i.e. ). The horizontal axis gives the iteration index and the vertical range is the prior interval in a twostandarddeviations sense. The true value is shown by the straight line. 

In the text 
Fig. 9 Restoration of cirrus superimposed on point sources. The proposed maps must be compared to the maps restored with the true instrument parameter and the best hyperparameter and with the naive map. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.