Online Material
Appendix A: The inverse problem
A.1 The model
As argued in the main text, (Sect. 2.1) the formal equation relating the number of counts of galaxies with the flux S_{k} (within ) at wavelength (within ) to the number of counts of galaxies, , at redshift z (within ) and IR luminosity (within ) is given by
where is the standard Dirac function, F the flux observed in a photometric band centered on wavelength of a galaxy at redshift z with a luminosity:
(A.2) 
Here, is the luminosity distance for an object at redshift zwith the standard cosmology used in this paper, A is the solid angle corresponding to one square degree, and K corresponds to the kcorrections:
(A.3) 
where is the transmission curve for the filter centered on , , and is the underlying library of SEDs (CE01) for which the SED of a galaxy depends only on its total luminosity .
As mentioned in the main text, from the point of view of the conditioning of the inverse problem, it is preferable to reformulate Eq. (A.1) in terms of , and :
where the kernel of Eq. (A.4) reads
with
(A.5) 
(A.6) 
Here we have introduced the Euclidiannormalized number count, , by multiplying the number count by the expected S^{2.5} power law.
A.2 Discretization
Let us project onto a complete basis of functions
of finite (asymptotically zero) support, which are chosen here to be piecewise constant step functions:
The parameters to fit are the weights n_{jl}. Calling (the parameters) and (the measurements), Eq. (A.4) then becomes formally
where M is a matrix with entries given by
A.3 Penalties
Assuming that the noise in can be approximated to be Normal, we can estimate the error between the measured counts and the nonparametric model by
where the weight matrix W is the inverse of the covariance matrix of the data (which is diagonal for uncorrelated noise with diagonal elements equal to one over the data variance). Since we are interested here in a nonparametric inversion, the decomposition in Eq. (A.7) typically involves many more parameters than constraints, such that each parameter controls the shape of the function, , only locally. As mentioned in the main text, some tradeoff must therefore be found between the level of smoothness imposed on the solution in order to deal with the artefacts induced by the illconditioning, on the one hand, and the level of fluctuations consistent with the amount of information in the counts, on the other hand. Between two solutions yielding equivalent likelihood, the smoothest is chosen on the basis of the quadratic penalty:
where K is a positive definite matrix, which is chosen so that R in Eq. (A.10) should be non zero when X is strongly varying as a function of its indices. In practice, we use a square Laplacian penalization D2 norm as defined by Eq. (30) of Ocvirk et al. (2006b). Indeed, a Tikhonov penalization does not explicitly enforce smoothness of the solution, and a square gradient penalization favors flat solutions that are unphysical in our problem.
As mentioned in the main text, for a range of redshifts, a direct measurement, X_{0}, which can be used as a prior for , is available. We may therefore add as a supplementary constraint that
should remain small, where the weight matrix, W_{2}, is the inverse of the covariance matrix of the prior, X_{0}, and should be non zero over the appropriate redshift range. In short, the penalized nonparametric solution of Eq. (A.8) accounting for both penalties is found by minimizing the socalled objective function
where L(X), R(X), and P(X) are the likelihood and regularization terms given by Eqs. (A.9) (A.11), respectively. The Lagrange multipliers allow us to tune the level of smoothness of the solution (in practice, we set for the reasons given below) and the requirement that X should remain close to its prior for the range of redshifts for which data is available. The introduction of the Lagrange multipliers is formally justified by our wanting to minimize the objective function Q(X), subject to the constraint that L(X) and P(X) should fall in the range and , respectively.
The minimum of the objective function, Q(X), given by Eq. (A.12) reads formally as
This equation clearly shows that the solution tends towards X_{0}when , while the smoothing Lagrange multiplier, , damps counterparts of the components of Y corresponding to the higher singular vectors of M (Ocvirk et al. 2006b). When dealing with noisy datasets, the nonparametric inversion technique may produce negative coefficients for the reconstructed luminosity function. To avoid such effects, positivity is imposed on those coefficients n_{jl} in Eq. (A.7), see for instance Ocvirk et al. (2006b) or Pichon et al. (2002). In practice, the minimum of the objective function is found iteratively, using optimpack (Thiébaut 2005). The relative weight on the likelihood and the two penalties is chosen so that the three quantities have a comparable contribution to the total likelihood after convergence. This corresponds to a reasonably smooth variation in the LF both in redshift and , and imposes a solution that is always within 1 of the observed lowredshift LF, X_{0}, when is not set to zero in Eq. (A.12).
Appendix B: Test of robustness
To quantify the confidence level of the inversion technique, we test its robustness. Starting from an arbitrary LF, we produce IR counts in the bands and flux ranges corresponding to the observations from this LF. Then, we add some random Gaussian noise to the simulated counts, using the real uncertainty on the observations as the of the error distribution for each flux bin. Finally, we apply the inversion technique described in Sect. 2.1 to these noisy counts and obtain an output LF.The comparison of the input and output LFs is shown in Fig. B.1. The error on the absolute difference in is represented in gray levels and contours. The difference is generally less than 0.4 dex (factor 2.5) in the range where the LF can be constrained from the observed counts (range of the zL plane encompassed by the dashed lines). A noticeable exception is the very lowredshift range (z<0.1), which corresponds to bright sources. For such large fluxes, the considerable noise in the observed counts produces large errors on the recovered LF. At high redshift, recall that the ultraluminous population of galaxies appears as rare and very bright objects, in a flux range where the number counts are poorly known.
Figure B.1: Estimated robustness of the LF inversion used in this paper. The relative difference between the input LF and the recovered LF (when a realistic noise is added to the corresponding input counts) is larger for darker parts of the diagram. This difference is relatively small (<0.4 dex) in the region of the zL space effectively constrained by observations: the dotted and dashed lines correspond to the extreme fluxes considered at 24 m and 850 m, respectively, for this study. See main text for details. 

Open with DEXTER 
Appendix C: Model predictions for Herschel
In Sect. 4, we have inverted the known IR counts to obtain constraints on the evolving total IR LF. We have seen that the LF obtained through this inversion is realistic and matches most of the recent observations (counts, CIRB, MidIR LF at low redshift). Then, in Sect. 5.2, we have shown how we can measure directly a part of this LF with a good confidence and that the LF resulting from the inversion is in good agreement with this solid measurement, validating the LF obtained by this empirical modeling approach. In this section, we use the median LF obtained from the inversions to predict some counts which should be observed with future observations in the FIR with Herschel or SCUBA2.At the time of publication, several new facilities are in preparation to observe the Universe in the farIR to submm regimes. The differential counts (normalized to Euclidean) at wavelengths ranging from 16 to 850 m, which we derived from the inversion technique, are presented in Fig. C.3. The separation of the contribution of local, intermediate, and distant galaxies in different colors illustrates the expected trend that larger wavelengths are sensitive to higher redshifts, hence the relative complementarity of all IR wavelengths. There will be a bias towards more luminous and distant objects with increasing wavelength, illustrated here for the Herschel passbands (see Fig. C.4), but this may be used to preselect the most distant candidates expected to be detected only at the largest wavelengths. In the following, we discuss the predictions of the inversion technique for those instruments, as well as their respective confusion limits, which is the main limitation of farIR extragalactic surveys.
The ESA satellite Herschel is scheduled to be launched within the next year, while the nextgeneration IR astronomical satellite of the Japanese space agency, SPICA, is scheduled for 2010, with a contribution by ESA under discussion, including a midIR imager named SAFARI. Both telescopes share the same diameter of 3.5 m, but the lower telescope temperature of SPICA, combined with projected competitive sensitivities, will make it possible to reach confusion around 70 m (where Herschel is limited by integration time). The 51hour limits of the instruments SAFARI onboard SPICA (50 Jy, 33210 m, dashed line), PACS (3mJy, 55210 m, light blue line) and SPIRE (2 mJy, 200670 m, blue line) onboard Herschel are compared in Fig. C.1 to the confusion limits that we derive from the bestfit model of the inversion, at all wavelengths between 30 and 850 m, assuming the the confusion limit definition given below.
Figure C.1: Confusion limit for a 3.5 m telescope. The 51 h limits of SPICASAFARI (50 Jy, 33210 m, dashed line), Herschel PACS (3 mJy, 55210 m, light blue line) and SPIRE (2 mJy, 200670 m, blue line) are shown together with their wavelength ranges. The blue part of the curve is determined by the source density criterion (i.e. the requirement to have less than 30% of the sources closer than 0.8 FWHM), the red part is defined by the photometric criterion, i.e. sources must be brighter than 5 times the rms due to very faint sources below the detection limit. 

Open with DEXTER 
The definition of the confusion limit is not trivial, in particular because it depends on the level of clustering of galaxies; the optimum way to define it would be to perform simulations to compute the photometric error as a function of flux density, and then decide that the confusion limit is e.g. the depth above which 68% of the detected sources are measured with a photometric accuracy better than 20%. In the following, we only consider a simpler approach that involves computing the two sources of confusion that were discussed in Dole et al. (2003):
 the photometric confusion noise: the noise produced by sources fainter than the detection threshold. The photometric criterion corresponds to the requirement that sources are detected with an S/N(photometric) > 5;
 the fraction of blended sources: a requirement for the quality of the catalog of sources will be that less than N % of the sources are closer than 0.8 FWHM, i.e. close enough to not be separated.
Figure C.2: Detection limits for confusion limited surveys from 70 to 850 m. The curves show the minimum IR luminosity (81000 m), or equivalently SFR (= ), that can be detected for a starforming galaxy assuming that it has an SED similar to the Chary & Elbaz (2001) ones. The 70, 100, 160, 250, 350 and 500 m limits correspond to a 3.5 m telescope diameter, such as Herschel or SPICA, while the 400 m is for a 12 m class telescope such as APEX (e.g. ARTEMIS, we show the average between the two bands at 350 or 450 m to avoid confusion with Herschel) and the 850 m is for a 15 m telescope as the JCMT (SCUBA). 

Open with DEXTER 
Table C.1: Fraction of the CIRB resolved by confusionlimited Herschel surveys.
We note that the confusion limit for a 3.5 mclass telescope, such as Herschel, is ten times more than the depth it can reach in one hour (5). With a source density of 12.8 sources per square degree at the 500 m confusion limit (25 mJy), or equivalently one source in a field of 17 arcmin on a side, this shows that the best strategy at this wavelength is to go for very large and moderately shallow surveys, in order to identify the rare and very luminous distant galaxies.
Figure C.3: Counts predicted from the inversion in the farinfrared and submm (solid line). The counts are decomposed in redshift bins (blue = z<0.5; green = 0.5<z<1.5; orange = 1.5<z<2.5; red = z>2.5). The oblique dashed line corresponds to the limit in statistics due to the smallness of a field like GOODS North+South or 0.07 square degrees: less than 2 galaxies per flux bin of width dex are expected below this limit. 

Open with DEXTER 
Figure C.4: Differential counts predicted from the nonparametric inversion for future Herschel observations: PACS 100 m (solid black), SPIRE 250 m (dotted blue), SPIRE 350 m (dashed green), SPIRE 500 m (dotdashed red) decomposed simultaneously in redshift and . 

Open with DEXTER 
For comparison, we also illustrated the groundbased capacity of ARTEMIS built by CEASaclay which will operate at the ESO 12 mtelescope facility APEX (Atacama Pathfinder EXperiment) at 200, 350 and 450 m and SCUBA2 that will operate at the 15 m telescope JCMT at 450 and 850 m. To avoid confusion between all instruments, we only show the average wavelength 400 m for a 12 mclass telescope and 850 m for a 15 mclass telescope (Fig. C.1). Although the confusion limit in the 850 m passband is ten times below that of Herschel at the longest wavelengths, this band is not competitive with the 400 m one, which should be priorities for ARTEMIS and SCUBA2 for the study of distant galaxies, or with the 70 and 100 m ones for a 3.5 m space experiment such as SPICA and Herschel, for redshifts below . We also note that only in these two passbands will the cosmic IR background be resolved with these future experiments (see Table C.1), which suggests that a larger telescope size should be considered for a future experiment to observe the far IR Universe above 100 m and below the wavelength domain of ALMA. We did not mention here ALMA since it will not be affected by these confusion issue: due to its very good spatial resolution, it will be limited to either small ultradeep survey, hence missing rare objects or followups of fields observed with single dish instruments, e.g. ARTEMIS. Finally, the JWST that will operate in the mid IR will be a very powerful instrument for probing the faintest starforming galaxies in the distant Universe, but predictions are difficult to produce at the present stage since it has already been found that extrapolations from the mid to far IR become less robust already at (e.g. Papovich et al. 2007; Pope et al. 2008; Daddi et al. 2007b).