A&A 451, 475-498 (2006)
DOI: 10.1051/0004-6361:20053283

Confidence limits of evolutionary synthesis models

IV. Moving forward to a probabilistic formulation

M. Cerviño1 - V. Luridiana1

Instituto de Astrofísica de Andalucía (CSIC), Camino bajo de Huétor 50, Apdo. 3004, Granada 18080, Spain

Received 21 April 2005 / Accepted 30 December 2005

Abstract
Context. Synthesis models predict the integrated properties of stellar populations. Several problems exist in this field, mostly related to the fact that integrated properties are distributed. To date, this aspect has been either ignored (as in standard synthesis models, which are inherently deterministic) or interpreted phenomenologically (as in Monte Carlo simulations, which describe distributed properties rather than explain them).
Aims. This paper presents a method of population synthesis that accounts for the distributed nature of stellar properties.
Methods. We approach population synthesis as a problem in probability theory, in which stellar luminosities are random variables extracted from the stellar luminosity distribution function (sLDF).
Results. With standard distribution theory, we derive the population LDF (pLDF) for clusters of any size from the sLDF, obtaining the scale relations that link the sLDF to the pLDF. We recover the predictions of standard synthesis models, which are shown to compute the mean of the luminosity function. We provide diagnostic diagrams and a simplified recipe for testing the statistical richness of observed clusters, thereby assessing whether standard synthesis models can be safely used or a statistical treatment is mandatory. We also recover the predictions of Monte Carlo simulations, with the additional bonus of being able to interpret them in mathematical and physical terms. We give examples of problems that can be addressed through our probabilistic formalism: calibrating the SBF method, determining the luminosity function of globular clusters, comparing different isochrone sets, tracing the sLDF by means of resolved data, including fuzzy stellar properties in population synthesis, among others. Additionally, the algorithmic nature of our method makes it suitable for developing analysis tools for the Virtual Observatory.
Conclusions. Though still under development, ours is a powerful approach to population synthesis. In an era of resolved observations and pipelined analyses of large surveys, this paper is offered as a signpost in the field of stellar populations.

Key words: galaxies: star clusters - galaxies: stellar content - Hertzsprung-Russell (HR) and C-M diagrams - methods: data analysis

1 Introduction and motivation

The study of stellar populations is one of the most fecund topics in today's astronomy. Understanding the properties of stellar populations is a key element in the solution of a host of fundamental problems, such as the calibration of distances to extragalactic objects, the age determination of clusters and galaxies through color fitting, the characterization of the star-formation history of composite populations, the modeling of the chemical evolution of galaxies, and several more. When tackling these problems, the interaction between theory and observations goes both ways: one may want to predict the properties of a stellar population with given properties, or to recover the basic properties of an observed population from the available observables. In either case, to accomplish the task one needs feasible theoretical models that serve as reliable diagnostic tools.

A traditional approach to building such diagnostic tools has been the computation of synthesis models. Synthesis models allow one to predict the features and the evolution of stellar populations in a highly structured way, one that is apt for routine analysis and quantitative assessments. For example, synthesis models can be used, in combination with other methods, for the determination of stellar population properties of large samples of galaxies in a reasonable time; e.g. the analysis of 50 362 galaxies of the Sloan Digital Sky Survey based on their integrated properties performed by Cid Fernandes et al. (2005).

However, in recent years there has been a growing awareness that synthesis modeling also suffers from severe limitations. In some cases, we know a priori that standard models cannot be applied to a given system, because the properties of the system fall outside the hypothesis space of the models; this is the case, for example, of undersampled populations (Cerviño & Valls-Gabaud 2003; Chiosi et al. 1988). In other cases, we observe a mismatch between the properties of the system inferred from observations of their individual components and those derived from their integrated properties analyzed with synthesis models: e.g. the discrepancy between the age determined from the color-magnitude diagram and the spectroscopic age in NGC 588 (Jamet et al. 2004), or the IMF slope inferred by Pellerin (2006) in undersampled giant H II regions. In these cases, we are facing a crisis in the explanatory power of synthesis models.

Previous attempts to solve this crisis and to understand the limitations of synthesis models have repeatedly pointed at the necessity of including statistical considerations in the analysis. According to this view, the predictive power breaks down because the traditional synthesis model is essentially a deterministic tool, whereas the nature of the problem is inherently stochastic. The clearest example of stochasticity is the mass spectrum of stellar populations, in which fluctuations (possibly random, although not necessarily so) in the number of stars of each type appear around the mean expected number. Until now, the efforts to take stochasticity into account in the modeling of stellar populations have moved in essentially two directions: the use of Monte Carlo techniques (e.g. Girardi 2002; Bruzual 2002; Santos & Frogel 1997; Cantiello et al. 2003; Cerviño et al. 2000a, among others) and the reinterpretation of traditional models in statistical terms (e.g. González et al. 2004; Cerviño et al. 2002; Buzzoni 1989). Both methods have proved able to solve some of the problems, or at least point toward possible solutions. However, they suffer from practical difficulties. Monte Carlo simulations are expensive (in terms of disk space, CPU time, and human time required to analyze the results), while, to date, the statistical reinterpretation of standard models has only served to establish limits to the use of synthesis models for clusters with moderately undersampled populations (Cerviño et al. 2003; Cerviño & Luridiana 2004), i.e. clusters with total initial masses on the order of $10^5~M_\odot$.

This limitation brings about a serious impasse for the study of stellar populations by means of their integrated light, since the properties of clusters with lower masses cannot be reliably obtained. This class includes, for example, all of the clusters in our Galaxy (including globular clusters), clusters in the Large Magellanic Cloud (LMC), as well as many clusters in external galaxies. For example, many of the clusters in the "Antennae'' studied by Zhang & Fall (1999) have masses lower than this limit: in Sect. 8.3 we will show that explicit consideration of stochastic effects could alter the conclusions that are drawn on the cluster luminosity function based on these clusters. Furthermore, these limitations will become even more dramatic with the development of Virtual Observatory (VO) technologies, which will make the automatic analysis of large amounts of data possible. It is therefore mandatory to adapt evolutionary synthesis models to the present needs, so that they can be applied to observations.

In the present paper we introduce a theoretical formalism for the probabilistic analysis of single stellar populations (SSPs). Our formalism yields results that are as accurate as those of large Monte Carlo simulations, but it bypasses the need to perform these simulations: that is, the method is both accurate and economic. By means of our formalism, synthesis models can be applied to clusters of any size and the confidence intervals of the results can be evaluated easily. This makes it possible to estimate the relative weights of different bands for the realistic application of goodness-of-fit criteria like the $\chi^2$ test. Finally, the algorithmic nature of our method makes it feasible for implementation in the VO environment.

This paper is the fourth in a series (Cerviño et al. 2000a, 2002, 2001) dealing with the statistical analysis of stellar populations. Although it only deals with SSPs, a fifth paper, in preparation, will be devoted to the extension of this formalism to any star formation history scenario. Finally, in Cerviño & Luridiana (2005) we give an extensive review of the uncertainties affecting the results of synthesis models. In addition to these papers, we also recommend the work by Gilfanov et al. (2004): statistical properties of the combined emission of a population of discrete sources: astrophysical implications. This paper, although not directly focused on synthesis models, suggests an alternative point of view of the problem. In some aspects, it has inspired the present paper.

In this work we restate the problem of stellar population synthesis from a new perspective, the one of luminosity distributions, which is a powerful and elegant way to understand stellar populations. We provide several examples of applications to illustrate such power and indicate potential areas of future development. The starting point of the new formalism is the definition of the stellar (i.e. individual) luminosity distribution function (LDF) and of its relation to the central problem of population synthesis (Sect. 2). Section 3 describes the two main variants of synthesis models by means of which the problem has traditionally been tackled: Monte Carlo and standard models. Next, we take the reader on a journey through the no man's land of population synthesis' pitfalls (Sect. 4), and show how these are faced by existing techniques (Sect. 5). This account is necessary to introduce, in Sect. 6, our suggested solution and show its comparative power. Finally, in Sect. 7, we give a few examples of applications of our formalism to current problems. Future developments of the present work are described in Sect. 8, and our main conclusions are summarized in Sect. 9.

  
2 Overview of the problem

This section starts by introducing a few basic definitions. An exhaustive summary of the notation used and its rationale is offered in Appendix A.

The general problem approached by synthesis models is the computation of the luminosity[*] $L_{{\rm tot}}$ emitted by an ensemble of  $N_{\rm tot}$ sources - a stellar population. From a theoretical point of view, this problem can be characterized in three basic ways.

If the luminosities li of the individual sources are known, the total luminosity  $L_{{\rm tot}}$ is obtained trivially as the sum of all the li values:

 \begin{displaymath}%
L_{{\rm tot}} = \sum_{i=1}^{N{_{\rm tot}}} ~ l_i.
\end{displaymath} (1)

This circumstance is not very frequent. Its most common examples are Monte Carlo synthetic clusters (Sect. 3) in the theoretical domain and resolved stellar populations in the observational one.

In the more usual situation in which the luminosities of individual objects cannot be specified on an individual basis, a different approach must be adopted: in this case a luminosity distribution function (LDF) $\varphi_{\rm L}(\ell)$ is assumed that describes the distribution of luminosity values in a generic ensemble. Traditionally, the LDF has been seen as the asymptotic limit of a differential count of stars:

 \begin{displaymath}%
\varphi_{\rm L}(\ell) = \lim_{N_{{\rm tot}} \rightarrow \in...
...Delta \ell \rightarrow 0}\frac{\Delta N}{\Delta \ell}}\right),
\end{displaymath} (2)

where $\Delta N$ is the number of stars in a luminosity bin of width  $\Delta \ell$, and  $N_{\rm tot}$ the total number of stars. The integral of the LDF is normalized to 1:

 \begin{displaymath}%
\int_{0}^{\infty} \varphi_{\rm L}(\ell){\rm d}\ell = 1.
\end{displaymath} (3)

The integrated luminosity of an ensemble is traditionally obtained by means of the expression:

 \begin{displaymath}%
L_{{\rm tot}} = N_{{\rm tot}} \int_{0}^{\infty} \ell ~ \varphi_{\rm L}(\ell){\rm d}\ell.
\end{displaymath} (4)

The bottom line of the present work is that this approach is conceptually wrong and operationally sterile. The crucial point where we part company with this approach is the definition and interpretation of the LDF: to us, $\varphi_{\rm L}(\ell)$ is a probability density function (PDF) from which the luminosity of an actual system is drawn. If the system is an individual star, its PDF is the stellar LDF (sLDF). If it is a stellar population, its PDF is the population LDF (pLDF). Hereafter, we will indicate the sLDF with the symbol  $\varphi_{\rm L}(\ell)$and the pLDF with the symbol  $\varphi_{\rm L_{tot}}({\cal L})$.

To avoid confusion, note that the pLDF is not the same as the cluster or galaxy luminosity function (LF) as commonly defined, because galaxy LFs include the effect of a spread in ages and number of stars (or, equivalently, ages and mass), whereas the pLDF defined in this paper represents the luminosity distribution of ensembles of stars with the same physical parameters (age and total number of stars).

In this paper, we are concerned with the properties of the sLDF and its relation to the pLDF. In particular, we will show that the total luminosity of a stellar population, ${\cal L}$, is a distributed quantity, whose mean value M1' is given by:

 \begin{displaymath}%
M_1' \equiv \langle {\cal L}\rangle = N_{{\rm tot}} \int_{0}^{\infty} \ell ~ \varphi_{\rm L}(\ell){\rm d}\ell.
\end{displaymath} (5)

The integral on the right-hand side of this equation is the mean value of the sLDF, $\mu _1'$, that is:

 \begin{displaymath}%
M_1' = N_{{\rm tot}} ~ \mu_1'.
\end{displaymath} (6)

Although Eqs. (4) and (5) yield similar expressions, it is important to distinguish between the two approaches. The historical origin of our sLDF lies in star counts, but it is a step further from them, in the same sense that the frequentist and the objectivist definitions of probability differ: the frequentist definition depends on the realization of trials, while the objectivist definition assumes that the probability properties are built-in. The implications of either approach will be discussed at length throughout the paper. For the time being, note that Eq. (1) involves a discrete sum, while Eqs. (4) and (5) involve an integral, whose numerical solution requires binning the independent variable. In Sect. 4.2 we discuss the consequences of binning distribution functions for numerical applications.

The main goal of synthesis models is to obtain the luminosity of a model stellar population, either by direct count (Eq. (1)) or by an integral including the sLDF (Eqs. (4) and (5)). In the following, we will see how this can be carried out in practice, under the assumption that there has been a star-forming episode in which all the stars have formed simultaneously; this is the scenario assumed by SSP models.

Because stellar luminosities evolve with time, the sLDF is a function of the star's age. Since the luminosity evolution of a star depends on its initial mass, we can express the time dependence of the sLDF explicitly by writing the sLDF as a function of two other functions: the isochrone $\ell(m;t)$ and the initial mass function (IMF) $\varphi_{{\rm M}}(m)$. The isochrone $\ell(m;t)$ gives the luminosity of a star as a function of its initial mass m at a given value of the age t. The IMF gives the probability distribution of initial stellar masses. The IMF has a status similar to that of the sLDF, in that it can be either interpreted as the result of a star count:

 \begin{displaymath}%
\varphi_{\rm M}(m) = \lim_{N_{{\rm tot}} \rightarrow \infty...
...lim_{\Delta m \rightarrow 0}\frac{\Delta N}{\Delta m}}\right),
\end{displaymath} (7)

or, in probabilistic terms, as the probability density for a star of being born with mass m: for reasons similar to those discussed above, we support the second interpretation (which, by the way, implies that the IMF should better be called Initial Mass Probability Density Function). According to its definition, $\varphi_{{\rm M}}(m)$ fulfills the normalization condition:

 \begin{displaymath}%
\int_{0}^{\infty} \varphi_{\rm M}(m){\rm d}m = 1.
\end{displaymath} (8)

Summing up, the sLDF can be rewritten in terms of the IMF and the isochrone as follows:

 \begin{displaymath}%
\varphi_{{\rm L}}(\ell;t) = \varphi_{\rm M}(m) \times \left(\frac{{\rm d}\ell(m;t)}{{\rm d}m} \right)^{-1}.
\end{displaymath} (9)

Equation (9) can now be used to rewrite the mean value of the sLDF in terms of the isochrone and the IMF[*]:
 
                                     $\displaystyle %
\mu_1'(t)$ = $\displaystyle \int_{m^{\rm low}}^{m^{{\rm up}}} \ell(m;t) ~ \varphi_{\rm M}(m) ...
...}\ell(m;t)}{{\rm d}m}\right)^{-1} ~ \frac{{\rm d}\ell(m;t)}{{\rm d}m}~ {\rm d}m$  
  = $\displaystyle \int_{m^{\rm low}}^{m^{{\rm up}}} \ell(m;t) ~ \varphi_{\rm M}(m)~ {\rm d}m,$ (10)

where the integration variable in Eq. (5) has been changed from $\ell$ to m, and ${m^{\rm low}}$, ${m^{{\rm up}}}$ are the lower and upper mass limits respectively of the integration domain.

Solving this integral is the main task of stellar population synthesis modeling. In the following section we will describe the main types of synthesis models and how they perform this integral.

  
3 Basic strategies

Once the physical problem is framed, we must translate it into actual computations, a task carried out by evolutionary synthesis codes. These come in two basic flavors, standard and Monte Carlo. Standard simulations are models in which the initial mass mixture is analytically described by the IMF, whereas in Monte Carlo simulations the ensemble of stars is selected by random sampling the mass of each star to be included in the population, and the IMF is used as the weighting function. Therefore, the mass distribution obtained with a standard simulation is univocally determined by the population's parameters, while in Monte Carlo simulations it is not; additionally, the standard distribution is smoother than the Monte Carlo one. Both methods, however, operate on a binned mass spectrum, due to the limited resolution of numerical computation and to the discreteness of the available stellar models. This fact has important consequences, which will be discussed in Sect. 4.2.

Using either of these two approaches, evolutionary synthesis codes aim to characterize the integrated emission properties of an ensemble of stars as a function of its physical parameters, such as the age and number of the individual stars of the ensemble[*]. Standard codes perform this task by carrying out the integration of Eq. (10). As described in Sect. 2, the result can be interpreted in two alternative ways: either a deterministic one - $\mu _1'$ is the sum of the luminosities of all the stars included in the ensemble modeled, normalized to one star -, or a probabilistic one - $\mu _1'$ is the mean value of the sLDF. Although the two interpretations may seem close at first sight, they are fundamentally different, both from a conceptual and from a practical perspective: this point will be furthered in Sect. 6. We call these two alternative interpretations of standard synthesis models the deterministic and the statistical one. Note that these labels do not identify different classes of models, but rather different interpretations within the same class of models - standard models. In practice, some codes do not explore this interpretation, while others acknowledge the distributed nature of luminosity and compute, in addition to the mean luminosity, the variance of the distribution. In either case, $\mu _1'$ can eventually be scaled to the size of the ensemble by multiplying by  $N_{\rm tot}$; this property will be formally demonstrated in Sect. 6.2.

Note that many codes do not use $\varphi_{{\rm M}}(m)$in Eq. (10), but rather a proportional function $\varphi'_{{\rm M}}(m) = {\rm const.}~ \varphi_{{\rm M}}(m)$ normalized in such a way that:

 \begin{displaymath}%
\int_{0}^{\infty} m ~ \varphi'_{\rm M}(m){\rm d}m = 1.
\end{displaymath} (11)

Since $\int_{0}^{\infty} m ~ \varphi_{\rm M}(m){\rm d}m$ is the mean mass value  $\langle m \rangle$ of the IMF, using $\varphi'_{{\rm M}}(m)$ instead of  $\varphi_{{\rm M}}(m)$ in Eq. (10) yields the mean luminosity of one star divided by  $\langle m \rangle$: in this case, the mean total luminosity is found by multiplying by the total mass of the ensemble  $M_{{\rm tot}}$ instead of  $N_{\rm tot}$.

As for Monte Carlo models, their task is essentially the computation of the sum in Eq. (1). Each time a simulation is run, a particular realization is drawn from the underlying distribution. The result of the simulation is the integrated luminosity ${\cal L}$of that particular realization. If many Monte Carlo simulations are available for a fixed set of input parameters, an estimate for M1' can be obtained as the mean of the ${\cal L}$ values. This implies that the results of Monte Carlo simulations depend not only on the underlying luminosity distribution, but also on the number of simulations used to sample such distribution. If the set of simulations is sufficiently large, the Monte Carlo method has also the potential to provide the actual distribution function of the luminosities of the ensemble. Hence, an important drawback of the method is that it is intrinsically expensive, because the accuracy of the results increases with the number of simulations.

In summary, the goal of determining the luminosity of stellar populations reduces to the computation of a sum (Eq. (1)) or an integral (Eq. (10)). However straightforward this may seem, this computation is hindered in the practice by several intrinsic features of the problem: these will be the topic of next section.

  
4 Pitfalls in handling the LDF

Our previous discussion has taken place on an abstract level. In the practice of synthesis modeling it is necessary to translate the concepts discussed above into specific prescriptions for the handling of equations, and deal with the limitations imposed by the input ingredients, which have a finite resolution in the parameter space. In this section we will discuss a specific aspect of this task, namely the difficulties inherent to the determination of the mean value of the sLDF. As a first step, let us revisit a few well-known results in terms of the sLDF.

  
4.1 Domain limits and average properties


  \begin{figure}
\par\includegraphics[width=8.2cm,clip]{3283f1.eps}\end{figure} Figure 1: Schematic representation of the of the sLDF for three different cases: main sequence only ( top), main sequence plus a constant post main sequence ( middle), and main sequence plus a bumpy post main sequence ( bottom). The position of the mean ($\mu _1'$) and the region corresponding to $\mu _1' \pm 1\sigma $ are explicitly marked.
Open with DEXTER

In synthesis modeling it is a well established fact that the integrated luminosity of the ensemble is dominated by the most massive stars in the cluster. In the following we will illustrate this point by means of three simplified scenarios that are representative of real sLDFs. These are shown graphically in Fig. 1. In each panel, the position of the mean $\mu _1'$ and the region corresponding to $\mu _1' \pm 1\sigma $ are explicitly marked.

Case 1: Main sequence luminosity function

In the first case let us assume that all the stars are in the main sequence (MS), and that $\ell \propto m^{\beta}$. Assuming a power-law IMF with index $-\alpha$, the sLDF can be expressed as:

\begin{displaymath}%
\varphi_{\rm L} \propto \ell^{-\frac{\alpha}{\beta}} \cdot ...
...eta}} = \frac{1}{\beta} ~ \ell^{\frac{1-\alpha-\beta}{\beta}}.
\end{displaymath} (12)

The constant factor can be determined by imposing the normalization condition:

\begin{displaymath}%
\varphi_{\rm L} = \frac{1-\alpha}{\beta}\frac{1}{\ell_{\rm ...
...beta}} = \frac{A}{\beta} ~\ell^{\frac{1-\alpha-\beta}{\beta}}.
\end{displaymath} (13)

The mean value of the luminosity is then:

\begin{displaymath}%
\mu_1' = \frac{A}{\beta} \int_{\ell_{\rm min}}^{\ell_{\rm m...
...beta}} - \ell_{\rm min}^{\frac{1+\beta-\alpha}{\beta}}\right).
\end{displaymath} (14)

If $1+\beta-\alpha > 0$, the mean luminosity is driven by  $\ell_{{\rm max}}$. In a typical situation with $\beta \approx 3$, the most luminous stars will dominate the luminosity if $\alpha < 4$: this is the case of Salpeter's IMF (Salpeter 1955). If the IMF departs from a simple power law, the conclusions are less straightforward. For example, if we consider the log-normal IMF by Miller & Scalo (1979) and approximate it as a power-law series, a value of $\alpha > 4$ is reached near 100 $M_\odot $ (see Fig. 1 in Kroupa 2001). In this case, the dominant stars will not be the most massive ones.

Case 2: Luminosity function with a constant post main-sequence contribution

Let us go a bit further and add a post-main sequence (PMS) contribution to the sLDF. The sLDF can now be divided in two regimes, corresponding to the MS and to the PMS respectively. As a first approximation, we will assume that the PMS phase gives a constant contribution to the sLDF, implying that any point of the PMS portion of the isochrone is equally probable. The sLDF can be described by the expression:

\begin{displaymath}%
\varphi_{\rm L} = \Biggl\{
\begin{array}{ll}
\frac{A}{\bet...
...if} \;\ell ~ \in (\ell_{\rm TO},\ell_{\rm max}),\\
\end{array}\end{displaymath} (15)

where $\ell_{\rm TO}$ is the turn-off luminosity. The absolute value of the PMS contribution is chosen so as to yield the same coefficient A for the two regimes, while preserving the overall normalization; that is, we are forcing the PMS to have a fixed weight with respect to the MS phase. This choice allows us to keep the complexity of the expressions to a minimum, but it has no influence on the general conclusions, which would be reached even if we dropped this assumption.

If we further assume that $\ell_{\rm TO}^{\frac{1+\beta-\alpha}{\beta}} \gg \ell_{\rm min}^{\frac{1+\beta-\alpha}{\beta}}$, the mean value of the sLDF is:

\begin{displaymath}%
\mu_1' \approxeq \frac{A}{1+\beta-\alpha} \ell_{\rm TO}^{\f...
...\beta}}\right)\left(\ell_{\rm max} \!+\! \ell_{\rm TO}\right).
\end{displaymath} (16)

That is, the mean depends on both $\ell_{\rm TO}$ and $\ell_{{\rm max}}$. Note that, since in the PMS phase the relation between the mass and the luminosity is not simple anymore, the above result cannot be easily expressed as a function of the initial mass. As a trivial example, if the age t is larger than the lifetime of a star of initial mass  ${m^{{\rm up}}}$, then $\ell(m^{\rm up},t) = 0$.

Case 3: sLDF with a bumpy post main-sequence contribution

As a final example, let us assume that the sLDF has a narrow peak in addition to the MS contribution. This scenario describes the case in which PMS stars have all the same luminosity, as in the horizontal branch or in the red giant clump phase; this case can be modeled by adding a pulse function to the MS section of the sLDF. Since PMS luminosities are larger than MS luminosities, the pulse function is located at $\ell_{{\rm max}}$. The sLDF is therefore:

\begin{displaymath}%
\varphi_{\rm L} = \Biggl\{
\begin{array}{ll}
\frac{A}{\bet...
...~ \in \left(\ell_{\rm TO},\ell_{\rm max}\right).\\
\end{array}\end{displaymath} (17)

Again, we are forcing the PMS to have a fixed weight with respect to the MS phase for simplicity reasons. In this case, assuming again that $\ell_{\rm TO}^{\frac{1+\beta-\alpha}{\beta}} \gg \ell_{\rm min}^{\frac{1+\beta-\alpha}{\beta}}$:

\begin{displaymath}%
\mu_1' \approxeq \frac{A}{1+\beta-\alpha} \ell_{\rm TO}^{\f...
...ell_{\rm TO}^{\frac{1-\alpha}{\beta}}\right) ~ \ell_{\rm max}.
\end{displaymath} (18)

So, again, the mean depends on both $\ell_{\rm TO}$ and  $\ell_{{\rm max}}$.

These three examples illustrate the general fact that the mean luminosity depends strongly on the value of  $\ell_{{\rm max}}$. The dependence on  $\ell_{{\rm max}}(t)$ would be even stronger for higher-order moments of the distribution. As Gilfanov et al. (2004) point out, the dependence of the results of synthesis models on the high-luminosity end of the sLDF is so strong that, without such a limit, synthesis models could not even be computed! As a further example of the relevance of the upper limit of the sLDF, we have shown in a previous paper that it also defines a lower limit for the luminosity of real clusters to be described by standard synthesis models (Cerviño & Luridiana 2004).

These examples also illustrate the following interesting facts: (i) The mean value does not necessarily give information on the distribution of luminosities, to the extreme that there can be situations in which the probability to find a source in the region around the mean value is zero (Fig. 1, bottom panel). This is the opposite of a Gaussian distribution, in which the mean is also the most probable value. (ii) Different distributions can have the same mean value but different variances. In fact, this circumstance permits to use surface brightness fluctuations (SBF: the ratio between the variance and the mean value of the luminosity function) to break the age-metallicity degeneracy, which makes clusters with different ages and metallicities have the same $\mu _1'$. (iii) The value of $\mu_1' - \sigma$ is negative in our three examples. This shows clearly that assuming, e.g., that $\mu _1' \pm 1\sigma $ includes $\sim$68% of the elements of the distribution can be grossly mistaken in the case of non-Gaussian distributions. As we will see later, this can also be the case with the distribution function of the integrated luminosity of an ensemble: this limits the use of goodness-of-fit methods based on the comparison with the mean, such as the $\chi^2$ test.

Summing up, our schematic but realistic sLDFs show that knowledge of the mean and the variance does not provide a handle on the problem if the shape of the distribution is not known. As Gilfanov et al. (2004) argue, the use of the mean value instead of the most probable one produces a systematic bias in the determination of integrated properties.

  
4.2 Implications of mass binning

Although Eq. (10) is expressed as an integral, synthesis codes approach it through numerical approximations. There are several reasons for this: first, the input data are available in tabular format rather than as analytical relations; e.g., given the enormous complexity of stellar evolution it is not possible to derive stellar properties as analytical functions of, say, initial mass and stellar age. Rather, stellar tracks are computed for a discrete and finite set of mass values, and their properties are conveniently interpolated a posteriori. Furthermore, calculated stellar tracks are generally expressed in tabular form, although there have been sporadic attempts at expressing them in analytical form (e.g. Tout et al. 1996). Second, even if analytical relations existed, their integration would plausibly require numerical methods. Third, the numerical precision of computers is limited. These circumstances force the actual calculations of synthesis models to be numerical rather than analytical: as a consequence, the mass variable must be binned. We will focus here on the implications of mass binning for Eq. (10).

Introducing binning, Eq. (10) can be expressed as:

 \begin{displaymath}%
\mu_1'(t) = \int_{m^{\rm low}}^{m^{{\rm up}}} \ell(m;t) ~ \varphi_{\rm M}(m) ~ {\rm d}m \simeq \sum_i w_i ~\ell_i(t),
\end{displaymath} (19)

where the index i identifies the mass bin and $\ell_i(t)$ is the (time-dependent) luminosity of the ith bin; the approximation holds only if the luminosity $\ell_i(t)$ is indeed representative of the whole mass bin. The coefficient wi is computed by means of the expression:

 \begin{displaymath}%
w_i = \int_{m_i^{{\rm up}}}^{m_{i}^{{\rm low}}} \varphi_{\rm M}(m) ~ {\rm d}m,
\end{displaymath} (20)

where $m_i^{\rm low}$ and $m_{i}^{{\rm up}}$ are the lower and upper limits of the ith mass bin (the specific way in which these limits are defined varies from code to code). In the framework of deterministic synthesis models, wi is deterministically interpreted as the fraction of the total number of stars enclosed in the given bin. In the probabilistic approach, however, such number is not fixed. Each star is either born within a given mass bin i, with a probability wi, or outside it. When  $N_{\rm tot}$ stars are selected, the number of stars in each mass bin follows a binomial distribution. Furthermore, since all the bins share the same number  $N_{\rm tot}$ of stars, there is a finite covariance among bins: that is, the collection of the mutually covariating binomial distributions constitutes a multinomial distribution. According to this approach, wi also represents the mean (as opposed to exact) contribution of each bin to the total number of stars. This interpretation is shared by statistical standard codes and Monte Carlo codes.

4.2.1 Distribution of stars in each bin

The statistical strand of standard models was born with the goal of devising statistical tools to be applied to synthesis models. In this context, the binomial probability distribution of stars in individual bins had been approximated by a Poisson distribution to simplify its handling (Cerviño et al. 2002). As is known, the Poisson approximation is accurate only when the number of events N is large and the probability p is vanishingly small, in such a way that Np stays finite: we will show in the following that such approximation is not always accurate enough for our problem.

A first point to consider is that speaking of a distribution of stars in bins only makes sense if a value of  $N_{\rm tot}$ is considered. In both cases considered, Poisson and binomial, the mean value of the number of stars in a bin is $\mu_i(n_i) = N_{\rm tot}$ $\times$ wi. Let us now illustrate the difference between the two distributions for the case of $N_{\rm tot} = 1$. The probability for the source to fall in the ith mass bin is, in the Poisson approximation, $p_i^{P}(1)=w_i ~ {\rm e}^{-w_i}$. In the binomial case such probability is pib(1)=wi, which is the correct value. The difference is non-negligible for large wi values, which is typical for bins at small masses.

Furthermore, when $N_{\rm tot}$ stars are considered, the Poisson approximation predicts the ratio M2/M1' of the variance to the mean to be unity, whereas in Monte Carlo simulations this ratio shows a trend toward smaller values for small bin masses (Fig. 1 of Cerviño et al. 2002), revealing the underlying binomial distributions, in which:

\begin{displaymath}%
\frac{M_2}{M_1'} = \frac{N_{\rm tot} w_i(1-w_i)}{N_{\rm tot} w_i }= 1 - w_i.
\end{displaymath} (21)

Note that, as expected, the discrepancy with respect to 1 grows larger for larger wi values - i.e., bins at lower masses. Unfortunately, Cerviño et al. (2002) failed to interpret this result in term of a binomial distribution, and proposed to solve the discrepancy by reducing the size of the bins, while keeping the Poisson approximation. Of course, reducing the size of the bin (or, equivalently, wi) leads to a closer similarity between the two distributions, but this solution is not always viable due to the limited resolution of the input ingredients. A further point is that, because  $N_{\rm tot}$ is shared by all the bins, the numbers of stars in different bins covariate: while this effect is neglected when the bin distributions are modeled by independent Poisson distributions, it arises naturally when a multinomial distribution is used.

The difference between the Poissonian approximation and the multinomial description is particularly clear if we compute the variance of the pLDF. Consider the expression for the variance obtained assuming that the number of stars in each bin is described by a Poisson distribution, and that there is no correlation between bins (e.g. Cerviño et al. 2002, among others):

 
$\displaystyle %
M_2 = \sum_i \mu_2(n_i) ~\ell_i^2 = N_{\rm tot} ~ \times ~ \sum_i w_i ~\ell_i^2.$     (22)

Now, consider the expression corresponding to a multinomial distribution, where, by definition, ${\rm cov}(n_i,n_j) = - N_{\rm tot} ~ w_i~ w_j$:
 
                           M2 = $\displaystyle \sum_i \mu_2(n_i) \ell_i^2 + \sum_i \sum_{j \neq i} \ell_i \ell_j ~ {\rm cov}\left(n_i,n_j\right)$  
  = $\displaystyle N_{\rm tot} ~ \times ~ \left( \sum_i w_i \ell_i^2 - \sum_i \left(w_i \ell_i\right)^2 \right)$  
    $\displaystyle - N_{\rm tot} ~ \times ~ \left( \sum_i \sum_{j\neq i} \ell_i \ell_j ~ w_i w_j \right).$ (23)

Comparing Eq. (22) to Eq. (23), it can be seen that the latter contains two additional terms: the term $N_{\rm tot}$ $\times$ $\sum_i (w_i \ell_i)^2$ arises as a result of dropping the Poisson approximation, and the term $\sum_i \sum_{j \neq i} \ell_i \ell_j ~ {\rm cov}(n_i,n_j)$represents the mutual dependence of bins, and hence arises as a direct consequence of binning.

These considerations show that the Poisson approximation, motivated by simplicity of handling, may break down in our problem (see also Lucy 2000). While the distinction between the Poisson approximation and the multinomial description is important in order to make sense of Monte Carlo simulations, it is fundamental in view of the discussion on our proposed probabilistic formalism (Sect. 6).


  \begin{figure}
\par\includegraphics[width=8.2cm,clip]{3283f2.eps}\end{figure} Figure 2: Top: isochrones in the $\ell _{\rm V}$ vs. B-V color-magnitude diagram for three different age values computed by Girardi et al. (2002) based on the evolutionary tracks by Marigo & Girardi (2001). Bottom: same as above, for  $\ell _{\rm K}$.
Open with DEXTER

  
4.3 Fast evolutionary phases

In writing Eq. (9), we have tacitly assumed that the luminosity is a well-behaved function of the mass so that the isochrone is always defined. However, this condition is in fact often violated, since a typical isochrone features shallow sections as well as peaks and discontinuities in the $(\ell, m)$ plane. Shallow sections correspond to quiescent phases of stellar evolution, where evolution is slow (e.g. the MS); peaks correspond to faster phases (e.g. the asymptotic giant branch, AGB); and discontinuities correspond to abrupt transitions between phases (e.g. the onset of central helium burning in intermediate mass stars) or jumps in stellar behavior (the transition between WR and non-WR-type structures). In the following, peaks and discontinuities will generically be referred to as fast evolutionary phases. Some of these cases are illustrated by the comparison of Figs. 2 with 3. In Fig. 2 the isochrones computed by Girardi et al. (2002) based on the evolutionary tracks by Marigo & Girardi (2001)[*] are shown, at selected age values. Figure 3 shows the relation between the initial mass and the luminosity in the V and K bands for the same isochrones, focusing on fast evolutionary phases.


  \begin{figure}
\par\includegraphics[width=16.33cm,clip]{3283f3.eps}\end{figure} Figure 3: Details of fast evolutionary phases in the V (solid line, left axis) and K (dotted-line, right axis) bands. Bottom panel: complete isochrones. Middle panels: blow-up of the mass axis, showing details of fast evolutionary phases at three different ages. Top panels: same as middle panels, with a more extreme blow up. The isochrone set is the same of Fig. 2.
Open with DEXTER

If the mean value of the sLDF is determined through the numerical approximation of Eq. (19), fast evolutionary phases are difficult to handle, because a small difference in initial stellar mass yields a large difference in luminosity, so that the result of the numerical integration crucially depends on which luminosity is chosen to be representative of a given mass bin. In principle, this difficulty can be dealt with by decreasing the width of the mass bin and choosing mass bins that do not go across discontinuities; however, the available resolution in mass is governed by the evolutionary tracks used by the code, and is generally too low to resolve adequately such phases in synthesis models.

This is a severe problem, since fast evolutionary phases are ubiquitous in post main-sequence evolution, and at certain frequencies they bear a major weight in the luminosity balance. Unfortunately, the way in which this problem is tackled is often labeled a "technical detail'' of the computation and dismissed as unimportant, and thus the papers describing evolutionary synthesis models do not generally make any reference to its solution - in spite of its difficulty and of the potentially disastrous consequences of incorrect assumptions. Here are a few examples of the ways in which this problem has been approached: a) in the Starburst99 synthesis code (Leitherer et al. 1999), for certain metallicity values, an undocumented stellar track at 1.701 $M_\odot $ is added to the tabulated track of 1.70 $M_\odot $ by Schaller et al. (1992) and Schaerer et al. (1993), to avoid the mass bins going across the discontinuity of the stellar models' behavior at such mass (C. Leitherer, D. Schaerer, & G. Meynet, private communication); b) to deal with the same problem, additional evolutionary tracks around the same mass range are used in the computation of the isochrones by Castellani et al. (2003) and Cariulo et al. (2004) (S. Degl'Innocenti, private communication) and Brocato et al. (1999) (E. Brocato, private communication); c) to avoid the intrinsic discontinuity in the isochrones, the same mass is used twice in the isochrones by Girardi et al. (2002), namely at the end of the red giant branch and at the beginning of the horizontal branch (S. Bressan, private communication).

As a final point, consider that fast evolutionary phases are those stages in which the evolution is rapid in any of the relevant luminosity bands, and not only the bolometric luminosity. Therefore, an isochrone that is adequately sampled in one band is not necessarily so in all the bands.

  
4.4 Transient phases and fuzzy stellar behavior

Traditional synthesis models rely on the possibility to express the luminosity as a function of mass (Eq. (9)), so that the integrals over luminosity are transformed into integrals over mass. However, there are stellar phases in which this is impossible to do, because the luminosity at a given age is not univocally determined by the mass. A few examples are: the thermal pulse phase in AGB stars, which is poorly resolved by stellar evolutionary models; supernova (SN) light curves, in which a star of fixed mass spans a large range of luminosity almost instantaneously; variable stars; rotating stars, in which the luminosity emitted depends on both the rotation velocity and the inclination angle. Partly because of this difficulty, these phenomena are generally not included in synthesis models; whereas approaching the problem from the point of view of distribution functions permits to account for them by including them in the sLDF - provided they can be modeled quantitatively. This point will be developed further in Sect. 7.2. See also Gilfanov et al. (2004) for an example on the inclusion of variability.

Of course, in many cases of interest the modeling of the phenomenon may be a difficulty in itself. For example, modeling rotation is in itself problematic. Rotating stellar models are just beginning to appear on the market, and the distribution of velocities in a stellar population is largely unknown. Stellar rotation is not yet included in synthesis models primarily because of these uncertainties, and not only because synthesis models are not flexible enough. But if stellar rotation or any other distributed stellar behavior is properly understood, or at least satisfactorily modeled, our method will permit to include it in synthesis modeling.

  
4.5 Implementation of model atmospheres

Usually, the available model atmospheres form a coarse grid in the (log g, log  $T_{\rm eff}$) plane, whereas isochrones are generally continuous in the plane. In order to assign a model atmosphere to each isochrone location, one can either choose the nearest atmosphere of the grid, or interpolate between nearby atmospheres. Assigning the nearest atmosphere implies a further binning of data and may originate jumps in the results, although it is often assumed that these jumps cancel on average when a whole population is considered. This problem will not be further discussed here; a more extensive discussion can be found in Cerviño & Luridiana (2005).

In the next section, we will review the ways in which the basic task of synthesis codes is accomplished, given the difficulties outlined here.

  
5 Computational algorithms

This section outlines the basic ways in which the task of computing the luminosity is specifically performed by standard codes. The first two parts, 5.1 and 5.2, describe techniques implemented in standard models. The third part, 5.3, describes how the Monte Carlo method is put into practice.

  
5.1 Isochrone synthesis

The commonest technique to compute the luminosity of stellar populations is known as isochrone synthesis. Isochrone synthesis is the numerical integration of the product between the IMF and the isochrone approximated as in the rightmost part of Eq. (19): $\mu_1'(t) = \sum_i w_i ~\ell_i(t)$. The strongest assumption of this numerical approximation is that each $\ell_i(t)$ is representative of all the stages included in it. Fast evolutionary phases evidently pose a problem in the integration, since a small interval in mass can correspond to very different luminosity values. In isochrone synthesis, the problem is solved increasing the number of mass bins that map fast evolutionary phases. However, this strategy eventually requires assuming mass bins narrower than the precision of the published tracks and isochrones. It is interesting to note that one of the main advantages of isochrone synthesis mentioned by Charlot & Bruzual (1991) is that it produces "smooth'' results; however, this advantage only hides the problem of fast evolutionary phases, but does not solve it.

  
5.2 The fuel consumption theorem

As an alternative method, one can avoid using the expression of Eq. (19), and perform instead the integration in the luminosity domain, that is integrate $\ell ~ \varphi_{\rm L}(\ell)$: in this way, the fast evolution of luminosity is automatically taken care of. This is the basis of the so-called fuel consumption theorem (FCT): to understand fully the FCT and its assumptions, we refer the reader to the original paper (Renzini & Buzzoni 1986; Buzzoni 1989; see also Maraston 1998).

Note that isochrone synthesis and the FCT can be coupled: the computation of isochrones can be done so as to fulfill the FCT requirements. We refer to Marigo & Girardi (2001) for an extended discussion on the subject (see also Bressan et al. 1994).

  
5.3 Monte Carlo methods

The Monte Carlo method can be implemented in different ways, which take advantage of its potential to varying degrees, so they are not completely equivalent. To understand this point, consider how a sample of stars selected through a Monte Carlo mass assignment can be handled: first, the stars can be either followed individually throughout their evolution, or grouped in mass bins characterized by average properties; second, in either case an atmosphere, used to transform the bolometric luminosity into observable quantities, can be assigned by either choosing the nearest model in the available grid, or interpolating between nearby atmospheres (see Sect. 4.5). Therefore there is a total of four ways in which a Monte Carlo-selected stellar population[*] can be treated. In the literature, there are examples of at least three of these:

(i)
The stars obtained through the Monte Carlo selection are grouped in bins; to each bin, the nearest available atmosphere model is assigned, which depends on the age considered. That is, the total luminosity is computed by means of the expression

 \begin{displaymath}%
{\cal L} = \sum_i n_i ~\ell_i(t),
\end{displaymath} (24)

where ni is the number of stars that have fallen into the ith mass bin, and $\ell_i$ is the luminosity of the atmosphere assigned to the bin; Eq. (24) is analogous to Eq. (19), with the only difference that the ni values vary from simulation to simulation. This strategy is followed by Bruzual (2002); Bruzual & Charlot (2003) (G. Bruzual, private communication). This implementation of the MC method is not very time-consuming, so it is quite effective in obtaining an overview of the effect of sampling in different observables, which is in fact the goal of the quoted papers. It has, however, two important disadvantages: the mass binning hinders a proper sampling of the discontinuities in the isochrones, and the $\ell_i$ assignation may either smooth out or spuriously amplify transient spectral features (see Sect. 4.5).

(ii)
The stars obtained through the Monte Carlo procedure are followed individually throughout their evolution, but the luminosities are assigned as in the previous method, that is choosing the nearest model in the grid. This approach, although similar to isochrone synthesis, is fundamentally different in that it avoids the additional rebinning implicit in the isochrone synthesis method. Models of this kind have been presented, for example, in Kurth et al. (1999); Cerviño & Mas-Hesse (1994); Cerviño et al. (2000b); Mas-Hesse & Kunth (1991). These models work better than the previous at mapping discontinuities in the isochrones (i.e., the value of ${\rm d}\ell/{\rm d}m$ is better evaluated: see, e.g. Fig. 4 of Mas-Hesse & Kunth 1991), but present the same disadvantages with respect to the assignment of $\ell_i$.

(iii)
A further possibility is to perform Monte Carlo simulations over the luminosity function: the Monte Carlo-sampled stars are followed individually, and each is assigned a tailored atmosphere. Doing so requires performing interpolations in the $\ell_i$ grid, and permits reproducing the path in the HR diagram of individual stars. The individual luminosity values obtained in this way are eventually summed to yield the total luminosity. These models exploit the potential of Monte Carlo method to its maximum and are the only ones able to map correctly the luminosity function without the necessity of binning. A sufficiently numerous set of Monte Carlo simulations of this kind provides directly the distribution function of the ensemble luminosity. The bleeding edge of this method lies, of course, in the interpolations techniques used both in the stellar evolution and in the atmosphere assignment. Examples of applications of this method can be found in Cantiello et al. (2003), Cerviño & Valls-Gabaud (2003) and Girardi (2000), among others.

  
6 Old tools for a new approach: the probabilistic formulation

The conclusions arising from this brief overview of population synthesis can be summarized in the following points:

In this section, we will describe a probabilistic formalism that automatically frames the problem in its most natural interpretation, and opens the way to a deep understanding of the underlying physical problem. The first subsection is devoted to recall standard statistical concepts for those readers without a background in statistics and probability theory. Those who are already expert in this field can skip this part.

A note on definitions is due here. Throughout this paper, we adopt the conventional (though often overlooked) distinction between probability theory and statistics. A statistical formulation is one that seeks to intepret a sample of experimental data in terms of an underlying distribution. A probabilistic formulation is one that assumes an underlying distribution and uses it to predict the resulting distribution of experimental data. The traditional Monte Carlo method has followed a statistical approach: conclusions were drawn from the observation of a spread in the results of simulations. The present paper lays the foundations for a probabilistic approach, in that it seeks to give a formal description of the underlying distributions, and to make quantitative predictions based on them. Obviously, the two approaches complement each other and partially overlap; concepts like the one of distribution function belong to both probability theory and statistics, thus we will refer to them as probabilistic or statistical alike. When it comes to define our method with respect to existing ones, however, we will systematically draw a distinction and refer to it as a probabilistic formalism.

6.1 Basic statistical concepts

As we have seen in Sect. 2, the luminosity of stars can be characterized in terms of an underlying sLDF (Eq. (5)). The exploitation of this approach permits finding an effective solution to the problem of computing the integrated properties of stellar populations. The formalism presented here to this scope is not new in terms of theory of distributions and can be found in any advanced textbook of statistics (e.g. Kendall & Stuart 1977). For completeness, we briefly review here the relevant concepts.

The properties of the sLDF as a distribution can be studied by computing its moments; the first moment is the mean, which we already encountered in Eq. (5):

 \begin{displaymath}%
\mu_1' = \int_0^\infty \ell ~ \varphi_{\rm L}(\ell) {\rm d}\ell.
\end{displaymath} (25)

The general definition of the nth moment of the sLDF is the following:

\begin{displaymath}%
\mu_n(a) = \int_0^\infty (\ell-a)^n ~ \varphi_{{\rm L}}(\ell) ~ {\rm d}\ell.
\end{displaymath} (26)

If a=0, we call it "raw moment'', while if $a=\mu_1'$ we call it "central moment''; in particular, the mean luminosity, which is the main output of synthesis models, is the raw moment of 1st order.

In the following, we will adopt the notation $\mu_n$ to indicate central moments and the notation $\mu'_n$ to indicate raw moments. Let us write down explicitly the expressions for the second central moment (or variance) and the second raw moment of the sLDF, and their mutual relation:

  
                                   $\displaystyle \mu_2' = \int_0^\infty \ell^2 ~ \varphi_{{\rm L}}(\ell) ~ {\rm d}\ell,$ (27)
    $\displaystyle \mu_2 = \int_0^\infty \left(\ell - \mu_1\right)^2 ~ \varphi_{{\rm L}}(\ell) ~ {\rm d}\ell$  
    $\displaystyle \quad ~ = \mu_2' - \mu_1'^2;$ (28)

and, analogously, for the third and fourth moments:
 
                                   $\displaystyle \mu_3' = \int_0^\infty \ell^3 ~ \varphi_{{\rm L}}(\ell) ~ {\rm d}\ell,$ (29)
    $\displaystyle \mu_3 = \int_0^\infty \left(\ell - \mu_1'\right)^3 ~ \varphi_{{\rm L}}(\ell) ~ {\rm d}\ell$  
    $\displaystyle \quad ~ = \mu_3' - 3 \mu_1' \mu_2' + 2 \mu_1'^3,$ (30)
    $\displaystyle \mu_4' = \int_0^\infty \ell^4 ~ \varphi_{{\rm L}}(\ell) ~ {\rm d}\ell,$ (31)
    $\displaystyle \mu_4 = \int_0^\infty \left(\ell - \mu_1'\right)^4 ~ \varphi_{{\rm L}}(\ell) ~ {\rm d}\ell$  
    $\displaystyle \quad ~ = \mu_4' - 4 \mu_1' \mu_3' + 6 \mu_1'^2 \mu_2' - 3 \mu_1'^4.$ (32)

Let us also introduce the characteristic function of the sLDF, $\phi(p)$, that is its Fourier transform:

 \begin{displaymath}%
\phi_{\rm L}(p) = \int_{0}^\infty {\rm e}^{{\rm i}p\ell} ~ \varphi_{{\rm L}}(\ell) {\rm d}\ell.
\end{displaymath} (33)

The characteristic function has the following properties: the coefficients of its Taylor expansion in ${\rm i}p$ are the raw moments of the distribution, while the coefficients of the Taylor expansion in ${\rm i}p$ of its logarithm $\ln \phi_{\rm L}(p)$ are the so-called cumulants of the distribution, $\kappa_n$:

\begin{displaymath}%
\ln \phi_{\rm L}(p) = \sum_{r=0} \kappa_r ~ \frac{({\rm i}p)^r}{r!}\cdot
\end{displaymath} (34)

It follows that moments and cumulants are related by the expression:

 \begin{displaymath}%
\sum_{n=0} \mu_n' ~ \frac{({\rm i}p)^n}{n!} = \exp ~ \left({\sum_{r=0} \kappa_r ~ \frac{({\rm i}p)^r}{r!}}\right).
\end{displaymath} (35)

In particular, the relations between the first four moments and cumulants are the following:
  
                                   $\displaystyle \kappa_1 = \mu_1',$ (36)
    $\displaystyle \kappa_2 = \mu_2 = \mu_2' - \mu_1'^2,$ (37)
    $\displaystyle \kappa_3 = \mu_3 = \mu_3' - 3 \mu_1' \mu_2' + 2 \mu_1'^3,$ (38)
    $\displaystyle \kappa_4 = \mu_4 - 3 \mu_2^2 = \mu_4' - 4 \mu_1' \mu_3' -3 \mu_2'^2 +12 \mu_1'^2 \mu_2' - 6 \mu_1'^4.$ (39)

An important property of cumulants is that they are independent of the assumed origin of the distribution, except for $\kappa_1$: they are also called sometimes "semi-invariants'' due to this property. If the origin is taken at the mean of the distribution, $\kappa_1 = 0$.

Finally, the skewness, $\gamma_1$, and the kurtosis, $\gamma_2$, of the distribution are defined through ratios of the third and the fourth central moments respectively to appropriate powers of the variance:

  
    $\displaystyle \gamma_1 = \frac{\mu_3}{\mu_2^{3/2}} = \frac{\kappa_3}{\kappa_2^{3/2}},$ (40)
    $\displaystyle \gamma_2 = \frac{\mu_4}{\mu_2^{2}} - 3= \frac{\kappa_4}{\kappa_2^{2}}\cdot$ (41)

(Note that the definition of skewness and kurtosis may vary from author to author; alternative definitions can be found at http://mathworld.wolfram.com.) These two quantities enclose information on the shape of the distribution: the skewness gives an idea of how asymmetric the distribution is, and it can be related to the difference between the value of the mean and the mode (the most probable value). The kurtosis is a measure of peakedness, i.e. of the symmetric deformation of the distribution with respect to a Gaussian. In particular, Gaussian distributions have $\gamma_1=0$and $\gamma_2 = 0$; flatter distributions have negative kurtosis values, while peaked distributions have positive kurtosis values.

  
6.2 From stellar to population luminosity functions

In the previous section we have characterized the properties of the luminosity function of individual stars. Let's see now how the luminosity function of an ensemble of  $N_{\rm tot}$sources can be computed.

  
6.2.1 Obtaining the pLDF: exact solution

As a general rule, the PDF of an ensemble of variables is obtained as the convolution of the PDFs of the individual variables. For example, let  $\varphi_{x}(x)$ be the PDF of a variable x and  $\varphi_{y}(y)$ the PDF of a variable y independent of x. The probability density of a variable u=x+y is given by the product of the probabilities of  $\varphi_{x}(x)$ and  $\varphi_{y}(y)$ summed over all the combinations of x and y such that u=x+y:

\begin{displaymath}%
\varphi_{u}(u) = \int_{-\infty}^\infty \varphi_{x}(z) ~ \va...
...{y}(u-z) ~ {\rm d}z = \varphi_{x}(x) ~\otimes ~\varphi_{y}(y),
\end{displaymath} (42)

which is the definition of convolution. In our case, we are assuming that all the stars have luminosities distributed following the same distribution function, $\varphi_{\rm L}(\ell)$, and that the stars are independent on each other. Therefore, the PDF of an ensemble of  $N_{\rm tot}$ stars is obtained by convolving $\varphi_{\rm L}(\ell)$ with itself $N_{\rm tot}$ times:

\begin{displaymath}%
\varphi_{{\rm L_{tot}}}({\cal L}) = \overbrace{ \varphi_{{\...
...\otimes~ ... ~ \otimes \varphi_{{\rm L}}(\ell)}^{N_{\rm tot}}.
\end{displaymath} (43)

Hence, if the sLDF is known, the pLDF of an ensemble of $N_{\rm tot}$ stars can be computed by means of a convolution. The convolution is conceptually straightforward, but it poses severe numerical problems. The reason is the following. The convolution must be performed linearly in luminosity and, in the general case, the dynamic range in luminosities spans eight orders of magnitude, from  $10^{-2}~L_\odot$ to  $10^{6}~L_\odot$. On the other hand, most programs that perform convolutions are based on Fourier transform routines that require a set of points ordered regularly on the x-axis; each time a convolution is performed the number of points on the x-axis is doubled. So, for a resolution of, say, 0.01 $L_\odot$ (necessary to resolve the low end of the luminosity function) 108 points would be needed to define the luminosity function, and this number would be doubled each time a convolution is performed. Therefore, the points necessary to compute even a very undersampled population (a moderate number of convolutions) diverge rapidly, making it numerically unfeasible. We do not know any computational routine powerful enough to perform this task, and any feedback from the community about this subject is highly welcome.

An alternative solution is making the convolution logarithmically in the Fourier space (see next section). However, the numerical computation of the Fourier transform of a function with an irregular sampling is also difficult and, again, we haven't find any routine to perform it satisfactorily.

  
6.2.2 Scaling properties of the LDF

Convolutions in the normal space are equivalent to products in the Fourier space, so that:

  
    $\displaystyle \phi_{{\rm L_{tot}}}(P) = \phi_{{\rm L}}(p)^{N_{\rm tot}},$ (44)
    $\displaystyle \ln \phi_{{\rm L_{tot}}}(P) = N_{\rm tot} \times \ln \phi_{{\rm L}}(p) = N_{\rm tot} \times \sum_{r=0} \kappa_r ~ \frac{({\rm i}p)^r}{r!}\cdot$ (45)

Hence, the cumulants of the luminosity distribution of an ensemble of $N_{\rm tot}$ sources, Kn, can be easily obtained from the cumulants of the sLDF, $\kappa_n$, through the simple scale relation:

 \begin{displaymath}%
K_n = N_{{\rm tot}} ~ \times ~ \kappa_n.
\end{displaymath} (46)

Since $\kappa_1=\mu_1'$, this also implies that:

 \begin{displaymath}%
M_1' = N_{{\rm tot}} ~ \times ~ \mu_1',
\end{displaymath} (47)

where M1' is the mean value of the distribution that describes the luminosity of an ensemble of $N_{\rm tot}$ sources: in other words, the mean luminosity obtained by synthesis models is scalable to a cluster of any size - including one with only 1 source!

Thus, probabilistic reasoning confirms the intuitive expectation that the properties of an ensemble of $N_{\rm tot}$ stars can be obtained by direct scaling of the properties of the sLDF. This result is fundamental for the interpretation of the output of synthesis models in terms of real clusters. However, it is clear from the above derivation that this simple scaling rule only applies to cumulants, not to moments; it is cumulants that hold the scale relations between the properties of the sLDF and the distribution of the total luminosity of an ensemble. But, by virtue of Eqs. (36)-(39), cumulants can be related to moments: $\kappa_1$ is equal to the first raw moment, and the following $\kappa$s are equal to the central moments of the same order, which can in turn be expressed in terms of the raw moments: therefore, to characterize a distribution we can refer either to the moments or to the cumulants.

It is immediately seen from Eqs. (40) and (41) that the shape of the distribution of the ensemble, when expressed in normal form (through a transformation of the distribution function to one with zero mean and unit variance: ${\cal L} \rightarrow x = {({\cal L}-M_1')}/{\sqrt{M_2}}$), can be easily related to the shape of the sLDF:

 \begin{displaymath}%
\Gamma_1 = \frac{1}{\sqrt{N_{{\rm tot}}}}~ \gamma_1,
\\
\end{displaymath} (48)

and

 \begin{displaymath}%
\Gamma_2 = \frac{1}{N_{{\rm tot}}} ~\gamma_2,
\\
\end{displaymath} (49)

where $\Gamma _1$ and $\Gamma _2$ are the skewness and the kurtosis of the distribution of the ensemble. Note that, in agreement with the central limit theorem, $\Gamma_1 \rightarrow 0$ and $\Gamma_2 \rightarrow 0$ for large enough  $N_{\rm tot}$ values, i.e. the distribution tends to a Gaussian.

Although the previous relations are useful to unveil the scale properties of LDFs, knowledge of the moments of a distribution is useful but not sufficient to analyze it if its shape is unknown. For most application, one needs to know whether the distribution can be approximated by a Gaussian, and in case it is not, which its shape is. The following section will deal with the problem of characterizing a distribution by means of its cumulants. The technique suggested can be used to solve two different kind of problems: on one hand, it can be used to generate a theoretical pLDF from a sLDF when the convolution is not feasible. On the other, it can be used to infer the pLDF of an observed population.

  
6.2.3 Obtaining the pLDF: approximate solution

Alternative solutions go through obtaining an approximate expression for the pLDF. A quantitative characterization of the pLDF by means of its cumulants can be obtained by means of approximate expressions. To this aim, we suggest using the Edgeworth's series, which can be written schematically as:

$\displaystyle %
\varphi(x) = Z(x) ~ \left[ 1 + \sum_{i=1}^\infty t_i \right],$     (50)

where x is the normalized luminosity defined above, Z(x) is the Gaussian distribution function, and the terms ti are obtained by the Chebyshev-Hermite polynomials multiplied by powers of the cumulants (Blinnikov & Moessner 1998). This series is a true asymptotic series, i.e. the error is controlled when the series is truncated to a finite number of terms n. As Blinnikov & Moessner (1998) demonstrate, the error is on the same order of the last term of the sum, tn. When the error is small, i.e. the approximation is satisfactory, the term $\Sigma_n \equiv \sum_{i=1}^n t_i$ measures the deviation of the LDF from gaussianity.

These properties can be used to obtain an explicit description of the distribution and to estimate its degree of gaussianity. The algorithm to be followed is described here and represented in Fig. 4:

(i)
The range of interest in x (i.e. the normalized luminosity) must be defined. This is necessary because convergence is first reached at small x, and propagates outward as more terms are included in the truncated series.

(ii)
The maximum deviation $\delta $ from gaussianity must be chosen, in order to discriminate between non-Gaussian and quasi-Gaussian behavior.

(iii)
The maximum discrepancy $\epsilon $ admissible between the truncated series and the LDF must be chosen.

(iv)
A truncated expression is computed with the first n terms.

(v)
At this point, |tn| provides an estimate of the error. If $\vert t_n\vert/\vert 1+\sum_n\vert \ga \epsilon$, the error is too large, i.e. the truncated series is not a good approximation to the LDF: a further term must be added and the process resumed at step (iv). This step might require computing further cumulants, as higher-order terms of the series include progressively higher-order cumulants.

(vi)
As the number of terms retained increases, |tn| becomes smaller than $\epsilon $ and the expression progressively approaches the LDF, until it eventually becomes an acceptable approximation. At this point, if $\vert\Sigma_n\vert < \delta$ the pLDF is quasi-Gaussian; otherwise, it is strongly non-Gaussian, a fact that must be taken into account when the distribution is analyzed. In either case, the approximated expression can be used.


  \begin{figure}
\par\includegraphics[width=8cm,clip]{3283f4.eps}\end{figure} Figure 4: Algorithm to obtain an approximate expression for a pLDF based on the Edgeworth's series.
Open with DEXTER

The algorithm is summarized in the flux diagram of Fig. 4, which permits to find an explicit analytical expression for a pLDF of unknown shape but known cumulants.

As an example, we give here the explicit expression of the Edgeworth's series truncated to include terms up to n=2:

 
                               $\displaystyle %
\varphi_{{\rm L_{tot}}}(x)$ = $\displaystyle \frac{1}{\sqrt{2\pi}} {\rm e}^{-\frac{1}{2} x^2}$  
    $\displaystyle \times \Biggl( 1 + \frac{1}{6}~ \Gamma_1 ~ \left(x^3 -3 x\right) + \frac{1}{24}~\Gamma_2 ~\left(x^4 - 6x^2 +3\right)$  
    $\displaystyle + \frac{1}{72}~ \Gamma_1^2 ~\left(x^6-15 x^4 +45 x^2 -15\right)\Biggl).$ (51)

In the top panel of Fig. 5 we represent the region where Eq. (51) satisfies the first test of Fig. 4, that is the range of $\Gamma _1$ and $\Gamma _2$ values in which Eq. (51) approximates the pLDF with an accuracy of 10% or better in a given interval of normalized luminosity of x. The dependence on x is represented in Fig. 5 by different shades of gray.


  \begin{figure}
\par\includegraphics[width=8.8cm,clip]{3283f5.eps}\end{figure} Figure 5: Top: $\Gamma _1$ and $\Gamma _2$ values for which the Edgeworth's expression truncated to n=2 approximates the pLDF with a 10% accuracy or better (Eq. (51)). Different shades correspond to different ranges in normalized luminosity. Bottom: $\Gamma _1$ and $\Gamma _2$ values for which the pLDF can be approximated by a Gaussian to better than 10%, within different ranges in normalized luminosity.
Open with DEXTER

Similarly, the bottom panel of Fig. 5 describes the region satisfying the second test, i.e. the region where the pLDF can be approximated by a Gaussian within a 10%. As expected, this region is centered around the point [ $\Gamma_1=0, \Gamma_2=0$], where a true Gaussian would lie. Again, this depends on the range of x considered: the wider this range, the narrower the range of acceptable $\Gamma _1$ and $\Gamma _2$ values.

The extension of the dark-gray region can be used to define a simplified diagnostic test for luminosity functions (Fig. 6). This test is based on the algorithm of Fig. 4, but only the first four moments are used, and conventional figures of 10% are used to define whether Edgeworth's approximation is acceptable and the LDF is quasi-Gaussian. Since $\Gamma _1$ and $\Gamma _2$ are relatable to the skewness and kurtosis of the sLDF via $N_{\rm tot}$ (Eqs. (48) and (49)), this test can also be used to determine the minimum number of stars necessary to ensure gaussianity in a given band. An example of this technique will be given in Sect. 7.1.


  \begin{figure}
\par\includegraphics[width=8cm,clip]{3283f6.eps}\end{figure} Figure 6: Characterization of a pLDF based on Edgeworth's approximation to the second order and a gaussianity tolerance interval of $\pm $10%.
Open with DEXTER

  
6.2.4 Computation of the variance of the pLDF

We have seen in Sect. 4.2 that the Poisson distribution is only an approximation to the distribution of the number of stars in a bin, and that it is not a safe one. The correct alternative is to use the multinomial distribution, which has the additional advantage of keeping track of covariance effects across bins. However, the multinomial distribution by itself does not tell the whole story. Additionally, its handling is difficult. We will show here that the adoption of the probabilistic formulation permits bypassing this problem, because it provides the tools to compute the relevant quantities without the need of making assumptions on the distribution of stars in bins, except its randomness[*].

As an example, let's use our probabilistic formalism to compute the variance of the distribution. To do so, we need not make any assumption on the distribution of the number of stars in bins: we just need to take into account the scale relations of luminosity functions (Eq. (46)) and the properties of cumulants (Eqs. (36)-(39)):

\begin{displaymath}%
M_2 ~ = ~ K_2 ~ = ~ N_{\rm tot} ~ \times ~ \kappa_2 = ~ N_{\rm tot} ~ \times ~ \mu_2.
\end{displaymath} (52)

If we apply the approximation of Eq. (19), we obtain:
 
                                M2 = $\displaystyle N_{\rm tot} ~ \times ~ \biggr( \int_0^\infty \ell^2 ~ \varphi_{{\...
...gr( \int_0^\infty \ell ~ \varphi_{{\rm L}}(\ell) ~ {\rm d}\ell \biggl)^2\biggl)$  
  = $\displaystyle N_{\rm tot} ~ \times ~ \biggr(\sum_i w_i ~\ell_i^2 ~ -~ \biggr(\sum_i w_i ~\ell_i \biggl)^2 \biggl)$  
  = $\displaystyle N_{\rm tot} ~ \times ~\biggr(\sum_i w_i \ell_i^2 - \sum_i (w_i \ell_i)^2 - \sum_i \sum_{j \neq i} \ell_i \ell_j ~ w_i ~ w_j\biggl),$ (53)

which is the same as the expression of Eq. (23). This simple example shows the power and, at the same time, the simplicity of our probabilistic formalism. Note, however, that this result does not imply that the multinomial description is wrong: on the contrary, it is implicit in the probabilistic treatment. Our point here is that the probabilistic treatment is simpler and more powerful than an analysis explicitly based on the multinomial distribution.


  \begin{figure*}
\par\includegraphics[width=12.9cm,clip]{3283f7.eps}\end{figure*} Figure 7: Main parameters of the luminosity function in several photometric bands, obtained from the isochrones by Marigo & Girardi (2001) and adopting the IMF by Salpeter (1955) in the mass range 0.15-120 $M_\odot $.
Open with DEXTER

Finally, note that the value of the variance of the pLDF customarily assumed in the literature (Eq. (22)) is biased, with the work by Cantiello et al. (2003) as the only exception we are aware of. Note that in fact, the variance obtained from Eq. (22) is the second raw moment of the pLDF, and not its second central moment.

  
6.3 Technical problems in the computation of the pLDF

Unfortunately, the determination of the LDF of an ensemble is hindered by a few technical problems. The first one concerns the stellar LDF: although some aspects of stellar populations (like variability or transient events) would be more easily treated via the luminosity function as compared to the standard method, the modeling of the luminosity evolution during fast evolutionary phases poses the same problems as in standard codes.

A second issue is the contribution of dead stars to the luminosity function. Those stars have no influence on the computation of the moments of the distribution, except for a "normalization factor'' that depends on the age of the population. In terms of the shape of the luminosity distribution, dead stars show up as a pulse function located at zero luminosity. This can be easily understood in the following terms: let us assume an ensemble with  $N_{\rm tot}$ initial stars. At a given age there is a non-zero probability that all stars are dead; the pLDF, marginalized to such a condition, is a delta centered on zero. There is also a non-zero probability that all but 1 stars are dead: the pLDF corresponding to this case is the sum of a pulse function and the stellar luminosity function, the pulse function having in this case a strength smaller than the delta. Similarly, all but 2 stars can be dead, and the corresponding pLDF is a pulse function plus the convolution of two stellar luminosity functions. In total, there are $N_{\rm tot}$ + 1 marginalized cases, corresponding to  $N_{\rm tot}$, $N_{\rm tot} -1$, $N_{\rm tot} -2$, ..., 2 and 1 dead stars. The LDF of the ensemble should include all these cases, each with a weight given by its relative probability, so that it will have a pulse contribution centered on zero. Although this result seems to contradict the central limit theorem, which states that the asymptotic shape of the LDF of the ensemble is a Gaussian, this is not the case, since the relative pulse contribution becomes smaller and smaller as $N_{\rm tot}$ increases.

  
7 Applications of the probabilistic treatment

  
7.1 Characterization of a sLDF by means of its cumulants

In Sect. 4.1 we showed that knowledge of the mean and the variance is not enough to characterize a LDF. As a rule of thumb, at least the first four cumulants should be taken into consideration, which describe the mean, the dispersion, the asymmetry and the peakedness of the distribution. As a first example of application of the probabilistic treatment, we will derive a few properties of stellar population from the analysis of its cumulants. Figure 7 shows the first four cumulants of the sLDF for different bands and ages, obtained from the isochrones by Marigo & Girardi (2001). We have assumed a Salpeter (1955) IMF in the mass range 0.15-120 $M_\odot $ normalized by the total number of stars. For a normalization in mass rather than in number of stars, these results must be multiplied by the mean mass, $\langle m \rangle = 0.52~M_\odot$. The figure illustrates several interesting facts:

  
7.2 Inclusion of uncertainties in the input parameters

The input ingredients used in population synthesis are generally assumed to be fully known but are in fact affected by uncertainties (Cerviño & Luridiana 2005). These uncertainties may either reflect incomplete knowledge or an intrinsic spread in the features of the quantities considered. In either case, they can be included in the sLDF provided they can be modeled quantitatively.

As an example, let's suppose that the mass distribution function depends on a parameter $\theta$ that is itself distributed, that is let's replace the univocally determined function  $\varphi_m(m)$ with a parametric function  $\psi(m;\theta)$. The parameter $\theta$ can be characterized by a probability distribution function such that:

\begin{displaymath}%
\int p(\theta) ~ {\rm d}\theta = 1,
\end{displaymath} (54)

where the integration interval is the range where $\theta$ is defined. To apply Eq. (9), it is necessary to eliminate the parametric dependence. This is done by integrating  $\psi(m;\theta)$ over all possible $\theta$ values, weighting it by $p(\theta)$:

\begin{displaymath}%
\varphi_m(m) = \int \psi(m;\theta) p(\theta) ~ {\rm d}\theta.
\end{displaymath} (55)

The resulting distribution, which is technically called a mixture of $p(\theta)$ and  $\psi(m;\theta)$ (Kendall & Stuart 1977), can be used in Eq. (9).

Let's consider two hypothetical examples. In the first, assume that $\psi(\theta;m) \propto m^{-\theta}$ and that the power-law index $\theta$ has a rectangular distribution centered on Salpeter's index  $\alpha=2.35$:

\begin{displaymath}%
p(\theta) = \Biggl\{
\begin{array}{lr}
\frac{1}{2\delta} ...
...ta -2.35 \le + \delta,\\
0 & {\rm otherwise},\\
\end{array}\end{displaymath} (56)

with $\delta>0$. If we mix $\psi(\theta;m)$ with $p(\theta)$, we find:
                 $\displaystyle %
\varphi_m(m)$ = $\displaystyle \int \psi(\theta;m) ~ p(\theta)~ {\rm d}\theta$  
  $\textstyle \propto$ $\displaystyle \frac{\left(m^{\delta}-m^{-\delta}\right)}{2\delta~ {\rm ln}~ m} ~ m^{-2.35}.$ (57)

When $\delta \rightarrow 0$, the expression ${(m^{\delta}-m^{-\delta})}/{(2 \delta~ {\rm ln}~ m)} \rightarrow 1$ and Salpeter's law is recovered. When $\delta $ is non-negligible, the sLDF is distorted with respect to the simple Salpeter's case (Fig. 8).

As a further example, let's assume that $\psi(\theta;m) \propto m^{-\theta}$ as above, but that now $p(\theta)$ is Gaussian:

\begin{displaymath}%
p(\theta) = \frac{1}{\sigma_\theta \sqrt{2\pi}} {\rm e}^{-\frac{(\theta - 2.35)^2}{2\sigma_\theta^2}}.
\end{displaymath} (58)

In this case, the $\theta$-weighted IMF is given by $\varphi_m(m)\propto m^{-2.35} \exp(\sigma_\theta^2 ln^2 m)/2$ and, again, the IMF is distorted with respect to the limiting case given by Salpeter's law (Fig. 8).


  \begin{figure}
\par\includegraphics[width=8.2cm,clip]{3283f8.eps}\end{figure} Figure 8: Salpeter's IMF (solid line) compared to two examples of distributed IMFs: rectangular (dashed, $\delta =0.5$) and Gaussian (dot-dashed, $\sigma =0.5$).
Open with DEXTER

Following this example, any spread in the input ingredients can be included in the modeling and contribute to the shape of the pLDF. In particular, the method outlined above can be used to include transient phases and fuzzy stellar behavior in the modeling (Sect. 4.4). For this to be done, it is just necessary that the phenomenon considered can be described in terms of a parameter of known distribution function. Other uncertainties that can be incorporated in the modeling are those that reflect our imperfect knowledge of the problem: for example, the sLDF could be mixed with a Gaussian distribution to mimic observational errors.

It is important to note that the effect on the pLDF of a spread in the input ingredients cannot be determined a priori, as it depends on how the distribution of the input ingredient affects the sLDF. In particular, it cannot be established whether the inclusion of distributed ingredients will render convergence to gaussianity more or less rapid. It is to be expected that, the further is the sLDF from gaussianity, the slower will the convergence of the pLDF to gaussianity be. For example, if the distribution of the input ingredient is bimodal, bimodality will persist in the pLDF of small  $N_{\rm tot}$, and a larger number of stars will be required for the pLDF to converge to a Gaussian.

7.3 Comparison between Monte Carlo simulations, numerical convolutions and the Edgeworth's approximation


  \begin{figure}
\par\includegraphics[width=7.35cm,clip]{3283f9.eps}\end{figure} Figure 9: Monte Carlo simulations of clusters at solar metallicity and 5.5 Ma in the K band for different values of  $N_{\rm tot}$ (Cerviño & Valls-Gabaud 2003, shaded histograms), compared to analytical pLDFs obtained convolving $N_{\rm tot}$ times a sLDF made up of three Gaussians (solid line). The vertical scale is logarithmic in the left panels and linear in the right panels. The analytical sLDF has the same mean and variance of the Monte Carlo simulations; the position of the mean is shown by a vertical dashed vertical line.
Open with DEXTER

We have shown in Sect. 6.2.1 that obtaining the pLDF through a convolution of the sLDF is not simple; however, in simple scenarios it is possible to solve it. In the following, we will consider a simple case to illustrate this point. Our aim in this experiment is twofold: first, we will derive an explicit expression for the pLDF at different  $N_{\rm tot}$'s to show how its shape is related to the generating sLDF and how it depends on  $N_{\rm tot}$. Second, we want to show that, through our method, we are able to reproduce the main features of Monte Carlo simulations for any value of  $N_{\rm tot}$.

Let us assume a stellar luminosity function made up of three Gaussians, representing the dead stars, the MS, and the PMS respectively. The parameters of the three Gaussians are chosen so that the broad features of a set of Monte Carlo simulations for one star are reproduced (upper panels of Fig. 9; the vertical scale is logarithmic in the left panel and linear in the right panel). In particular, the mean and the variance of the triple-Gaussian LDF are constrained to be the same as those of the Monte Carlo simulations. We also confirmed a posteriori that $\gamma_1$ and $\gamma_2$ are very close, which is expected for similar distributions.

With a numerical routine, we convolved this sLDF with itself $N_{\rm tot}$ times. The resulting pLDFs for selected values of  $N_{\rm tot}$ are shown in Fig. 9 (solid line). The features of these pLDFs can be understood qualitatively as follows: the characteristic function of this sLDF is:

$\displaystyle %
\phi_{\rm L}(p) = A_{\rm ds} {\rm e}^{-\frac{1}{2} \sigma_{\rm ...
...A_{\rm PMS} {\rm e}^{-\frac{1}{2} \sigma_{\rm PMS}^2 p^2 - i \ell_{\rm PMS} p},$     (59)

where $A_{\rm ds}$, $A_{\rm MS}$, and $A_{\rm PMS}$ are the weights of the Gaussians corresponding to dead stars, the MS, and the PMS respectively; $\ell_{\rm ds}$, $\ell_{\rm MS}$, and $\ell_{\rm PMS}$ their locations on the luminosity axis; and $\sigma_{\rm ds}$, $\sigma_{\rm MS}$, and $\sigma_{\rm PMS}$ the respective dispersions. The exponent of the $N_{\rm tot}$-th power of this characteristic function is a sum of real exponents in p2 (i.e. Gaussian distributions) and imaginary exponents in p(i.e. translations of the corresponding distributions). Hence the final function will be a sum of Gaussians located at different positions.

The vertical sequence of panels shows how, increasing $N_{\rm tot}$, the pLDF progressively becomes smoother and more symmetric, approaching a Gaussian shape. In the same figure, Monte Carlo simulations with corresponding $N_{\rm tot}$ values are also shown (Cerviño & Valls-Gabaud 2003), and these coincide remarkably with the analytical pLDF. This is a consequence of having chosen a sLDF similar to the Monte Carlo distribution function for $N_{\rm tot} = 1$ (which, in turn, maps the underlying sLDF), and shows the power of the method: large Monte Carlo simulations become redundant, in the sense of being predictable, if one can characterize the sLDF and succeeds in convolving it.


  \begin{figure}
\par\includegraphics[width=7.4cm,clip]{3283f10.eps}\end{figure} Figure 10: Comparison between the pLDFs obtained through sLDF convolutions (dashed line) and the Edgeworth's approximation (solid line) for clusters with different $N_{\rm tot}$ values.
Open with DEXTER

Figure 10 compares the pLDFs obtained through convolution of the sLDF to their Edgeworth's approximation to the second order (Eq. (51)). Each panel corresponds to a different number of stars; the cluster in the upper panel, with $N_{\rm tot}=1000$, is the same one of the lower panel of Fig. 9. The Edgeworth's approximation improves visibly across the range of  $N_{\rm tot}$ considered: this is expected, because the closer is the pLDF to a Gaussian, the better is it approximated by Edgeworth's expression.


  \begin{figure}
\par\includegraphics[width=7.5cm,clip]{3283f11.eps}\end{figure} Figure 11: $\Gamma _1$ and $\Gamma _2$ values of the pLDFs obtained through Monte Carlos simulations (black diamonds), sLDF convolutions (empty circles), and the Edgeworth's approximation (empty squares). Larger $\Gamma _1$ and $\Gamma _2$ correspond to lower $N_{\rm tot}$ values.
Open with DEXTER

Finally, in Fig. 11 we show a blow-up of Fig. 5 that compares the $\Gamma$s of the pLDF obtained through the three different methods (Monte Carlo simulations, numerical convolution, and the Edgeworth's approximation). Several points deserve to be emphasized here:

7.4 Criteria for assessing the significance of fits

As has been pointed out several times, the main result of synthesis models is the mean value of the stellar luminosity function, which can be scaled to clusters of any size. We have also shown that the relative dispersion of the model results (the ratio $\sigma({\cal L})/M_1' = \sqrt{M_2}/M_1'$) decreases when  $N_{\rm tot}$ increases. However, $\sigma({\cal L})$ increases in absolute terms, since it is proportional to  $\sqrt{N_{\rm tot}}$, a fact that should be taken into account in the comparison of theoretical data to observed clusters. Furthermore, since each monochromatic luminosity has its own $\sigma({\cal L})$, they should be weighted differently in fits. Finally, although some regions of the spectrum may have a quasi-Gaussian distribution, this will not happen in general with all the regions. Hence, not all the frequencies (either in synthetic or in observed spectra) are equivalent, or even suitable, to obtain the properties of the observed cluster.


  \begin{figure}
\par\includegraphics[width=8.2cm,clip]{3283f12.eps}\end{figure} Figure 12: Cumulants of the monochromatic spectral energy distribution for a 1 Ga luminosity function assuming a Salpeter (1955) IMF in the mass range 0.15-120 $M_\odot $. The figure shows the region around H$\delta $, which is marked with a solid line. The lines Ca  II H + H$\epsilon $ are marked with dashed lines, and the line Ca  II K with dotted lines.
Open with DEXTER

As an example, Fig. 12 shows the first four cumulants of a region of the visual electromagnetic spectrum for the pLDF of a 1 Ga cluster with solar metallicity, obtained from the synthesis code sed@[*] using a Salpeter (1955) IMF in the mass range 0.1-120 $M_\odot $. The results are normalized to mass. The isochrones used are from Girardi et al. (2002) covering a mass range from 0.15 to 100 $M_\odot $ and based on the solar models (Z = 0.019) by Girardi et al. (2000) and Bertelli et al. (1994) that include overshooting and a simple synthetic treatment of the thermal pulses AGB phase (Girardi & Bertelli 1998). The atmosphere models are taken from the high resolution library by González Delgado et al. (2005); Martins et al. (2005) based on PHOENIX (Allar et al. 2001; Hauschildt & Baron 1999), the ATLAS 9 odfnew library (Castelli & Kurucz 2003) computed with SPECTRUM (Gray & Corbally 1994), the ATLAS 9 library (Kurucz 1991) computed with SYNTSPEC (Hubeny et al. 1995), and TLUSTY (Lanz & Hubeny 2003) at [Fe/H] = 0.0 dex.

The figure shows the region around H$\delta $. As expected from previous plots at these ages (Fig. 7), the values of $\Gamma _1$ and $\Gamma _2$ are quite low. However, not all the wavelengths have the same statistical significance. In particular, the H$\delta $ line shows the largest values of $\Gamma _1$ and $\Gamma _2$, so in undersampled clusters its profile will be difficult to fit. On the other hand, the profile of Ca  II H + H$\epsilon $ is a quite robust result, with a low relative dispersion and $\Gamma _1$ and $\Gamma _2$ values close to the continuum level. Finally, the Ca  II K line, with $\Gamma _1$ and $\Gamma _2$ values similar to those of the continuum, has a high relative dispersion. Summarizing, fitting the theoretical models of Ca  II lines, including their profiles, to observed data would yield more realistic results than fitting either the intensity, the equivalent width or the profile of H$\delta $. Of course, these conclusions depend on the age and metallicity.

  
8 Future applications of the probabilistic treatment

In this section we will briefly discuss several potential extensions of the formalism that could have a strong impact in the analysis of stellar populations. Details on a few examples will also be given.

  
8.1 Forthcoming extensions of the formalism

Since in this paper we have only considered the case of SSPs, maybe the most important pending issue is the extension to other scenarios of star formation. This topic will be covered in a forthcoming paper.

A current limitation of the formalism is that it only deals with integrated properties that scale linearly with the number of stars in the cluster. In the future, the formalism will be extended to include the case of luminosity ratios. To solve this problem, in addition to the cumulants of the distribution function of the ensemble, it is also necessary to obtain the correlation function of the corresponding quantities.

A further assumption of the formalism in its present stage is that the stellar population has a fixed number of stars  $N_{\rm tot}$. It is our intention to extend the formalism by including the case of a collection of populations with varying number of stars. This extended formalism could be applied to several problems, such as the analysis of stellar populations in pixels, which requires computing the global distribution resulting from the distribution of stars in each pixel and the distribution of numbers of stars across pixels; the distribution of luminosities in globular clusters; the estimation of the difference between luminosity profiles in galaxies inferred by means of a comparison with the mode and the mean respectively; the comparison of theoretical SBF with observed ones; and the comparison among different Monte Carlo simulations performed with a total fixed mass. A few of these prospective applications will be discussed further in the remainder of this section.

Finally, the formalism developed here for the treatment of luminosity functions can also be applied to stellar yields. All the statistical considerations that we have made about luminosities can be easily translated into equivalent issues in the field of chemical evolution, although in that case the star formation history would have a fundamental role, and the differential equations that describe (deterministically) the chemical evolution would become stochastic differential equations, whose mean would coincide with the results obtained deterministically. As in the present case, comparing the mean against observations can produce a bias that depends strongly on the shape of the distribution.

The following sections will give more details on a few examples among these.

8.2 Surface brightness fluctuations

SBF observations from galaxies and globular clusters have been proposed as a test of evolutionary tracks and isochrones (e.g. Cantiello et al. 2003, among others). This test is based on the comparison between the observed variance across pixels in the image of a galaxy and the variance expected on statistical grounds. However, there are several inconsistencies in this method as is applied at present:

This inconsistencies could be overcome by extending the formalism so as to include the LDF of populations with varying number of sources. This subject will be discussed at length in a forthcoming paper.

  
8.3 Putting constraints on the globular clusters' distribution

In a similar way, the luminosity distributions of globular clusters in galactic halos could be compared to the corresponding distributions obtained theoretically, either in terms of moments or in terms of the explicit shape of the distribution. The possibility of obtaining higher-order moments, such as the skewness, helps constraining the distribution, which is necessary to test evolutionary tracks and isochrones. However, there are drawbacks similar to those of the previous case, with the only exception of the assumption on the validity of a SSP, which in a galactic halo is probably verified.

To pursue the goals sketched above, it is imperative to know the initial distribution of cluster masses. In the following, we will show qualitatively that our method can also contribute toward this goal, by disclosing a potential source of bias in the use of synthesis models for the determination of the mass distribution of globular clusters. Note, however, that a quantitative conclusion cannot yet be reached.


  \begin{figure}
\par\includegraphics[width=7.8cm,clip]{3283f13.eps}\end{figure} Figure 13: Luminosity distribution functions in V for clusters with the LMC metallicity, approximated by the Edgeworth's series, for several ages and cluster masses: 10 $^4~M_\odot$ (solid line), 10 $^{4.5}~M_\odot$ (dashed line) and 10 $^5~M_\odot$ (dotted line). The mean values of the pLDFs are marked as vertical lines.
Open with DEXTER

In Fig. 13 we show the pLDFs in $\mathcal {L}_{\rm V}$ for clusters with the LMC metallicity obtained from the isochrones by Girardi et al. (2002) using the Edgeworth's approximation. We have plotted the distributions that correspond to the extremes of the age ranges used by Zhang & Fall (1999). The pLDFs correspond to cluster masses of 10 $^4~M_\odot$ (solid line), 10 $^{4.5}~M_\odot$ (dashed line), and 10 $^5~M_\odot$ (dotted line). The mean values of the corresponding pLDFs are marked as vertical lines.

The figure shows that, given the asymmetry of the stellar luminosity function, most observed clusters will have smaller luminosities than the mean, i.e. smaller luminosities than those predicted by a standard code. The straightforward implication is that the mass of observed clusters inferred by means of a comparison with standard models is, in most cases, underestimated, with a bias larger for smaller clusters (see also Gilfanov et al. 2004) and younger ages.

This effect is highly relevant for young and undersampled clusters. In particular, assuming that the age estimation obtained by Zhang & Fall (1999) is not biased, the figure clearly shows that there can be a systematic underestimation of the mass of younger clusters, an hence, an overestimation of the number of low-mass clusters. In the cluster mass range considered, this effect is more relevant in the 104 to 10 $^{4.5}~M_\odot$ interval. This fact could change the shape of the distribution of the initial cluster masses obtained by Zhang & Fall (1999), particularly in the low mass range, making it shallower or even inverting the slope. However, this result is only a qualitative application of the new formalism, and the example above should not be taken literally. To establish a firm conclusion it is necessary to apply the probabilistic treatment also to the determination of ages.

  
8.4 Tracing the sLDF with resolved stellar populations

The individual stars in a resolved population can be used to trace the sLDF. This can be done by comparing the first four cumulants computed by means of current synthesis models to the corresponding observed quantities. Resolved populations have several advantages:


  \begin{figure}
\par\includegraphics[width=8.5cm,clip]{3283f14.eps}\end{figure} Figure 14: Main parameters of the luminosity function for several photometric bands obtained from the isochrones by Marigo & Girardi (2001) (bold lines) and Bertelli et al. (1994); Girardi et al. (2000) (light lines), assuming a Salpeter (1955) IMF in the mass range 0.15-120 $M_\odot $.
Open with DEXTER

8.5 Testing the isochrones

The method proposed in Sect. 8.4 can also be used to compare the predictions of different sets of isochrones, since it is also sensitive to the number of stars expected in each region of the isochrone (which is almost IMF-independent for PMS stars). A similar method has been proposed by Wilson & Hurley (2003).

To illustrate the point, we show in Fig. 14 the first four cumulants of the sLDF for different bands and ages using the isochrones by Marigo & Girardi (2001), computed following FCT requirements explicitly, and by Girardi et al. (2000) and Bertelli et al. (1994). We have assumed a Salpeter (1955) IMF in the mass range 0.15-120 $M_\odot $ normalized by the total number of stars, and the results have been obtained by a direct integration of the isochrones. Note that, although the mean values and the $\mu_2/\mu_1'^2$ ratios are similar across the isochrone sets, there are large differences in $\Gamma _1$ and $\Gamma _2$ at some ages. This tells us, without even knowing the luminosity distribution function, that there are strong differences between both sets of isochrones, which can be directly related with differences in the treatment of stellar evolution, e.g. differences in the lifetimes of different phases.

8.6 Application to the virtual observatory

The method described in this work can be implemented in the VO as an automatized tool for the analysis of observed data, since synthetic models can be given in terms of probability distributions suitable to be used in data mining algorithms or in Bayesian analysis. To achieve this goal, an appropriate theoretical data model is necessary; the definition of such model is a task that is currently being carried out by the Theory interest group of the International Virtual Observatory Alliance (http://www.ivoa.net). In addition, the extension of this probabilistic formalism to distributions of luminosity ratios, which are used in diagnostic diagrams, would be an asset for the development of more robust VO analysis tools.

  
9 Conclusions

This paper considers a series of problems in population synthesis that arise as a consequence of the distributed nature of stellar populations, and develops a new probabilistic formalism that takes them into account. With this formalism it is possible to reproduce and explain the features of Monte Carlo simulations without the need of performing them. The new formalism has several advantages with respect to Monte Carlo simulations in terms of generality and reliability. Unlike Monte Carlo simulations, it is not affected by sampling errors in the estimation of the moments. Unfortunately, its exact application requires computing repeated convolutions, and we do not know any computational tool that can perform this task efficiently. Although the formalism is complete for luminosities, it must still be extended to the case of ratios. A summary of our conclusions follows:

As our understanding of stellar populations shifts, population synthesis tools evolve. The problem of predicting the integrated properties of stellar populations was initially framed as a deterministic one and solved by standard codes. A growing awareness of the spread in the input parameters has boosted the interest in Monte Carlo simulations, whose phenomenological exploration has brought about important insights into the statistical aspect of the problem. The time is ripe now for a further forward step, one that advances the problem from a statistical to a probabilistic formulation. As this evolution takes place, however, it is important to keep in mind that the new formalism reinterprets previous conceptions, rather than overthrowing them, and that it does not supersede the old tools, but instead aims to specify how and when they can be applied. The probabilistic formalism is best seen as a unifying model that includes the old tools and empowers them, in a direction that is becoming imperative for understanding the new observational data.

Acknowledgements
This work has been the effort of many years of reflection on the sampling problems of the IMF. Therefore, there are lots of people that at some moment or other have contributed to this work with comments and suggestions. First of all, we would like to acknowledge Luis Manuel Sarro Baró, whose incisive questions shed light on the problem of distributions in bins. We also acknowledge useful discussions with Marat Gilfanov. Steve Shore, Juan Betancort, Enrique Pérez, and David Valls-Gabaud made useful suggestions on statistics, while Georges Meynet, Daniel Schaerer, Sandro Bressan, Scilla Degl'Innocenti, Enzo Brocato, and Leo Girardi helped us to understand track interpolation issues. We acknowledge Gabriella Raimondo, Rosa Amelia González, and Gustavo Bruzual for several useful conversations on the subject of SBF and Monte Carlo simulations. We thank Alberto Buzzoni for very instructive discussions of synthesis models. The first draft of the paper was greatly improved thanks to the constructive criticism and encouraging comments of an anonymous referee. Joli Adams, our A&A language editor, displayed infinite patience in helping us find just the right word. N.C.L. acknowledges Carlos, Dario, and Eva for providing experimental facts showing that the application of mean values to individual data is a statistical fallacy.
This work was supported by the Spanish Programa Nacional de Astronomía y Astrofísica through the project AYA2004-02703. MC is supported by a Ramón y Cajal fellowship. VL is supported by a CSIC-I3P fellowship.

  
Appendix A: Notation

This section contains an exhaustive summary of the notation used and its rationale.

A.1 Quantities related to stars

li
is the luminosity of an individual object in a discrete sum (Eq. (1)).

$\ell$
is a continuous variable representing the possible luminosity values of individual objects.

$\varphi_L$
$= \varphi_L(\ell;t)$ is the probability density function (PDF) of the variable $\ell$, i.e. the stellar luminosity distribution function (sLDF) (Eq. (2)). It depends on the assumed age (t, which is explicited here as a parameter), but also on the metallicity, the evolutionary tracks adopted, and the star formation history.

$\phi_L$
$= \phi_L(p;t)$ is the characteristic function of $\varphi_L$, defined as its Fourier transform (Eq. (33)).

$\mu'_n$
$= \mu'_n(t)$ is the nth raw moment of the sLDF (Eqs. (27)-(32)).

$\mu_n$
$= \mu_n(t)$ is the nth central moment of the sLDF (Eqs. (28)-(32)). The second central moment is the variance of the sLDF, and its square root, $\sigma$, is the standard deviation of the probability distribution. Note that in the literature the symbol $\sigma$ is often used to represent the standard deviation of the sample, or mean square error, which is a statistical quantity. In this paper, we do not use $\sigma$ with this statistical meaning.

$\kappa_n$
$ = \kappa_n(t)$ is the nth cumulant of the sLDF (Eq. (35)).

$\gamma_n$
$= \gamma_n(t)$ are the skewness ($\gamma_1$) and the kurtosis ($\gamma_2$) of the sLDF (Eqs. (40)-(41)).

A.2 Quantities related to the integrated properties of a stellar population

All these quantities depend on the total number of stars $N_{\rm tot}$ in the population, as well as on the age of the population:

$L_{{\rm tot}}$
is the integrated luminosity of a given individual population (Eq. (1)).

${\cal L}$
$= {\cal L}(t)$ is a continuous variable representing the possible values of the integrated luminosity of stellar populations with  $N_{\rm tot}$ stars (Eq. (5)).

$\varphi_{{\rm L_{tot}}}$
$=\varphi_{{\rm L_{tot}}}({\cal L};t)$ is the probability density function of ${\cal L}$, i.e. the population luminosity distribution function (pLDF). Under suitable assumptions, it can be obtained by convolving the sLDF  $N_{\rm tot}$ times with itself (Sect. 6.2.1).

$\phi_{\rm L_{tot}}$
$= \phi_{\rm L_{tot}}(P;t)$ is the characteristic function of $\varphi_{\rm L_{tot}}({\cal L})$, defined as its Fourier transform (Eq. (44)).

M'n
= M'n(t) is the nth raw moment of the pLDF.

Mn
= Mn(t) is the nth central moment of the pLDF. If n=2, its square root is the variance of the pLDF, denoted as $\sigma({\cal L})$ in this paper.

Kn
= Kn(t) is the nth cumulant of the pLDF. Under suitable assumptions, simple scale relations hold between Kn and $\kappa_n$ (Eq. (46)).

$\Gamma_n$
$= \Gamma_n(t)$ is the skewness ($\Gamma _1$) and the kurtosis ($\Gamma _2$) of the pLDF. Under suitable assumptions, simple scale relations hold between $\Gamma_n$ and $\gamma_n$ (Eqs. (48)-(49)).

A.3 Quantities related to the IMF

$\varphi_M$
$ = \varphi_M(m)$ is the IMF, defined as the probability density function for a star of having an initial mass m (Eq. (8)).

$\varphi'_M$
$ = \varphi'_M(m)$ is a function proportional to  $\varphi_M(m)$: $\varphi'_M(m) = \varphi_M(m) / \langle m \rangle$, where $\langle m \rangle$ is the mean mass of the IMF (Eq. (11)).

wi
$= w_i(m^{\rm low}_i,m^{\rm up}_i;t)$ is the probability that a given star has a mass belonging to the interval $[m^{\rm low}_i,m^{\rm up}_i]$ (Eq. (20)). Note that the interval is defined arbitrarily depending on the computational needs and the characteristic evolutionary time of the population considered, so in most cases it is varied depending on the age of the stellar population.

ni
$= n_i(m^{\rm low}_i,m^{\rm up}_i;t)$ is a random (distributed) variable representing the number of stars in a given mass interval $[m^{\rm low}_i,m^{\rm up}_i]$ (Eqs. (22), (24)). Its mean value is $\langle n_i \rangle = N_{\rm tot} \times w_i$.

References

 

Copyright ESO 2006