Crucial aspects of the initial mass function

M. Cerviño; C. Román-Zúñiga; V. Luridiana; A. Bayo; N. Sánchez; E. Pérez

doi:10.1051/0004-6361/201219504

Home

All issues

Volume 553 (May 2013)

A&A, 553 (2013) A31

Full HTML

Free Access

Issue		A&A Volume 553, May 2013


Article Number		A31
Number of page(s)		14
Section		Galactic structure, stellar clusters and populations
DOI		https://doi.org/10.1051/0004-6361/201219504
Published online		25 April 2013

A&A 553, A31 (2013)

I. The statistical correlation between the total mass of an ensemble of stars and its most massive star

M. Cerviño¹^,2, C. Román-Zúñiga³, V. Luridiana²^,4, A. Bayo⁵^,6, N. Sánchez⁷ and E. Pérez¹

¹ Instituto de Astrofísica de Andalucía (IAA-CSIC), Glorieta de la Astronomía s/n, 18008 Granada, Spain
e-mail: mcs@iaa.es
² Instituto de Astrofísica de Canarias, c/ vía Láctea s/n, 38205 La Laguna, Tenerife, Spain
³ Instituto de Astronomía, Universidad Académica en Ensenada, Universidad Nacional Autónoma de México, Ensenada BC, 22860 Mexico, Mexico
⁴ Departamento de Astrofísica, Universidad de La Laguna (ULL), 38205 La Laguna,Tenerife, Spain
⁵ European Southern Observatory, Casilla 19001, Santiago 19, Chile
⁶ Max Planck Institut für Astronomie, Königstuhl 17, 69117 Heidelberg, Germany
⁷ S. D. Astronomía y Geodesia, Fac. CC. Matemáticas, Universidad Complutense de Madrid, 28040 Madrid, Spain

Received: 28 April 2012
Accepted: 21 February 2013

Abstract

Context. Our understanding of stellar systems depends on the adopted interpretation of the initial mass function, IMF φ(m). Unfortunately, there is not a common interpretation of the IMF, which leads to different methodologies and diverging analysis of observational data.

Aims. We study the correlation between the most massive star that a cluster would host, m_max, and its total mass into stars, ℳ, as an example where different views of the IMF lead to different results.

Methods. We assume that the IMF is a probability distribution function and analyze the m_max − ℳ correlation within this context. We also examine the meaning of the equation used to derive a theoretical ℳ − $\hbox{$\hat{m}_\mathrm{max}$}$ relationship, $𝒩 \times {^{\int}}_{m̂_{\max}}^{m_{up}} φ (m) d m = 1$ $\hbox{${\cal N} \times \int_{\hat{m}_\mathrm{max}}^{m_{\rm up}} \phi(m)\,\mathrm{d}m = 1$}$ with N the total number of stars in the system, according to different interpretations of the IMF.

Results. We find that only a probabilistic interpretation of the IMF, where stellar masses are identically independent distributed random variables, provides a self-consistent result. Neither ℳ nor the total number of stars in the cluster, N, can be used as IMF scaling factors. In addition, $\hbox{$\hat{m}_\mathrm{max}$}$ is a characteristic maximum stellar mass in the cluster, but not the actual maximum stellar mass. A ⟨ℳ⟩ − $\hbox{$\hat{m}_\mathrm{max}$}$ correlation is a natural result of a probabilistic interpretation of the IMF; however, the distribution of observational data in the N (or ℳ) − m_max plane includes a dependence on the distribution of the total number of stars, N (and ℳ), in the system, Φ_N(N), which is not usually taken into consideration.

Conclusions. We conclude that a random sampling IMF is not in contradiction to a possible m_max − ℳ physical law. However, such a law cannot be obtained from IMF algebraic manipulation or included analytically in the IMF functional form. The possible physical information that would be obtained from the N (or ℳ) − m_max correlation is closely linked with the Φ_ℳ(ℳ) and Φ_N(N) distributions; hence it depends on the star formation process and the assumed definition of stellar cluster.

Key words: stars: statistics / Galaxy: stellar content / methods: data analysis

© ESO, 2013

1. Introduction

In recent literature, the term initial mass function (IMF) is used to indicate three different types of distributions: (1) the distribution by number of the stellar masses observed in a particular star ensemble; (2) a normalized version of (1), i.e., the frequency distribution of the stellar masses observed in a particular star ensemble; and (3) the theoretical probability density function φ(m) of the stellar masses that can be formed in a generic star ensemble. In this work, following Scalo (1986), we adopt the third definition and explore some consequences of mixing these definitions.

In the following, we leave distribution (2) out of the discussion and focus, for simplicity, only on distributions (1) and (3)¹. These two distributions are different but closely related to each other, as statistics and probability are. Probability deals with predicting the likelihood of possible events in a system with known properties; statistics consists in analysing the distribution of real events with the aim of determining some unknown property of the system. Probability addresses the direct problem, while statistics addresses the inverse problem. In our case, distribution (3) describes the underlying probability distribution from which stellar masses can be drawn, while distribution (1) describes an actual stellar sample from which we wish, ideally, to recover the parameters of the underlying probability distribution.

The relation between the shape of (1) and the shape of (3) depends crucially on the size of the sample, that is, the number of stars $\hbox{${\cal N}$}$ ; when $\hbox{${\cal N}$}$ values are large, the two shapes tend to be similar. This similarity can mislead one into believing that (1) is just a scaled-up version of (3), with $\hbox{${\cal N}$}$ being the scale factor. This would be very wrong since, as explained above, the physical meanings of both distributions are intrinsically different. This paper is dedicated to exploring the implications of such difference.

A major drawback of the distribution-by-number view (number (1) above) is that the very definition of a stellar sample necessarily implies some (hidden or explicit) assumption on the star formation (SF) process that originated the sample. For example, an embedded, open, or globular cluster, an OB associations, and so on, are coeval and cospatial samples; field stars, which are used to study galaxy structure, are neither coeval nor cospatial; the stars in a galaxy that were born at a given time, which are a sample suitable for stellar populations studies, are coeval but not cospatial. These examples make clear that, when a sample is selected, some predefined spatial and time scales are implicitly assumed, and these scales may influence the distribution by the number of the stellar masses. Rephrasing Scalo (1986), when talking about the IMF, we are left in the uncomfortable position of having no means to define an empirical sample that corresponds to a consistent definition of IMF and that can be directly related to the theories of SF without introducing major assumptions.

The probability distribution function (PDF) view (number (3) above) is actually an abstraction used to describe the general universe of initial masses that a star would have. This interpretation implies that we have to use a probability framework in order to make a description of the problem and inferences from observed data sets. One implicit requirement of such an approach is that the stellar mass is an identically independent distributed (iid) variable, and therefore, any realization of the IMF is a random sample². Within this framework, all the empirical samples are included naturally as far as they are particular realizations of the theoretical distribution. Although it is possible to include conditions representing particular SF scenarios, it is generally assumed that the IMF has no memory of the SF event: that is, the SF details have no major impact on the IMF itself, although they can have an impact on the resulting IMF realization once the corresponding conditions are included in the derivation. It is a surprising fact that there is no clear observational evidence that the IMF varies strongly and systematically as a function of different SF scenarios (Bastian et al. 2010).

Throughout this paper, we consider several pieces of work based on a distribution-by-number interpretation of the IMF. The specific way in which the IMF is represented varies depending on the considered paper. Some authors assume that the IMF is a continuous law that returns, for each mass value, the number of stars of that mass; others consider that it returns the number of stars in each mass bin. Some assume that the stars are distributed in a predefined way and the mass of a star depends on the mass of the other stars; others consider that the stars are distributed independently from each other. In the following, we give examples of this and emphasize the differences between the various distribution-by-number interpretations and the PDF view of the IMF.

Naturally, the equations involving the IMF depend on the interpretation of the IMF. More importantly however, the cluster-related quantities inferred from manipulations of the IMF are interpreted differently according to the initial assumptions. One case in which the different views of the IMF lead to dramatically diverging interpretations is the modeling of the correlation between the total stellar mass in a cluster, ℳ, and the mass, m_max, of its most massive star, which we investigate in this series of papers.

There are many facets to the study of the ℳ − m_max correlation. One is the correlation obtained theoretically from manipulations of the IMF functional form, which is the subject of this paper. Another is the inference of ℳ from partial information of the system. The lack of information makes this inference deeply dependent on the IMF interpretation (this aspect is discussed in Cerviño et al. 2013, hereafter Paper II). A third issue is the comparison between theory and observational data. This point also depends on the interpretation of the IMF (and is studied in Jimenez-Donaire et al., in prep., from now on Paper III)

The structure of the paper is as follows: in Sect. 2 we present our basic framework for a probabilistic interpretation of the IMF. Section 3 is devoted to analyzing in a probabilistic context the meaning of the basic equation commonly used in the literature relating ℳ and m_max. In Sect. 4 we discuss the different methodologies and assumptions used by other authors to obtain a ℳ − m_max correlation. We include a discussion on iid stellar masses and on the connection of the IMF with the SF. Finally, we briefly discuss the composition of different IMFs to obtain an integrated galaxy IMF (IGIMF). Our conclusions are described in Sect. 5.

2. Formal probabilistic formulation

Let us start by framing the problem in a formal probabilistic framework:

The IMF, φ(m) = dN/dm, is a PDF, that provides the probability of finding a star in a given mass range by its integration in such mass range. The mass limits of the PDF, m_low and m_up, are given by stellar theory and must fulfill ${}^{\int}m_{low}_{m_{up}} φ (m) d m = 1$ $\hbox{$\int_{m_{\rm low}}^{m_{\rm up}} \phi(m) \mathrm{d}m = 1$}$ ; that is, we are certain that any possible star has a mass between m_low and m_up. This is the first fundamental difference with respect to the distribution-by-number interpretation: the IMF cannot be arbitrarily normalized to ℳ or $\hbox{$\cal N$}$ , since it does not provide numbers of stars with a given mass but the probability for a star to be born with a given mass independently of how many stars are in the cluster or the cluster total mass. In this interpretation of the IMF, there is neither an implicit sample nor predefined space or time scales. The IMF so defined may have values larger than one, provided its integral over any mass range is lower than one. This is the second fundamental difference with respect to the distribution-by-number interpretation when described in terms of frequencies (case 2 in the Introduction) where no value larger than one is possible by construction. In this paper we use the Kroupa IMF (Kroupa 2001, 2002) as used in Weidner & Kroupa (2006)³ and subsequent works, except for the value of m_up which we set equal to 120 M_⊙. Although a larger value would probably be more realistic according to recent studies (Crowther et al. 2010, see also the contributions to the Up2010 conference published by Treyer et al. 2011), this choice is motivated by the fact that the m_up value of most public stellar tracks used in most m_max estimations is 120 M_⊙. In Fig. 1 we show the φ(m) used in this paper and the probability for a star of having a mass in the range m,m + 1 M_⊙. The probability for a random star of having a mass lower than a given value m_a is given by $p (m < m_{a}) = \int_{m_{low}}^{m_{a}} φ (m) d m,$ $\begin{equation} p(m < m_\mathrm{a}) = \int_{m_{\rm low}}^{m_\mathrm{a}} \phi(m) \, \mathrm{d}m, \label{eq:pltm} \end{equation}$ (1)while the probability for a random star of having a mass equal to or larger than m_a is given by $p (m \geq m_{a}) = \int_{m_{a}}^{m_{up}} φ (m) d m .$ $\begin{equation} p(m \geq m_\mathrm{a}) = \int_{m_\mathrm{a}}^{m_{\rm up}} \phi(m) \, \mathrm{d}m. \label{eq:pgeqm} \end{equation}$ (2)In this work, the integrals over the IMF will always be read as equal to or larger than the lower limit and lower than the upper limit. The use of lower than instead of equal to or lower than in the upper limit and the complementary in the lower limit is just a convention. However, equal cannot be used simultaneously in both equations: no star can simultaneously belong to two independent intervals. The convention we use implies that the nominal value m_up cannot be formally reached, although values very close to it are possible.

Fig. 1

IMF used in the present work (solid line), as in the parametrization by Kroupa (2001, 2002) and Weidner & Kroupa (2006). Being a PDF, it can have values larger than one; the probabilities are given by the integral over the PDF. We also plot the probability that a star has a mass in the m,m + 1 M_⊙ range, which is lower than one (dashed line). This probability declines rapidly when m is larger than m_up − 1 M_⊙.

2.
Different observational scenarios can be described by adding constraints to the IMF. For instance, we may explicitly include the limit imposed on m_max by the total mass of the sample we are analyzing, that is, m_max = min { m_up,ℳ } . In this case, we must define an a posteriori PDF, related to the IMF, that includes such a condition: $\begin{matrix} φ (m | m < m \max) = \frac{φ (m) H (m \max - m)}{p (m < m \max)}, \end{matrix}$ $\begin{eqnarray} \phi(m | m < m\mathrm{_{max}}) = \frac{\phi(m) \, \mathrm{H}(m\mathrm{_{max}}-m)}{p(m < m\mathrm{_{max}})}, \label{eq:IMFmcond} \end{eqnarray}$ (3)where H(m_max − m) is the Heaviside function⁴, which ensures that no star equal to or larger than m_max can be present in the cluster. We note that φ(m|m < m_max) is also a PDF. The mean mass of such distribution is $⟨ m | m < m \max ⟩ = \frac{\int_{m_{low}}^{m_{up}} m φ (m) H (m \max - m) d m}{p (m < m \max)} \cdot$ $\begin{equation} \left< m | m < m\mathrm{_{max}} \right> = \frac{ \int_{m_{\rm low}}^{m_{\rm up}} \, m \, \phi(m) \, \mathrm{H}(m\mathrm{_{max}}-m) \mathrm{d}m}{p(m < m\mathrm{_{max}})}\cdot \label{eq:mmeancond} \end{equation}$ (4)More elaborated constrained-IMF can be formulated, always keeping in mind that conditions are imposed ad hoc and produce a PDF whose functional form differs from φ(m).
3.
The PDF describing ensembles with a total number of stars $\hbox{$\cal N$}$ (formally conditioned to have $\hbox{$\cal N$}$ stars) can be calculated as successive convolutions of the corresponding PDF for one star. For instance, the PDF for the total mass, $\hbox{$\Phi_{\cal M}({\cal M}|{\cal N})$}$ , is the result of convolving the IMF $\hbox{${\cal N}$}$ times with itself(see Cerviño & Luridiana 2006; Selman & Melnick 2008): $Φ_{ℳ} (ℳ | 𝒩) = \begin{matrix} 𝒩 \\ \overset{􏽺 0 􏽽􏽼 0 􏽻}{φ (m) \otimes φ (m) \otimes .... \otimes φ (m)} \end{matrix} .$ $\begin{equation} \Phi_{{\cal M}}({\cal M}|{\cal N}) = \overbrace{ \phi(m)\otimes \phi(m) \otimes \, .... \,\otimes \phi(m)}^{{\cal N}}. \label{eq:Mtot} \end{equation}$ (5)A property of self-convolution is that simple relations link the mean value and the high-order moments of φ(m) and $\hbox{$\Phi_{{\cal M}}({\cal M}|{\cal N})$}$ (see, e.g., Cerviño & Luridiana 2006). As an example, the mean integrated mass of $\hbox{$\Phi_{{\cal M}}({\cal M}|{\cal N})$}$ , $\hbox{$ \left< {\cal M} | {\cal N} \right> $}$ , is related to the mean stellar mass of the IMF, ⟨m⟩, through the relation $⟨ ℳ | 𝒩 ⟩ = 𝒩 \times ⟨ m ⟩ = 𝒩 \times \int_{m_{low}}^{m_{up}} m φ (m) d m .$ $\begin{equation} \left< {\cal M} | {\cal N} \right> = {\cal N} \times \left< m \right> = {\cal N} \times \int_{m_{\rm low}}^{m_{\rm up}} \, m \, \phi(m) \, \mathrm{d}m. \label{eq:meanvaluegen} \end{equation}$ (6)However, we note that $\hbox{$\Phi_{{\cal M}}({\cal M}|{\cal N}) \neq {\cal N}\times \phi(m)$}$ and that the actual total mass cannot be obtained, but only an estimate of it. This is the third fundamental difference with the distribution-by-number interpretation, which assumes that for a given $\hbox{${\cal N}$}$ there is one, and only one, ℳ value, given by $\hbox{${\cal M}({\cal N}) = {\cal N} \times \left< m \right>$}$ .

3. Relating the number of stars with the most massive star in the sample

According to the law of large numbers, in a sample of $\hbox{$\cal N$}$ stars drawn from an underlying PDF, φ(m), the typical number of stars N_a with m ≥ m_a is given by $\hbox{$N_\mathrm{a} = {\cal N} \times p(m \geq m_\mathrm{a})$}$ . Particularizing this equation, we can define a characteristic maximum value of m_max, $\hbox{$\hat{m}_\mathrm{max}$}$ , for which there is typically only one star with mass equal to or larger than $\hbox{$\hat{m}_\mathrm{max}$}$ through $1 = 𝒩 \times p (m \geq m̂ \max) = 𝒩 \times \int_{m̂ \max}^{m_{up}} φ (m) d m .$ $\begin{equation} 1 = {\cal N} \times p(m \geq \hat{m}_{\max}) = {\cal N} \times \int_{\hat{m}_{\mathrm{max}}}^{m_{\rm up}} \phi(m)\, \mathrm{d}m. \label{eq:charval} \end{equation}$ (7)This is the basic equation used by several authors as the determination of the actual mass of the most massive star in a system (as examples: Elmegreen 1997, 1999, 2000; Kroupa & Weidner 2003; Weidner & Kroupa 2004, 2006). However, we can also obtain a mean value of m_max (Oey & Clarke 2005) or a median value of m_max (Weidner et al. 2010). So the question is: does the definition of the characteristic value $\hbox{$\hat{m}_\mathrm{max}$}$ indeed provide the actual m_max extreme value or only an estimate of it? And if it is an estimate, what is its exact meaning? Let us seek the answer in a probabilistic context⁵.

Fig. 2

Distribution of the maximum stellar mass, $\hbox{$\Phi_{m_\mathrm{max}}({m_\mathrm{max}}|{\cal N})$}$ for different values of $\hbox{${\cal N}$}$ . The circle on each curve is the position of the characteristic value $\hbox{$\hat{m}_\mathrm{max}$}$ .

We consider a set of $\hbox{${\cal N}$}$ stars with unknown stellar masses, m_i, drawn from the IMF. For any given mass m_a, the probability of having at least one star with mass m_i equal to or larger than m_a in the sample, $\hbox{${\cal{P}}(\exists i \in [1,{\cal N}]\, |\, m_i \geq m_\mathrm{a}) $}$ , is the complementary probability that all stars have a mass lower than m_a, $\hbox{${\cal{P}}(m_i < m_\mathrm{a}, \forall i \in \left[1,{\cal N}\right]) $}$ . Since the stellar masses are iid drawn from the same distribution φ(m), the probability $\hbox{${\cal{P}}(m_i < m_\mathrm{a}, \forall i \in \left[1,{\cal N}\right])$}$ is the result of multiplying p(m < m_a) by itself $\hbox{${\cal N}$}$ times⁶: $\begin{matrix} 𝒫 (m_{i} < m_{a}, \forall i \in [1, 𝒩]) & = & {[p (m < m_{a})]}^{𝒩} \\ = \end{matrix}$ $\begin{eqnarray} {\cal{P}}(m_i < m_\mathrm{a}, \forall i \in \left[1,{\cal N}\right]) &=& \left[p(m < m_\mathrm{a}) \right]^{{\cal N}} \nonumber \\ &=& \left[1 - p(m \geq m_\mathrm{a}) \right]^{{\cal N}}. \label{eq:P.lt.mmax} \end{eqnarray}$ (8)Thus, $\begin{matrix} 𝒫 (\exists i \in [1, 𝒩] | m_{i} \geq m_{a}) & = & 1 - 𝒫 (m_{j} < m_{a}, \forall j \in [1, 𝒩]) \\ = \end{matrix}$ $\begin{eqnarray} {\cal{P}}(\exists i \in [1,{\cal N}] |\, m_i \geq m_\mathrm{a}) &=& 1 - {\cal{P}}(m_j < m_\mathrm{a}, \forall j \in \left[1,{\cal N}\right]) \nonumber\\ &=& 1- \left[1 - p(m \geq m_\mathrm{a}) \right]^{{\cal N}}. \label{eq:P.ge.mmax} \end{eqnarray}$ (9)This relation is valid for any value of m_a and any distribution function.

If we now set $\hbox{$m_\mathrm{a} = \hat{m}_\mathrm{max}$}$ , we can replace $\hbox{$p(m \geq \hat{m}_\mathrm{max})$}$ in Eq. (9) by $\hbox{$1/{\cal N}$}$ by virtue of the $\hbox{$\hat{m}_\mathrm{max}$}$ definition. The probability that there is at least one star with $\hbox{$m\ge \hat{m}_\mathrm{max}$}$ in a sample of $\hbox{$\cal N$}$ stars is thus given by $𝒫 (\exists i \in [1, 𝒩] | m_{i} \geq m̂ \max) = 1 - {[1 - \frac{1}{𝒩}]}^{𝒩},$ $\begin{equation} {\cal{P}}(\exists i \in [1,{\cal N}] |\, m_i \geq \hat{m}_\mathrm{max}) = 1 - \left[1 - \frac{1}{{\cal N}} \right]^{{\cal N}}, \end{equation}$ (10)which has an asymptotic value 1 − 1/e ~ 0.63 for large $\hbox{${\cal N}$}$ values, with 0.63 being a reasonable approximation for, say, $\hbox{${\cal N} > 100$}$ . Hence, the characteristic mass, $\hbox{$\hat{m}_\mathrm{max}$}$ , obtained by solving Eq. (7) is the value of m that is not reached or exceeded⁷ with a probability 0.37 in a sample of $\hbox{$\cal N$}$ stars. This means that in a large enough set of clusters, all of them with $\hbox{${\cal N}$}$ stars, typically in 63% of the clusters the mass of the most massive star will be equal to or larger than $\hbox{$\hat{m}_\mathrm{max}$}$ , while in 37% of the clusters it will be lower than $\hbox{$\hat{m}_\mathrm{max}$}$ . So the $\hbox{$\hat{m}_\mathrm{max}$}$ value obtained in Eq. (7) does not provide the mass m_max of the most massive star in a cluster of $\hbox{${\cal N}$}$ stars, contrary to what is stated in several astrophysical papers⁸.

Actually, for any possible value $\hbox{$\hat{m}_\mathrm{max}$}$ lower than m_up that we would use as a proxy of the actual value of m_max, there is a probability larger than 90% that the most massive star in the system is more massive than such $\hbox{$\hat{m}_\mathrm{max}$}$ value (see Appendix A for details).

3.1. The PDF of m_max for a known $\hbox{$\cal N$}$ , $\hbox{$\Phi_{\textit{m}_{{\mathsf{max}}}}(\textit{m}_{\mathsf{max}}|{\cal N})$}$

Fig. 3

Percentile analysis around the median of $\hbox{$\Phi_{m_\mathrm{max}}({m_\mathrm{max}}|{\cal N})$}$ as a function of $\hbox{${\cal N}$}$ (shaded areas). The figure includes as a reference the position of the characteristic value, median, mean, and mode of the distribution. Small triangles: compilation by Weidner et al. (2010) of observational values of m_max and inferred values of $\hbox{$\cal N$}$ obtained from observations; squares: observed values of $\hbox{$\cal N$}$ and m_max from Kirk & Myers (2011); stars: observed values of $\hbox{$\cal N$}$ and m_max in the field for the four observed regions from Kirk & Myers (2011).

Fig. 4

Confidence interval analysis of $\hbox{$\Phi_{m_\mathrm{max}}({m_\mathrm{max}}|{\cal N})$}$ as a function of $\hbox{${\cal N}$}$ (shaded area). Lines and symbols have the same meaning as in Fig. 3.

Actually, there is no unique value of m_max for a total number of stars $\hbox{${\cal N}$}$ , but the possible values of m_max are distributed following the probability function $\begin{matrix} Φ_{m_{\max}} (m_{\max} | 𝒩) & = & 𝒩 φ (m_{\max}) p (m < m_{\max})^{𝒩 - 1} \\ = \end{matrix}$ $\begin{eqnarray} \Phi_{m_\mathrm{max}}({m_\mathrm{max}}|{\cal N}) &=& {\cal N} \, \phi(m_\mathrm{max}) \, p(m < m_\mathrm{max})^{{\cal N} -1} \\ &=& {\cal N} \, \phi(m_\mathrm{max}) \, \left(\int_{m_{\rm low}}^{m_\mathrm{max}} \phi(m) \,\mathrm{d}m\right)^{{\cal N} -1}, \label{eq:PDFmmax} \end{eqnarray}$ as deduced by Gumbel (1958), Sornette (2004), van Albada (1968), Oey & Clarke (2005), Maschberger & Clarke (2008), Pflamm-Altenburg & Kroupa (2008), among others.

In Fig. 2 we show the distribution $\hbox{$\Phi_{m_\mathrm{max}}({m_\mathrm{max}}|{\cal N})$}$ for different values of $\hbox{${\cal N}$}$ . The circle on each PDF corresponds to the position of the characteristic value $\hbox{$\hat{m}_\mathrm{max}$}$ , which divides the PDF in two areas: the left one containing the 37% of the probability and the right one containing the 63% of the probability. We note that $\hbox{$\Phi_{m_\mathrm{max}}({m_\mathrm{max}}|{\cal N})$}$ is highly asymmetrical. Given the shape of the distribution, it cannot be described only by their parameters (mean, variance, and so on); we must consider the whole distribution for any comparison with the observational data. This can be done in two ways, by a percentile analysis (analysis around the median) and by a confidence interval analysis around the mode⁹ (the maximum value of the distribution, which is related to the most common value obtained in a set of observations).

Figure 3 shows a percentile analysis of the distribution. The figure also includes the position of the mean, mode, and characteristic values of the distribution for reference. The position of the mean, $\hbox{$\left<m_\mathrm{max}|{\cal N}\right>$}$ , mostly falls between the 63% and 84% percentile, i.e., far from the median of the distribution. On the other hand, $\hbox{$\hat{m}_\mathrm{max}$}$ corresponds, as predicted, to the 37% percentile. Finally, the mode of the distribution lies in the lowest percentile range. The figure also shows the (m_max, $\hbox{$\cal N$}$ ) values compiled by Weidner et al. (2010), in which m_max is determined from observations and $\hbox{$\cal N$}$ is inferred from star counting in a given mass range¹⁰. It also shows the data from Kirk & Myers (2011), who quote the observed masses of individual stars of 14 young stellar groups in four different regions (m_max, $\hbox{$\cal N$}$ , and ℳ were obtained from their tabulated data). We also show the corresponding m_max and $\hbox{$\cal N$}$ values of field stars in each region analyzed by Kirk & Myers (2011), which are in agreement with the general trend of the correlation.

The confidence interval around the mode analysis takes into account the distribution shape and the range of probability of any region in the diagram. This is done by sorting the contributions to the probability in decreasing order and finding the m_max range that contains some specified amount of probability. Different confidence intervals are obtained by adding the sorted probabilities, taking into account their associated m_max values. This methodology is extensively used in the analysis of redshifts in photometric surveys (see Fernández-Soto et al. 2002, for more details). The situation is illustrated in Fig. 4, which includes the 90, 68, and 26% confidence intervals.

3.2. The PDF of $\hbox{$\cal N$}$ for a known m_max, $\hbox{$\mathsf \Phi_{\cal N}({\cal N} | \textit{m}_{\mathsf{max}})$}$

Fig. 5

Confidence interval analysis of $\hbox{$\Phi_{\cal N}({\cal N}|m_\mathrm{max})$}$ as a function of m_max for a $\hbox{$\Phi_{\cal N}({\cal N}) = \mathrm{constant}$}$ . Symbols have the same meaning as in Fig. 3.

Fig. 6

Confidence interval analysis of $\hbox{$\Phi_{\cal N}({\cal N}|m_\mathrm{max})$}$ as a function of m_max for a $\hbox{$\Phi_{\cal N}({\cal N}) \propto {\cal N}^{-2}$}$ . Arrows: data points by Weidner et al. (2010) using $\hbox{${\cal N}_\mathrm{obs}$}$ without correction of incompleteness due to unobserved stars. Other symbols have the same meaning as in Fig. 3.

In Sect. 3.1 we discussed the estimation of m_max, given the number of stars $\hbox{${\cal N}$}$ . Alternatively, we can also investigate the opposite case, the estimation of $\hbox{${\cal N}$}$ from a known m_max (that is, the determination of the $\hbox{$\Phi({\cal N} | m_\mathrm{max})$}$ distribution). To address this problem, we can use the Bayes’ theorem: $Φ_{𝒩} (𝒩 | m_{\max}) = \frac{Φ_{m_{\max}} (m_{\max} | 𝒩) Φ_{𝒩} (𝒩)}{\int Φ_{m_{\max}} (m_{\max} | 𝒩) Φ_{𝒩} (𝒩) d𝒩} \cdot$ $\begin{equation} \Phi_{\cal N}({\cal N} | m_\mathrm{max}) = \frac{\Phi_{m_\mathrm{max}}({m_\mathrm{max}}|{\cal N}) \, \Phi_{{\cal N}}({\cal N})}{\int \Phi_{m_\mathrm{max}}({m_\mathrm{max}}|{\cal N})\; \, \Phi_{{\cal N}}({\cal N})\; \mathrm{d} {\cal N}}\cdot \end{equation}$ (13)We know all terms on the right-hand side of this equation, except $\hbox{$ \Phi_{{\cal N}}({\cal N})$}$ ,which is the probability of having a system with a given total number of stars, i.e., an initial number-of-stars-per-cluster function (an initial cluster number function, ICNF). If $\hbox{$\Phi_{{\cal N}}({\cal N})$}$ is a power-law distribution in a similar fashion to the initial cluster mass function (ICMF), $\hbox{$\Phi_{{\cal N}}({\cal N}) = A {\cal N}^{-\beta}$}$ with A a normalization value, we find $Φ_{𝒩} (𝒩 | m_{\max}) = A^{'} p (m < m_{\max})^{𝒩 - 1} 𝒩^{1 - β},$ $\begin{equation} \Phi_{\cal N}({\cal N} | m_\mathrm{max}) = A'\, \, p(m < m_\mathrm{max})^{{\cal N} -1} \, {\cal N}^{1-\beta}, \label{eq:phiNmmax} \end{equation}$ (14)where A′ is a normalization value that includes A.

The mode of $\hbox{$\Phi_{\cal N}({\cal N} | m_\mathrm{max})$}$ , $\hbox{${\cal N}^\mathrm{mode}$}$ , is obtained by equaling to zero its first derivative with respect to $\hbox{${\cal N}$}$ , which yields¹¹ $𝒩^{mode} \approx \frac{β - 1}{\ln p (m < m_{\max})} \cdot$ $\begin{equation} {\cal N}^\mathrm{mode} \approx \frac{\beta-1}{\ln p(m < m_\mathrm{max})}\cdot \label{eq:Nmode} \end{equation}$ (15)This equation has an acceptable solution only for β < 1; in particular, for a flat distribution of $\hbox{${\cal N}$}$ (i.e., β = 0) the result is approximately 1/p(m ≥ m_max). This justifies the name of $\hbox{$\hat{m}_\mathrm{max}$}$ as the characteristic value, since it provides $\hbox{$\cal N^\mathrm{mode}$}$ as a function of the most extreme value of the distribution under the hypothesis of a flat $\hbox{$\Phi_{{\cal N}}({\cal N})$}$ ¹². In Fig. 5 we plot the confidence intervals of the $\hbox{$\Phi_{\cal N}({\cal N} | m_\mathrm{max})$}$ distribution as a function of m_max. We note that the axes of the plot have changed with respect to the figures in the previous section, since m_max is now the variate. We also plot the data points from Weidner et al. (2010) and Kirk & Myers (2011).

However, Eq. (15) results in a negative value without astrophysical meaning if the ICNF is similar to the ICMF; $\hbox{$\Phi_{\cal N}({\cal N} | m_\mathrm{max})$}$ is a decreasing function for all $\hbox{$\cal N$}$ , and the most probable $\hbox{${\cal N}$}$ corresponds to the maximum of $\hbox{$\Phi_{{\cal N}}({\cal N})$}$ , i.e., the lower limit of the $\hbox{$\Phi_{\cal N}({\cal N})$}$ distribution. Hence, $\hbox{$\Phi_{{\cal N}}({\cal N})$}$ modifies the confidence interval analysis of $\hbox{$\Phi_{\cal N}({\cal N} | m_\mathrm{max})$}$ , as shown in Fig. 6.

It seems surprising that, depending the independent variable used (m_max or $\hbox{$\cal N$}$ ), one has to take into account $\hbox{$\Phi_{\cal N}({\cal N})$}$ . Where is the $\hbox{$\Phi_{\cal N}({\cal N})$}$ dependence in Figs. 3 and 4? Actually, we must be aware that Figs. 3−6 are not representations of $\hbox{$\Phi_{{\cal N},m_\mathrm{max}}({\cal N}, m_\mathrm{max})$}$ , which would be the one to be compared with observational data. Instead, they are a representation of the probability for fixed values in the x-axis, i.e., the figures can be only interpreted making vertical (discrete or infinitesimal) slices. Hence, for comparison with data, the x-axis on Figs. 3 and 4 must be weighted by $\hbox{$\Phi_{\cal N}({\cal N})$}$ , and the x-axis on Figs. 5 and 6 must be weighted by φ(m). Obviously, such a weight process changes the probability density in the $\hbox{${\cal N} - m_\mathrm{max}$}$ plane.

3.3. Which information does the $\hbox{$\mathsf {\cal N} ~(\mathrm{or}~{\cal M})-\textit{m}_{\mathsf{max}}$}$ plane contain?

All the quantities considered here, m_max, $\hbox{${\cal N}$}$ , and ℳ, have their own distributions, φ(m), $\hbox{$\Phi_{\cal N}({\cal N})$}$ , and Φ_ℳ(ℳ). So, any uncertainty of data points in the $\hbox{${\cal N} ~ (\mathrm{or}~{\cal M})-m_\mathrm{max}$}$ plane would be minimized or amplified by such distributions, and neither $\hbox{$\Phi_{m_\mathrm{max}}( m_\mathrm{max} | {\cal N} )$}$ nor $\hbox{$\Phi_{\cal N}({\cal N} | m_\mathrm{max})$}$ (or their ℳ counterparts) are suitable descriptions. The only suitable distribution of data points is given by $\hbox{$\Phi_{m_\mathrm{max},{\cal N}}( m_\mathrm{max}, {\cal N} )$}$ ¹³ (or their ℳ counterpart, see below). This PDF is shown in Fig. 7 for the case of a $\hbox{$\Phi_{\cal N}({\cal N}) \propto {\cal N}^{-2}$}$ . However, the use of $\hbox{$\Phi_{m_\mathrm{max},{\cal N}}( m_\mathrm{max}, {\cal N} )$}$ imposes some important caveats.

The first of these caveats affects any test on the $\hbox{${\cal N} ~(\mathrm{or}~{\cal M})-m_\mathrm{max}$}$ correlation. Such a test can only be done at a distribution level and not in a data-point-by-data-point analysis. This means that we need a quantitative characterization of the uncertainty associated to each data point and must combine the corresponding uncertainties to obtain a density map in the $\hbox{${\cal N} ~(\mathrm{or}~{\cal M})-m_\mathrm{max}$}$ plane.

The second caveat refers to the plane to be used: $\hbox{${\cal N} -m_\mathrm{max}$}$ or ℳ − m_max? It includes two different aspects. The first is that any ℳ inference implicitly includes an $\hbox{$\cal N$}$ inference, and in most of the cases (all where ⟨m⟩ is used), it is actually an $\hbox{$\cal N$}$ inference itself but expressed as ⟨ℳ⟩ (i.e., the plane to be used is actually $\hbox{${\cal N} -m_\mathrm{max}$}$ ). The second aspect is that the distribution of data points in the $\hbox{${\cal N} -m_\mathrm{max}$}$ plane includes φ(m) and $\hbox{$\Phi_{\cal N}({\cal N})$}$ and the distribution of data points in the ℳ − m_max plane also includes Φ_ℳ(ℳ). This means that some hypothesis about the relation between $\hbox{$\cal N$}$ and ℳ is always required when the ℳ − m_max plane is used.

We conclude this section with a brief discussion about the falsification of the random sampling of the IMF claimed by Weidner et al. (2010) in view of the results presented here, that is, the dependence on $\hbox{$\Phi_{\cal N}({\cal N})$}$ and Φ_ℳ(ℳ) in the distribution of data points in the $\hbox{${\cal N} ~(\mathrm{or}~{\cal M})-m_\mathrm{max}$}$ plane.

First, random sampling is an axiom in statistics and probability. It is not a hypothesis. Statistical tests evaluate the compatibility of a hypothetical distribution with a given sample. There can be two main reasons for the incompatibility of both entities: (a) the assumed distributions are not a correct representation of the sample; (b) the sample is biased or not randomly chosen. In the present case, the hypothesized distributions are the IMF, the ICNF, and the ICMF, where the ICMF and the ICNF are linked not trivially by Eq. (5). We would assume a universal IMF, but still need an ICMF (or ICNF) characterization. The very definition of the ICMF (or ICNF) leads to an uncomfortable situation similar to the case of the IMF: we have no means of defining an empirical sample that can be directly related to SF theories without introducing a major assumption, that is, the cluster definition. Can a single star be considered as a valid cluster? How do we define a single cluster formation event in a giant molecular cloud? Is there a difference between the ICMF defined over a random set of clusters and the one defined over a group of clusters that would have a common origin in a large-scale star-forming event?

Hence, the results obtained by Weidner et al. (2010) can be interpreted in different ways:

The clusters in the sample do not follow the assumed IMF.
The clusters in the sample do not follow the assumptions about the ICMF or ICNF.
The sample is biased due to selection effects (including the definition of what a cluster is).
The sample is incomplete, so no conclusions about the preceding items can be obtained.

We will discuss these issues in more detail in Papers II and III.

Fig. 7

3D representation of $\hbox{$\log \Phi_{m_\mathrm{max},{\cal N}}(m_\mathrm{max}, {\cal N})$}$ distribution for a $\hbox{$\Phi_{\cal N}({\cal N}) \propto {\cal N}^{-2}$}$ .

4. Discussion

In the previous sections we have established the formal probabilistic interpretation of the IMF and the propagation of this interpretation in the correlation between m_max and $\hbox{$\cal N$}$ . We can now explore the implications of such an interpretation and (a) compare it with the implications of concurrent interpretations (Sect. 4.1); and (b) discuss the random-sampling assumption of this work and its implications for the relation between the IMF and the SF (Sect. 4.2).

4.1. Literature on the ℳ − m_max and the $\hbox{${\cal N}-\textit{m}_{\mathsf{max}}$}$ correlations

There are copious studies related to the existence and modeling of a ℳ − m_max correlation (for instance, Reddish 1978; Larson 1982; Vanbeveren 1982; García-Vargas & Díaz 1994; García-Vargas et al. 1995; Elmegreen 1997, 1999, 2000; Larson 2003; Kroupa & Weidner 2003; Weidner & Kroupa 2004; Oey & Clarke 2005; Weidner & Kroupa 2006; Parker & Goodwin 2007; Selman & Melnick 2008; Maschberger & Clarke 2008; Weidner et al. 2010; Kroupa et al. 2011). Some of these articles give an explicit formulation of this relation, while others propose that it is a physical relation that links both quantities. Others even argue that the relation is not physical but only an effect of the size of samples. As we will see, the difference among the various ℳ − m_max relationships and their meaning does not depend on the relation itself, but rather on how each author interprets the IMF.

One common assumption is that the $\hbox{${\cal N}-m_\mathrm{max}$}$ and the ℳ − m_max correlations are theoretically equivalent. With this idea in mind, the first correlation is preferred by Selman & Melnick (2008) and Maschberger & Clarke (2008), who argue that $\hbox{$\cal N$}$ is the natural independent variable for testing the random-sampling hypothesis. The second one is preferred by Weidner et al. (2010) because, with the two quantities inferred, the possible error in $\hbox{$\cal N$}$ is larger than the error in ℳ. Only a few authors (Selman & Melnick 2008) explore the question of whether they are indeed formally equivalent or not. As we have seen previously, in a probabilistic framework they are not equivalent (cf. Eq. (5)).

4.1.1. The IMF as an exact analytical law

Fig. 8

ℳ − m_max relationship resulting from the analytical formulation of the IMF of García-Vargas & Díaz (1994); García-Vargas et al. (1995). The figure includes data points from Weidner et al. (2010) and Kirk & Myers (2011), where symbols have the same meaning as in Fig. 3 and the result of two linear fits to the data from Weidner et al. (2010) and Kirk & Myers (2011) using either log ℳ or log m_max as the independent variable.

Let us consider the case of García-Vargas & Díaz (1994) and García-Vargas et al. (1995) as an example of this interpretation. They assume that the IMF is not a probability distribution but an exact analytical law, φ_GV(m) = k(ℳ) × φ(m), where k(ℳ) is a renormalization constant that, because ℳ is the exact value of the amount of gas transformed into stars, verifies

$ℳ = \int_{m_{low}}^{m_{up}} m φ_{GV} (m) d m = k (ℳ) \int_{m_{low}}^{m_{up}} m φ (m) d m,$ $\begin{equation} {\cal M} = \int_{m_{\rm low}}^{m_{\rm up}} m \, \phi_{\mathrm{GV}}(m) \, \mathrm{d}m = k({\cal M}) \int_{m_{\rm low}}^{m_{\rm up}} m \, \phi(m) \, \mathrm{d}m, \label{eq:MGV} \end{equation}$ (16)where φ(m) is the standard functional form of the IMF. The exact number of stars with mass m_a in the cluster is given by N_a = φ_GV(m_a), which implies that $\hbox{$k({\cal M})={\cal N}$}$ . Taking into account that stars are discrete entities, they propose a scenario in which only the stellar masses that verify φ_GV(m) ≥ 1 represent acceptable physical solutions (the so-called richness effect). Given that φ_GV(m) decreases with m, the most massive star in the cluster is the one that verifies $φ_{GV} (m_{\max}) = 𝒩 \times φ (m_{\max}) = 1.$ $\begin{equation} \phi_{\mathrm{GV}}(m_\mathrm{max}) = {\cal N} \times \phi(m_\mathrm{max}) = 1. \label{eq:NGV} \end{equation}$ (17)For a power-law IMF, φ(m) = A m^− α, this leads to a ℳ − m_max relationship with the form: $m_{\max} \propto ℳ^{\frac{1}{α}} .$ $\begin{equation} m_\mathrm{max} \propto {\cal M}^\frac{1}{\alpha}. \label{eq:MmGV} \end{equation}$ (18)According to the scenario proposed, the cluster forms stars in a sorted way, in which the stars with an associated larger value of φ_GV(m) take precedence over stars with associated lower values of φ_GV(m). So, the most massive star (the one with the lowest φ_GV(m_max) value) is conditioned to the formation of a large enough number of lower mass star (the richness effect). Stated otherwise, the mass of this most massive star is determined by the amount of gas that remains after all possible lower mass stars have been formed with relative numbers established by the IMF. We note that the relevant point here is that there must be a certain amount of mass transformed into stars with mass m < m_a in order to have a star with mass m_a.

A similar ℳ_cloud − m_max relationship is found by Larson (1982, 2003). However, Larson’s results come from fitting the observational data of cloud masses, ℳ_cloud, with respect to m_max, and they are quoted as a statistical correlation, not a physical law. We note that a correlation between ℳ_cloud and m_max does not imply the same correlation between ℳ and m_max, since an efficiency factor is required (see Shadmehri & Elmegreen 2011,for a more detailed discussion).

In Fig. 8 we show the resulting ℳ − m_max relationship under these assumptions on the IMF and assuming the functional form of the IMF used in this work. The figure includes data points from Weidner et al. (2010) and Kirk & Myers (2011). We have included the result of two linear fits to the data from Weidner et al. (2010) and Kirk & Myers (2011) using either log ℳ or log m_max as the independent variable. The theoretical relation is off toward larger log ℳ values.

This interpretation of the IMF stems from stellar counting procedures. Since φ_GV(m) is a continuous function, it cannot return a natural number N_a for any mass value m_a; because stars are discrete entities, this approach can only be an approximate description. This alone is sufficient to invalidate Eq. (17) as a way to obtain the actual most massive star, since $\hbox{$\cal N$}$ may (unphysically) turn out to be a non-natural number. A consequence, this equation can only provide an approximation.

This situation implies that continuous functional forms of the IMF can only be directly related to the number of stars with a given mass interval, and not to the number of stars with a given mass. This possibility is explored in the next interpretation case.

4.1.2. The IMF as a distribution of the number of stars

One alternative view of the IMF is that it can be arbitrarily normalized and provide the exact number of stars in a given mass range. This is the case assumed by Reddish (1978), Vanbeveren (1982), Elmegreen (1997, 1999, 2000), Kroupa & Weidner (2003), Weidner & Kroupa (2004), Elmegreen (2006), Weidner & Kroupa (2006), Weidner et al. (2010) and Kroupa et al. (2011). We refer to these articles as those that use the IMF de facto as a distribution of the number of stars. Their interpretation is that the number of stars between m_a and m_b , with m_a < m_b, is given by $N (m \in [m_{a}, m_{b}]) = \int_{m_{a}}^{m_{b}} φ_{Elm} (m) d m,$ $\begin{equation} N (m \in [m_\mathrm{a},m_\mathrm{b}])= \int_{m_\mathrm{a}}^{m_\mathrm{b}} \, \phi_{\mathrm{Elm}}(m) \, \mathrm{d}m, \label{eq:NElm} \end{equation}$ (19)where φ_Elm(m) = k × φ(m) with k a normalization constant. This equation is the general case of Eq. (7), that is, the definition of $\hbox{$\hat{m}_\mathrm{max}$}$ , described above. The difference with the previous case is that the total number of stars in the cluster is now given by $𝒩 = \int_{m_{low}}^{m_{up}} φ_{Elm} (m) d m,$ $\begin{equation} {\cal N} = \int_{m_{\rm low}}^{m_{\rm up}} \phi_{\mathrm{Elm}}(m) \, \mathrm{d}m, \label{eq:NtotElm} \end{equation}$ (20)so, $\hbox{$k={\cal N}$}$ . The actual total mass is given by integration of m × φ_Elm(m) within the same mass limits. However, how the limits are written and what interpretation is given to them varies according to the author. Here we use the formalization by Elmegreen (1997, 1999, 2000, 2006): $ℳ = \int_{m_{low}}^{m_{up}} m φ_{Elm} (m) d m = 𝒩 \times \int_{m_{low}}^{m_{up}} m φ (m) d m,$ $\begin{equation} {\cal M} = \int_{m_{\rm low}}^{m_{\rm up}} m \, \phi_{\mathrm{Elm}}(m) \, \mathrm{d}m = {\cal N} \times \int_{m_{\rm low}}^{m_{\rm up}} m \, \phi(m) \, \mathrm{d}m, \label{eq:MElm} \end{equation}$ (21)and postpone to the next subsubsection the discussion of the special case of Weidner & Kroupa (2004, 2006), Weidner et al. (2010), and Kroupa et al. (2011). Whatever the normalization is, we need an additional assumption to obtain the actual maximum stellar mass in the cluster from Eq. (19). We have to assume ad hoc that the most massive star m_max is the result of solving Eq. (7) (i.e., that $\hbox{$\hat{m}_\mathrm{max}$}$ is the actual m_max). To do so, external arguments, similar to the richness effect, are required.

For a power-law IMF and m_up = ∞, the m_max − ℳ correlation is $m_{\max} \propto ℳ^{\frac{1}{α - 1}} \propto 𝒩^{\frac{1}{α - 1}} .$ $\begin{equation} m_\mathrm{max} \propto {\cal M}^{\frac{1}{\alpha-1}} \propto {\cal N}^{\frac{1}{\alpha-1}}. \end{equation}$ (22)Elmegreen (1997, 1999, 2000) argue that, since the cluster is filled through random sampling, the inferred m_max can only be an estimate of the actual value. Only Vanbeveren (1982) states that it is possible to obtain the actual m_max value.

In Fig. 9 we show the resulting ℳ − m_max correlation under these assumptions using the functional form of the IMF employed here. The curve is completely equivalent to the $\hbox{$\left<{\cal M}\right> - \hat{m}_\mathrm{max}$}$ correlation obtained in the PDF case. The figure includes data points from Weidner et al. (2010) and Kirk & Myers (2011) just for comparison. We also included the result of a linear fit of log ℳ as a function of log m_max obtained from the data.

This interpretation of the IMF relies on stellar counting followed by a binning process. It is by far the most common interpretation and is assumed in a wide range of situations, from IMF determinations to stellar population synthesis. Its main feature is that Eq. (19) provides the actual number of stars and that $\hbox{${\cal M} = {\cal N} \times \left< m \right>$}$ provides the actual total stellar mass in the cluster (this last feature is also shared by the analytical law interpretation). In this case it may seem that the problem with integer numbers of stars mentioned in the previous case is solved as far as we can always choose a suitable set of bins such that Eq. (19) produce a natural number for any m_a and m_b values. However, the solution is not so trivial: depending on the bin definition, distributions with different shapes are obtained (D’Agostino & Stephens 1986; Maíz Apellániz & Úbeda 2005), but the shape of the IMF is still defined by $\hbox{${\cal N} \times \phi(m)$}$ . Consequently, the bins cannot be defined at will. The only plausible solution is to assume that Eq. (19) (and hence Eq. (21)) is only valid in the limiting case $\hbox{${\cal N} = \infty$}$ (Cerviño et al. 2002; Fouesneau & Lançon 2010; Piskunov et al. 2011), and that, for finite $\hbox{$\cal N$}$ values, they do not provide actual N(m ∈ [m_a,m_b] ) or ℳ values but only estimates of such values. Again, we must understand what exactly this estimate represents.

To summarize this section, no continuous functional form of the IMF can provide the actual number of stars, neither for a given mass nor for a given mass interval, but only an estimate of it. The only way to give meaning to this estimate is by adopting a probabilistic framework. This implies using a probabilistic algebra, which explicitly prevents arbitrary normalizations of φ(m).

Fig. 9

ℳ − m_max relationship resulting from the distribution function formulation of the IMF of Elmegreen (1997, 1999, 2000), the formulation of Weidner & Kroupa (2004, 2006), and the optimal sampling formulation of Kroupa et al. (2011). The figure includes data points from Weidner et al. (2010) and Kirk & Myers (2011) and the result of the linear fit of the data to log ℳ as a function of log m_max.

4.1.3. The Weidner & Kroupa case

The studies by Weidner & Kroupa (2004, 2006), Weidner et al. (2010), and Kroupa et al. (2011) are another example of an interpretation of the IMF in terms of a distribution of the number of stars. However, they deserve special attention since they represent a major effort to include conditions in the IMF.

The equations to find a ℳ − m_max relationship proposed by Weidner & Kroupa (2004, 2006), once corrected by an improper account of m_max in ℳ (Kroupa et al. 2011), are $\begin{matrix} 1 & = & \int \begin{matrix} m_{up} \\ m_{\max} \end{matrix} φ_{WK} (m) d m, \\ ℳ - m_{\max} & = & \int_{m_{low}}^{m_{\max}} m φ_{WK} (m) d m . \end{matrix}$ $\begin{eqnarray} \label{eq:mmaxWK} 1 & = & \int_{m_\mathrm{max}}^{m_{\rm up}} \phi_{\mathrm{WK}}(m) \, \mathrm{d}m, \\ \label{eq:MclWK} {\cal M} - m_\mathrm{max} & =& \int_{m_{\rm low}}^{m_\mathrm{max}} m\, \phi_{\mathrm{WK}}(m) \, \mathrm{d}m. \end{eqnarray}$ As in the previous case, Eq. (23) is equivalent to the definition of $\hbox{$\hat{m}_\mathrm{max}$}$ given in Eq. (7) and φ_WK(m) has the same functional form (scaled by a constant k_WK). A simple inspection shows that $\hbox{$k_\mathrm{WK}={\cal N}$}$ . The difference with the previous case is in Eq. (24): the upper limit of the integral is m_max and not m_up. By doing so, Kroupa et al. (2011) aim to constrain the IMF in such a way that Eq. (23) provides the actual m_max value rather than an estimate of it.

They justify that Eq. (23) provides such actual value by focusing on how the IMF is sampled. Their first approach was the sorted sampling scenario (Weidner & Kroupa 2006), according to which the IMF is sort-sampled, where the stars with the lowest mass are those that form first. This scenario is physically motivated, based on the hydrodynamical simulations of cluster formation in competitive accretion without the inclusion of possible (positive or negative) feedback of massive stars (Bonnell et al. 2003, 2004). Weidner & Kroupa (2006) presented Monte Carlo simulations to support this model, where clusters with a given total mass ℳ are drawn from a randomly sampled IMF. The number of stars used in the simulation was estimated from ℳ divided by the mean stellar mass. After that, the sample is sorted and the desired ℳ value approximated by accepting or rejecting the most massive star in the cluster. The most recent work (Kroupa et al. 2011) is based on the concept of the optimal sample: sampling is optimal if Eq. (23) is verified and produces the actual value of m_max. In both cases, it is argued that the IMF is not random sampled. Figure 9 shows the original and the corrected ℳ − m_max relationship they obtain.

This interpretation is based on a strict vision of the IMF as a stellar counting process involving an individual star, the one with m = m_max, and a stellar counting plus binning procedure for the remaining $\hbox{${\cal N} -1$}$ stars. This can be seen from the treatment of the integral limits or equivalently, the histograms bins, throughout the different versions. In the original set of equations proposed by Weidner & Kroupa (2006), m_max was counted twice in two non-overlapping bins. The new version (Kroupa et al. 2011) clearly states the bin where m_max is, but now it opens a problem with the φ(m) definition. We recall that it is mainly a problem of inclusion of conditions, which is not a trivial issue. Let us consider the possible self-consistent cases:

1.
We use the criteria of equal to or larger than for lower integral limits and lower than for upper ones to give a physical meaning to Eq. (23). However, if we want m_max to appear directly in the computation of ℳ, we must impose it ad hoc, which is done by using ℳ − m_max instead of ℳ. A self-consistent formulation, taking into account the integral limits in Eq. (23), is to write explicitly the mass contribution of the stars in the (m_max, m_up) range $\begin{matrix} m_{\max} & = & \int \begin{matrix} m_{up} \\ m_{\max} \end{matrix} m φ_{WK} (m) \Rightarrow \\ φ_{WK} (m) & = & δ (m - m_{\max}) + (𝒩 - 1) \times φ (m | m < m_{\max}), \end{matrix}$ $\begin{eqnarray} m_\mathrm{max} & = & \int^{m_{\rm up}}_{m_\mathrm{max}} m\, \phi_{\mathrm{WK}}(m)\,\,\,\, \Rightarrow \nonumber\\ \label{eq:WKcor1} \phi_{\mathrm{WK}}(m) &=& \delta(m-m_\mathrm{max}) + ({\cal N}Ê- 1) \times \phi(m| m < m_\mathrm{max}), \end{eqnarray}$ (25)where δ(m − m_max) is the Dirac delta function. However, this implies an ad hoc variation of the φ(m) functional form, which is necessary to impose that m_max is the maximum stellar mass.
2.
We use the criteria of larger than for lower integral limits and equal or lower than for upper ones. Then, we can compute ℳ properly using m_max as the upper integral limit. However, in this case we must change Eq. (23) by $\begin{matrix} 0 & = & \int \begin{matrix} m_{up} \\ m_{\max} \end{matrix} φ_{WK} (m) d m \\ \Rightarrow φ_{WK} (m) = k_{WK} \times φ (m | m \leq m_{\max}), \end{matrix}$ $\begin{eqnarray} \label{eq:WKcor2} 0 & = & \int_{m_\mathrm{max}}^{m_{\rm up}} \phi_\mathrm{WK}(m) \mathrm{d}m \nonumber\\ \label{eq:mmaxalter} && \Rightarrow \phi_{\mathrm{WK}}(m) = k_\mathrm{WK} \,\times\,\phi(m|m\leq m_\mathrm{max}), \end{eqnarray}$ (26)which means that there is no star more massive than m_max. This means, however, that we lose the equation giving m_max value, which must be imposed ad hoc.

Cases (1) and (2) above are the only possible ones, and both constrain ad hoc m_max to be the maximum stellar mass in the cluster. Now, we have shown previously that any description of the IMF as a continuous function implicitly eliminates the dependence with $\hbox{$\cal N$}$ (and hence ℳ) and its interpretation as a distribution by number. The Kroupa et al. (2011) case clearly shows that there is no way to include constraints into a distribution-by-number description of the IMF and, at the same time, enjoy the advantages of a continuous distribution representation. Once a continuous functional form for φ(m) is assumed, only a PDF interpretation is valid, and we implicitly renounce obtaining actual values of stellar masses, actual total masses, or actual values of m_max. In particular, it would not be possible to obtain a hidden physical law implicit in the φ(m) functional form. At most we could obtain statistical correlations like the $\hbox{$\left<{\cal M}\right> - \hat{m}_\mathrm{max}$}$ . If there were such physical laws, their origin would be external to the IMF and could only be inferred from detailed simulations, and not from algebraic manipulation of the IMF. That is the price we must pay for the advantages of a continuous formulation of the IMF.

4.1.4. The probabilistic case

The IMF is treated as a probability distribution in Oey & Clarke (2005), Elmegreen (2006), Parker & Goodwin (2007), Maschberger & Clarke (2008), Selman & Melnick (2008), Hass & Anders (2010), among others. Their basic assumption is similar to the one of this paper, and some partial results of the description shown here have been obtained by other authors (including Weidner et al. 2010). Here, we summarize the results from works on the topic in the global context of the formulation given in the previous section. The common point of these works is that, without additional ad hoc conditions, an ℳ − m_max relationship cannot be defined trivially as a physical law, but only as a statistical correlation. The total mass in the cluster, the total number of stars in the cluster, and the particular number of stars with given stellar masses are not fixed quantities, but distributed ones, and none of them can be obtained univocally from the others. Hence, the use of ℳ − m_max or the use of $\hbox{${\cal N}-m_\mathrm{max}$}$ is not just a question of choice in terms of observational considerations; it is actually the result of statistical correlations of different distributions.

The probabilistic description of the IMF is included, by construction, in works that make use of Monte Carlo simulations (see Weidner & Kroupa 2006; Elmegreen 2006; Parker & Goodwin 2007; Selman & Melnick 2008; Hass & Anders 2010, as examples), where the IMF is sampled star by star up to a given value of ℳ or $\hbox{$\cal N$}$ . Such Monte Carlo simulations have been devoted to explain and compare different results using different sampling algorithms. Hass & Anders (2010) made an explicit, exhaustive, and detailed study of the issue. As far as we know, only Elmegreen (2006) and Selman & Melnick (2008) have made theoretical studies aimed of describing the relationship of ℳ − m_max using conditional probabilities.

Most of the theoretical studies have been carried out in terms of an $\hbox{${\cal N}-m_\mathrm{max}$}$ relationship, using $\hbox{$\cal N$}$ as variate and m_max as variable and making use of $\hbox{$\Phi_{m_\mathrm{max}}({m_\mathrm{max}}|{\cal N})$}$ . They often include an expression for the mean value of the distribution (Oey & Clarke 2005), the mode of the distribution (Gumbel 1958; Kendall & Stuart 1977), or the percentile analysis (Weidner et al. 2010). However, there is almost no study in terms of the $\hbox{$m_\mathrm{max}-{\cal N}$}$ relationship nor in the $\hbox{$\Phi_{\cal N}({\cal N})$}$ dependence of the $\hbox{${\cal N}-m_\mathrm{max}$}$ correlation (Elmegreen 2006; Selman & Melnick 2008).

So, in the probabilistic case, the $\hbox{${\cal N}-m_\mathrm{max}$}$ , ℳ − m_max, $\hbox{$m_\mathrm{max}-{\cal N}$}$ , and m_max − ℳ correlations are not equivalent to each other. The ℳ − m_max correlation requires a $\hbox{$\Phi_{\cal N} ({\cal N}|\cal M)$}$ distribution which is not required by the $\hbox{${\cal N}-m_\mathrm{max}$}$ correlation. In addition, establishing the $\hbox{$m_\mathrm{max}-{\cal N}$}$ and m_max − ℳ correlations requires some priors about the distribution of $\hbox{$\Phi_{\cal N} ({\cal N})$}$ and Φ_ℳ(ℳ) that are not considered in the previous correlations.

The probabilistic formulation offers the advantages of using continuous distributions and including conditions formally. However, this does not mean that any condition can be represented analytically. We have mentioned above that the Weidner & Kroupa (2004, 2006) formulation is a major effort to include conditions in the IMF. Let us rewrite Eq. (25) in statistical terms and give a meaning to such distribution: $\begin{matrix} φ (m | m_{\max};𝒩) & = & \frac{δ (m - m_{\max})}{𝒩} + \frac{𝒩 - 1}{𝒩} φ (m | m < m_{\max}) . \end{matrix}$ $\begin{eqnarray} \phi(m|m_\mathrm{max};{\cal N}) & =& \frac{\delta(m-m_\mathrm{max})}{\cal N} + \frac{{\cal N} - 1}{\cal{N}} \phi(m| m < m_\mathrm{max}). \end{eqnarray}$ (27)The above equation describes the constrained IMF for a fixed m_max value in a set of $\hbox{$\cal N$}$ stars. This constraint does not imply that a star with m_max is present in the cluster, but just that there are no stars more massive than m_max and that the event m = m_max has a probability of $\hbox{$1/{\cal N}$}$ . Since all the arguments of the characteristic value hold here, the associated characteristic value is the fixed m_max value, which is also a cut-off value of the distribution. So, 63% of realizations for clusters with $\hbox{$\cal N$}$ stars following such PDF have at least one star with mass m_max (and no stars more massive than m_max).

Hence, there is no way to include in an analytical form the condition that the most massive star is actually m_max and that such a star is present in any realization. There is also a similar problem with ℳ, although the problem in this case is more severe since it also requires a $\hbox{$\Phi({\cal N})$}$ (discrete) distribution. However, there is an infinite number of combinations of stellar masses that are consistent with any reasonable ℳ − m_max physical law.

The only possible solution at the moment to include a ℳ − m_max physical law and work with it is to perform a large set of Monte Carlo simulations, which should assume a particular $\hbox{$\Phi({\cal N})$}$ distribution, and just consider the subset where the chosen ℳ − m_max physical law is verified. Then, any physical result must be obtained numerically (as opposed to analytically). The advantages of describing φ(m) as a continuous distribution are thus lost¹⁴.

4.2. Sampling, iid variables, and the relation of the IMF with SF

We have seen that the existence of a physical law linking ℳ and m_max cannot be established through a simple manipulation of the IMF functional form. The current debate on whether the IMF is randomly or non-randomly sampled stems mainly from works by Weidner & Kroupa (2006) and Weidner et al. (2010), where $\hbox{$\hat{m}_\mathrm{max}$}$ is interpreted as the exact value of the most massive star in a cluster with a given mass. This debate has been focusing on different sampling proposals. Even if the authors themselves now consider the sorted sampling proposal just as a first approximation (Kroupa et al. 2011), we want to emphasize that the key point of different sampling algorithms is not the sorting process, but the assumed relation between $\hbox{$\cal N$}$ and ℳ (e.g., the sorted sampling proposal uses an $\hbox{$\cal N$}$ value estimated by means of ℳ divided by ⟨m|m < m_max⟩, which imposes a constraint in $\hbox{$\cal N$}$ ). The situation is actually more clearly described in the richness effect proposed by García-Vargas & Díaz (1994); García-Vargas et al. (1995): a star with mass m_a is formed according to the amount of gas that remains in the system once a certain number of stars with m < m_a have been formed. The sampling problem appears when we try to fix ℳ(m < m_a) and $\hbox{${\cal N}(m < m_\mathrm{a})$}$ simultaneously and include it analytically in the φ(m) functional form.

As we have shown, there is no self-consistent way to do it with the current description of φ(m). The inclusion of any ℳ − m_max physical law, no matter what its interpretation is, precludes using an analytical functional form for the IMF. The sampling methods proposed by different authors are actually operational methods, not an implementation of the physical process¹⁵.

However, we want to stress that the question on whether the IMF is randomly sampled or not (i.e., whether stars are iids or not) is completely valid, independent of the particular problem motivating the question. So we will not attempt to discuss this question in terms of any specific results from literature, but from a more general perspective.

4.2.1. Identical and independent distributed variables and the relation of the IMF with the star formation

The question we aim to answer is: are stellar masses iid variables, or, at least, can they be treated as if they were? A sample is an iid sample if each random variable has the same identical probability distribution and all of them are mutually independent.

Throughout the paper, we have explicitly excluded a mention to the SF physics. It is now time to take a look at different ways in which the SF and the IMF can be linked and how randomness enters in this game. There are several possible ways. (a) Some physicists prefer to assume a deterministic universe in which one and only one result is obtained for a given set of initial conditions. But there is such a large variety of initial conditions that they can be only described in a probabilistic way. Hence the results of SF events, like the IMF itself, can be only described in a probabilistic way. (b) We can also assume an universe where determinism, although it exists, is somehow hidden by complexity. Thus we assume accordingly that the SF is a complex process in the mathematical sense: nonlinear and with interconnected components, producing such a large variety of results that they can only be treated in a probabilistic way. (c) We admit that there are intrinsically random variables in nature and that the SF is an intrinsically random process (like turbulence), so its results can only be treated in a probabilistic way. We refer to Shadmehri & Elmegreen (2011), Sánchez et al. (2006), Elmegreen (1999, 2011) as examples where some of these different scenarios are considered.

The feature common to these three cases is that the IMF should be used probabilistically (i.e., stellar masses are randomly sampled), which does not imply that the SF is random. There would be no physical ℳ and m_max relationship at all, or there would be a deterministic physical law linking ℳ and m_max. However, the internal distribution of stellar masses that are physically compatible (in the SF sense) with this physical law would depend on a set of unknown (and variable) initial conditions or intrinsically random characteristics. Then the IMF could only be described by means of a probabilistic formulation. A probabilistic interpretation of the IMF does not contradict a deterministic vision of the physics of SF.

On a large scale, the IMF is the result of all possible SF events and SF modes, although it does not necessarily describe any particular one. Following this argument, we are able to describe probabilistically the incidence of having a star with a given mass that was born at a a given time, the stellar birth rate ℬ(m,t), as the composition of two independent functions: the star formation history, SFH ψ(t,ℳ) (although $\hbox{$\psi(t,{\cal N)}$}$ would be more adequate) and the IMF, φ(m) (Schmidt 1959, 1963; Tinsley 1980; Scalo 1986). The first function includes all the possible SF modes and provides the time-scale and the amount of gas transformed into stars. The second one describes how a given amount of gas would be distributed among different stellar masses. We recall that the first IMF determinations were done with field stars (Salpeter 1955), so they implicitly averaged a large variety of SF modes.

The separation of ℬ(m,t) into two independent functions seems to be a valid approach for the study of galaxies and a variety of systems where different modes of star formation coexist; it has been extensively used in extragalactic astronomy and cosmology. One particular characteristic of this approach is the use of single stellar populations (SSP, Renzini & Buzzoni 1986) which corresponds to $\hbox{$\psi(t,{\cal N}) = {\cal N} \times \delta(t)$}$ . Since any function can be described by a sum of δ(t − τ) functions, it allows the SFH to be recovered from observational data or the evolution of galaxies to be described as a composition of SSPs with different intensity. The star formation rate, SFR, can then be defined as a time average of the SFH (da Silva et al. 2012) or as the result of a flat SFH ( $\hbox{$\psi(t,{\cal N}) = \mathrm{const.}$}$ ). Current SF rate indicators are based on SSP modeling with constant SFH (Kennicutt 1998).

The case would be different if we changed the scale to smaller systems. When we restrict the situation to specific SF modes, particular details emerge and have some imprint on the IMF. The more restrictive the mode, the more details are present. In this case we are moving ourselves to particular IMF realizations with given conditions, which may depart from the probabilistic description given by φ(m). At small scales, the validity of the decomposition of ℬ(m,t) in two independent functions is not clear. However, the universality of the IMF even at such scales leads one to think that it would be the case (however, see Elmegreen 2011 for an example of possible variations of the IMF, especially in the low-mass tail, depending on the environmental conditions).

The approach we have presented here when talking about ℬ(m,t) is a top-down one: φ(m) is the most generic representation, so that the larger the system, the more valid it is. We note that this vision is mentioned by Vanbeveren (1982), who also claimed existence of a ℳ − m_max physical law. Because there is an universal IMF at a large scale, he says, the IMF varies at small scale.

In this case it is expected the IMF has a quasi universal shape at high scales with possible variations at small scales. Here, we understand that deviations from a universal shape are allowed as far as they are small compared to the global budget. In addition, the incidence of deviations also depends on the size of the system, that is, the integral of the $\hbox{$\psi(t,{\cal N)}$}$ over time (see da Silva et al. 2012,for a discussion).

There is also a bottom-up approach when talking about ℬ(m,t), which is the one proposed by the IGIMF theory. In this case, universality in the IMF functional form is assumed. However, there is a ℳ − m_max physical law that relates ℳ with m_max; hence there is IMF variability in the sense of a variable m_max for given ℳ. It is assumed that this physical law operates for all SF modes, or equivalently, that there is one SF mode: star formation in clusters. In this case, the mass distribution of stars depends on where (and when) they were formed, so only stars formed in the same cluster (or clusters with the same ℳ) share the same IMF.

For the study of galaxies or, in general, systems that may contain clusters with different masses, it is necessary to take into account the distribution of the total masses of these clusters: the ICMF. As a result, at a galactic scale there is not one IMF, but a IGIMF that results through the combination of the ICMF and different IMFs. It depends on ℳ and implies a redefinition of the IMF itself (Kroupa & Weidner 2003). In this case it is not clear if ℬ(m,t) can be separated into independent functions and how (Cerviño et al. 2011). This implies major revisions of global galactic and extragalactic studies, including the SSP concept, and there is currently a large debate on the issue (Corbelli et al. 2009; Fumagalli et al. 2011; Eldridge 2012). Although a full discussion goes beyond the scope of this paper, we want to point out that there would be a $\hbox{$\left<{\cal M}\right> - \hat{m}_\mathrm{max}$}$ physical law, although it must be imposed ad hoc, and that, whatever the case, random sampling and a probabilistic description of the IMF are compatible with it.

5. Conclusions

Having carried out a thorough analysis of different IMF interpretations, with a focus on the question of how information on m_max can be extracted from the IMF itself, we are in position to formulate the problem in a different way: what information does the IMF contain? Can we extract information on the SF process from an algebraic manipulation of the IMF? The answers to these questions are driven by the interpretation of the IMF adopted by each author and, in particular, their conclusion as to whether, without direct observations, m_max can be exactly determined or just estimated.

Our analysis of the problem has led us to the following main conclusion: Only a probabilistic interpretation of the IMF, where φ(m) is a PDF (ruling out arbitrary normalizations) and stellar masses are random sampledly iid variables, provides a physical and mathematical self-consistent formulation that explains the $\hbox{$\left<{\cal M} \right> - \hat{m}_\mathrm{max}$}$ statistical correlation obtained from IMF algebraic manipulation. We also give plausible arguments that introduce the IMF as a probabilistic distribution when related with the physics of the star formation process.

Additional conclusions of this work are:

1.
The actual total stellar mass of a cluster,ℳ, cannot be inferred from an IMF, φ(m), with a continuous functional form. A direct IMF integration only provides its mean value, ⟨ℳ⟩, for a given number of stars $\hbox{$\cal N$}$ : $⟨ ℳ ⟩ = 𝒩 \times ⟨ m ⟩ = 𝒩 \times \int_{m_{low}}^{m_{up}} m φ (m) d m .$ $\begin{equation} \left< {\cal M}\right> = {\cal N} \times \left< m \right> = {\cal N} \times \int_{m_{\rm low}}^{m_{\rm up}} \, m \, \phi(m) \, \mathrm{d}m. \label{eq:fin0} \end{equation}$ (28)Although some authors do not consider $\hbox{$\cal N$}$ as a relevant physical variable (Kroupa et al. 2011), the fact that stars are discrete entities and $\hbox{$\cal N$}$ is a natural number are relevant physical constraints that must be included in the treatment of the IMF and in the algebra used to obtain physical results from it.
2.
Given the equation defining the most massive star in a system, $\frac{1}{𝒩} = \int_{m̂ \max}^{m_{up}} φ (m) d m,$ $\begin{equation} \label{eq:last} \frac{1}{\cal N} = \int_{\hat{m}_\mathrm{max}}^{m_{\rm up}} \phi(m) \, \mathrm{d}m, \end{equation}$ (29)the resulting $\hbox{$\left<{\cal M}\right> - \hat{m}_\mathrm{max}$}$ correlation is practically independent of the specific IMF interpretation adopted. However, how this equation is understood strongly depends on the framework of the interpretation.
3.
In a probabilistic interpretation, Eq. (29) provides a characteristic mass, $\hbox{$\hat{m}_\mathrm{max}$}$ , that is, the value of m that is not reached or exceeded with a probability 0.37 in a sample of $\hbox{${\cal N}$}$ stars, but not the actual mass of the most massive star in the sample.
4.
For any $\hbox{$\hat{m}_\mathrm{max} \gtrsim 10~{M}_\odot$}$ and not close to m_up, there is a probability larger than 90% that the most massive star in the system is larger than such $\hbox{$\hat{m}_\mathrm{max}$}$ value. Therefore, assuming that Eq. (29) provides the actual mass of the most massive star in the cluster, as argued in the framework of different interpretations of the IMF, is an ad hoc assumption and not a physical fact.
5.
$\hbox{$\hat{m}_\mathrm{max}$}$ defines the mode of the distribution $\hbox{$\Phi_{\cal N}({\cal N}| m_\mathrm{max})$}$ of the possible $\hbox{$\cal N$}$ values inferred from the most massive star in the cluster assuming a flat $\hbox{$\Phi_{\cal N}({\cal N})$}$ distribution. A similar dependence in $\hbox{$\Phi_{\cal N}({\cal N})$}$ is present when $\hbox{$\cal N$}$ is inferred from the number of the N_a most massive stars in the cluster (cf. Paper II). However, the observational evidence is that $\hbox{$\Phi_{\cal N}({\cal N})$}$ is a power law (if it is related with the ICMF).
6.
When the total cluster mass is inferred through the equation $\hbox{$\left<{\cal M}\right> = {\cal N} \times \left< m \right>$}$ and $\hbox{$\cal N$}$ is obtained assuming a flat $\hbox{$\Phi_{\cal N}({\cal N})$}$ , the observational data become consistent with a $\hbox{$\hat{m}_\mathrm{max} - \left<{\cal M}\right>$}$ statistical correlation. This is indeed the case when $\hbox{$\Phi_{\cal N}({\cal N})$}$ is not taken into account explicitly in the $\hbox{$\cal N$}$ (and ℳ) estimation (as found in most of the cluster in the Weidner et al. 2010 sample).
7.
The meaningful distribution to be tested against observational data is $\hbox{$\Phi_{m_\mathrm{max},{\cal N}}(m_\mathrm{max}, {\cal N})$}$ and not $\hbox{$\Phi_{\cal N}({\cal N}| m_\mathrm{max})$}$ or $\hbox{$\Phi_{m_\mathrm{max}}(m_\mathrm{max}|{\cal N})$}$ .
8.
Weidner et al. (2010) claim that the results of their analysis falsify the hypothesis of a random sampling of the IMF. Based on the two preceding points, we consider that such claim should be revised, both because of the ℳ values it relies on and because of the methodological choice of using $\hbox{$\Phi_{m_\mathrm{max}}(m_\mathrm{max}|{\cal N})$}$ .
9.
Different sampling algorithms proposed in the literature are not physical requirements, but convenient mathematical algorithms that try to simplify the implications of such physical law on studies where the IMF is used (as is the case of stellar population in galaxies). Unfortunately, such simplification is not possible.
10.
We cannot exclude that a hard physical law linking ℳ to m_max (the actual values) does indeed exist; but, if this is the case, it must arise from considerations of the problem including a full-fledged SF analysis, which cannot be shortcut through algebraic IMF manipulations. Whatever the case is, the existence of such an ℳ − m_max physical law is compatible with random sampling of stellar masses and a probabilistic interpretation of the IMF.
11.
If such a physical law exists, it cannot be incorporated to an analytical IMF functional form, but must rather be approached by computing Monte Carlo simulations and taking into account only the subset of simulations that verify the assumed ℳ − m_max physical law. We note that this approach is fully compatible with the optimal sampling definition provided by Kroupa et al. (2011).

We conclude that a random sampling IMF is not in contradiction to a possible m_max − ℳ physical law. However, such a law cannot be obtained from IMF algebraic manipulation or included analytically in the IMF functional form. The possible physical information that would be obtained from the $\hbox{$\cal N$}$ (or ℳ) − m_max correlation is closely linked with the Φ_ℳ(ℳ) and $\hbox{$\Phi_{\cal N}({\cal N})$}$ distributions; hence it depends on the SF process and the assumed definition of stellar cluster. In a second paper of this series we will explore the application of the probabilistic description of the IMF formulated in this study. Particularly, we will describe how to use it to make inferences about quantities that characterize some stellar systems, and how observational constraints work as a priori conditions, affecting the sampling distributions of ℳ and $\hbox{$\cal N$}$ that we can infer.

¹

However, because distribution (2) is an scaled version of distribution (1), the conclusions derived from (1) also apply to (2).

²

Random sample means that every possible sample has a calculable chance of selection. This is a requirement of any statistical and probabilistic study (Kendall & Stuart 1977).

³

We note that Weidner & Kroupa (2004) use α₂ = 2.30 in their parametrization of the IMF and that Weidner & Kroupa (2006) use α₂ = 2.35.

⁴

We use here the Heaviside function as a distribution to define the domain of φ(m), including constraints. In this situation the value of H(0) is not defined, but it is assigned a posteriori to be consistent with the convention used in the integral limits. In the case of Eq. (3), H(0) = 0.

⁵

The discussion in this section is mainly based on Sornette (2004), Kendall & Stuart (1977), and Gumbel (1958), although the same formulae can be found in other works.

⁶

Here we use p to represent probabilities on the IMF (cf., Eqs. (1) and (2)) and $\hbox{${\cal{P}}$}$ to represent probabilities on the sample with $\hbox{${\cal N}$}$ stars.

⁷

We note that, depending on the reference and the convention used in Sect. 2, this value can be defined either as reached or exceeded or just as exceeded.

⁸

The characteristic largest value defined by Eq. (7) is related to the estimation of the number of events we must record to have an event larger than a given value m_a (which is called return period in extreme value theory). If the events are taken in a regular time interval, for instance, it could be the estimation of the number of years between earthquakes larger than a given magnitude, the number of years between economy crashes, and so on.

⁹

The analyses based on the parameters of the distribution, on the percentile, and on confidence intervals around the mode are equivalent only in the Gaussian case, where 1σ is almost equivalent to the percentile range 16 − 84% and the 68% confidence interval.

¹⁰

Except in a few cases, Weidner & Kroupa (2004) and Weidner et al. (2010) obtain $\hbox{${\cal N}$}$ by extrapolating to the full IMF range the number of stars N_a observed above a specified mass or within a specified mass range. Then, ℳ is obtained by means of $\hbox{${\cal M} = {\cal N} \times \left<m\right>$}$ . We obtained the plotted $\hbox{$\cal N$}$ values by division of the ℳ values quoted in their tables by ⟨m⟩.

¹¹

$\hbox{${\cal N}$}$ is not a continuous variable; hence it cannot have been derivated and $\hbox{${\cal N}^\mathrm{mode}$}$ must be an integer number. Thus, the formulae provide only an approximation.

¹²

In Paper II we show that this assumption is implicit when $\hbox{$\cal N$}$ is inferred from the number N_a of massive stars in the (m_max, m_a) range by using the relation $\hbox{${\cal N} = N_\mathrm{a} \times p(m \geq m_\mathrm{a})$}$ . Similarly, the assumption is implicit when ℳ is inferred by multiplying the mean stellar mass by $\hbox{$\cal N$}$ ; it is a general assumption found in the literature and, in particular, is the method used to infer ℳ in the Weidner et al. (2010) compilation.

¹³

That is:

$\begin{matrix} Φ_{m_{\max}, 𝒩} (m_{\max}, 𝒩) & = & Φ_{m_{\max}} (m_{\max} | 𝒩) Φ_{𝒩} (𝒩) \\ = & Φ_{𝒩} (𝒩 | m_{\max}) φ (m_{\max}) . \end{matrix}$ $\begin{eqnarray*} \Phi_{m_\mathrm{max},{\cal N}}(m_\mathrm{max},{\cal N}) &=& \Phi_{m_\mathrm{max}}(m_\mathrm{max}|{\cal N}) \, \Phi_{\cal N}({\cal N})\nonumber\\ &=& \Phi_{\cal N}( {\cal N} | m_\mathrm{max} ) \, \phi(m_\mathrm{max}). \end{eqnarray*}$

¹⁴

We note that any sampling proposal that aims to reproduce a ℳ − m_max physical law with a finite number of stars $\hbox{$\cal N$}$ is also doomed to this situation: it provides a φ(m_i) array, but not a continuous φ(m) distribution.

¹⁵

The optimal sampling algorithm provided by Kroupa et al. (2011) is based on obtaining bins through the larger than for lower integral limits and equal to or lower than for upper integral limits. These criteria are complementary to those underlying their equations to obtain the ℳ − m_max relationship. In addition, the IMF is filled from m_max down to lower masses, contrary to the physical arguments given to justify the sorting sampling algorithm. We stress that it is not a problem of the formulation in as much as the physical formulation of the problem is not linked with the operational mathematical method used to solve the physical equations.

¹⁶

We use μ(m) to follow the notation used by Gumbel (1958). It must not be confused with the definition of the mean value that is used in other papers.

Acknowledgments

M.C. acknowledges Fernando Selman and David Valls-Gabaud for useful discussions on this subject. He also acknowledges Roberto Terlevich, Michele Fumagalli, Søren S. Larsen, and Kevin Covey for discussions on the similarities and differences of $\hbox{$\Phi_{\cal N}({\cal N})$}$ and Φ_ℳ(ℳ) and their implications in the modeling of clusters and galaxies, which have been very useful for this paper and for future works. Finally, we acknowledge Nate Bastian, Pavel Kroupa, Michele Fumagalli, and John Eldridge for useful comments to the first version of this paper (now split into Papers I and II) and the suggestions of the referee, Peter Anders, which have greatly improved the clarity of the paper. This work has been supported by the MICINN (Spain) through the grants AYA2007-64712, AYA2010-15081, AYA2011Ð22614, AYA2010-15196, AYA2011-29754-C03-01, AYA2008-06423-C03-01/ESP, AYA2010-17631, a Calar Alto Observatory postdoctoral fellowship, and by program UNAM-DGAPA-PAPIIT IA101812, and CONACYT 152160 Mexico, and co-funded under the Marie Curie Actions of the European Commission (FP7-COFUND).

References

Bastian, N., Covey, K. R., & Meyer, M. R. 2010, ARA&A, 48, 339 [NASA ADS] [CrossRef] [Google Scholar]
Bonnell, I. A., Bate, M. R., & Vine, S. G. 2003, MNRAS, 343, 413 [NASA ADS] [CrossRef] [Google Scholar]
Bonnell, I. A., Vine, S. G., & Bate, M. R. 2004, MNRAS, 349, 735 [NASA ADS] [CrossRef] [Google Scholar]
Cerviño, M., & Luridiana, V. 2006, A&A, 451, 475 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Cerviño, M., Valls-Gabaud, D., Luridiana, V., & Mas-Hesse, J. M. 2002, A&A, 381, 51 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Cerviño, M., Pérez, E., Sánchez, N., Román-Zúñiga, C., & Valls-Gabaud, D. 2011, UP2010: Have Observations Revealed a Variable Upper End of the Initial Mass Function? eds. M. Treyer et al. (San Francisco, CA: ASP), ASP Conf. Proc., 440, 133 [Google Scholar]
Cerviño, M., Román-Zúñiga, C., Bayo, A., et al. 2013, A&A, 553, A32 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Corbelli, E., Verley, S., Elmegreen, B. G., & Giovanardi, C. 2009, A&A, 495, 479 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Crowther, P. A., Schnurr, O., Hirschi, R., et al. 2010, MNRAS, 408, 731 [NASA ADS] [CrossRef] [Google Scholar]
D’Agostino, R. B., & Stephens, M. A. 1986, Goodness-of-Fit Techniques (New York: Marcel Dekker) [Google Scholar]
Eldridge, J. J. 2012, MNRAS, 422, 794 [NASA ADS] [CrossRef] [Google Scholar]
Elmegreen, B. G. 1997, ApJ, 486, 944 [NASA ADS] [CrossRef] [Google Scholar]
Elmegreen, B. G. 1999, ApJ, 515, 323 [NASA ADS] [CrossRef] [Google Scholar]
Elmegreen, B. G. 2000, ApJ, 539, 342 [NASA ADS] [CrossRef] [Google Scholar]
Elmegreen, B. G. 2006, ApJ, 486 , 944 [Google Scholar]
Elmegreen, B. G. 2011, ApJ, 731, 61 [NASA ADS] [CrossRef] [Google Scholar]
Fernández-Soto, A., Lanzetta, K. M., Chen, H.-W., Levine, B., & Yahata, N. 2002, MNRAS, 330, 889 [NASA ADS] [CrossRef] [Google Scholar]
Fouesneau, M., & Lançon, A. 2010, A&A, 521, L22 [Google Scholar]
Fumagalli, M., da Silva, R. L., & Krumholz, M. R. 2011, ApJ, 741, L26 [NASA ADS] [CrossRef] [Google Scholar]
García-Vargas, M. L., & Díaz, A. I. 1994, ApJS, 91, 553 [NASA ADS] [CrossRef] [Google Scholar]
García-Vargas, M. L., Bressan, A., & Díaz, A. I. 1995, A&AS, 112, 13 [NASA ADS] [Google Scholar]
Gumbel, E. J. 1958, Statistics of Extremes (Columbia University Press) [Google Scholar]
Haas, M. R., & Anders, P. 2010, A&A, 512, 79 [Google Scholar]
Kendall, M., & Stuart, A. 1977, The advanced theory of statistics (London: Griffin), 4th edn. [Google Scholar]
Kennicutt, R. C., Jr. 1998, ARA&A, 36, 189 [Google Scholar]
Kirk, H., & Myers, P. C. 2011, ApJ, 727, 64 [NASA ADS] [CrossRef] [Google Scholar]
Kroupa, P. 2001, MNRAS, 322, 231 [NASA ADS] [CrossRef] [Google Scholar]
Kroupa, P. 2002, Science, 295, 82 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Kroupa, P., & Weidner, C. 2003, ApJ, 598, 1076 [NASA ADS] [CrossRef] [Google Scholar]
Kroupa, P., Weidner, C., Pflamm-Altenburg, J., et al. 2011 [arXiv:1112.3340] [Google Scholar]
Larson, R. B. 1982, MNRAS, 200, 159 [NASA ADS] [Google Scholar]
Larson, R. B. 2003, Galactic Star Formation Across the Stellar Mass Spectrum, eds. J. M. De Buizer, & N. S. van der Bliek (San Francisco: ASP), ASP Conf. Ser., 287, 65 [Google Scholar]
Maíz Apellániz, J., & Úbeda, L. 2005, ApJ, 629, 873 [NASA ADS] [CrossRef] [Google Scholar]
Maschberger, T., & Clarke, C. J. 2008, MNRAS, 391, 711 [NASA ADS] [CrossRef] [Google Scholar]
Oey, M. S., & Clarke, C. J. 2005, ApJ, 620, L43 [NASA ADS] [CrossRef] [Google Scholar]
Parker, R. J., & Goodwin, S. P. 2007, MNRAS, 380, 1271 [NASA ADS] [CrossRef] [Google Scholar]
Pflamm-Altenburg, J., & Kroupa, P. 2008, Nature, 455, 641 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Piskunov, A. E., Kharchenko, N. V., Schilbach, E. et al. 2011, A&A, 525, 122 [Google Scholar]
Reddish, V. C. 1978, International Series in Natural Philosophy (Oxford: Pergamon) [Google Scholar]
Renzini, A., & Buzzoni, A. 1986, Spectral Evolution of Galaxies, Astrophysics and Space Science Library, 122, 195 [NASA ADS] [CrossRef] [Google Scholar]
Salpeter, E. E. 1955, ApJ, 121, 161 [Google Scholar]
Sánchez, N., Alfaro, E. J., & Pérez, E. 2006, ApJ, 641, 347 [NASA ADS] [CrossRef] [Google Scholar]
Scalo, J. M. 1986, Fund. Cosm. Phys. 11, 1 [Google Scholar]
Schmidt, M. 1959, ApJ, 129, 243 [NASA ADS] [CrossRef] [Google Scholar]
Schmidt, M. 1963, ApJ, 137, 758 [NASA ADS] [CrossRef] [Google Scholar]
Selman, F. J., & Melnick, J. 2008, ApJ, 689, 816 [NASA ADS] [CrossRef] [Google Scholar]
Shadmehri, M., & Elmegreen, B. G. 2011, MNRAS, 410, 788 [NASA ADS] [CrossRef] [Google Scholar]
da Silva, R. L., Fumagalli, M., & Krumholz, M. 2012, ApJ, 745, 145 [NASA ADS] [CrossRef] [Google Scholar]
Sornette, D. 2004, Critical phenomena in natural sciences: chaos, fractals, selforganization and disorder: concepts and tools, Springer series in synergetics (Heidelberg: Springer) [Google Scholar]
Treyer, M., Wyder, T., Neill, J., Seibert, M., & Lee, J. 2011, UP2010: Have Observations Revealed a Variable Upper End of the Initial Mass Function? ASP Conf. Proc., 440 [Google Scholar]
Tinsley, B. 1980, Fun. Cosm. Phys., 5, 287 [Google Scholar]
van Albada, T. S. 1968, Bull. Astron. Inst. Netherlands, 20, 57 [NASA ADS] [Google Scholar]
Vanbeveren, D. 1982, A&A, 115, 65 [NASA ADS] [Google Scholar]
Weidner, C., & Kroupa, P. 2004, MNRAS, 348, 187 [NASA ADS] [CrossRef] [Google Scholar]
Weidner, C., & Kroupa, P. 2006, MNRAS, 365, 1333 [NASA ADS] [CrossRef] [Google Scholar]
Weidner, C., Kroupa, P., & Bonnell, I. A. D. 2010, MNRAS, 401, 275 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A: The intensity function

As stated in Sect. 3, φ(m) cannot provide a value of m_max that can be used as the actual maximum stellar mass in a hypothetical cluster. Still, we can calculate the probability for the actual value of m_max to be close to the mean, the median, the characteristic value, or the mode of $\hbox{$\Phi_{m_\mathrm{max}}({m_\mathrm{max}}|{\cal N})$}$ . In general, we can evaluate the probability that a value known to be larger that m_b is smaller than m_b + dm_b. To do that, we need to introduce the intensity function¹⁶, μ(m_b): $μ (m_{b}) d m_{b} = \frac{φ (m_{b}) d m_{b}}{1 - p (m < m_{b})} \geq φ (m_{b}) d m_{b} .$ $\appendix \setcounter{section}{1} \begin{equation} \mu(m_\mathrm{b}) \;\mathrm{d} m_\mathrm{b} = \frac{\phi(m_\mathrm{b}) \mathrm{d}m_\mathrm{b}}{1- p(m < m_\mathrm{b})} \geq \phi(m_\mathrm{b})\; \mathrm{d}m_\mathrm{b}. \end{equation}$ (A.1)The intensity function is not a PDF; it is independent of $\hbox{${\cal N}$}$ , as implicit in the idd variable hypothesis: the probability of obtaining a value equal to or larger than 5 throwing one dice is 2/6, independently of previous throws. This must not be confused with the case we studied in the previous paragraphs, which would be equivalent to the probability of obtaining at least one throw with a result equal to or larger than 5 in $\hbox{${\cal N}$}$ draws.

Fig. A.1

Intensity function μ(m) as a function of m for the IMF. The figure also shows the probability that m will be in the range (m_b, m_b + 1 M_⊙).

In Fig. A.1 we plot the intensity function for different values of m_b for the case of the IMF used in this work. The figure also shows the probability that a star known to have m ≥ m_b will be in the range [m_b, m_b + 1 M_⊙). The figure shows that μ(m_b) has a minimum at a value close to m_up, and it goes to infinity at m_up. The probability of m in the range [m_b, m_b + 1 M_⊙] decreases with m_b, except for values close to m_up. For example, there is only a chance lower than 10% that, given a star in the m_b − m_up range, this star has a mass m_b for m_b ≥ 10 M_⊙. The situation changes in the extreme case in which m_b is close to m_up: if we know that there is one star with mass m_up or larger, the mass must certainly be m_up (i.e., probability equal to 1), since stars with mass larger than m_up do not exist.

This has an interesting implication for the statement that $\hbox{$\hat{m}_\mathrm{max}$}$ actually provides the mass of the most massive star in the cluster: assuming that there is one star equal to or more massive than $\hbox{$\hat{m}_\mathrm{max}$}$ and that $\hbox{$\hat{m}_\mathrm{max} \ge 10~{M}_\odot$}$ and is not close to m_up, there is a probability larger than 90% that the most massive star is more massive than $\hbox{$\hat{m}_\mathrm{max}$}$ !

All Figures

Fig. 1

IMF used in the present work (solid line), as in the parametrization by Kroupa (2001, 2002) and Weidner & Kroupa (2006). Being a PDF, it can have values larger than one; the probabilities are given by the integral over the PDF. We also plot the probability that a star has a mass in the m,m + 1 M_⊙ range, which is lower than one (dashed line). This probability declines rapidly when m is larger than m_up − 1 M_⊙.

In the text

	Fig. 2 Distribution of the maximum stellar mass, $\hbox{$\Phi_{m_\mathrm{max}}({m_\mathrm{max}}\|{\cal N})$}$ for different values of $\hbox{${\cal N}$}$ . The circle on each curve is the position of the characteristic value $\hbox{$\hat{m}_\mathrm{max}$}$ .
In the text

Fig. 3

Percentile analysis around the median of $\hbox{$\Phi_{m_\mathrm{max}}({m_\mathrm{max}}|{\cal N})$}$ as a function of $\hbox{${\cal N}$}$ (shaded areas). The figure includes as a reference the position of the characteristic value, median, mean, and mode of the distribution. Small triangles: compilation by Weidner et al. (2010) of observational values of m_max and inferred values of $\hbox{$\cal N$}$ obtained from observations; squares: observed values of $\hbox{$\cal N$}$ and m_max from Kirk & Myers (2011); stars: observed values of $\hbox{$\cal N$}$ and m_max in the field for the four observed regions from Kirk & Myers (2011).

In the text

	Fig. 4 Confidence interval analysis of $\hbox{$\Phi_{m_\mathrm{max}}({m_\mathrm{max}}\|{\cal N})$}$ as a function of $\hbox{${\cal N}$}$ (shaded area). Lines and symbols have the same meaning as in Fig. 3.
In the text

	Fig. 5 Confidence interval analysis of $\hbox{$\Phi_{\cal N}({\cal N}\|m_\mathrm{max})$}$ as a function of m_max for a $\hbox{$\Phi_{\cal N}({\cal N}) = \mathrm{constant}$}$ . Symbols have the same meaning as in Fig. 3.
In the text

	Fig. 6 Confidence interval analysis of $\hbox{$\Phi_{\cal N}({\cal N}\|m_\mathrm{max})$}$ as a function of m_max for a $\hbox{$\Phi_{\cal N}({\cal N}) \propto {\cal N}^{-2}$}$ . Arrows: data points by Weidner et al. (2010) using $\hbox{${\cal N}_\mathrm{obs}$}$ without correction of incompleteness due to unobserved stars. Other symbols have the same meaning as in Fig. 3.
In the text

	Fig. 7 3D representation of $\hbox{$\log \Phi_{m_\mathrm{max},{\cal N}}(m_\mathrm{max}, {\cal N})$}$ distribution for a $\hbox{$\Phi_{\cal N}({\cal N}) \propto {\cal N}^{-2}$}$ .
In the text

Fig. 8

ℳ − m_max relationship resulting from the analytical formulation of the IMF of García-Vargas & Díaz (1994); García-Vargas et al. (1995). The figure includes data points from Weidner et al. (2010) and Kirk & Myers (2011), where symbols have the same meaning as in Fig. 3 and the result of two linear fits to the data from Weidner et al. (2010) and Kirk & Myers (2011) using either log ℳ or log m_max as the independent variable.

In the text

Fig. 9

ℳ − m_max relationship resulting from the distribution function formulation of the IMF of Elmegreen (1997, 1999, 2000), the formulation of Weidner & Kroupa (2004, 2006), and the optimal sampling formulation of Kroupa et al. (2011). The figure includes data points from Weidner et al. (2010) and Kirk & Myers (2011) and the result of the linear fit of the data to log ℳ as a function of log m_max.

In the text

	Fig. A.1 Intensity function μ(m) as a function of m for the IMF. The figure also shows the probability that m will be in the range (m_b, m_b + 1 M_⊙).
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Bastian, N., Covey, K. R., & Meyer, M. R. 2010, ARA&A, 48, 339 [NASA ADS] [CrossRef] [Google Scholar]

[2] Bonnell, I. A., Bate, M. R., & Vine, S. G. 2003, MNRAS, 343, 413 [NASA ADS] [CrossRef] [Google Scholar]

[3] Bonnell, I. A., Vine, S. G., & Bate, M. R. 2004, MNRAS, 349, 735 [NASA ADS] [CrossRef] [Google Scholar]

[4] Cerviño, M., & Luridiana, V. 2006, A&A, 451, 475 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[5] Cerviño, M., Valls-Gabaud, D., Luridiana, V., & Mas-Hesse, J. M. 2002, A&A, 381, 51 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[6] Cerviño, M., Pérez, E., Sánchez, N., Román-Zúñiga, C., & Valls-Gabaud, D. 2011, UP2010: Have Observations Revealed a Variable Upper End of the Initial Mass Function? eds. M. Treyer et al. (San Francisco, CA: ASP), ASP Conf. Proc., 440, 133 [Google Scholar]

[7] Cerviño, M., Román-Zúñiga, C., Bayo, A., et al. 2013, A&A, 553, A32 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[8] Corbelli, E., Verley, S., Elmegreen, B. G., & Giovanardi, C. 2009, A&A, 495, 479 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[9] Crowther, P. A., Schnurr, O., Hirschi, R., et al. 2010, MNRAS, 408, 731 [NASA ADS] [CrossRef] [Google Scholar]

[10] D’Agostino, R. B., & Stephens, M. A. 1986, Goodness-of-Fit Techniques (New York: Marcel Dekker) [Google Scholar]

[11] Eldridge, J. J. 2012, MNRAS, 422, 794 [NASA ADS] [CrossRef] [Google Scholar]

[12] Elmegreen, B. G. 1997, ApJ, 486, 944 [NASA ADS] [CrossRef] [Google Scholar]

[13] Elmegreen, B. G. 1999, ApJ, 515, 323 [NASA ADS] [CrossRef] [Google Scholar]

[14] Elmegreen, B. G. 2000, ApJ, 539, 342 [NASA ADS] [CrossRef] [Google Scholar]

[15] Elmegreen, B. G. 2006, ApJ, 486 , 944 [Google Scholar]

[16] Elmegreen, B. G. 2011, ApJ, 731, 61 [NASA ADS] [CrossRef] [Google Scholar]

[17] Fernández-Soto, A., Lanzetta, K. M., Chen, H.-W., Levine, B., & Yahata, N. 2002, MNRAS, 330, 889 [NASA ADS] [CrossRef] [Google Scholar]

[18] Fouesneau, M., & Lançon, A. 2010, A&A, 521, L22 [Google Scholar]

[19] Fumagalli, M., da Silva, R. L., & Krumholz, M. R. 2011, ApJ, 741, L26 [NASA ADS] [CrossRef] [Google Scholar]

[20] García-Vargas, M. L., & Díaz, A. I. 1994, ApJS, 91, 553 [NASA ADS] [CrossRef] [Google Scholar]

[21] García-Vargas, M. L., Bressan, A., & Díaz, A. I. 1995, A&AS, 112, 13 [NASA ADS] [Google Scholar]

[22] Gumbel, E. J. 1958, Statistics of Extremes (Columbia University Press) [Google Scholar]

[23] Haas, M. R., & Anders, P. 2010, A&A, 512, 79 [Google Scholar]

[24] Kendall, M., & Stuart, A. 1977, The advanced theory of statistics (London: Griffin), 4th edn. [Google Scholar]

[25] Kennicutt, R. C., Jr. 1998, ARA&A, 36, 189 [Google Scholar]

[26] Kirk, H., & Myers, P. C. 2011, ApJ, 727, 64 [NASA ADS] [CrossRef] [Google Scholar]

[27] Kroupa, P. 2001, MNRAS, 322, 231 [NASA ADS] [CrossRef] [Google Scholar]

[28] Kroupa, P. 2002, Science, 295, 82 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]

[29] Kroupa, P., & Weidner, C. 2003, ApJ, 598, 1076 [NASA ADS] [CrossRef] [Google Scholar]

[30] Kroupa, P., Weidner, C., Pflamm-Altenburg, J., et al. 2011 [arXiv:1112.3340] [Google Scholar]

[31] Larson, R. B. 1982, MNRAS, 200, 159 [NASA ADS] [Google Scholar]

[32] Larson, R. B. 2003, Galactic Star Formation Across the Stellar Mass Spectrum, eds. J. M. De Buizer, & N. S. van der Bliek (San Francisco: ASP), ASP Conf. Ser., 287, 65 [Google Scholar]

[33] Maíz Apellániz, J., & Úbeda, L. 2005, ApJ, 629, 873 [NASA ADS] [CrossRef] [Google Scholar]

[34] Maschberger, T., & Clarke, C. J. 2008, MNRAS, 391, 711 [NASA ADS] [CrossRef] [Google Scholar]

[35] Oey, M. S., & Clarke, C. J. 2005, ApJ, 620, L43 [NASA ADS] [CrossRef] [Google Scholar]

[36] Parker, R. J., & Goodwin, S. P. 2007, MNRAS, 380, 1271 [NASA ADS] [CrossRef] [Google Scholar]

[37] Pflamm-Altenburg, J., & Kroupa, P. 2008, Nature, 455, 641 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]

[38] Piskunov, A. E., Kharchenko, N. V., Schilbach, E. et al. 2011, A&A, 525, 122 [Google Scholar]

[39] Reddish, V. C. 1978, International Series in Natural Philosophy (Oxford: Pergamon) [Google Scholar]

[40] Renzini, A., & Buzzoni, A. 1986, Spectral Evolution of Galaxies, Astrophysics and Space Science Library, 122, 195 [NASA ADS] [CrossRef] [Google Scholar]

[41] Salpeter, E. E. 1955, ApJ, 121, 161 [Google Scholar]

[42] Sánchez, N., Alfaro, E. J., & Pérez, E. 2006, ApJ, 641, 347 [NASA ADS] [CrossRef] [Google Scholar]

[43] Scalo, J. M. 1986, Fund. Cosm. Phys. 11, 1 [Google Scholar]

[44] Schmidt, M. 1959, ApJ, 129, 243 [NASA ADS] [CrossRef] [Google Scholar]

[45] Schmidt, M. 1963, ApJ, 137, 758 [NASA ADS] [CrossRef] [Google Scholar]

[46] Selman, F. J., & Melnick, J. 2008, ApJ, 689, 816 [NASA ADS] [CrossRef] [Google Scholar]

[47] Shadmehri, M., & Elmegreen, B. G. 2011, MNRAS, 410, 788 [NASA ADS] [CrossRef] [Google Scholar]

[48] da Silva, R. L., Fumagalli, M., & Krumholz, M. 2012, ApJ, 745, 145 [NASA ADS] [CrossRef] [Google Scholar]

[49] Sornette, D. 2004, Critical phenomena in natural sciences: chaos, fractals, selforganization and disorder: concepts and tools, Springer series in synergetics (Heidelberg: Springer) [Google Scholar]

[50] Treyer, M., Wyder, T., Neill, J., Seibert, M., & Lee, J. 2011, UP2010: Have Observations Revealed a Variable Upper End of the Initial Mass Function? ASP Conf. Proc., 440 [Google Scholar]

[51] Tinsley, B. 1980, Fun. Cosm. Phys., 5, 287 [Google Scholar]

[52] van Albada, T. S. 1968, Bull. Astron. Inst. Netherlands, 20, 57 [NASA ADS] [Google Scholar]