Highlight
Free Access
Issue
A&A
Volume 497, Number 3, April III 2009
Page(s) 991 - 1007
Section Numerical methods and codes
DOI https://doi.org/10.1051/0004-6361/200810824
Published online 09 February 2009

A Markov Chain Monte Carlo technique to sample transport and source parameters of Galactic cosmic rays

I. Method and results for the Leaky-Box model

A. Putze1 - L. Derome1 - D. Maurin2 - L. Perotto1,3 - R. Taillet4,5

1 - Laboratoire de Physique Subatomique et de Cosmologie LPSC, 53 avenue des Martyrs, 38026 Grenoble, France
2 - Laboratoire de Physique Nucléaire et des Hautes Énergies LPNHE, Tour 33, Jussieu, 75005 Paris, France
3 - Laboratoire de l'accélérateur linéaire LAL, Université Paris-Sud 11, Bâtiment 200, BP 34, 91898 Orsay Cedex, France
4 - Laboratoire de Physique Théorique LAPTH, Chemin de Bellevue BP 110, 7494i1 Annecy-le-Vieux, France
5 - Université de Savoie, 73011 Chambéry, France

Received 18 August 2008 / Accepted 7 December 2008

Abstract
Context. Propagation of charged cosmic-rays in the Galaxy depends on the transport parameters, whose number can be large depending on the propagation model under scrutiny. A standard approach for determining these parameters is a manual scan, leading to an inefficient and incomplete coverage of the parameter space.
Aims. In analyzing the data from forthcoming experiments, a more sophisticated strategy is required. An automated statistical tool is used, which enables a full coverage of the parameter space and provides a sound determination of the transport and source parameters. The uncertainties in these parameters are also derived.
Methods. We implement a Markov Chain Monte Carlo (MCMC), which is well suited to multi-parameter determination. Its specificities (burn-in length, acceptance, and correlation length) are discussed in the context of cosmic-ray physics. Its capabilities and performances are explored in the phenomenologically well-understood Leaky-Box Model.
Results. From a technical point of view, a trial function based on binary-space partitioning is found to be extremely efficient, allowing a simultaneous determination of up to nine parameters, including transport and source parameters, such as slope and abundances. Our best-fit model includes both a low energy cut-off and reacceleration, whose values are consistent with those found in diffusion models. A Kolmogorov spectrum for the diffusion slope ( $\delta=1/3$) is excluded. The marginalised probability-density function for $\delta $ and $\alpha $ (the slope of the source spectra) are $\delta\approx 0.55{-}0.60$ and $\alpha \approx 2.14{-}2.17$, depending on the dataset used and the number of free parameters in the fit. All source-spectrum parameters (slope and abundances) are positively correlated among themselves and with the reacceleration strength, but are negatively correlated with the other propagation parameters.
Conclusions. The MCMC is a practical and powerful tool for cosmic-ray physic analyses. It can be used to confirm hypotheses concerning source spectra (e.g., whether $\alpha_i\neq \alpha_j$) and/or determine whether different datasets are compatible. A forthcoming study will extend our analysis to more physical diffusion models.

Key words: diffusion - methods: statistical - ISM: cosmic rays

1 Introduction

One issue of cosmic-ray (CR) physics is the determination of the transport parameters in the Galaxy. This determination is based on the analysis of the secondary-to-primary ratio (e.g., B/C, sub-Fe/Fe), for which the dependence on the source spectra is negligible, and the ratio remains instead mainly sensitive to the propagation processes (e.g., Maurin et al. 2001, and references therein). For almost 20 years, the determination of these parameters relied mostly on the most constraining data, namely the HEAO-3 data, taken in 1979, which covered the $\sim $1-35 GeV/n range (Engelmann et al. 1990).

For the first time since HEAO-3, several satellite or balloon-borne experiments (see ICRC 2007 reporter's talk Blasi 2008) have acquired higher quality data in the same energy range or covered a scarcely explored range (in terms of energy, 1 TeV/n-PeV/n, or in terms of nucleus): from the balloon-borne side, the ATIC collaboration has presented the B/C ratio at 0.5-50 GeV/n (Panov et al. 2007), and for H to Fe fluxes at 100 GeV-100 TeV (Panov et al. 2006). At higher energy, two long-duration balloon flights will soon provide spectra for Z=1-30 nuclei. The TRACER collaboration has published spectra for oxygen up to iron in the GeV/n-TeV/n range (Ave et al. 2008; Boyle et al. 2007). A second long-duration flight took place in summer 2006, during which the instrument was designed to have a wider dynamic-range capability and to measure lighter B, C, and N elements. The CREAM experiment (Seo et al. 2004) flew a cumulative duration of 70 days in December 2004 and December 2005 (Seo et al. 2006, and preliminary results in Marrocchesi et al. 2006; and Wakely et al. 2006), and again in December 2007. A fourth flight was scheduled for December 2008[*]. Exciting data will arrive from the PAMELA satellite (Picozza et al. 2007), which was successfully launched in June 2006 (Casolino et al. 2008).

With this wealth of new data, it is relevant to question the method used to extract the propagation parameters. The value of these parameters is important to many theoretical and astrophysical questions, because they are linked, amongst others, to the transport in turbulent magnetic fields, sources of CRs, and $\gamma$-ray diffuse emission (see Strong et al. 2007, for a recent review and references). It also proves to be crucial for indirect dark-matter detection studies (e.g., Donato et al. 2004; Delahaye et al. 2008). The usage in the past has been based mostly on a manual or semi-automated - hence partial - coverage of the parameter space (e.g., Webber et al. 1992; Strong & Moskalenko 1998; Jones et al. 2001). More complete scans were performed in Maurin et al. (2001,2002), and Lionetto et al. (2005), although in an inefficient manner: the addition of a single new free parameter (as completed for example in Maurin et al. 2002; compared to Maurin et al. 2001) remains prohibitive in terms of computing time. To remedy these shortcomings, we propose to use the Markov Chain Monte Carlo (MCMC) algorithm, which is widely used in cosmological parameter estimates (e.g., Christensen et al. 2001; Lewis & Bridle 2002; and Dunkley et al. 2005). One goal of the paper is to confirm whether the MCMC algorithm can provide similar benefits in CR physics.

The analysis is performed in the framework of the Leaky-Box Model (LBM), a simple and widely used propagation model. This model contains most of the CR phenomenology and is well adapted to a first implementation of the MCMC tool. In Sect. 2, we highlight the appropriateness of the MCMC compared to other algorithms used in the field. In Sect. 3, the MCMC algorithm is presented. In Sect. 4, this algorithm is implemented in the LBM. In Sect. 5, we discuss the MCMC advantages and effectiveness in the field of CR physics, and present results for the LBM. We present our conclusions in Sect. 6. Application of the MCMC technique to a more up-to-date modelling, such as diffusion models, is left to a forthcoming paper.

2 Link between the MCMC, the CR data, and the model parameters

Various models describe the propagation of CRs in the interstellar medium (Evoli et al. 2008; Berezhko et al. 2003; Maurin et al. 2001; Strong & Moskalenko 1998; Shibata et al. 2006; Webber et al. 1992; Bloemen et al. 1993). Each model is based on his own specific geometry and has its own set of parameters, characterising the Galaxy properties. The MCMC approach aims to study quantitatively how the existing (or future) CR measurements can constrain these models or, equivalently, how in such models the set of parameters (and their uncertainties) can be inferred from the data.

In practice, a given set of parameters in a propagation model implies, e.g., a given B/C ratio. The model parameters are constrained such as to reproduce the measured ratio. The standard practice used to be an eye inspection of the goodness of fit to the data. This was replaced by the $\chi ^2$ analysis in recent papers: assuming the $\chi ^2$ statistics is applicable to the problem at stake, confidence intervals in these parameters can be extracted (see Appendix A).

The main drawback of this approach is the computing time required to extend the calculation of the $\chi ^2$ surface to a wider parameter space. This is known as the curse of dimensionality, due to the exponential increase in volume associated with adding extra dimensions to the parameter space, while the good regions of this space (for instance where the model fits the data) only fill a tiny volume. This is where the MCMC approach, based on the Bayesian statistics, is superior to a grid approach. As in the grid approach, one end-product of the analysis is the $\chi ^2$ surface, but with a more efficient sampling of the region of interest. Moreover, as opposed to classical statistics, which is based on the construction of estimators of the parameters, Bayesian statistics assumes the unknown parameters to be random variables. As such, their full distribution - the so-called conditional probability-density function (PDF) - given some experimental data (and some prior density for these parameters, see below) can be generated.

To summarise, the MCMC algorithm provides the PDF of the model parameters, based on selected experimental data (e.g., B/C). The mean value and uncertainty in these parameters are by-products of the PDF. The MCMC enables the enlargement of the parameter space at a minimal computing time cost (although the MCMC and Metropolis-Hastings algorithms used here are not the most efficient one). The technicalities of the MCMC are briefly described below. The reader is referred to Neal (1993) and MacKay (2003) for a more substantial coverage of the subject.

3 Markov Chain Monte Carlo (MCMC)

Considering a model depending on m parameters

\begin{displaymath}\vec{\theta}\equiv \{ \theta^{(1)},~\theta^{(2)}, ~ \ldots, ~\theta^{(m)}\},
\end{displaymath} (1)

we aim to determine the conditional PDF of the parameters given the data, $P (\vec{\theta}\vert{\rm data})$. This so-called posterior probability quantifies the change in the degree of belief one can have in the m parameters of the model in the light of the data. Applied to the parameter inference, Bayes theorem is

\begin{displaymath}P (\vec{\theta}\vert{\rm data}) = \frac{P ({\rm data}\vert\vec{\theta})
\cdot P (\vec{\theta})}{P ({\rm data})},
\end{displaymath} (2)

where $P({\rm data})$ is the data probability (the latter does not depend on the parameters and hence, can be considered to be a normalisation factor). This theorem links the posterior probability to the likelihood of the data ${\cal L(\vec{\theta}})\equiv P ({\rm data}\vert\vec{\theta})$and the so-called prior probability, $P (\vec{\theta})$, indicating the degree of belief one has before observing the data. To extract information about a single parameter, $\theta^{(\alpha)}$, the posterior density is integrated over all other parameters $\theta^{(k \neq \alpha)}$ in a procedure called marginalisation. Finally, by integrating the individual posterior PDF further, we are able to determine the expectation value, confidence level, or higher order mode of the parameter $\theta^{(\alpha)}$. This illustrates the technical difficulty of Bayesian parameter estimates: determining the individual posterior PDF requires a high-dimensional integration of the overall posterior density. Thus, an efficient sampling method for the posterior PDF is mandatory. For models of more than a few parameters, regular grid-sampling approaches are not applicable and statistical techniques are required (Cowan 1997).

Among these techniques, MCMC algorithms have been fully tried and tested for Bayesian parameter inference (Neal 1993; MacKay 2003). MCMC methods explore any target distribution given by a vector of parameters $p (\vec{\theta})$, by generating a sequence of n points (hereafter a chain)

\begin{displaymath}\{\vec{\theta}_i\}_{i=1, \ldots, n}\equiv \{ \vec{\theta}_1,~\vec{\theta}_2, ~ \ldots, ~\vec{\theta}_n\}.
\end{displaymath} (3)

Each $\vec{\theta}_i$ is a vector of m components (as defined in Eq. (1)). In addition, the chain is Markovian in the sense that the distribution of $\vec{\theta}_{n+1}$ is influenced entirely by the value of  $\vec{\theta}_n$. MCMC algorithms are developed so that the time spent by the Markov chain in a region of the parameter space is proportional to the target PDF value in this region. Hence, from such a chain, one can obtain an independent sampling of the PDF. The target PDF as well as all marginalised PDF are estimated by counting the number of samples within the related region of parameter space.

Below, we provide a brief introduction to an MCMC using the Metropolis-Hastings algorithm (see Neal 1993; MacKay 2003, Chap. 29, for further details and references).

3.1 The algorithm

The prescription that we use to generate the Markov chains from the unknown target distribution is the so-called Metropolis-Hastings algorithm. The Markov chain increases by jumping from the current point in the parameter space  $\vec{\theta}_i$ to the following $\vec\theta_{i+1}$. As said before, the PDF of the new point only depends on the current point, i.e. $\mathcal{T}(\vec{\theta}_{i+1}\vert{\vec{\theta}_1, \ldots, \vec{\theta}_i}) =
\mathcal{T}(\vec{\theta}_{i+1}\vert\vec{\theta}_i)$. This quantity defines the transition probability for state $\vec\theta_{i+1}$ from the state  $\vec{\theta}_i$. The Metropolis-Hastings algorithm specifies  $\mathcal{T}$to ensure that the stationary distribution of the chain asymptotically tends to the target PDF one wishes to sample from.

At each step i (corresponding to a state $\vec{\theta}_i$), a trial state $\vec{\theta}_{{\rm trial}}$ is generated from a proposal density $q(\vec{\theta}_{{\rm trial}}\vert \vec{\theta}_i)$. This proposal density is chosen so that samples can be easily generated (e.g., a Gaussian distribution centred on the current state). The state $\vec{\theta}_{{\rm trial}}$ is accepted or rejected depending on the following criterion. By forming the quantity

\begin{displaymath}a(\vec{\theta}_{\rm trial}\vert\vec{\theta}_i) =
\min \left(...
...)}
{q(\vec{\theta}_{{\rm trial}}\vert \vec{\theta}_i)}\right),
\end{displaymath} (4)

the trial state is accepted as a new state with a probability a (rejected with probability 1-a). The transition probability is then

\begin{displaymath}\mathcal{T}(\vec{\theta}_{i+1}\vert\vec{\theta}_i) =
a(\vec{...
...c{\theta}_i)q(\vec{\theta}_{{\rm trial}}\vert \vec{\theta}_i).
\end{displaymath} (5)

If accepted, $\vec{\theta}_{i+1} = \vec{\theta}_{{\rm trial}}$, whereas if rejected, the new state is equivalent to the current state, $\vec{\theta}_{i+1} = \vec{\theta}_i$. This criterion ensures that once at its equilibrium, the chain samples the target distribution  $p (\vec{\theta})$. If the proposal density $q(\vec{\theta}_{{\rm trial}}\vert \vec{\theta}_i)$is chosen to be symmetric, it cancels out in the expression of the acceptance probability, which becomes:

\begin{displaymath}a = \min \left(1,~\frac{p (\vec{\theta}_{{\rm trial}})}
{p (\vec{\theta}_i)}\right)\cdot
\end{displaymath} (6)

We note that the process requires only evaluations of ratios of the target PDF. This is a major virtue of this algorithm, in particular for Bayesian applications, in which the normalisation factor in Eq. (2), $P({\rm data}) = \int P({\rm data}\vert\vec{\theta}) \cdot P(\vec{\theta}) {\rm d} \vec{\theta}$ is often extremely difficult to compute. Hence, the ratio of the target PDF, i.e. the posterior of the parameter for our problem, can be calculated directly from the likelihood of the data and the priors.

3.2 Chain analysis

The chain analysis refers to the study of several properties of the chains. The following quantities are inspected in order to convert the chains in PDFs.

Burn-in length

The burn-in describes the practice of removing some iterations at the beginning of the chain to eliminate the transient time needed to reach the equilibrium or stationary distribution, i.e., to forget the starting point. The burn-in length b is defined to be the number of first samples $\{\vec{\theta}_i\}_{i=1, \ldots, b}$ of the chain that must be discarded. The stationary distribution is reached when the chain enters the most probable parameter region corresponding to the region where the target function is close to its maximal value. To estimate b, the following criterion is used: we define p1/2 to be the median of the target function distribution obtained from the entire chain of N samples. The burn-in length b corresponds to the first sample $\vec{\theta}_b$, for which ${p} (\vec{\theta}_b)\!>\!{p}_{1/2}$ (see Appendix C for an illustration).

Correlation length

By construction (see Eq. (5)), each step of the chain depends on the previous one, which ensures that the steps of the chain are correlated. We can obtain independent samples by thinning the chain, i.e. by selecting only a fraction of the steps with a periodicity chosen to derive uncorrelated samples. This period is estimated by computing the autocorrelation functions for each parameter. For a parameter $\theta^{(\alpha)}$ ( $\alpha = 1, \ldots, m$), the autocorrelation function is given by

\begin{displaymath}c_j^{(\alpha)} = \frac{E \left[ \theta_i^{(\alpha)} \theta_{j...
...ght)^2}{E \left[ \left(
\theta_i^{(\alpha)} \right)^2\right]},
\end{displaymath} (7)

which we calculate with the Fast Fourier Transformation (FFT). The correlation length $l^{(\alpha)}$ for the $\alpha $th parameter is defined as the smallest j for which $c_j^{(\alpha)}<1/2$, i.e. the values $\theta^{(\alpha)}_i$ and $\theta^{(\alpha)}_{i+j}$of the chain that are considered to be uncorrelated. The correlation length l for the chain, for all parameters, is defined to be

\begin{displaymath}l \equiv \max_{\alpha = 1, \ldots, m} l^{(\alpha)},
\end{displaymath} (8)

which is used as the period of the thinning (see Appendix C, for an illustration).

Independent samples and acceptance

The independent samples of the chain are chosen to be $\{\vec{\theta}_i\}_{i=b+l k}$, where k is an integer. The number of independent samples  $N_{{\rm ind}}$is defined to be the fraction of steps remaining after discarding the burn-in steps and thinning the chain,

\begin{displaymath}N_{{\rm ind}}=\frac{N_{{\rm tot}}-b}{l}\cdot
\end{displaymath} (9)

The independent acceptance $f_{{\rm ind}}$ is the ratio of the number of independent samples $N_{{\rm ind}}$ to the total step number $N_{{\rm tot}}$,

\begin{displaymath}f_{{\rm ind}} = \frac{N_{{\rm ind}}}{N_{{\rm tot}}}\cdot
\end{displaymath} (10)

3.3 Choice of the target and trial functions

3.3.1 Target function

As already said, we wish to sample the target function $p(\vec{\theta})= P (\vec{\theta}\vert{\rm data})$. Using Eq. (2) and the fact that the algorithm is insensitive to the normalisation factor, this amounts to sampling the product $P ({\rm data}\vert\vec{\theta}) \cdot P (\vec{\theta})$. Assuming a flat prior $P (\vec{\theta})={\rm const}.$, the target distribution reduces to

\begin{displaymath}p (\vec{\theta})= P ({\rm data}\vert\vec{\theta})\equiv {\cal L}(\vec{\theta}),
\end{displaymath} (11)

and here, the likelihood function is taken to be

\begin{displaymath}{\cal L} (\vec{\theta})= \exp{\left( -\frac{\chi^2(\vec{\theta})}{2}\right)}\cdot
\end{displaymath} (12)

The $\chi^2(\vec{\theta})$ function for $n_{\rm data}$ data is

\begin{displaymath}\chi^2 (\vec{\theta}) = \sum_{k = 1}^{n_{\rm data}} \frac{(y^{\rm exp}_k -
y^{\rm theo}_k (\vec{\theta}))^2}{\sigma_k^2},
\end{displaymath} (13)

where $y^{\rm exp}_k$ is the measured value, $y^{\rm theo}_k$is the hypothesised value for both a certain model and the parameters  $\vec{\theta}$, and $\sigma_k$ is the known variance of the measurement. For example, $y^{\rm exp}_k$ and $y^{\rm theo}_k$ represent the measured and calculated B/C ratios.

The link between the target function, i.e., the posterior PDF of the parameters, and the experimental data is established with the help of Eqs. (11) to (13). This link guarantees the proper sampling of the parameter space using Markov chains, which spend more time in more relevant regions of parameter space, as described above.

3.3.2 Trial function

Despite the effectiveness of the Metropolis-Hastings algorithm, to optimise the efficiency of the MCMC and minimise the number of chains to be processed, trial functions should be as close as possible to the true distributions. We use a sequence of three trial functions to explore the parameter space. The first step is a coarse determination of the parameter PDF. This allows us to calculate the covariance matrix leading to a better coverage of parameter space, provided that the target PDF is sufficiently close to being an N-dimensional Gaussian. The last step takes advantage of a binary-space partitioning (BSP) algorithm.

Gaussian step

For the first iteration, the proposal density $q\left(\vec{\theta}_{\rm trial},
\vec{\theta}_i\right)$, required to obtain the trial value $\vec{\theta}_{{\rm trial}}$from $\vec{\theta}_i$ is written as

\begin{displaymath}q \left(\vec{\theta}_{\rm trial}, \vec{\theta}_i \right) \pro...
...- \theta^{(\alpha)}_i
\right)^2}{\sigma_\alpha^2}\right)}\cdot
\end{displaymath} (14)

These represent m independent Gaussian distributions centred on  $\vec{\theta}_i$. The distribution is symmetric, so that the acceptance probability a follows Eq. (6). The variance $\sigma_\alpha^2$ for each parameter $\alpha $is to be specified. Each parameter $\theta_{\rm trial}^{(\alpha)}$ is hence calculated to be

\begin{displaymath}\theta_{\rm trial}^{(\alpha)} = \theta_i^{(\alpha)} + \sigma_\alpha\cdot x,
\end{displaymath}

where x is a random number obeying a Gaussian distribution centred on zero with unit variance.

It is important to choose an optimal width $\sigma_\alpha$to sample properly the posterior (target) distribution. If the width is too large, as soon as the chain reaches a region of high probability, most of the trial parameters fall into a region of low probability and are rejected, leading to a low acceptance and a long correlation length. Conversely, for too small a width, the chain will take a longer time to reach the interesting regions. Eventually, even if the chain reaches these regions of high acceptance, only a partial coverage of the PDF support will be sampled (also leading to a long correlation length).

In practice, we first define $\sigma_\alpha$ ( $\alpha = 1, \ldots, m$) equal to the expected range of the parameter. In a subsequent iteration, $\sigma_\alpha$ is set to be $2 \sqrt{2 \ln 2}\approx 2.3$ times $\sigma_\alpha^{\rm calc}$, i.e. the FWHM of the PDF obtained with the first iteration. The result is actually insensitive to the numerical factor used.

Covariance matrix

The proposal density is taken to be an N-dimensional Gaussian of covariance matrix V

\begin{displaymath}q(\vec{\theta}_{\rm trial}, \vec{\theta}_{i}) \!\propto \!\ex...
...{-1} (\vec{\theta}_{\rm trial} - \vec{\theta}_{i})
\!\right)}.
\end{displaymath} (15)

The covariance matrix V is symmetric and diagonalisable (D is a diagonal matrix of eigenvalues and P represents the change in the coordinate matrix),

V=PT D P,

and where again Eq. (6) holds. The parameters $\vec{\theta}_{{\rm trial}}$ are hence found to be

\begin{displaymath}\vec{\theta}_{\rm trial} = \vec{\theta}_i + P^T D ~ \vec{x},
\end{displaymath}

where $\vec{x}$ is a vector of m random numbers following a Gaussian distribution centred on zero and with unit variance.

The covariance matrix V is estimated, e.g., from a previous iteration using the Gaussian step. The advantage of this trial function with respect to the previous one is that it takes account of the possible correlations between the m parameters of the model.

Binary space partitioning (BSP)

A third method was developed to define a proposal density for which the results of the Gaussian step or the covariance matrix iterations are used to subdivide the parameter space into boxes, in each of which a given probability is affected.

The partitioning of the parameter space can be organised using a binary-tree data structure known as a binary-space partitioning tree (de Berg et al. 2000). The root node of the tree is the m-dimensional box corresponding to the entire parameter space. The binary-space partitioning is then performed by dividing each box recursively into two child boxes if the partitioning satisfies the following requirement: a box is divided only if the number of independent samples contained in this box is higher than a certain number (here we used a maximum of between 3% and 0.1% of the total number of independent samples). When a box has to be divided, the division is made along the longer side of the box (the box-side lengths are defined relative to the root-box sides). For each end node (i.e. node without any children), a probability, defined as the fraction of the number of independent samples in the box to their total number, is assigned. For empty boxes, a minimum probability is assigned and all the probabilities are renormalised so that the sum of all end-node probabilities equals 1.

The proposal density $q(\vec{\theta}_{\rm {trial}})$ is then defined, in each end-node box, as a uniform function equal to the assigned probability. The sampling of this proposal density is simple and efficient: an end node is chosen with the assigned probability and the trial parameters are chosen uniformly in the corresponding box. In comparison to the other two proposal densities, this proposal density based on a BSP is asymmetric, because it is only dependent on the proposal state $q(\vec{\theta}_{\rm {trial}})$. Hence, Eq. (4) must be used.

4 Implementation in the propagation model

The MCMC with the three above methods are implemented in the USINE package[*], which computes the propagation of Galactic CR nuclei and anti-nuclei for several propagation models (LBM, 1D and 2D diffusion models). The reader is referred to Maurin et al. (2001) for a detailed description for the nuclear parameters (fragmentation and absorption cross-sections), energy losses (ionisation and Coulomb), and solar modulation (force-field) used.

We briefly describe how the MCMC algorithm is implemented in the propagation part (Sect. 4.1), using a LBM - the procedure would be similar for any other model. The LBM and its parameters are briefly discussed (Sect. 4.2) as well as the input spectrum parameters (Sect. 4.3). Additional information about the data are gathered in Appendix B.

4.1 Flow chart

A flow chart of the Metropolis-Hastings MCMC algorithm used in the context of GCRs is given in Fig. 1.

 \begin{figure}
\par\includegraphics[width = 8.8cm]{0824fig1.eps}
\end{figure} Figure 1:

Flow chart of the implemented MCMC algorithm: $\vec{\theta}_i$ is a vector (Eq. (1)) of the $\alpha = 1, \ldots, m$ free parameters of the model, evaluated at each step i, and $p (\vec{\theta}_i)$ is the target function given by Eq. (11). See text for details.

Open with DEXTER
To summarise, the initial values of the propagation parameters $\vec{\theta}_{0}$ are chosen randomly in their expected range to crank up each Markov chain. The interstellar (IS) CR fluxes are then calculated for this set of parameters (see, e.g., Fig. 1 in Maurin et al. 2002, for further details of the propagation steps). The IS flux is modulated with the force-field approximation and the resulting top-of-atmosphere (TOA) spectrum is compared with the data, which allows us to calculate the $\chi ^2$ value (Eq. (13)), hence the likelihood (Eq. (12)). This likelihood (in practice the log-likelihood) is used to compute the acceptance probability (Eq. (4)) of the trial vector of parameters $\vec{\theta}_{{\rm trial}}$ (as generated by one of the three trial functions described in Sect. 3.3.2). Whether the trial vector is accepted or rejected implies whether $\vec{\theta}_{1}=\vec{\theta}_{\rm trial}$ or $\vec{\theta}_{1}=\vec{\theta}_{0}$. This procedure is repeated for the N steps of the chain. Obviously, when $\vec{\theta}_{i+1} = \vec{\theta}_i$, the propagation step does not need to be repeated. Because of the nature of the MCMC algorithm, several chains can be executed in parallel. Once completed, these chains are analysed (see Sect. 3.2) - discarding the first step belonging to the burning length and thinned according to the correlation length l (Eq. (8)) - and combined to recover the desired posterior PDF $P (\vec{\theta}\vert{\rm data})$.

In this procedure, the user must decide i) the data to be used; ii) the observable to retain in calculating the likelihood; and iii) the number of free parameters m (of the vector  $\vec{\theta}$) for which we seek the posterior PDF.

4.2 Leaky-Box Model (LBM)

The LBM assumes that all CR species are confined within the Galaxy with an escape rate that equals $N/\tau_{\rm esc}$, where the escape time  $\tau_{\rm esc}$ is rigidity-dependent, and is written as  $\tau_{\rm esc}(R)$. This escape time has two origins. First, CRs can leak out the confinement volume and leave the Galaxy. Second, they can be destructed by spallation on interstellar matter nuclei. This latter effect is parameterised by the grammage x (usually expressed in g cm-2), defined as the column density of interstellar matter encountered by a path followed by a CR. The CRs that reach Earth have followed different paths, and can therefore be described by a grammage distribution $N(x) \equiv {\rm d}N/{\rm d}x$. The LBM assumes that

\begin{displaymath}N(x) \propto \exp^{-\lambda_{\rm esc}(R) x},
\end{displaymath} (16)

where the mean grammage $\lambda_{\rm esc}(R)= \langle x \rangle$ is related to the mass m, velocity v and escape time  $\tau_{\rm esc}(R)$ by means of $\lambda_{\rm esc}(R) = \bar{m}nv \tau_{\rm esc}(R)$.

The function $\lambda_{\rm esc}(R)$ determines the amount of spallations experienced by a primary species, and thus determines the secondary-to-primary ratios, for instance B/C. From an experimentalist point of view, $\lambda_{\rm esc}(R)$ is a quantity that can be inferred from measurements of nuclei abundance ratios. The grammage $\lambda_{\rm esc}(R)$ is known to provide an effective description of diffusion models (Berezinskii et al. 1990): it can be related to the efficiency of confinement (which is determined by the diffusion coefficient and to both the size and geometry of the diffusion volume), spallative destruction (which tends to shorten the average lifetime of a CR and thus lower  $\lambda_{\rm esc}$), and a mixture of other processes (such as convection, energy gain, and losses).

In this paper, we compute the fluxes in the framework of the LBM with minimal reacceleration by the interstellar turbulence, as described in Osborne & Ptuskin (1988) and Seo & Ptuskin (1994). The grammage $\lambda_{\rm esc}(R)$ is parameterised as

\begin{displaymath}
\lambda_{\rm esc}(R) =
\bigg\{
\begin{array}{ll}
\lambd...
...\
\lambda_0 \beta R^{-\delta} & {\rm otherwise;}
\end{array}
\end{displaymath} (17)

where we allow for a break, i.e. a different slope below and above a critical rigidity R0. The standard form used in the literature is recovered by setting $\delta_0=0$. For the entire set of n nuclei, a series of n equations (see Maurin et al. 2001, for more details) for the differential densities $N^{j=1,\ldots, n}$ are solved at a given kinetic energy per nucleon Ek/n (E is the total energy), i.e.

\begin{displaymath}A^j N^j(E_{k/n}) + \frac{\rm d}{{\rm d}E}\left( B^j N^j - C^j \frac{{\rm d}N^j}{{\rm d}E} \right) = S^j(E_{k/n}).
\end{displaymath} (18)

In this equation, the r.h.s. term is the source term that takes into account the primary contribution (see Sect. 4.3), the spallative secondary contribution from all nuclei k heavier than j, and the $\beta$-decay of radioactive nuclei into j. The first energy-dependent factor Aj is given by

\begin{displaymath}A^j = \frac{1}{\tau_{\rm esc}}
+ \sum_{\rm ISM=H,He} n_{\rm...
...j \sigma^{j\rm +ISM}_{\rm inel}
+ \frac{1}{\tau^j_\beta}\cdot
\end{displaymath}

The two other terms correspond to energy losses and first-order reacceleration for Bj and to second-order reacceleration for Cj. Following Osborne & Ptuskin (1988) and Seo & Ptuskin (1994),

\begin{displaymath}B= \big\langle\frac{{\rm d}E}{{\rm d}t}\big\rangle_{\rm ion,~...
...E K_{\rm pp} \quad {\rm and} \quad
C= \beta^4 E^2 K_{\rm pp} ,
\end{displaymath}

where

\begin{displaymath}K_{\rm pp}=\frac{4}{3} {\cal V}_a^2
\frac{\tau_{\rm esc}}{\delta(4-\delta^2)(4-\delta)}\cdot
\end{displaymath} (19)

The strength of the reacceleration is mediated by the pseudo Alfvénic speed ${\cal V}_a$ of the scatterers in units of km s-1 kpc-1. This is related to a true speed given in a diffusion model with a thin disk h and a diffusive halo L by means of ${\cal V}_a=V_a\times (hL)^{-1/2}$(Seo & Ptuskin 1994). Assuming typical values of h=0.1 kpc and L=10 kpc, the value of ${\cal V}_a$ can be directly transposed and compared to a true speed Va, as obtained in diffusion models.

To summarise, our LBM with reacceleration may involve up to five free parameters, i.e. the normalisation $\lambda _0$, the slopes $\delta _0$ and $\delta $ below or above the cut-off rigidity R0, and a pseudo-Alfvén velocity ${\cal V}_a$ related to the reacceleration strength.

4.3 Source spectra

We assume that the primary source spectrum Qj(E) for each nuclear species j is given by ($\beta=v/c$)

\begin{displaymath}Q_j(E) \equiv {\rm d}Q_j/{\rm d}E = q_j \beta^{\eta_j} R^{- \alpha_j},
\end{displaymath} (20)

where qj is the source abundance, $\alpha_j$ is the slope of the species j, and the term $\beta^{\eta_j}$ manifests our ignorance about the low-energy spectral shape. We further assume that $\alpha_j\equiv \alpha$ for all j, and unless stated otherwise, $\eta_j\equiv \eta=-1$ in order to recover d $Q/{\rm d}p\propto p^{-\alpha}$, as obtained from acceleration models (e.g., Jones 1994). The constraints existing on $\eta $ are explored in Sect. 5.3.

The pattern of the source abundances observed in the cosmic radiation differs from that of the solar system. This is due to a segregation mechanism during the acceleration stage. Two hypotheses are disputed in the literature: one is based on the CR composition controlled by volatility and mass-to-charge ratio (Meyer et al. 1997; Ellison et al. 1997), and the other one is based on the first ionisation potential (FIP) of nuclei (e.g., Cassé & Goret 1973). In this work, for each configuration, the source abundances are initialised to the product of the solar system abundances (Lodders 2003), and the value of the FIP taken from Binns et al. (1989). The final fluxes are obtained by an iterative calculation of the propagated fluxes, rescaling the element abundances - keeping fixed the relative isotopic abundances - to match experimental data at each step until convergence is reached (see Fig. 1 in Maurin et al. 2002, for further details). The result is thus insensitive to the input values (more details about the procedure are given in Appendix B.1).

The measurement of all propagated isotopic fluxes should characterise all source spectra parameters completely, i.e. the qj and $\alpha_j$ parameters should be free. However, only element fluxes are available, which motivates the above rescaling approach. In Sect. 5.3, a few calculations are undertaken to determine self-consistently, along with the propagation parameters, i) $\alpha $ and $\eta $; and ii) the source abundances for the primary species C, O, and the mixed N elements (the main contributors to the boron flux).

5 Results

We first examine the relative merits of four different parameterisations of the LBM, and determine the statistical significance of adding more parameters. These models correspond to $\{\vec{\theta}^{\alpha}\}_{\alpha=1, \ldots, m\leq5}$ with

  • Model I  $=\{\lambda_0,~ R_0, \delta\}$, i.e. no reacceleration ( ${\cal V}_a = 0$) and no break in the spectral index ( $\delta_0=0$).
  • Model II  $=\{\lambda_0,~ \delta, ~ {\cal V}_a\}$, i.e. no critical rigidity (R0 = 0) and no break in the spectral index ( $\delta_0=0$).
  • Model III  $=\{\lambda_0,~ R_0, ~ \delta, ~ {\cal V}_a\}$, i.e. no break in the spectral index ( $\delta_0=0$).
  • Model IV  $=\{\lambda_0,~ R_0, ~ \delta_0, ~ \delta, ~ {\cal V}_a\}$.
Various subsets of B/C data are used to investigate whether old data are useful or just add confusion to the PDF determination. We note in Sect. 5.2 that no useful constraint can be drawn from $\overline {p}$ data alone.

We also consider additional free parameters (Sect. 5.3) related to the source spectra, for a self-consistent determination of the propagation and source properties. Since we show that a break in the slope (Model IV) is not required by current data, we focus on Model III (for the description of the propagation parameters), defining:

  • Model III+1  $=\{\lambda_0,~ R_0, ~ \delta, ~ {\cal V}_a\}+ \{\alpha \}$, where the source slope $\alpha $ is a free parameter.
  • Model III+2  $=\{\lambda_0,~ R_0, ~ \delta, ~ {\cal V}_a\}+ \{\alpha, \eta \}$, where both the source slope $\alpha $ and the exponent $\eta $ (of $\beta$, see Eq. (20)) are free parameters.
  • Model III+4  $=\{\lambda_0,~ R_0, ~ \delta, ~ {\cal V}_a\}+ \{\alpha,~ q_{\rm C}, ~ q_{\rm N}, ~ q_{\rm O} \}$, where the abundances qi of the most significantly contributing elements are also free parameters.
  • Model III+5  $=\{\lambda_0,~ R_0, ~ \delta, ~ {\cal V}_a\}+ \{\alpha,~ \eta,~ q_{\rm C}, ~ q_{\rm N}, ~ q_{\rm O} \}$.
This allows us to investigate the correlations between parameters further and into potential biases in the propagation parameter determination.

More details about the practical use of the trial functions can be found in Appendix C. In particular, the sequential use of the three sampling methods (Gaussian step, covariance matrix step, and then binary-space partitioning) is found to be the most efficient: all results presented hereafter are based on this sequence.

5.1 Fitting the B/C ratio

5.1.1 HEAO-3 data alone

We first constrain the model parameters with HEAO-3 data only (Engelmann et al. 1990). These data are the most precise data available at the present day for the stable nuclei ratio B/C of energy between 0.62 to 35 GeV/n.

The results for the models I, II, and III are presented in Figs. C.2, 2 top, and 2 bottom. The inner and outer contours are taken to be regions containing 68% and 95% of the PDF respectively (see Appendix A.1).

 \begin{figure}
\par\includegraphics[width = 8.8cm]{0824fig2a.eps}\vspace{3mm}
\includegraphics[width = 8.8cm]{0824fig2b.eps}
\end{figure} Figure 2:

Posterior distributions for Model II ( top) and Model III ( bottom) using HEAO-3 data only. For more details, refer to caption of Fig. C.2.

Open with DEXTER
The first observation one can make for the LBM without reacceleration (Model I, Fig. C.2), is that the marginal distributions of the three LBM parameters are mostly Gaussian. The tail for small values of R0 is due to this parameter being constrained by low-energy data (<1 GeV/n): there are no HEAO-3 data at low energy, so all R0 values below 3 GV are equiprobable (this remains true for Model III).

As seen in Fig. 2, a more complicated shape for the different parameters is found for Model II (top panel), and even more so for Model III (bottom panel). This induces a longer correlation length (1.5 and 6.9 steps instead of 1 step) and hence reduces the efficiency of the MCMC (75% for model II and 17% for model III). Physically, the correlation between the parameters, as seen most clearly in Fig. 2 (bottom), is understood as follows. First, $\lambda _0$, R0, and $\delta $ are positively correlated. This originates in the low-energy relation $\lambda_{\rm esc}\propto \lambda_0 R_0^{-\delta}$, which should remain approximately constant to reproduce the bulk of the data at GeV/n energy. Hence, if R0 or $\delta $ is increased, $\lambda _0$also increases to balance the product. On the other hand, ${\cal V}_a$ is negatively correlated with $\delta $ (and hence with all the parameters): this is the standard result that to reach smaller $\delta $ (for instance to reach a Kolmogorov spectrum), more reacceleration is required. This can also be seen from Eq. (19), where at constant  $\tau_{\rm esc}$, $K_{\rm pp}\propto {\cal V}_a^2/f(\delta)$, where f is a decreasing function of $\delta $: hence, if $\delta $ decreases, $f(\delta)$ increases, and ${\cal V}_a$ then has to increase to retain the balance.

The values for the maximum of the PDF for the propagation parameters along with their 68% confidence intervals (see Appendix A) are listed in Table 1.

Table 1:   Most probable values of the propagation parameters for different models, using B/C data (HEAO-3 only).

The values obtained for our Model I are in fair agreement with those derived by Webber et al. (1998), who found $\{\lambda_0,~ R_0,~ \delta \}=\{
38.27,~ 3.6,~ 0.7\}$. The difference for $\lambda _0$ could be related to the fact that Webber et al. (1998) rely on a mere eye inspection to extract the best-fit solution or/and use a different set of data. For example, comparing Model I with a combination of HEAO-3 and low-energy data (ACE+Voyager 1 & 2+IMP7-8, see Sect. 5.1.2) leads to $\{\lambda_0,~ R_0,~ \delta \}=\{52,~ 5.3,~
0.69\}$, slightly changing the values of the Model I preferred parameters (compared to the first line of Table 1).

The reacceleration mechanism was invoked in the literature to decrease the spectral index $\delta $toward its preferred value of 1/3 given by a Kolmogorov spectrum of turbulence. In Table 1, the estimated propagation parameter values for the models II and III are indeed slightly smaller than for Model I, but the Kolmogorov spectral index is excluded for all of these three cases (using HEAO-3 data only). This result agrees with the findings of Maurin et al. (2001), in which a more realistic two-dimensional diffusion model with reacceleration and convection was used. We note that the values for ${\cal V}_a\sim80$ km s-1 kpc-1, should lead to a true speed $V_a={\cal V}_a\times \sqrt{hL}\sim80$ km s-1 in a diffusion model for which the thin disk half-height is h=0.1 kpc and the halo size is L=10 kpc: this is consistent with values found in Maurin et al. (2002).

The final column in Table 1 indicates, for each model, the best $\chi ^2$ value per degree of freedom, $\chi_{{\rm min}}^2/$d.o.f. This allows us to compare the relative merit of the models. LB models with reacceleration reproduce the HEAO-3 data more accurately (with $\chi ^2$/d.o.f. of 1.43 and 1.30 for the Models II and III respectively compared to $\chi ^2$/d.o.f. = 4.35 for Model I). The best-fit model B/C fluxes are shown with the B/C HEAO-3 data modulated at $\Phi = 250$ MV in Fig. 3.

 \begin{figure}
\par\includegraphics[width = 8.5cm]{0824fig3.eps}
\end{figure} Figure 3:

Best-fit ratio for Model I (blue dotted), II (red dashed), and Model III (black solid) using the HEAO-3 data only (green symbols). The curves are modulated with $\Phi = 250$ GV. The corresponding best-fit parameters are gathered in Table 3.

Open with DEXTER
Physically, the origin of a cutoff R0 in $\lambda_{\rm esc}$at low energy can be related to convection in diffusion models (Jones 1979). Hence, it is a distinct process as reacceleration. The fact that Model III performs more successfully than Model II implies that both processes are significant, as found in Maurin et al. (2001).

In the following, we no longer consider Model I and II, and inspect instead, the parameter dependence of Model III on the dataset selected.

5.1.2 Additional constraints from low-energy data

The actual data sets for the B/C ratio (see, e.g., Fig. 5) show a separation into two energy domains: the low-energy range extends from $\sim $10-2 GeV/n to $\sim $1 GeV/n and the high-energy range goes from $\sim $1 GeV/n to $\sim $102 GeV/n. The spectral index $\delta $ is constrained by high-energy data, e.g., the HEAO-3 data, and adding low-energy data allows us to more reliably constrain R0. We note that by fitting only the low-energy data, only the grammage crossed in very narrow energy domain would be constrained.

In a first step, we add only the ACE (CRIS) data (de Nolfo et al. 2006), which covers the energy range from $\sim $ $ 8 \times 10^{-2}$ GeV/n to $\sim $ $2 \times 10^{-1}$ GeV/n, and which is later referred to as dataset B (the dataset A being HEAO-3 data alone). The resulting posterior distributions are similar for the datasets B and A (B is not shown, but A is given in Fig. 2, bottom). Results for datasets A and B are completely consistent (first and second line of Table 2), but for the latter, propagation parameters are more tightly constrained and the fit is improved ( $\chi^{2}_{\rm min}$/d.o.f. = 1.09). The ACE (CRIS) data are compatible with R0 = 0, but the preferred critical rigidity is 2.47 GV.

All other low-energy data (ISEE-3, Ulysses, IMP7-8, Voyager 1 & 2, ACE) are then included (dataset D). The resulting values of the propagation parameters are left unchanged. However, a major difference lies in the higher $\chi^{2}_{\rm min}$/d.o.f. of 4.15, which reflects an inconsistency between the different low-energy data chosen for the MCMC. If the data point from the Ulysses experiment is excluded, $\chi^{2}_{\rm min}$/d.o.f. decreases to a value of 2.26, and by excluding also the ISEE-3 data points (dataset C) it decreases further to 1.06 (see Table 2).

Table 2:   Same as in Table 1, testing different B/C data sets.

Since the set of low-energy data have different modulation parameters, the difference in the results for the various data subsets becomes clearer after the data have been demodulated. The force-field approximation provides a simple analytical one-to-one correspondence between the modulated top-of-the atmosphere (TOA) and the demodulated interstellar (IS) fluxes. For an isotope x, the IS and TOA energies per nucleon are related by $E_k^{\rm IS} = E_k^{\rm TOA} +
\Phi$ ( $\Phi=Z/A\times \phi$ is the modulation parameter), and the fluxes by (px is the momentum per nucleon of x)

\begin{displaymath}\psi_x^{\rm IS} \left(E_k^{\rm IS}\right) = \left( \frac{p_x^...
...\psi_x^{\rm TOA} \left(E_k^{\rm TOA} + Z/A \times \psi\right).
\end{displaymath} (21)

The B/C ratio results from a combination of various isotopes, and assuming the same Z/A for all isotopes, we find that

\begin{displaymath}\left( \frac{\rm B}{\rm C}\right)^{\rm IS} \left(E_k^{\rm IS}...
...ight)^{\rm TOA} \left(E_k^{\rm TOA} + Z/A \times \phi \right).
\end{displaymath} (22)

The modulated and demodulated low-energy B/C data are shown in Fig. 4 (see caption for details). The ISEE-3 and Ulysses data points, as just underlined,
 \begin{figure}
\par\includegraphics[width = 8.5cm]{0824fig4.eps}
\end{figure} Figure 4:

HEAO-3 and ACE (CRIS) modulated (TOA, solid line) and demodulated (IS, dashed line) data points have been connected to guide the eye. Filled symbols (modulated and demodulated) correspond to HEAO-3, ACE (CRIS), IMP7-8 and Voyager 1  and 2. On the TOA curve, the empty red stars and the blue upper triangle correspond to ISEE-3 and Ulysses.

Open with DEXTER
are clearly inconsistent with other data. To be consistent, $\Phi=200$ MV for ISEE-3 and $\Phi=200$ MV for Ulysses would be required. Significant uncertainties $\Delta\Phi\sim 25{-}50$ GV are quoted in general, so that it is difficult to conclude whether there are systematics in the measurement or if the modulation quoted in the papers is inappropriate. Some experiments have also accumulated the signal for several years, periods during which the modulation changes. It is beyond the scope of this paper to discuss this issue further. Below, we discard both ISEE-3 and Ulysses data in selecting an homogeneous low-energy data set, which includes the most recent ACE (CRIS) data.

Table 3:   Best-fit parameters for selected models and B/C sets.

The resulting best-fit models, when taking low-energy data into account, are displayed in Fig. 5. The B/C best-fit ratio is displayed for Model III and the dataset A (red thin lines) and C (black thick lines). Model III-B (not shown) yields similar results to Model III-C. Solid and dashed lines correspond to the two modulations $\Phi = 250$ MV (HEAO-3 and IMP7-8) and $\Phi =225$ MV respectively (ACE and Voyager 1 & 2). Although the fit from HEAO-3 alone provides a good match at low energy, adding ACE (CRIS) and Voyager 1 & 2 constraints slightly shifts all of the parameters to lower values.

 \begin{figure}
\par\includegraphics[width = 8.5cm]{0824fig5.eps}
\end{figure} Figure 5:

Best-fit B/C flux (Model III) for datasets A (thin red curves) and C (thick black curves). Above 300 MeV/n, B/C is modulated to $\Phi = 250$ MV (solid lines) appropriate for HEAO-3 data, whereas below, it is modulated to $\Phi =225$ MV (dashed lines). Model III-B, not shown, overlaps with III-C. The corresponding propagation values are gathered in Table 3.

Open with DEXTER

In a final try, we take into account all available data (dataset E, final line of Table 2). Many data are clearly inconsistent with each other (see Fig. 8), but as for the low-energy case, although the $\chi^{2}_{\rm min}$/d.o.f. is worsened, the preferred values of the propagation parameters are not changed drastically (compare with datasets B, C, and D in Table 2). We await forthcoming data from CREAM, TRACER, and PAMELA to be able to confirm and refine the results for HEAO-3 data.

5.1.3 Model IV: break in the spectral index

We have already mentioned that the rigidity cut-off may be associated with the existence of a galactic wind in diffusion models. By allowing a break to occur in the spectral index of $\lambda_{\rm esc}$ (see Eq. (17)), we search for a deviations from a single power law ( $\delta_0=\delta$) or from the cut-off case ( $\delta_0=0$).

Adding a new parameter $\delta _0$ (Model IV) increases the correlation length of the MCMC, since R0 and $\delta _0$ are correlated (see Eq. (17)). The acceptance $f_{{\rm ind}}$ (Eq. (9)) is hence extremely low. For Model IV-C (i.e. using dataset C, see Table 2), we find $f_{{\rm ind}}=2\%$. The PDF for $\delta _0$ is shown in the left panel of Fig. 6. The most probable values and 68% confidence intervals obtained are $\{\lambda_0, R_0, \delta_0,
\delta, V_{a} \} = \{ 30^{+2}_{-2}, 2.2^{+0.4}_{-0.6}, -0.6^{+0.2}_{-1.3}, 0.55^{+0.04}_{-0.02}, 76^{+9}_{-11} \}$, which are consistent with values found for other models, as given in Tables 1 and 2: adding a low-energy spectral break only allows us to better adjust low-energy data (figure not shown). The best-fit parameters, for which $\chi^2_{\rm min}=0.86$, are reported in Table 3. The small value of $\chi^{2}_{\rm min}$(smaller than 1) may indicate an over-adjustment, which would disfavour the model.

It is also interesting to compel $\delta _0$ to be positive, to check whether $\delta_0=0$ (equivalent to Model III), $\delta_0=\delta$ (equivalent to Model II), or any value in-between that is preferred. We find the most probable values to be $\{ \lambda_0, R_0, \delta_0, \delta, V_{a} \} = \{ 23^{+1}_{-1}, 1^{+2}_{-1},
0^{+0.6}, 0.49^{+0.01}_{-0.01}, 102^{+4}_{-5} \}$. The corresponding PDF for $\delta _0$is shown in the right panel of Fig. 6. The maximum occurs for $\delta_0=0$, which is also found to be the best-fit value; we checked that the best-fit parameters matches those given in Table 3 for Model III-C. A secondary peak appears at $\delta_0\approx 0.5$, such as $\delta_0\approx\delta$corresponding to Model II. The associated $\chi^{2}_{\rm min}$ for this configuration is worse than that obtained with $\delta_0=0$, in agreement with the conclusion that Model III provides a closer description of the data than Model II.

 \begin{figure}
\par\includegraphics[width = 8.5cm]{0824fig6.eps}
\end{figure} Figure 6:

Marginalised PDF for the low-energy spectral index $\delta _0$ in Model IV-C. The parameter $\delta _0$ is either free to span both positive and negative values ( left panel) or constrained to $\delta _0 > 0$ ( right panel).

Open with DEXTER

5.1.4 Summary and confidence levels for the B/C ratio

In the previous paragraphs, we have studied several models and B/C datasets. The two main conclusions that can be drawn are i) the best-fit model is Model III, which includes reacceleration and a cut-off rigidity; and ii) the most likely values of the propagation parameters are not too dependent on the data set used, although when data are inconsistent with each other the statistical interpretation of the goodness of fit of a model is altered (all best-fit parameters are gathered in Table 3). The values of the derived propagation parameters are close to the values found in similar studies and the correlation between the LB transport parameters are well understood.

Taking advantage of the knowledge of the $\chi ^2$ distribution, we can extract a list of configurations, i.e. a list of parameter sets, based on CLs of the $\chi ^2$ PDF (as explained in Appendix A.2). The $\chi ^2$ distribution is shown for our best model, i.e. Model III, in Fig. 7. The red and black areas correspond to the 68% and 95% confidence intervals, which are used to generate two configuration lists, from which 68% and 95% CLs on, e.g., fluxes, can be derived[*].

 \begin{figure}
\par\includegraphics[width = 8.5cm]{0824fig7.eps}
\end{figure} Figure 7:

$\chi ^2$/d.o.f. normalised distribution for Model III-C. The 68% and 95% CL of the distribution are shown respectively as the red and black area.

Open with DEXTER

The B/C best-fit curve (dashed blue), the 68% (red solid), and 95% (black solid) CL envelopes are shown in Fig. 8. For the specific case of the LBM, this demonstrates that current data are already able to constrain strongly the B/C flux (as reflected by the good value $\chi^2_{\rm min}=1.06$), even at high energy. This provides encouraging support in the discriminating power of forthcoming data. However, this conclusion must be confirmed by analysis of a more refined model (e.g., diffusion model), for which the situation might not be so simple.

 \begin{figure}
\par\includegraphics[width = 8.5cm]{0824fig8.eps}
\end{figure} Figure 8:

Confidence regions of the B/C ratio for Model III-C as calculated from all propagation parameters satisfying Eq. (A.2). The blue-dashed line is the best-fit solution, red-solid line is 68% CL and black-solid line 95% CL. Two modulation parameters are used: $\Phi =225$ MV below 0.4 GeV/n (adapted for ACE+Voyager 1 & 2+IMP7-8 data) and $\Phi = 250$ MV above (adapted for HEAO-3 data).

Open with DEXTER

From the same lists, we can also derive the range allowed for the source abundances of elements (we did not try to fit isotopic abundances here, although this can be achieved, e.g., as in Simpson & Connell 2001, and references therein). The element abundances are gathered in Table 4, for elements from C to Si (heavier elements were not used in this study). They can be compared with those found in Engelmann et al. (1990) (see also those derived from Ulysses data, Duvernois et al. 1996). For some elements, the agreement is striking (F, Mg), and is otherwise fair. The difference for the main progenitors of boron, i.e. C, N, and O, is a bit puzzling, and is probably related to a difference in the input-source spectral shape. This is discussed further in Sect. 5.3, where we also determine self-consistently the propagation parameters along with the C, N, and O abundances.

Table 4:   Element source abundances for Model III-C and comparison with HEAO-3 results.

  5.2 Constraints from $\bar{{p}}$

In the context of indirect dark-matter searches, the antimatter fluxes ( $\overline {p}$, $\bar{d}$ and e+) are used to look for exotic contributions on top of the standard, secondary ones.

The standard procedure is to fit the propagation parameters to B/C data, and apply these parameters in calculating the secondary and primary (exotic) contributions. The secondary flux calculated for our best-fit Model III-C is shown, along with the data (see Appendix B.2 for more details) in Fig. 9 (black-solid line). For this model, we can calculate the $\chi ^2$ value for the $\overline {p}$ data, and we find $\chi^2/$d.o.f. = 1.86. The fit is not perfect, and as found in other studies (e.g., Duperray et al. 2005), the flux is somehow low at high energy (PAMELA data are awaited to confirm this trend). However, we checked that these high-energy data points are not responsible for the large $\chi ^2$ value. The latter could be attributed to either a small exotic contribution, a different propagation history for species for which A/Z=1 or $A/Z\approx2$, or inaccurate data.

It may therefore appear reasonable to fit directly the propagation parameters to the $\overline {p}$ flux, assuming that it is a purely secondary species. Since the fluxes of its progenitors (p and He) are well measured, this should provide an independent check of the propagation history. We first attempted to apply the MCMC method to the $\overline {p}$ data with Model III, then Model II and finally Model I. However, even the simplest model exhibits strong degeneracies, and the MCMC chains could not converge. We had to revert to a model with no reacceleration ( ${\cal V}_a = 0$), no critical rigidity (R0 = 0), and no break in the spectral index ( $\delta_0=0$), for which $\lambda_{\rm esc}=\lambda_0 \beta (R/1GV)^{-\delta}$ (hereafter Model 0). The $1\sigma$ values found for the two parameters $\{\lambda_0, \delta\}$are $\lambda_0^{\bar{p},~\rm Model~0}=10.2^{+0.5}_{-0.5}~{\rm g~cm^{-2}}$ and $\delta^{\bar{p},~\rm Model~0}=0.00^{+0.04}$. Hence, only one parameter ($\lambda _0$) is required to reproduce the data, as seen in Fig. 9 (red-dashed line, $\chi^2_{\rm min}/$d.o.f. = 1.128). This is understood as follows: due to the combined effect of modulation and the tertiary contribution ( $\overline {p}$ inelastically interacting on the ISM, but surviving as a $\overline {p}$ of lower energy), the true low-energy data points all correspond to $\overline {p}$ produced at a few GeV energies. Due to the large scattering in the data, it is sufficient to produce the correct amount of $\overline {p}$ at this particular energy to account for all of the data.

 \begin{figure}
\par\includegraphics[width = 8cm]{0824fig9.eps}
\end{figure} Figure 9:

Demodulated anti-proton data and IS flux for Model 0 (best fit on $\overline {p}$ data, red-dashed line) and for Model III-C (from the best-fit parameters on B/C data, black-solid line).

Open with DEXTER

Due to the importance of antimatter fluxes for indirect dark-matter searches, this novel approach could be helpful in the future. However, this would require a more robust statistics of the $\overline {p}$ flux, especially at higher energy, to lift the degeneracy in the parameters.

5.3 Adding free parameters related to the source spectra

In all previous studies (e.g., Jones et al. 2001), the source parameters were investigated after the propagation parameters had been determined from the B/C ratio (or other secondary to primary ratio). We propose a more general approach, where we fit simultaneously all of the parameters. With the current data, this already provides strong constraints on the CR source slope $\alpha $ and source abundances (CNO). Higher-quality data are awaited to refine this analysis. We also show how this approach can help to uncover inconsistencies in the measured fluxes.

For all models below, taking advantage of the results obtained in Sect. 5.1, we retain Model III-C. The roman number refers to the free transport parameters of the model (III =  $\{\lambda_0,~ R_0, ~ \delta, ~ {\cal V}_a\}$), and the capital refers to the choice of the B/C dataset (C=HEAO-3+Voyager 1 & 2 + IMP7-8, see Table 2). This is supplemented by source spectra parameters and additional data for the element fluxes.

5.3.1 Source shape $\alpha $ and $\eta $ from Eq. (20)

As a free parameter, we first add a universal source slope $\alpha $. We then allow $\eta $, parameterising a universal low-energy shape of all spectra, to be a second free parameter. In addition to B/C constraining the transport parameters, some primary species must be added to constrain $\alpha $ and $\eta $. We restrict ourselves to O, the most abundant boron progenitor, because it was measured by both the HEAO-3 experiment (Engelmann et al. 1990), and also the TRACER experiment (Ave et al. 2008). The modulation levels were $\Phi = 250$ MV for HEAO-3 and $\Phi=500$ MV for TRACER. We estmated the latter number from the solar activity at the time of flight (2 weeks in December 2003) as seen from neutron monitors data[*].

In total, we test four models (denoted by 1a, 1b, 2a, and 2b for legibility):

  • III-C+1a: $\{\lambda_0,~ R_0,~ \delta,~ {\cal V}_a\} + \{\alpha\}$, with O = HEAO-3;
  • III-C+1b: $\{\lambda_0,~ R_0,~ \delta,~ {\cal V}_a\} + \{\alpha\}$, with O=TRACER;
  • III-C+2a: $\{\lambda_0,~ R_0,~ \delta,~ {\cal V}_a\} + \{\alpha,~ \eta\}$, with O = HEAO-3;
  • III-C+2b: $\{\lambda_0,~ R_0,~ \delta,~ {\cal V}_a\} + \{\alpha,~ \eta\}$, with O = TRACER.
Where the Arabic numbers relate to the source-spectrum free parameters used in the calculation, and the lower case relates to the chosen oxygen-flux dataset (a = HEAO-3, b = TRACER). The most probable parameters are gathered in Table 5, where, to provide a comparison, the first line reports the values found for Model III-C (i.e. with $\gamma\equiv\alpha+\delta$ fixed to 2.65).

Table 5:   Most probable values of the propagation and source parameters for different parameterisation of the source spectrum.

We remark that by adding HEAO-3 oxygen data to the fit (1a), the propagation parameters $\lambda _0$, R0, and $\delta $ overshoot Model III-C's results, while they undershoot those of Model 1b (TRACER data). The parameter  ${\cal V}_a$ undershoots and overshoots for these two models respectively, since it is anti-correlated with the former parameters. As a consequence, the fit to B/C is worsened, especially at low energy (see Fig. 10).
 \begin{figure}
\par\includegraphics[width = 8cm]{0824fig10.eps} %
\end{figure} Figure 10:

B/C ratio from best-fit models of Table 6. In addition to the propagation parameters, free parameters of the source spectra are $\alpha $ (thin lines, labeled 1) or $\alpha $ and $\eta $ (thick lines, labeled 2): the two models are tested on two datasets for the primary flux (in addition to using the B/C HEAO-3 data): O is as measured by HEAO-3 (black lines, dataset a) or as measured by TRACER (blue lines, dataset b).

Open with DEXTER

The top left panel of Fig. 11 shows the slopes $\alpha $ derived for Models 1a (solid black) and 1b (dashed blue). In both cases, $\alpha $is well constrained, but the values are inconsistent, a result that is clear because the low-energy data are also inconsistent: the demodulated (i.e. IS) HEAO-3 and TRACER oxygen data points are shown in the right panel of Fig. 11.

 \begin{figure}
\par\mbox{\includegraphics[width = 4.2cm]{0824fig11a.eps}\include...
...24fig11b.eps} }
\par\includegraphics[width = 8.4cm]{0824fig11c.eps}
\end{figure} Figure 11:

Marginalised PDF for models III-C+1 ($\alpha $ free parameter, top left) and III-C+2 ($\alpha $ and $\eta $ free parameters, bottom panels). In the three panels, the solid-black lines rely on HEAO-3 oxygen data, whereas the blue-dashed lines rely on TRACER oxygen data (thin and thick lines are as in Fig. 10). Top right panel: zoom in demodulated O low energy HEAO-3 and TRACER data. (The solid segments on TRACER data show the energy bin size. Uncertainty on all fluxes are present, but too small to notice.)

Open with DEXTER
To remedy this situation, we allow $\eta $ to be a free parameter (family of models III+2). The net effect is to absorb any uncertainty originating in either the modulation level or the source-spectrum low-energy shape. As shown in the bottom panel of Fig. 11, the source slopes derived from the two experiments are now in far closer agreement (bottom left), with $\alpha\simeq 2.15$. The most probable values and the best-fit model values are given in Tables 5 and 6 respectively. The effect of this action is evident in the low-energy slope of the source spectrum $\eta $. As seen in the bottom-right panel, the two data sets contain significantly inconsistent ranges. The value $\eta_{\rm TRACER}\simeq -6.7$ probably indicates that the solar modulation we chose was incorrect. The value $\eta_{\rm HEAO-3}\simeq 0.3$might provide a reasonable guess of the low-energy shape of the source spectrum, but might also be a consequence of systematics in the experiment. The associated oxygen fluxes are shown in Fig. 12 for the best-fit models: as explained, models that allow $\eta $ to vary (thick lines) reproduce more accurately the data than when $\eta $ is set to be -1 (thin lines).
 \begin{figure}
\par\includegraphics[width = 8.4cm]{0824fig12.eps} %
\end{figure} Figure 12:

Same models as in Fig. 10, but for the oxygen flux.

Open with DEXTER

Although it would be precipitate to draw any firm conclusion about the low-energy shape, we can turn the argument around to serve as a diagnosis of the low-energy data quality. For instance, assuming that the spectral index $\alpha $ of all elements was the same, extracting and comparing $\eta_i$ for each of these i elements may enable us to infer some systematics remaining in the data. It would be worth fitting the H and He species, which are the most reliably measured fluxes to date; this will be considered in a future study using diffusion models.

5.3.2 $\alpha $, $\eta $ and source normalisation qi

The final two models add, as free parameters, the CNO element source abundances (relative isotopic abundances are fixed to SS ones). The data used in the fit are B/C, C, N, and O, all measured by HEAO-3 (TRACER data for C and N have not yet been published). The models, which are denoted by short 4a and 5a in the text below, are:

  • III-C+4a: $\{\lambda_0,~ R_0,~ \delta,~ {\cal V}_a\} + \{\alpha, q_{\rm C},~
q_{\rm N},~ q_{\rm O}\}$;
  • III-C+5a: $\{\lambda_0,~ R_0,~ \delta,~ {\cal V}_a\} + \{\alpha,~ \eta, q_{\rm C},~ q_{\rm N},~ q_{\rm O}\}$.
 \begin{figure}
\par\includegraphics[width =17cm,clip]{0824fig13.eps}
\end{figure} Figure 13:

PDF (diagonal) and 2D correlations (off-diagonal) plots for the nine free parameters of Model III-C+5 when fitted on B/C, and CNO HEAO-3 data.

Open with DEXTER
The PDF and 2D correlations plots between propagation and source parameters are seen in Fig. 13. With nine parameters, the efficiency is very low ( $f_{\rm ind}\la 0.05\%$), even using the BSP trial function. To obtain $\sim $800 independent points, a total $1.6\times 10^6$ steps were completed. The contours are not as regular as for our 4-parameter model (see Fig. 2), but correlations between the parameters are still clearly evident: we recover the $\lambda_0-R_0-\delta$ correlations (and anti-correlation with ${\cal V}_a$). In addition, we note that all source-related parameters ($\alpha $, $\eta $ and element abundances $q_{\rm C,N,O}$) are correlated among themselves and with ${\cal V}_a$ (hence anti-correlated with the remaining transport parameters). This is especially visible for the primary species C and O, while less clear for the mixed species N. This is understood as the parameter $\gamma$, i.e. the slope of the propagated primary fluxes ( $\gamma=\alpha+\delta$) is mostly fixed by measurements: if we decrease $\delta $, then $\alpha $ must be increased to match the data. However, if the source slope is increased, the exponent $\eta $ must also increase to match the low-energy data. The positive correlation between source abundances comes from the fact that relative fluxes should be preserved.

The most probable values are gathered in Table 5. Compared with the respective Models 1a and 2a, leaving the source abundances $q_{\rm C}$, $q_{\rm N}$ and $q_{\rm O}$ free in 4a and 5a does not significantly change our conclusions. Again, adding $\eta $(2a and 5a) as a free parameter allows us to absorb the low-energy uncertainties in the data, so that we obtain $\alpha=2.17$ (5a) instead of the biased value of 2.13 (4a). The same conclusions hold for other propagation parameters. On the derived source abundances, the impact of adding the parameter $\eta $ is for them to increase. The relative C:N:O abundances ($O\equiv 1$) are respectively 0.78:0.36:1 (4a) and 0.82:0.40:1 (5a), the second model providing values slightly closer to those derived from HEAO-3 data 0.81:0.49:1.

The difference in the source element abundances when they are rescaled to match the data or including them in the MCMC is also seen from Table 6, which gathers the best-fit parameters. The next-to-last line reproduces  $q_{\rm C,N,O}$ obtained for all models: all abundances are roughly in agreement, although our approach underlines the importance of taking the correlations between the parameters properly into account in extracting unbiased estimates of the propagation and source parameters.

Table 6:   Best-fit values of propagation and source parameters using B/C and CNO data for models as given in Table 5.

The goodness of fit for the models when applied to the B/C, C, N, and O data is shown in the last column of Table 6, in terms of the $\chi^{2}_{\rm min}$value. The models in which $q_{\rm C,N,O}$ is free do not provide a closer match between models and data but also a no poorer fit than when $q_{\rm C,N,O}$ is fixed. As soon as primary fluxes are included in the fit (compared to Model III-C), the $\chi^{2}_{\rm min}$ is worsened. This is due to a combination of an imperfect fit to the primary fluxes and, as already said, a poorer B/C fit because the propagation parameters are optimised to match the former rather than the latter (B/C). The best-fit parameters are given in the same Table, and the associated CNO fluxes are plotted in Fig. 14 for illustration purposes.

 \begin{figure}
\par\mbox{\includegraphics[width = 2.6cm]{0824fig14a.eps}\include...
...]{0824fig14b.eps}\includegraphics[width = 2.6cm]{0824fig14c.eps} }\end{figure} Figure 14:

Carbon, nitrogen and oxygen fluxes from best-fit models of Table 6.

Open with DEXTER

5.3.3 Perspective on source spectrum parameters

Other primary species could have been included in the $\chi ^2$ calculation to i) constrain further $\alpha $; and/or ii) to check the hypothesis $\alpha_i\neq \alpha_j$for different species; and/or iii) diagnose some problems in the data, if we believe the slope should be universal. However, using a few primary species (O or CNO) already affects the goodness of fit of B/C (compare model III-C to others in Fig. 10). Since there are many more measured primary fluxes than secondary ones, taking too many primaries would weigh too significantly in the $\chi ^2$, compared to the B/C contribution, and this would divert the MCMC into regions of the parameter space that fit these fluxes rather than B/C. Since systematic errors are known to be larger in flux measurements than in ratios, this may lead to biased estimates of the propagation parameters. Allowing $\eta $ to be a free parameter is a first step to decrease the impact of this bias (in the low-energy part of the spectrum). These biases are not necessarily an issue since we may be more interested in estimates of the source parameters rather than in unbiased value of the propagation parameters.

To illustrate the difficulty in data systematics, it suffices to say that for the most abundant species H and He, all experiments provided mutually incompatible measurements, until AMS and BESS experiments flew ten years ago. We cannot therefore expect HEAO-3 data, acquired in 1979, to be completely free of such drawbacks. Again, we await the publication of several forthcoming new data sets before pursuing our analysis further in this direction.

6 Conclusion

We have implemented a Markov Chain Monte Carlo to extract the posterior distribution functions of the propagation parameters in a LBM. Three trial functions were used, namely a standard Gaussian step, an N-dimensional Gaussian step and its covariance matrix, and a binary-space partitioning. For each method, a large number of chains were processed in parallel to accelerate the PDF calculations. The three trial functions were used sequentially, each method providing some inputs to the next: while the first one was good at identifying the general range of the propagation parameters, it was not as efficient in providing an accurate description of the PDF. The two other methods provided this accuracy, and the final results were based on PDF obtained from the chains processed with the binary-space partitioning.

Taking advantage of the sound statistical properties of the MCMC, confidence intervals for the propagation parameters can be given, as well as confidence contours for all fluxes and other quantities derived from the propagation parameters. The MCMC was also used to compare the impact of choosing different datasets and ascertain the merits of different hypotheses concerning the propagation models. Concerning the first aspect, we have shown that combining different B/C datasets leaves mostly unchanged the propagation parameters, while strongly affecting the assessment of the goodness of a model. We also show that at present, the $\overline {p}$ data do not cover a sufficiently large energy range to constrain the propagation parameters, but they could be useful in crosschecking the secondary nature of the flux in the future.

In this first paper, we have focused on the phenomenologically well-understood LBM, to ease and simplify the discussion and implementation of the MCMC. In agreement with previous studies, we confirm that a model with a rigidity cutoff performs more successfully than one without and that reacceleration is preferred over no reacceleration. Such a model can be associated with a diffusion model with wind and reacceleration. As found in Maurin et al. (2001), the best-fit models demand both a rigidity cutoff (wind) and reacceleration, but do not allow us to reconcile the diffusion slope with a Kolmogorov spectrum for turbulence. An alternative model with two slopes for the diffusion was used, but it is not favoured by the data. In a last stage, we allowed the abundance and slope of the source spectra to be free parameters, as well as the element abundances of C, N, and O. This illustrated a correlation between the propagation and source parameters, potentially biasing the estimates of these parameters. The best-fit model slope for the source abundances was $\alpha\approx 2.17$ using HEAO-3 data, compatible with the value $\alpha\approx 2.14$ for TRACER data. The MCMC approach allowed us to draw confidence intervals for the propagation parameters, the source parameters, and also for all fluxes.

A wealth of new data on Galactic CR fluxes are expected soon. As illustrated for the LBM, the MCMC is a robust tool in handling complex data and model parameters, where one has to fit simultaneously all source and propagation parameters. The next step is to apply this approach to more realistic diffusion models and larger datasets, on a wider range of nuclear species.

Acknowledgements
D.M. and R.T. warmly thank Joanna Dunkley for useful discussions at an early stage of this work. We thank Céline Combet for a careful reading of the paper.

Appendix A: Best fit, goodness of a model, most probable values and confidence levels/intervals

The best-fit model parameters are given by a unique set of parameters for which the $\chi ^2$ value of Eq. (13) is minimized; the goodness of fit of a model is given by $\chi^2_{\rm min}/$d.o.f. On the other hand, the most probable value for each parameter $\theta_i$ is defined as the maximum ${\cal P}^{\rm max}_i\equiv {\cal P}(\theta_i^{\rm max})$ of its PDF (after marginalising). The most probable $\vec{\theta}^{\rm max}$ and best-fit model parameters  $\vec{\theta}^{\rm best}$do not necessarily coincide, especially when correlations exist between parameters. The best-fit model parameters are best suited to providing the most likely CR fluxes, whereas the 1D marginalised PDF provides directly the most likely value of the parameter.

A.1 Confidence levels/intervals on parameters

Confidence intervals (CI), associated with a confidence level (CL), are constructed from the PDF. The asymmetric interval $\Delta_x\equiv [\theta_i^{\rm max}-\theta^-_x,
\theta_i^{\rm max}+\theta^+_x]$ such as

\begin{displaymath}{\rm CL}(x)\equiv \int_{\Delta_x}{\cal P}(\theta_i) {\rm d} \theta_i= 1 - \gamma,
\end{displaymath} (A.1)

defines the $1 - \gamma$ confidence level (CL), along with the CI of the parameter $\theta_i$. Here, the CIs (i.e. $\theta^-_x$ and $\theta^+_x$) are found by decreasing the value  ${\cal P}(\theta_i)$ from ${\cal P}^{\rm max}_i$ to ${\cal P}_i^x$, such that $1-\gamma=x$. This is easily generalised to 2D confidence levels in constructing 2D confidence intervals as shown later in correlation plots. Below, we use the $x=68\%$ and $x=95\%$ CLs, corresponding to $1\sigma$ and $2\sigma$uncertainties.

A.2 Confidence intervals on fluxes

The best-fit model fluxes (e.g., B/C, O, $\bar{{p}}$) are calculated from the best-fit model parameters. Confidence levels in these quantities cannot be obtained from the 1D marginalised CIs of the parameters. They must be constructed from a sampling of the (still) correlated parameters. This is achieved by using all sets of parameters $\{\vec{\theta}\}_{x\% \rm CL}=\{\vec{\theta}_i\}_{i=1\cdots p}$, for which $\chi^2(\vec{\theta}_i)$ falls in the $x\%$ confidence level of the $\chi ^2$ PDF. Once these sets are found, we simply calculate the desired flux for all the sets: the maximum and minimum values are kept for each energy bin, defining confidence envelopes for this flux. Thus, the main task is to construct confidence intervals for the $\chi ^2$ distribution.

For n parameters in the large sample limit - where the joint PDF for the estimator of the parameters and the likelihood function become Gaussian -, the CI is given by

\begin{displaymath}[\chi^2_{{\rm min}},~ \chi^2_{{\rm min}}+\Delta \chi^2], \quad
{\rm where} \quad \Delta \chi^2 = Q_{\gamma} (1 - \gamma, n)
\end{displaymath}

is the quantile of order $1 - \gamma$ (confidence level CL) of the $\chi ^2$distribution (Cowan 1997). However, by applying the MCMC, we have access to a direct sampling of the $\chi ^2$ distribution. Hence, independently of the statistical meaning of a model, the confidence interval is extracted from the cumulative $\chi ^2$ PDF, by requiring that

\begin{displaymath}\int_{\chi^2_{\rm min}}^{\chi^2_{\rm min}+\Delta \chi^2}{\cal P}(\chi^2) {\rm d} \chi^2 =1 - \gamma.
\end{displaymath} (A.2)

We nevertheless checked that both approaches provide similar results. For instance, the CIs (for Model III-C) obtained directly from Fig. 7 are ${\rm CI}~(68\%)=[\chi^2_{\rm min}, \chi^2_{\rm min}+4.9]$and ${\rm CI}~(95\%)=[\chi^2_{\rm min}, \chi^2_{\rm min}+9.2]$, whereas they are ${\rm CI}~(68\%)=[\chi^2_{\rm min}, \chi^2_{\rm min}+4.7]$ and ${\rm CI}~(95\%)=[\chi^2_{\rm min}, \chi^2_{\rm min}+9.5]$ when calculated from the $Q_{\gamma} (1 - \gamma, n)$ quantiles (Cowan 1997).

Appendix B: Data

In the paper, we focus on the B/C ratio, which is the most accurate tracer of the propagation parameters (other tracers, such as the sub-Fe/Fe or the quartet 1H, 2H, 3He and 4He are not considered). We also estimate the potential of the $\overline {p}$, a secondary species, as an alternative species for constraining these parameters. We describe below the typical configurations used to calculate the corresponding spectrum as well as the associated datasets used.

B.1 B/C

The default configuration for nuclei is the following: the value of the observed propagated slope $\gamma=\alpha+\delta$, unless stated otherwise, is set to be 2.65 (Ave et al. 2008), and source abundances of the most abundant species (C, N, O, F, Ne, Na, Mg, Al, Si) are rescaled to match HEAO-3 data at 10.6 GeV/n. Boron is assumed to be a pure secondary species. Only elements lighter than Si are propagated, since they are the only relevant ones for determining B/C (Maurin et al. 2001).

For B/C at intermediate GeV energies, we use HEAO-3 data (Engelmann et al. 1990). A collection of low-energy data is formed by data sets for the IMP7-8 (Garcia-Munoz et al. 1987), ISEE-3 (Krombel & Wiedenbeck 1988), Ulysses (Duvernois et al. 1996), and Voyager 1&2 (Lukasiak et al. 1999), and ACE (CRIS) (de Nolfo et al. 2006) spacecrafts. At higher energy, we consider balloon flights (Lezniak & Webber 1978; Simon et al. 1980; Orth et al. 1978; Dwyer & Meyer 1987), the ATIC-2 balloon-borne experiment (Panov et al. 2007), and the Spacelab-2 experiment (Mueller et al. 1991). For element fluxes, HEA0-3 and TRACER results (Ave et al. 2008) are used.

B.2 $\bar{{p}}$

For the calculation of the $\overline {p}$ flux, the situation is simpler: the production of this secondary flux can be directly linked to the accurate measurements of propagated p and He fluxes, by the AMS (Alcaraz et al. 2000b; AMS Collaboration et al. 2000; Alcaraz et al. 2000a) and BESS experiments (Shikaze et al. 2007; Sanuki et al. 2000). For more details of the $\overline {p}$ flux calculation (cross sections, source terms, ...), the reader is referred to Donato et al. (2001).

For the $\overline {p}$ data, we consider the AMS 98 (AMS Collaboration et al. 2002) experiment on the shuttle, the balloon-borne experiments IMAX 92 (Mitchell et al. 1996), CAPRICE 94 (Boezio et al. 1997), WIZARD-MASS 91 (Basini 1999), CAPRICE 98 (Boezio et al. 2001), and the series of BESS balloon flights BESS 95+97 (Orito et al. 2000), BESS 98 (Maeno et al. 2001), BESS 99 and 2000 (Asaoka et al. 2002), and BESS 2002 (Haino & et al. 2005). We also add the BESS Polar results (BESS Collaboration et al. 2008). For BESS data, we use the solar modulation level as provided in Shikaze et al. (2007), based on proton and helium data.

Appendix C: Illustration of MCMC chains and PDF found with the three trial functions q $(\theta _{\rm {{trial}}}, \theta _{{i}})$

To compare the three trial functions (Sect. 3.3.2), a simple setup is retained: the B/C ratio observed from HEAO-3 data is used to constrain the model parameters $\{\vec{\theta}_i\}_{i=1,\ldots, 3}= \{\lambda_0,~ R_0,~ \delta\}$, i.e. Model I.

Taking advantage of parallel processing (as underlined in Sect. 4.1), we combine several chains of 10 000 steps for each trial function. We start with the Gaussian trial function. It combines $N_{\rm c}=40$ chains and its output is used to calculate the covariance matrix. Taking advantage of smaller burn-in and correlation lengths, only 20 chains need to be combined for the covariance trial function. Again, the resulting PDFs are used as input to the BSP trial function, which also combines 20 chains. Several chains (shown here for $\delta $), along with their burn-in length and correlation step l, are shown in Fig. C.1.

 \begin{figure}
\par\mbox{\includegraphics[width=0.43\textwidth,height=0.41\textw...
...height=0.41\textwidth,keepaspectratio=false]{0824figC1c.eps} }
\par
\end{figure} Figure C.1:

Illustration of MCMC chains (here for the parameter $\delta $ and Model I). From top to bottom, chains generated from the Gaussian step, covariance matrix, and binary-space partitioning. Three chains are shown in each panel: the shaded area corresponds to the burn-in length and the arrow to the size of the correlation length l defined by Eq. (8). Although each process consists of 10 000 steps, the Gaussian step zoom in on the first 5000 steps, the two others displaying respectively 500 and 50 steps. This indicates that each method allows a gain of $\sim $10 in efficiency to extract the PDF (compare the size of the arrows with the number of steps in each case).

Open with DEXTER
 \begin{figure}
\par\mbox{\includegraphics[width=0.43\textwidth,height=0.41\textw...
...height=0.41\textwidth,keepaspectratio=false]{0824figC2d.eps} }
\par
\end{figure} Figure C.2:

Posterior PDF of $\{\vec{\theta}^{(\alpha)}\}_{\alpha=1, \ldots, 3}= \{\lambda_0,~ R_0, \delta\}$ using the 3 proposal densities defined in Sect. 3.3.2 (Gaussian - top left; covariance matrix - top right; BSP - bottom right). In these three panels, the diagonal shows the 1D marginalised posterior density function of the indicated parameter. The number of entries in the normalised histograms corresponds to the number of uncorrelated steps spent by the chain for values of $\lambda _0$, R0 and $\delta $. Off-diagonal plots show the 2D marginalised posterior density functions for the parameters in the same column and same line respectively: $\lambda _0-R_0$, $\lambda _0-\delta $ and $R_0-\delta $. The colour code corresponds to the regions of increasing probability (from paler to darker shade). The two contours (smoothed) delimit regions containing respectively 68% and 95% (inner and outer contour) of the PDF. Finally, the bottom-left panel shows an example of a 3D parameter mesh (3 slices along the three different planes $\lambda _0-\delta -R_0$) obtained from the BSP method.

Open with DEXTER

The result of the three sampling methods (Gaussian, covariance matrix, and BSP) for the PDF are shown in Fig. C.2. The insert in each panel provides mean values (over the number of chains processed $N_{\rm c}$) for relevant parameters of the chain analysis (Sect. 3.2): a decrease in the burn-in length b (421.5, 20.4 and 2.6) and the correlation length l (159.7, 6 and 1) is found when moving from the Gaussian to the BSP sampling method. The fraction of independent samples (as defined in Eqs. (9) and (10)) is $f_{\rm ind}=0.7\%$ for the Gaussian step, while nearly every step is valid and uncorrelated (total of 99.9%) for the BSP mode. This confirms that, for a given number of steps, refined trial functions are more efficient in extracting the PDF (note however that some improvement comes from the fact that each method takes advantage of the previous step of calculation).

Figure C.2 (bottom-left) illustrates the binary-space partitioning discussed in Sect. 3.3.2. It shows the projections of box sides on the three 2D planes $\lambda _0-R_0$, $\lambda _0-\delta $ and $R_0-\delta $ of the parameter space. The partitioning has been produced by the BSP method using the covariance matrix procedure presented on the same figure (top-right), where the box density is clearly proportional to the estimated target density.

References

Footnotes

... 2008[*]
http://cosmicray.umd.edu/cream/cream.html
... package[*]
A public version will be released soon (Maurin, in preparation).
... derived[*]
For instance, it can be used to predict the $\overline {p}$ or $\bar{d}$ background flux to look for a dark-matter annihilating contribution, e.g., as in Donato et al. (2004). Note however that the statistical procedure described here is far more robust than the crude approach used by Maurin et al. (2001); Donato et al. (2004).
... data[*]
http://ulysses.sr.unh.edu/NeutronMonitor/Misc/neutron2.html. Indeed, the solar activity between 2002 and 2004 has not varied much, so that we use a value for $\Phi$ derived from the BESS 2002 flight (see Fig. 2 of Shikaze et al. 2007).

All Tables

Table 1:   Most probable values of the propagation parameters for different models, using B/C data (HEAO-3 only).

Table 2:   Same as in Table 1, testing different B/C data sets.

Table 3:   Best-fit parameters for selected models and B/C sets.

Table 4:   Element source abundances for Model III-C and comparison with HEAO-3 results.

Table 5:   Most probable values of the propagation and source parameters for different parameterisation of the source spectrum.

Table 6:   Best-fit values of propagation and source parameters using B/C and CNO data for models as given in Table 5.

All Figures

  \begin{figure}
\par\includegraphics[width = 8.8cm]{0824fig1.eps}
\end{figure} Figure 1:

Flow chart of the implemented MCMC algorithm: $\vec{\theta}_i$ is a vector (Eq. (1)) of the $\alpha = 1, \ldots, m$ free parameters of the model, evaluated at each step i, and $p (\vec{\theta}_i)$ is the target function given by Eq. (11). See text for details.

Open with DEXTER
In the text

  \begin{figure}
\par\includegraphics[width = 8.8cm]{0824fig2a.eps}\vspace{3mm}
\includegraphics[width = 8.8cm]{0824fig2b.eps}
\end{figure} Figure 2:

Posterior distributions for Model II ( top) and Model III ( bottom) using HEAO-3 data only. For more details, refer to caption of Fig. C.2.

Open with DEXTER
In the text

  \begin{figure}
\par\includegraphics[width = 8.5cm]{0824fig3.eps}
\end{figure} Figure 3:

Best-fit ratio for Model I (blue dotted), II (red dashed), and Model III (black solid) using the HEAO-3 data only (green symbols). The curves are modulated with $\Phi = 250$ GV. The corresponding best-fit parameters are gathered in Table 3.

Open with DEXTER
In the text

  \begin{figure}
\par\includegraphics[width = 8.5cm]{0824fig4.eps}
\end{figure} Figure 4:

HEAO-3 and ACE (CRIS) modulated (TOA, solid line) and demodulated (IS, dashed line) data points have been connected to guide the eye. Filled symbols (modulated and demodulated) correspond to HEAO-3, ACE (CRIS), IMP7-8 and Voyager 1  and 2. On the TOA curve, the empty red stars and the blue upper triangle correspond to ISEE-3 and Ulysses.

Open with DEXTER
In the text

  \begin{figure}
\par\includegraphics[width = 8.5cm]{0824fig5.eps}
\end{figure} Figure 5:

Best-fit B/C flux (Model III) for datasets A (thin red curves) and C (thick black curves). Above 300 MeV/n, B/C is modulated to $\Phi = 250$ MV (solid lines) appropriate for HEAO-3 data, whereas below, it is modulated to $\Phi =225$ MV (dashed lines). Model III-B, not shown, overlaps with III-C. The corresponding propagation values are gathered in Table 3.

Open with DEXTER
In the text

  \begin{figure}
\par\includegraphics[width = 8.5cm]{0824fig6.eps}
\end{figure} Figure 6:

Marginalised PDF for the low-energy spectral index $\delta _0$ in Model IV-C. The parameter $\delta _0$ is either free to span both positive and negative values ( left panel) or constrained to $\delta _0 > 0$ ( right panel).

Open with DEXTER
In the text

  \begin{figure}
\par\includegraphics[width = 8.5cm]{0824fig7.eps}
\end{figure} Figure 7:

$\chi ^2$/d.o.f. normalised distribution for Model III-C. The 68% and 95% CL of the distribution are shown respectively as the red and black area.

Open with DEXTER
In the text

  \begin{figure}
\par\includegraphics[width = 8.5cm]{0824fig8.eps}
\end{figure} Figure 8:

Confidence regions of the B/C ratio for Model III-C as calculated from all propagation parameters satisfying Eq. (A.2). The blue-dashed line is the best-fit solution, red-solid line is 68% CL and black-solid line 95% CL. Two modulation parameters are used: $\Phi =225$ MV below 0.4 GeV/n (adapted for ACE+Voyager 1 & 2+IMP7-8 data) and $\Phi = 250$ MV above (adapted for HEAO-3 data).

Open with DEXTER
In the text

  \begin{figure}
\par\includegraphics[width = 8cm]{0824fig9.eps}
\end{figure} Figure 9:

Demodulated anti-proton data and IS flux for Model 0 (best fit on $\overline {p}$ data, red-dashed line) and for Model III-C (from the best-fit parameters on B/C data, black-solid line).

Open with DEXTER
In the text

  \begin{figure}
\par\includegraphics[width = 8cm]{0824fig10.eps} %
\end{figure} Figure 10:

B/C ratio from best-fit models of Table 6. In addition to the propagation parameters, free parameters of the source spectra are $\alpha $ (thin lines, labeled 1) or $\alpha $ and $\eta $ (thick lines, labeled 2): the two models are tested on two datasets for the primary flux (in addition to using the B/C HEAO-3 data): O is as measured by HEAO-3 (black lines, dataset a) or as measured by TRACER (blue lines, dataset b).

Open with DEXTER
In the text

  \begin{figure}
\par\mbox{\includegraphics[width = 4.2cm]{0824fig11a.eps}\include...
...24fig11b.eps} }
\par\includegraphics[width = 8.4cm]{0824fig11c.eps}
\end{figure} Figure 11:

Marginalised PDF for models III-C+1 ($\alpha $ free parameter, top left) and III-C+2 ($\alpha $ and $\eta $ free parameters, bottom panels). In the three panels, the solid-black lines rely on HEAO-3 oxygen data, whereas the blue-dashed lines rely on TRACER oxygen data (thin and thick lines are as in Fig. 10). Top right panel: zoom in demodulated O low energy HEAO-3 and TRACER data. (The solid segments on TRACER data show the energy bin size. Uncertainty on all fluxes are present, but too small to notice.)

Open with DEXTER
In the text

  \begin{figure}
\par\includegraphics[width = 8.4cm]{0824fig12.eps} %
\end{figure} Figure 12:

Same models as in Fig. 10, but for the oxygen flux.

Open with DEXTER
In the text

  \begin{figure}
\par\includegraphics[width =17cm,clip]{0824fig13.eps}
\end{figure} Figure 13:

PDF (diagonal) and 2D correlations (off-diagonal) plots for the nine free parameters of Model III-C+5 when fitted on B/C, and CNO HEAO-3 data.

Open with DEXTER
In the text

  \begin{figure}
\par\mbox{\includegraphics[width = 2.6cm]{0824fig14a.eps}\include...
...]{0824fig14b.eps}\includegraphics[width = 2.6cm]{0824fig14c.eps} }\end{figure} Figure 14:

Carbon, nitrogen and oxygen fluxes from best-fit models of Table 6.

Open with DEXTER
In the text

  \begin{figure}
\par\mbox{\includegraphics[width=0.43\textwidth,height=0.41\textw...
...height=0.41\textwidth,keepaspectratio=false]{0824figC1c.eps} }
\par
\end{figure} Figure C.1:

Illustration of MCMC chains (here for the parameter $\delta $ and Model I). From top to bottom, chains generated from the Gaussian step, covariance matrix, and binary-space partitioning. Three chains are shown in each panel: the shaded area corresponds to the burn-in length and the arrow to the size of the correlation length l defined by Eq. (8). Although each process consists of 10 000 steps, the Gaussian step zoom in on the first 5000 steps, the two others displaying respectively 500 and 50 steps. This indicates that each method allows a gain of $\sim $10 in efficiency to extract the PDF (compare the size of the arrows with the number of steps in each case).

Open with DEXTER
In the text

  \begin{figure}
\par\mbox{\includegraphics[width=0.43\textwidth,height=0.41\textw...
...height=0.41\textwidth,keepaspectratio=false]{0824figC2d.eps} }
\par
\end{figure} Figure C.2:

Posterior PDF of $\{\vec{\theta}^{(\alpha)}\}_{\alpha=1, \ldots, 3}= \{\lambda_0,~ R_0, \delta\}$ using the 3 proposal densities defined in Sect. 3.3.2 (Gaussian - top left; covariance matrix - top right; BSP - bottom right). In these three panels, the diagonal shows the 1D marginalised posterior density function of the indicated parameter. The number of entries in the normalised histograms corresponds to the number of uncorrelated steps spent by the chain for values of $\lambda _0$, R0 and $\delta $. Off-diagonal plots show the 2D marginalised posterior density functions for the parameters in the same column and same line respectively: $\lambda _0-R_0$, $\lambda _0-\delta $ and $R_0-\delta $. The colour code corresponds to the regions of increasing probability (from paler to darker shade). The two contours (smoothed) delimit regions containing respectively 68% and 95% (inner and outer contour) of the PDF. Finally, the bottom-left panel shows an example of a 3D parameter mesh (3 slices along the three different planes $\lambda _0-\delta -R_0$) obtained from the BSP method.

Open with DEXTER
In the text


Copyright ESO 2009

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.