Issue 
A&A
Volume 527, March 2011



Article Number  A49  
Number of page(s)  7  
Section  Cosmology (including clusters of galaxies)  
DOI  https://doi.org/10.1051/00046361/201015281  
Published online  24 January 2011 
Hubble parameter reconstruction from a principal component analysis: minimizing the bias
^{1}
IAG, Universidade de São Paulo, Rua do Matão 1226, Cidade
Universitária,
CEP 05508900, 05362070
São Paulo,
SP,
Brazil
^{2}
Institute for the Physics and Mathematics of the Universe,
University of Tokyo, Kashiwa,
2778568
Chiba,
Japan
email: emille.ishida@ipmu.jp
Received:
25
June
2010
Accepted:
8
December
2010
Aims. A modelindependent reconstruction of the cosmic expansion rate is essential to a robust analysis of cosmological observations. Our goal is to demonstrate that current data are able to provide reasonable constraints on the behavior of the Hubble parameter with redshift, independently of any cosmological model or underlying gravity theory.
Methods. Using type Ia supernova data, we show that it is possible to analytically calculate the Fisher matrix components in a Hubble parameter analysis without assumptions about the energy content of the Universe. We used a principal component analysis to reconstruct the Hubble parameter as a linear combination of the Fisher matrix eigenvectors (principal components). To suppress the bias introduced by the high redshift behavior of the components, we considered the value of the Hubble parameter at high redshift as a free parameter. We first tested our procedure using a mock sample of type Ia supernova observations, we then applied it to the real data compiled by the Sloan Digital Sky Survey (SDSS) group.
Results. In the mock sample analysis, we demonstrate that it is possible to drastically suppress the bias introduced by the high redshift behavior of the principal components. Applying our procedure to the real data, we show that it allows us to determine the behavior of the Hubble parameter with reasonable uncertainty, without introducing any adhoc parameterizations. Beyond that, our reconstruction agrees with completely independent measurements of the Hubble parameter obtained from redenvelope galaxies.
Key words: cosmological parameters / methods: statistical
© ESO, 2011
1. Introduction
At the end of the 20th century, observations of type Ia supernovae (SNIa) revealed that the Universe expansion is accelerating (Riess et al. 1998; Perlmutter et al. 1999). Since these publications, several efforts have been made to explain these observations (Cunha et al. 2009; Frieman et al. 2008; Linder 2008; Linder & Huterer 2005; Samsing & Linder 2010; Freaza et al. 2002; Ishida 2005; Ishida et al. 2008, and references therein). In a standard analysis, darkenergy models are characterized by a small set of parameters. These are placed into the cosmic expansion rate by means of the Friedman equations, in substitution for the conventional cosmologicalconstant term. This approach assumes a specific dependence of the darkenergy equation of state (w) on redshift and provides some insight into the probable values of the parameters involved. However, the results remain restricted to that particular parametrization. An interesting question to attempt to answer is what can be inferred about the cosmic expansion rate from observations without any reference to a specific model for the energy content of the Universe?
To perform an independent analysis, we used principal component analysis (PCA). In simple terms, PCA identifies the directions of data points clustering in the phase space defined by the parameters of a given model. Consequently, it allows a dimensionality reduction with as minimum an information loss as possible (Tegmark et al. 1997). The importance of a modelindependent reconstruction of the cosmic expansion rate has already been investigated in the literature (Huterer & Turner 1999, 2000; Tegmark 2002; Wang & Tegmark 2005; Mignone & Bartelmann 2008). In this context, PCA has been used to reconstruct the darkenergy equation of state (Huterer & Starkman 2003; Crittenden et al. 2009; Simpson & Bridle 2006) and the deceleration parameter (Shapiro & Turner 2006) as a function of redshift. The use of PCA was also proposed in the interpretation of future experiments results by Albrecht et al. (2009). In the face of growing interest in the application of PCA to cosmology, Kitching & Amara (2009) recall that some care must be taken in choosing the basic expansion functions and the interpretation assigned to the components.
The main goal of this work is to apply PCA to reconstruct directly the Hubble parameter redshift dependence without any reference to a specific cosmological model. In this context, the eigenvectors and eigenvalues of the Fisher matrix form a new basis in which the Hubble parameter is expanded. For the first time, we show that it is possible to derive analytical expressions for the Fisher matrix if we focus on the Hubble parameter (H(z)) as a sum of step functions. The reader should realize throughout this work that our procedure is mostly driven by the data, although there is a weak dependence of the components on our starting choices of parameter values. In other words, the functional form of each eigenvector is not of primary importance, we are more interested in how they are linearly combined. This approach allows us to avoid many interpretation problems pointed out by Kitching & Amara (2009). Our only assumption is that the Universe is spatially homogeneous and isotropic and can be described by FriedmannRobertsonWalker (FRW) metric.
The paper is organized as follows. In Sect. 2, we briefly review our knowledge of PCA and demonstrate how it can be applied to a Hubble parameter analysis using type Ia supernova observations. Section 3 shows the results obtained with a simulated supernova data set, following the standard procedure for dealing with the linear combination coefficients. We demonstrate that the quality of our results derived from the simulated data are greatly improved if we consider the Hubble parameter value in the upper redshift bound as a free parameter. We apply the same procedure to real type Ia supernova data compiled by the Sloan Digital Sky Survey team (Kessler et al. 2009). The results are shown in Sect. 4. Finally, in Sect. 5, we present our conclusions.
2. Principal component analysis
2.1. The Fisher matrix
The procedure used to find the principal components (PCs) begins with the definition of the Fisher information matrix (F). Owing to its relation to the covariance matrix (F = C^{1}), it can be shown that the PCs and their associated uncertainties are related to the eigenvectors and eigenvalues of the Fisher matrix, respectively.
We consider that our data set is formed by N independent observations, each one characterized by a Gaussian probability density function, f_{i}(x_{i},σ_{i};β^{β}β). In our notation, x_{i} represents the ith measurement, σ_{i} the uncertainty associated with it, and is the vector of parameters of our theoretical model. In other words, we investigate a specific quantity, x, which can be written as a function of the parameters β_{i}, (x(β^{β}β)). In this context, the likelihood function is given by and the Fisher matrix is defined as (1)The brackets in Eq. (1) represents the expectation value.
We can write F = D^{T}Λ^{Λ}Λ D, where the rows of the decorrelation matrix (D) are the eigenvectors () of F, and is a diagonal matrix whose nonzero elements are the eigenvalues (λ_{i}) of F. Choosing D to be orthogonal, with det(D) = 1, forms an orthonormal basis of decorrelated vectors (or modes). After finding the eigenvectors and eigenvalues of F, we rewrite x as a linear combination of . Our ability to determine each coefficient of this linear expansion (α_{i}) is given by . Following the standard convention, we enumerate from the larger to the smaller associated eigenvalue.
The main goal of PCA is the dimensionality reduction of our initial parameter space. This arises in the number of PCs we use to rewrite x. The most accurately determined modes (smaller σ_{αi}) correspond to directions of high data clustering in the original parameter space. As a consequence, they represent a larger part of the variance present in the original data set. In the same way, the most poorly determined modes correspond to a small portion of the variance in the data, describing features that might not be important in our particular analysis. In this context, we must determine the number of PCs that will be used in the reconstruction. Our decision must be balanced between how much information we are willing to discard and the amount of uncertainty that will not compromise our results. The constraint on x reconstructed with M modes (where and N_{PC} is the total number of PCs), is given by a simple error propagation of the uncertainties associated with each PC (Huterer & Starkman 2003) (2)From this expression, it is clear that adding one more PC adds also its associated uncertainty. At this point, we note that to calculate F_{kl} we must choose numerical values for each parameter β_{i}. This corresponds to specifying a base model as our starting point. As a consequence, the results provided by PCA are interpreted as deviations from this initial model. The uncertainty derived from fitting the data to this base model should also be added in quadrature to Eq. (2), to compute the total uncertainty in the final reconstruction.
The question of how many PCs should be used in the final reconstruction is far from simple, and there is no standard quantitative procedure to determine it. In many cases, the decision depends on the particular data set analyzed and our expectation towards them (for a complete review see Jollife 2002, chap. 6). One practical way of facing the problem is to consider how many components are inconsistent with zero in a particular reconstruction. In most cases, the coefficients α_{i} tend to decrease in modulus for higher i, at the same time as the uncertainties associated with them increases. In this context, we can choose the final reconstruction as the one whose coefficients are all inconsistent with zero.
The determination of one final reconstruction is beyond the scope of this work. However, to provide an idea of how much of the initial variance is included in our plots, we shall order them following their cumulative percentage of total variance.
The total variance present in the data is represented well by the sum of all λ_{i}, and a reconstruction with the first M PCs encloses a percentage of this value (t_{M}), given by (3)As a consequence, the question of how many PCs turns into a matter of what percentage of total variance we are willing to enclose.
2.2. Investigating the Hubble parameter from SNIa observations
From now on, we consider the distance modulus, μ, provided by type Ia supernova observations as our observed quantity (x_{i} = μ_{i}). In a very simple approach, if we consider a flat, homogeneous and isotropic Universe, described by the FRW metric, the distance modulus relates to cosmology according to where μ_{0} is called intercept, d_{L}(z) is the luminosity distance, and H(z) is the Hubble parameter. We use H_{0} = 72 km s^{1}Mpc^{1} as the current value of the Hubble parameter (Komatsu et al. 2009).
To make H(z) as general as possible, we write it as a sum of step functions (6)where β_{i} are constants, is the vector formed by all β_{i}, N_{bin} is the number of redshift bins and Δz is the width of each bin. This approach was proposed by Shapiro & Turner (2006) in the context of deceleration parameter analysis. Although, when it is used for the Hubble parameter, the Fisher matrix calculations are simplified and we still get pretty general results. Given the definition above, β_{i} are now the parameters of our theory. Physically, they represent the value of the Hubble parameter in each redshift bin. We can obviously express any functional form using this prescription, with higher resolution for a larger number of bins.
In this context, we are able to obtain analytical expressions for the luminosity distance (7)and its derivatives, (8)where L(z) corresponds to the integer part of z / Δz. From Eqs. (1), (4), and (5), we can calculate the Fisher matrix components as (9)The Hubble parameter may now be reconstructed as the sum of H_{base} and a linear combination of the new uncorrelated variables represented by the eigenvectors of the Fisher matrix. Mathematically, (10)where α_{i} are constants and is the vector formed by all the α_{i}. Using Eq. (10) in Eqs. (4) and (5), we can write the reconstructed distance modulus. The data set is then used to find values for the parameters α_{i} that minimize the expression (11)This minimization procedure will also generate an uncertainty in the value of parameters α_{i} (), which should be taken into account in the final reconstruction error budget.
3. Application
3.1. Mock sample
To test the expressions and procedures presented before, we used a simulated type Ia supernova data set. We consider 34 redshift bins of Δz = 0.05 (0 ≤ z ≤ 1.7), each one containing 50 supernovae. We tested configurations with a larger number of bins, but the results are consistent for any configuration with more than ~25 redshift bins. The uncertainty in the ith bin was calculated according to the prescription proposed in Kim et al. (2004)(12)We performed 1000 simulations of a flat Universe containing a cosmological constant and dark matter, with matter density parameter Ω_{m} = 0.27 as our fiducial model.
Our main goal in using this simulation is to obtain an idea of how the procedure proposed here behaves in an almost ideal scenario. It represents a simplified version of future data, as for the Joint Dark Energy Mission (JDEM)^{1}, but it is enough to allow us to check the consistency of our procedure.
Fig.1 PCs obtained from our mock sample as a function of redshift. All PCs are shown according to the convention that e^{e}e_{i}(z = 0) > 0. Top: first (redfull), second (bluedashed), third (greendotdashed), and fourth (browndotted) PC. Bottom: fifth (purplefull), sixth (cyandashed), seventh (magentadotdashed), and eighth (dark bluedotted) PC. 
Fig.2 Principal components obtained using a constant base model (fullred line) and using ΛCDM as a base model (dottedbrown line). 
Fig.3 Reconstruction of the Hubble parameter using Eqs. (10) and (11). Hubble parameter reconstructed with one (topleft) to six PCs (bottomright), in units of H_{0}. The black (solid) line represents our bestfit reconstruction and the red (dotted) curves shows 2σ confidence levels. The blue (dashed) line corresponds to the the behavior of the Hubble parameter in our fiducial model. 
Using the equations shown in the previous section, we calculated the Fisher matrix components. We found that the modes are weakly sensitive to the choice of initial base model (values for the parameters β_{i}, hereafter H_{base}). However, if we use a specific cosmological model to attribute values to the parameters β_{i} (ΛCDM, for example), all the results derived from this initial choice can only be analyzed in the face of that model. As our goal is to make a modelindependent analysis, the best choice is to calculate the PCs based on a model where there is no evolution with redshift (β_{i} = const. > 0,∀i). The PCs will then denote deviations from a constant behavior, regardless of the value attributed to β_{i}. A constant Hubble parameter is obviously an extremely unrealistic model, although, it does allow us to have a better idea of which characteristics of our results are extracted from the data and which are only a consequence of our initial choices.
We do not currently have well constrained information about the evolution of the Hubble parameter with redshift, but we do have independent measurements of its value today, H_{0} (e.g. Komatsu et al. 2009). Hence, we present our results in units of H_{0} and use a base model in which β_{i} = 1,∀i. The resulting eigenvectors with larger eigenvalues are shown in Fig. 1.
The comparison between the PCs obtained from using H_{base} = ΛCDM and H_{base} = const. is shown in Fig. 2. From this plot, we can see that the difference exist, but the overall shape of the PCs are not very sensitive to the choice of H_{base}.
To clear illustrate the standardprocedure results of PCA reconstruction in the specific case studied here, we show in Fig. 3 reconstructions using one to six PCs with corresponding values of t_{M}. From this plot, it is clear that our attempt to reconstruct H(z) using a few PCs does not provide the expected results. We have two main problems here: the reconstruction merely oscillates around the fiducial model (bluedashed line) and we can clearly see that there is a bias dominating the highredshift behavior. We address both problems in the next section.
3.2. Minimizing the bias in the reconstruction
From Fig. 1, we realize that all the first eight PCs go to zero at highredshift, which means that at these redshifts our data provides little information^{2}. Consequently, no matter how many PCs we use or which values we attribute to the parameters α_{i}, the reconstructed function will always be biased in the direction of our previously chosen base model for high z (in this work, “high z” corresponds to the upper redshift bound of our data set. In our mock sample, z_{max} = 1.7).
At this point, we must pay attention to the crucial role played by H_{base} in the standard procedure described so far. Although the PCs depend weakly on our choice of H_{base}, the final reconstruction is extremely sensitive to the choice. Figure 4 shows how different H_{base} lead to completely different final reconstructions.
Fig.4 Reconstructions using 1 (topleft) to 6 (rightbottom) PCs for different H_{base}, obtained from our mock sample. The black (full) line represents the fiducial model ΛCDM, the brown (dotted) line corresponds to the final reconstructions in the case H_{base}(z) = ΛCDM, the red (full) line corresponds to the case H_{base}(z) = 0.5, the blue (dashed) line shows the reconstruction considering H_{base} = 1.0, and the green (dotdashed) line is the reconstruction for H_{base}(z) = 4.0. 
Searching the literature, we found two different approaches to dealing with this problem. We could ignore the reconstruction in the region of high redshift (Huterer & Starkman 2003; Shapiro & Turner 2006) or add a physically motivated model for H_{base} in Eq. (10), which would provide us with the value we expect to measure in the upper redshift bound (Tang et al. 2008). We consider that the first alternative does not represent a good solution. The problem is not only the bias at high z, but also the weird behavior present in the reconstruction as a whole. Beyond that, our intention is not only to improve the fit quality, but also to make it independent of our initial choice of H_{base}. The second alternative would produce results in good agreement with the fiducial model (corresponding to the dottedbrown reconstruction in Fig. 4), in a simulated situation. Defining a physically motivated H_{base} would only, however, introduce another bias. As in reality we do not have access to the “true” value of H(z), this would require us to make a hypothesis about the energy content and dark energy model, which we are trying to avoid.
In this context, we believe that it is reasonable to consider the behavior of H(z) at high z as a free parameter. This means adding another parameter (h_{zmax}) to Eq. (10), which becomes (13)As a consequence, the new χ^{2} will be given by (14)and the uncertainty associated with the determination of h_{zmax} (σ_{hzmax}) is added in quadrature to the right hand side of Eq. (2), leading to (15)The effect of this choice is shown in Fig. 5. We can see that, no matter which H_{base} we use, if h_{zmax} is consider to be a free parameter, the reconstruction at high z is driven by the data. As a consequence, we obtain good agreement for the reconstruction using a constant as well as a ΛCDM model for H_{base}. Even though the PCs are not identical for nonevolving and ΛCDM models (Fig. 2), this agreement is a direct consequence of the bestfit values of the set of parameters { α_{i},h_{zmax} } always arranging themselves to more accurately describe the information in the data.
Fig.5 Reconstructions using one (topleft) to six (rightbottom) PCs for different choices of H_{base} and considering z_{max} as a free parameter, for our mock sample. The black (full) line represents the fiducial model ΛCDM, the blue (dashed) line corresponds to the final reconstructions in the case H_{base}(z) = H_{ΛCDM} and the red (full) line corresponds to the case H_{base}(z) = cte > 0. 
Fig.6 Reconstruction of the Hubble parameter using Eqs. (13) and (14). Hubble parameter reconstructed with 1 (topleft) to 5 (bottomleft) and 10 (bottomright) PCs, in units of H_{0}. The color code is the same used in Fig. 3. 
We present in Fig. 6 the results of the reconstruction using 1 to 5, and 10 PCs, and corresponding values of t_{M}. Comparing these results with those in Fig. 3, we can see a huge improvement in the agreement between the reconstructed function and the fiducial model. The reconstruction with 10 PCs encloses the fiducial model within 2σ confidence levels in the whole redshift range covered by the data. Considering that initially we had 34 parameters β_{i}, this represents a reduction of ≈70% in the parameter space dimensionality.
We have so far demonstrated that PCA is an effective method for determining the Hubble parameter behavior with redshift. It provides a considerable reduction in the initial parameter space dimensionality, without introducing any hypothesis about the energy content, cosmological model, or underlying gravity theory. The reconstruction relies on the assumption of a homogeneous and isotropic Universe, described by a FRW metric. The simulated data set used above is composed of independent data points, each one associated with a Gaussian probability density function, in a flat ΛCDM Universe.
Fig.7 Principal components obtained from real data sample. All PCs are plotted following the convention e^{e}e_{i}(z = 0) > 0. Top: first (redfull), second (bluedashed), third (greendotdashed), and fourth (browndotted) PCs. Bottom: fifth (purplefull), sixth (cyandashed), seventh (magentadotdashed), and eight (darker bluedotted) PCs. 
In what follows, we apply this procedure to a real (and consequently less well behaved) data set. Our goal is to see whether, in a realistic scenario, the effectiveness of the procedure remains.
4. Results from current SNIa data
We applied this procedure to real supernova Ia data compiled by the Sloan Digital Sky Survey (SDSS) SN group, hereafter real sample. This sample include measurements from the SDSS (Kessler et al. 2009), the ESSENCE survey (Miknaitis et al. 2007; WoodVasey et al. 2007), the Supernova Legacy Survey (SNLS)(Astier et al. 2006), the Hubble Space Telescope (HST) (Garnavich et al. 1998; Knop et al. 2003; Riess et al. 2004, 2007), and a compilation of nearby SN Ia measurements (Jha et al. 2007). The first eight PCs found from the real data set are shown in Fig. 7. We used 28 redshift bins of width Δz = 0.05 and the 287 data points from the aforementioned data set with z ≤ 1.4. The resulting reconstructions with one to six PCs are shown in Fig. 8. The bluedashed line corresponds to the behavior of the Hubble parameter in a flat Universe containing dark energy with an equationofstate parameter for dark energy w_{dark} = −0.76 and matter density parameter Ω_{m} = 0.30. This corresponds to the bestfit, flat cosmology found by Kessler et al. (2009) in the context of the Multicolor Light Curve Shape (MLCS2k2, Jha et al. 2007), hereafter fXCDM. It is shown here exclusively for comparison reasons, this model was not used in our calculations.
Fig.8 Reconstructions obtained from linear combinations of the PCs shown in Fig. 7. Panels run from 1 (topleft) to 6 (bottomright) PCs. The blue dashed line represents H(z) in a flat XCDM cosmological model with w = −0.76 and Ω_{m} = 0.30 (bestfit results reported by Kessler et al. 2009). Black curve corresponds to the bestfit reconstruction and red lines are 2σ confidence levels. 
Fig.9 Top: reconstruction of the Huble parameter using four PCs, obtained from the real sample, superimposed on direct Hubble parameter measurements reported by Stern et al. (2010) (green dots). Bottom: reconstruction of H(z) with four PCs and confidence intervals of 2σ obtained from the propagation of statistical uncertainties on Ω_{m} and w reported by Kessler et al. (2009), for MLCS2k2 results. In both panels, the blue dashed line represents H(z) in a flat XCDM cosmological model with w = −0.76 and Ω_{m} = 0.30 (best fit results reported by Kessler et al. 2009). Black curve corresponds to the bestfit reconstruction using 4 PCs and red lines are 2σ confidence levels. 
Comparing Figs. 6 and 8, we realize that the confidence intervals are much larger in Fig. 8, as expected, because of the observational uncertainties that are present only in the real sample. Beyond that, the contours corresponding to 2σ confidence levels do not evolve in Fig. 8 as they do in Fig. 6. This is a direct consequence of our choice of introducing h_{zmax} as a free parameter. In the simulated case, σ_{hzmax} is much smaller than the uncertainty associated with the parameters α_{i}. As a consequence, the evolution of the confidence levels is dominated by the uncertainties associated with the PCs. In the real case, the opposite situation occurs. For the six cases presented in Fig. 8, σ_{hzmax} ≫ σ_{ααα}, making the final uncertainty almost independent of how many PCs are used in the reconstruction. This behavior is a consequence of the low number and quality of data points at high redshift. However, it is also related to a nonnull correlation between the uncertainties in determining the parameters α_{i} and h_{zmax}. To explore this method in the best case scenario, we need to ensure not only that we have high number and quality of data points at high redshift, but also that the PCs are as well determined as possible.
In comparing Figs. 6 and 8, the reader should be aware that the blue dashed line means different things in each figure. Figure 6 is a simulation, and in such a case the blue dashed line corresponds to the fiducial model used to generate our mock sample. Figure 8 was created using real data, in this case the blue dashed line represents the bestfit flat ΛCDM model, as reported by Kessler et al. (2009).
We can also see, in Fig. 8, that the reconstruction becomes more irregular when more than four PCs are used. If we take the blue dashed line as a good representation of the “real” cosmological model, we could say that four PCs are enough to enclose the desired behavior within 2σ confidence level. For the sake of completeness, we plot reconstructions up to six PCs.
To compare our results with other modelindependent determinations of H(z), we plot in the top panel of Fig. 9 the reconstruction with four PCs, superimposed on measurements of H(z) derived from redenvelope galaxies observations by Stern et al. (2010). The error bars associated with these data points are still pretty large, but they already provide important insights into the behavior of the Hubble parameter in the redshift range covered by the real data sample. We can see that the reconstruction encloses the fXCDM model, as well as agreeing with the redenvelope galaxy measurements. To compare our results with the predictions of a standard modeldependent procedure, we show in the bottom panel of Fig. 9, the 2σ confidence levels derived from the error propagation of statistical uncertainties in Ω_{m} and w reported by Kessler et al. (2009) assuming a fXCDM model and using MLCS2k2. We again find good agreement between the two results.
In Figs. 8 and 9, we point out that the bluedashed line does not represent the behavior we are trying to achieve, but is merely a representation of a model we are used to dealing with. The purpose of showing it here is to provide an idea of how far our results are from others presented in the literature, although, in our particular analysis, no assumptions about the energy content of the Universe is necessary.
The determination of what kind of physical and/or systematic effect generates patterns seen in the two lower panels of Fig. 8 is beyond the scope of this work. In our interpretation, these results confirm that fXCDM provides a good firstorder approximation of the real behavior of H(z) within the current observational errors and assumptions underlying our procedure. However, a more realistic simulation and detailed study of systematic errors are necessary in order to fully understand secondorder effects.
5. Conclusions
We have presented an alternative procedure for extracting cosmological parameters from type Ia supernova data. Our analysis is concentrated in the Hubble parameter, although we emphasize that the same procedure can be applied to other quantities of interest. Our goal has been to be as general as possible, so we have tried to avoid parametric forms or specific cosmological models by using PCA.
Writing H(z) according to Eq. (6) and considering type Ia supernova observations, we have shown that it is possible to obtain analytical expressions for the Fisher matrix. We used a mock sample formed by 34 redshift bins of width Δz = 0.05, with errors calculated following the prescription proposed by Kim et al. (2004). This mock sample represents a simplification of future data sets, such as the JDEM, and is not a realistic representation of current data. Our goal in using it was to check the consistentency of our procedure.
Our first attempt in reconstructing the Hubble parameter as a linear combination of the eigenvectors of F was unsuccessful. In trying to fit highredshift data with PCs that go asymptotically to zero, the most oscillatory modes propagate their behavior to the reconstructed H(z) in the whole redshift range. As a consequence, the final result barely resembles our fiducial model.
To suppress the influence of the highredshift behavior present in all PCs of interest, we considered the value of the Hubble parameter at high redshift as an extra free parameter in our analysis. This simple modification provided reliable results when used with simulated and real supernova data. Beyond that, our results are corroborated with measurements of redenvelope galaxies from Stern et al. (2010).
As a final remark, we emphasize that PCA provides a viable way of avoiding phenomenological parameterizations. It represents one of the few statistical methods that allow us to obtain the behavior of a chosen quantity directly from the data. It has its own assumptions, such as Gaussianity, independence of data points and in the specific case analyzed here, cosmologies that obey a FRW metric. In the final reconstruction phase, it also exhibits a bias in the upper redshift bound. On the other hand, the procedure proposed here can drastically suppress the influence of this bias. Beyond that, we show that in the context of this work, the Fisher matrix can be analytically obtained. This avoids all uncertainties related to numerical derivations of step functions and might be a good alternative to standard statistical analyses applied to cosmological data.
This kind of behavior is also present in PCs from the dark energy equation of state (Huterer & Starkman 2003) and deceleration parameter (Shapiro & Turner 2006).
Acknowledgments
E.E.O.I. thanks the Brazilian agency CAPES (1313100) for financial support. R.S.S. thanks the Brazilian agencies FAPESP (2009/051764) and CNPq (200297/20104) for financial support. We also thank the anonymous referee for fruitful comments and suggestions. This work was supported by World Premier International Research Center Initiative (WPI Initiative), MEXT, Japan.
References
 Albrecht, A., Amendola, L., Bernstein, G., et al. 2009 [arXiv:0901.0721] [Google Scholar]
 Astier, P., Guy, J., Regnault, N., et al. 2006, A&A, 447, 31 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Crittenden, R. G., Pogosian, L., & Zhao, G. 2009, J. Cosm. AstroPart. Phys., 12, 25 [Google Scholar]
 Cunha, C., Huterer, D., & Frieman, J. A. 2009, Phys. Rev. D, 80, 063532 [NASA ADS] [CrossRef] [Google Scholar]
 Freaza, M. P., de Souza, R. S., & Waga, I. 2002, Phys. Rev. D, 66, 103502 [Google Scholar]
 Frieman, J. A., Turner, M. S., & Huterer, D. 2008, ARA&A, 46, 385 [NASA ADS] [CrossRef] [Google Scholar]
 Garnavich, P. M., Kirshner, R. P., Challis, P., et al. 1998, ApJ, 493, L53 [NASA ADS] [CrossRef] [Google Scholar]
 Huterer, D., & Starkman, G. 2003, Phys. Rev. Lett., 90, 031301 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Huterer, D., & Turner, M. S. 1999, Phys. Rev. D, 60, 081301 [NASA ADS] [CrossRef] [Google Scholar]
 Huterer, D., & Turner, M. S. 2000, Phys. Rev. D, 62, 063503 [NASA ADS] [CrossRef] [Google Scholar]
 Ishida, É. E. O. 2005, Braz. J. Phys., 35, 1172 [NASA ADS] [CrossRef] [Google Scholar]
 Ishida, É. E. O., Reis, R. R. R., Toribio, A. V., & Waga, I. 2008, Astropart. Phys., 28, 547 [NASA ADS] [CrossRef] [Google Scholar]
 Jha, S., Riess, A. G., & Kirshner, R. P. 2007, ApJ, 659, 122 [NASA ADS] [CrossRef] [Google Scholar]
 Jollife, I. T. 2002, Principal Component Analysis (SpringerVerlag) [Google Scholar]
 Kessler, R., Becker, A. C., Cinabro, D., et al. 2009, ApJS, 185, 32 [NASA ADS] [CrossRef] [Google Scholar]
 Kim, A. G., Linder, E. V., Miquel, R., & Mostek, N. 2004, MNRAS, 347, 909 [NASA ADS] [CrossRef] [Google Scholar]
 Kitching, T. D., & Amara, A. 2009, MNRAS, 398, 2134 [NASA ADS] [CrossRef] [Google Scholar]
 Knop, R. A., Aldering, G., Amanullah, R., et al. 2003, ApJ, 598, 102 [NASA ADS] [CrossRef] [Google Scholar]
 Komatsu, E., Dunkley, J., Nolta, M. R., et al. 2009, ApJS, 180, 330 [NASA ADS] [CrossRef] [Google Scholar]
 Linder, E. V. 2008, Rep. Prog. Phys., 71, 056901 [Google Scholar]
 Linder, E. V., & Huterer, D. 2005, Phys. Rev. D, 72, 043509 [NASA ADS] [CrossRef] [Google Scholar]
 Mignone, C., & Bartelmann, M. 2008, A&A, 481, 295 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Miknaitis, G., Pignata, G., Rest, A., et al. 2007, ApJ, 666, 674 [NASA ADS] [CrossRef] [Google Scholar]
 Perlmutter, S., Aldering, G., Goldhaber, G., & The Supernova Cosmology Project 1999, ApJ, 517, 565 [NASA ADS] [CrossRef] [Google Scholar]
 Riess, A. G., Filippenko, A. V., Challis, P., et al. 1998, AJ, 116, 1009 [NASA ADS] [CrossRef] [Google Scholar]
 Riess, A. G., Strolger, L., Tonry, J., et al. 2004, ApJ, 607, 665 [NASA ADS] [CrossRef] [Google Scholar]
 Riess, A. G., Strolger, L., Casertano, S., et al. 2007, ApJ, 659, 98 [NASA ADS] [CrossRef] [Google Scholar]
 Samsing, J., & Linder, E. V. 2010, Phys. Rev. D, 81, 043533 [NASA ADS] [CrossRef] [Google Scholar]
 Shapiro, C., & Turner, M. S. 2006, ApJ, 649, 563 [Google Scholar]
 Simpson, F., & Bridle, S. 2006, Phys. Rev. D, 73, 083001 [NASA ADS] [CrossRef] [Google Scholar]
 Stern, D., Jimenez, R., Verde, L., Kamionkowski, M., & Stanford, S. A. 2010, J. Cosm. AstroParticle Phys., 2, 8 [Google Scholar]
 Tang, J., Abdalla, F. B., & Weller, J. 2008 [arXiv:0807.3140] [Google Scholar]
 Tegmark, M. 2002, Phys. Rev. D, 66, 103507 [NASA ADS] [CrossRef] [Google Scholar]
 Tegmark, M., Taylor, A. N., & Heavens, A. F. 1997, ApJ, 480, 22 [NASA ADS] [CrossRef] [Google Scholar]
 Wang, Y., & Tegmark, M. 2005, Phys. Rev. D, 71, 103513 [NASA ADS] [CrossRef] [Google Scholar]
 WoodVasey, W. M., Miknaitis, G., Stubbs, C. W., et al. 2007, ApJ, 666, 694 [NASA ADS] [CrossRef] [Google Scholar]
All Figures
Fig.1 PCs obtained from our mock sample as a function of redshift. All PCs are shown according to the convention that e^{e}e_{i}(z = 0) > 0. Top: first (redfull), second (bluedashed), third (greendotdashed), and fourth (browndotted) PC. Bottom: fifth (purplefull), sixth (cyandashed), seventh (magentadotdashed), and eighth (dark bluedotted) PC. 

In the text 
Fig.2 Principal components obtained using a constant base model (fullred line) and using ΛCDM as a base model (dottedbrown line). 

In the text 
Fig.3 Reconstruction of the Hubble parameter using Eqs. (10) and (11). Hubble parameter reconstructed with one (topleft) to six PCs (bottomright), in units of H_{0}. The black (solid) line represents our bestfit reconstruction and the red (dotted) curves shows 2σ confidence levels. The blue (dashed) line corresponds to the the behavior of the Hubble parameter in our fiducial model. 

In the text 
Fig.4 Reconstructions using 1 (topleft) to 6 (rightbottom) PCs for different H_{base}, obtained from our mock sample. The black (full) line represents the fiducial model ΛCDM, the brown (dotted) line corresponds to the final reconstructions in the case H_{base}(z) = ΛCDM, the red (full) line corresponds to the case H_{base}(z) = 0.5, the blue (dashed) line shows the reconstruction considering H_{base} = 1.0, and the green (dotdashed) line is the reconstruction for H_{base}(z) = 4.0. 

In the text 
Fig.5 Reconstructions using one (topleft) to six (rightbottom) PCs for different choices of H_{base} and considering z_{max} as a free parameter, for our mock sample. The black (full) line represents the fiducial model ΛCDM, the blue (dashed) line corresponds to the final reconstructions in the case H_{base}(z) = H_{ΛCDM} and the red (full) line corresponds to the case H_{base}(z) = cte > 0. 

In the text 
Fig.6 Reconstruction of the Hubble parameter using Eqs. (13) and (14). Hubble parameter reconstructed with 1 (topleft) to 5 (bottomleft) and 10 (bottomright) PCs, in units of H_{0}. The color code is the same used in Fig. 3. 

In the text 
Fig.7 Principal components obtained from real data sample. All PCs are plotted following the convention e^{e}e_{i}(z = 0) > 0. Top: first (redfull), second (bluedashed), third (greendotdashed), and fourth (browndotted) PCs. Bottom: fifth (purplefull), sixth (cyandashed), seventh (magentadotdashed), and eight (darker bluedotted) PCs. 

In the text 
Fig.8 Reconstructions obtained from linear combinations of the PCs shown in Fig. 7. Panels run from 1 (topleft) to 6 (bottomright) PCs. The blue dashed line represents H(z) in a flat XCDM cosmological model with w = −0.76 and Ω_{m} = 0.30 (bestfit results reported by Kessler et al. 2009). Black curve corresponds to the bestfit reconstruction and red lines are 2σ confidence levels. 

In the text 
Fig.9 Top: reconstruction of the Huble parameter using four PCs, obtained from the real sample, superimposed on direct Hubble parameter measurements reported by Stern et al. (2010) (green dots). Bottom: reconstruction of H(z) with four PCs and confidence intervals of 2σ obtained from the propagation of statistical uncertainties on Ω_{m} and w reported by Kessler et al. (2009), for MLCS2k2 results. In both panels, the blue dashed line represents H(z) in a flat XCDM cosmological model with w = −0.76 and Ω_{m} = 0.30 (best fit results reported by Kessler et al. 2009). Black curve corresponds to the bestfit reconstruction using 4 PCs and red lines are 2σ confidence levels. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.