Open Access
Issue
A&A
Volume 636, April 2020
Article Number A46
Number of page(s) 24
Section Cosmology (including clusters of galaxies)
DOI https://doi.org/10.1051/0004-6361/201834954
Published online 17 April 2020

© P.-F. Léget et al. 2020

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

Type Ia supernovae (SNe Ia) are excellent cosmological probes: they are very luminous objects that are visible up to a redshift of z ∼ 2 (Guillochon et al. 2017), and their luminosity dispersion is naturally low and can be further reduced by an appropriate standardization process. As precise distance indicators, comparing their luminosity and their redshift allowed Perlmutter & Aldering (1998), Perlmutter et al. (1999), Riess et al. (1998), and Schmidt et al. (1998) to demonstrate that the expansion of the Universe is accelerating; a feature that gave rise to the dark energy paradigm. This result, obtained with a small sample of SNe Ia, has since been repeatedly confirmed with larger samples (Astier et al. 2006; Guy et al. 2010; Suzuki et al. 2012; Rest et al. 2014; Betoule et al. 2014; Scolnic et al. 2018), and in combination with other cosmological probes like the cosmological microwave background (CMB; Planck Collaboration XIII 2016), baryon acoustic oscillations (BAO; Delubac et al. 2015), and cosmic shear (Troxel et al. 2018) led to the so-called concordance model, the flat-Λ cold dark matter model.

Improving our description of SNe Ia as a cosmological probe is needed, not only to discriminate among alternate dark energy models (Copeland et al. 2006), but also in order to address existing tensions, such as the 3.4σ difference between the values of the Hubble constant H0 from the local measurement by Riess et al. (2016) and the cosmological fit from the Planck Collaboration XIII (2016). The current uncertainty budget due to limited SN Ia statistics will be greatly improved by current surveys like the Zwicky Transient Facility (ZTF; Bellm 2014) or next generation surveys like the Large Synoptic Survey Telescope (LSST) or the Wide Field Infrared Survey Telescope (WFIRST) (LSST Dark Energy Science Collaboration 2012; Spergel et al. 2015). Since, systematic and statistical uncertainties are already of the same order of magnitude (Betoule et al. 2014; Scolnic et al. 2018), a better understanding of systematics will be needed to improve the accuracy of SNe Ia as a cosmology probe.

Part of the systematic uncertainty comes from the standardization process; the observed flux of the SN Ia has to be corrected for variations observed from one object to another. Two main contributions to this variation have been observed. The first one, seen by Rust (1974) and Pskovskii (1977, 1984), is the correlation between the peak luminosity of a SNe Ia and the light curve decrease time, the so-called brighter-slower effect. Various parametrizations have been proposed for this effect, the most commonly used being Δm15 (Phillips 1993), stretch (Perlmutter et al. 1997), or X1 (Guy et al. 2007). The second contribution, observed first by Hamuy et al. (1995) and Riess et al. (1996), is that the peak luminosity depends on color, the so-called brighter-bluer effect. This effect can be explained by the presence of dust in varying quantities along the line of sight that is possibly combined with an intrinsic color-brightness correlation after the stretch effect has been taken into account. Most standardization techniques in photometry thus rely on a stretch and color relation known as the Tripp (1998) relation. The SALT2 model (Guy et al. 2007) is one such standardization method, and has over the years become the reference in cosmological analysis. Despite many attempts, no consensus has yet emerged on how to go beyond a two-component model to describe SNe Ia light curves. However, the observation of a step in the standardized luminosity with respect to the host mass by Kelly et al. (2010) and Sullivan et al. (2010) has led to the inclusion of a corrective term in the subsequent cosmological analysis. This corrective term linked to the environment hints that more fundamental properties of SNe Ia physics are not captured by the stretch-color standardization scheme, and that there is room for improvement.

However, the most obvious indication that the parameterization used is insufficient comes from the observation of a residual dispersion of SNe Ia luminosity around the Hubble diagram after standardization. In order to obtain statistically coherent results, Perlmutter et al. (1999) and Astier et al. (2006) introduced an intrinsic dispersion in luminosity as an additional uncertainty in the fit. Betoule et al. (2014) estimate the dispersion value at 0.11 mag using a SALT2 standardization, for an observed total dispersion of 0.16 mag. Since the intrinsic dispersion is due to unmodeled SN Ia variations that may depend on the redshift, this may result in a significant error on the extraction of cosmological parameters (Rigault et al. 2018). The precision of SNe Ia as cosmological probes therefore depends on this intrinsic dispersion in luminosity. However, the intrinsic dispersion strongly depends on the assumptions about measurement uncertainties. A way to characterize the overall accuracy obtained by a given standardization method is to use the weighted root mean square (wRMS) metric on the standardized magnitude (e.g., Blondin et al. 2011). Thus, in order to reduce the effects of unmodeled variations of SNe Ia, an improved standardization procedure must also involve reducing the wRMS.

Most efforts to improve the standardization of SNe Ia involve the search for a new parameter correlated with the intrinsic luminosity. After the discovery of the mass-step, many efforts focused on describing the environmental effects of the SNe Ia (Kelly et al. 2010; Sullivan et al. 2010; Rigault et al. 2013, 2015, 2018; Roman et al. 2018). A complementary approach consists of directly looking for this new parameter from the analysis of the light curves or observed spectra. For example, Mandel et al. (2017) proposed that the color of SNe Ia is a mixture of intrinsic color and extinction by dust, and proposed a Bayesian model to separate these two components. For their part, Chotard et al. (2011) showed that once the Ca II H&K variability is taken into account, the color law is compatible with a Cardelli et al. (1989) extinction curve. Moreover, Mandel et al. (2014) highlighted the dependence of the extinction curve on the minima of P Cygni profiles. Thus the use of additional variables related to spectral indicators seems to be a promising avenue to describe the variability of SNe Ia.

The use of spectral indicators to standardize SNe Ia has a long history (Nugent et al. 1995; Arsenijevic et al. 2008; Bailey et al. 2009; Wang et al. 2009; Foley & Kasen 2011; Chotard et al. 2011). As example, using only the flux ratio ℛ642/443, Bailey et al. (2009) is able to get a significatively better standardization in B-band in comparison to the classical Tripp (1998) relation. The method of using spectral information is also suggested by the analysis of Fakhouri et al. (2015), which shows that the best method to measure distance with SNe Ia is with spectroscopic twins SNe Ia. Another recent study by Nordin et al. (2018) has shown that the use of spectral information from the UV part of the spectra improves distance measurement compared to the Tripp (1998) relation. However, those methods do not currently lead to a spectral energy distribution (SED) model, which is necessary for cosmological analyses on purely photometric data, such as LSST (LSST Dark Energy Science Collaboration 2012). Thus we propose here a full SED model which will be based on spectral indicators generalizing the procedure originally developed in Chotard et al. (2011). This method uses spectral features, which allows for the addition of more than one intrinsic parameter and they offer a possible way to separate intrinsic properties from extrinsic properties. From it, we build a full SED model which may be used for purely photometric surveys.

The aim of this study is to revisit the parametrization of SNe Ia SED in light of the SNfactory spectrophotometric dataset (Aldering et al. 2002), and to seek new sources of variability by statistical analysis. Our model is trained using spectral indicators, derived around maximum light in B-band, as features to describe the model. They provide both a reduced dimensionality description of spectra, and a description which is linked to the physics of the explosion. In addition, we select indicators insensitive to reddening in order to decouple the characterization of the reddening from effects purely linked to the intrinsic part of the explosion. This new SNe Ia SED model is named the SUpernova Generator And Reconstructor (SUGAR) model.

Another approach to develop a new SED model was undertaken in parallel by Saunders et al. (2018), using the same dataset and based on a “SALT2 like” framework. Saunders et al. (2018) generalized a strategy that was originally proposed by Guy et al. (2007) and which was first performed on broad band photometry from the SNfactory dataset in Kim et al. (2013). In brief, Saunders et al. (2018) did a principal component analysis (PCA)-like study on interpolated spectral time series in order to go beyond the classical Tripp (1998) relation. Saunders et al. (2018) is able to significatively improve the SED description with respect to the SALT2 model. The SNEMO model developed in Saunders et al. (2018) differs from the SUGAR model in that SNEMO attempts to find principal components purely based on spectral time series variability in the relative luminosities at each wavelength and as a function of time, whereas SUGAR uses spectral features to try to find a compact description of such spectral time series. However, both models share some technical details in the model training and were developed in common.

This paper is organized as follows: Sect. 2 presents how the spectrophotometric time series of the SNfactory were obtained, Sect. 3 focuses on the intermediary data used in building the SUGAR model (spectral indicators at maximum light, dimensionality reduction through factor analysis, derivation of extinction parameters and time interpolation). Section 4 describes the SUGAR model: its formalism, training, and components. Section 5 presents the performance of the model for fitting spectral time series using SUGAR model. These results are compared to the performance achieved by SALT2 for the same data. Finally, Sect. 6 discusses adding additional components to the SUGAR model and some technical choices that were made in the training of the SUGAR model. The appendices describe details of the mathematical implementation of the SUGAR model. The SUGAR template is available online1, while the data used to train SUGAR are available online2.

2. Spectrophotometric time series from SNfactory

This analysis is based on 171 SNe Ia obtained by the SNfactory collaboration beginning in 2004 (Aldering et al. 2002) with the SuperNova Integral Field Spectrograph (SNIFS, Lantz et al. 2004) installed on the University of Hawaii 2.2-m telescope (Mauna Kea). SNIFS is a fully integrated instrument optimized for semi-automated observations of point sources on a structured background over an extended optical window at moderate spectral resolution. SNIFS has a fully-filled 6.4″ × 6.4″ spectroscopic field-of-view subdivided into a grid of 15 × 15 contiguous square spatial elements (spaxels). The dual-channel spectrograph simultaneously covers 3200–5200 Å (B-channel) and 5100–10 000 Å (R-channel) with 2.8 and 3.2 Å resolution, respectively. The data reduction of the x, y, λ data cubes is summarized by Aldering et al. (2006) and updated in Sect. 2.1 of Scalzo et al. (2010). The flux calibration is developed in Sect. 2.2 of Pereira et al. (2013) based on the atmospheric extinction derived in Buton et al. (2013). In addition, observations are obtained at the SN Ia location at least one year after the explosion to serve as a final reference to enable subtraction of the underlying host and the host subtraction, as described in Bongard et al. (2011). For every SN Ia followed, the SNfactory creates a spectrophotometric time series, typically composed of ∼14 epochs, with the first spectrum taken on average three days before maximum light in B-band (Bailey et al. 2009; Chotard et al. 2011). The sample of 171 SNe Ia contains the objects with good final references, objects that passed quality cuts suggested by Guy et al. (2010), and is restricted to objects with at least one observation in a time window of ±2.5 days around maximum light in B-band. The size of this window is kept identical with respect to the study of Chotard et al. (2011) and is discussed in Léget (2016). After flux calibration, host-galaxy subtraction and correction for Milky Way extinction, the flux of the 171 spectra is integrated in synthetic top-hat filters defined in Pereira et al. (2013) and for reference a SALT2 fit is applied with the model from Betoule et al. (2014) in order to obtain the X1, C, and mB parameters. The spectra of the 171 SNe Ia are transformed to the rest frame with a fiducial cosmology and an arbitrary Hubble constant. It is allowing to respect blinding of future cosmological analysis using this dataset. For the spectral analysis here, the spectra are rebinned at 1500 km s−1 between 3254 and 8649 Å (197 bins per spectra) for computational efficiency while still resolving spectral features and converted to the absolute AB magnitude system. The training is done on 113 SNe Ia observed before 2010 and the validation is done on 58 SNe Ia observed after 2010.

3. Derived data

Any empirical SN Ia modeling must solve three problems3: Choose what features to model, account for color, and deal with the data sampling. For the features modeling of SUGAR, we used spectral features at maximum light to describe the intrinsic part of the SED. This is described successively in Sects. 3.1 and 3.2. Section 3.1 describes how the spectral features are selected and measured. Section 3.2 explains how the spectral features are projected onto a new basis that allows us to work in an orthogonal basis for the SUGAR training. To estimate the average color curve of SNe Ia, we generalize the method of Chotard et al. (2011). This is described in Sect. 3.3. To deal with the data sampling, we project the observed spectra onto a common time grid so that the SED model can be calculated on the same time grid. This is done in Sect. 3.4 using the Gaussian process method. The following sections describe those intermediate steps that determine the full SUGAR SED model in Sect. 4. We use the spectral features derived and color curve parameters derived at maximum light combined with the interpolated spectra to infer the full SED at all epochs.

3.1. Spectral indicators at maximum light

Spectral indicators are metrics of empirical features of the input spectrum such as equivalent widths or line velocities. They offer an efficient characterization of spectral variability by representing the underlying spectral complexity with a few key numbers, and play a role in nonlinear dimensionality reduction of the original data. These two characteristics, interpretability and simplicity, make them ideal for describing the intrinsic part of SNe Ia SED, and consequently derive the extrinsic part in the same spirit as Chotard et al. (2011). This decoupling restricts the set of spectral indicators to pseudo-equivalent widths and the wavelengths of P-Cygni profile minima. As the spectral indicators evolve with phase, we select the spectrum closest to maximum light in B-band, if it is within a time window of ±2.5 days. This window size was chosen to optimize the trade-off between the total number of SNe Ia in the sample and the potential loss in precision due to time evolution.

Near maximum light and within the spectral range covered by the SNfactory, it is possible to systematically obtain the 13 spectral indicators represented in Fig. 1. They are the nine pseudo-equivalent widths of Ca II H&K, Si IIλ4131, Mg II, Fe λ4800, S II W, Si IIλ5972, Si IIλ6355, O Iλ77734, and Ca II IR features, as well as four minima of P Cygni profiles (Si IIλ4131, S IIλ5454, S IIλ5640, and Si IIλ6355). For the pseudo-equivalent widths, we rely only on well-defined troughs that are present in all SNe Ia in the sample, and consider line blends as a whole. Some of the possible minima were also discarded either because the corresponding lines form a complex mixture or because the corresponding trough is too shallow to accurately define a minimum for some SNe Ia. This is why the Si IIλ5972 feature velocity is rejected. The spectral indicators and their uncertainties were automatically derived from the spectra at maximum light following the procedure of Chotard et al. (2011), which is described in detail in Appendix A.

thumbnail Fig. 1.

Pseudo-equivalent widths of absorption lines at maximum light as well as minima of P-Cygni profiles at maximum of light which are used in our analysis. They are represented on the spectrum of SN-training-77 at a phase of 1 day after maximum brightness.

3.2. Factor analysis on spectral features

3.2.1. Factor analysis model

The space defined by the 13 selected spectral indicators has too high dimensionality to efficiently train a model. Additionally, some of the spectral indicators are correlated, and therefore contain redundant information. Most of the model variation can be captured in a reduced number of dimensions. Principal component analysis (Pearson 1901) is one of the methods to implement such a dimensionality reduction. It consists of diagonalizing the covariance matrix of the sample and projecting the data into the resulting eigenvectors basis. In this new basis, the variables are uncorrelated, and an approximation of the input data is found by neglecting the dimensions corresponding to the smallest eigenvalues. This method has been employed in the case of SNe Ia by Guy et al. (2007), Kim et al. (2013), and Sasdelli et al. (2015). However, in the case considered here, some directions are dominated by noise, so their eigenvectors would align along the direction of measurement errors rather than the intrinsic sample variance. To solve this problem, we employ a variant of PCA, Factor Analysis (Spearman 1904, 1927), which has the advantage of taking into account the variance caused by measurement uncertainties. This technique decomposes the observables into two terms, one representing the explanatory factors and one representing the noise affecting each variable. This is expressed by the following relation (Ghahramani & Hinton 1997):

x i = Λ q i + u i , $$ \begin{aligned} \boldsymbol{x}_i = \boldsymbol{\Lambda } \boldsymbol{q}_i + \boldsymbol{u}_i, \end{aligned} $$(1)

where xi is the spectral feature vector with s components for the ith SN Ia. qi is the explanatory factor vector of dimension m ≤ s, and is linearly related to xi by the matrix Λ of dimensions m × s. ui is the noise with variance Ψ of dimensions s × s. In order to fix the normalization of Λ and qi, we assume that the qi are drawn from a centered normal distribution:

P ( q i ) N ( 0 , I ) . $$ \begin{aligned} P\left(\boldsymbol{q}_i\right) \sim \mathcal{N} \left(0 ,\boldsymbol{I} \right). \end{aligned} $$(2)

In the framework of the PCA, and up to a normalization, the matrix Λ and the qi are respectively equivalent to the eigenvector matrix and the projections into the new basis. Factor analysis thus consists of determining the matrices Λ and Ψ that maximize the likelihood, under the assumption that the xi are distributed according to a normal distribution with both intrinsic scatter and measurement noise:

P ( x i ) N ( 0 , Λ Λ T + Ψ ) . $$ \begin{aligned} P\left(\boldsymbol{x}_i\right) \sim \mathcal{N} \left(0 ,\boldsymbol{\Lambda } \boldsymbol{\Lambda }^T +\boldsymbol{\Psi } \right). \end{aligned} $$(3)

To estimate Λ, Ψ, and the explanatory factors qi, Ghahramani & Hinton (1997) propose a solution based on an expectation-maximization algorithm where qi and Λ are estimated iteratively. Here, unlike in conventional factor analysis, a reliable estimate of the spectral indicator measurement error is provided. For each SN Ia i, Ψi is the known diagonal matrix that contains the squared errors of the spectral indicators, hence, it is not necessary to fit for a global Ψ. We adapted the aforementioned expectation-maximization algorithm to take this change into account. Additional details can be found in Appendix B.

Finally, a prescription is needed to normalize each individual variable. The amplitude of the spectral indicator variation, expressed in Å, is not a good indicator of their impact on the spectral shape. As an example, Branch et al. (2006) show that the weak Si IIλ5972 line can play a significant role in subclassing SNe Ia. As a consequence, the input data are normalized to unit variance prior to the factor analysis, so that no spectral indicator is favored a priori. Within this framework, the eigenvalues of ΛΛT represent the variance explained by each factor, and s − Tr(ΛΛT) is the variance coming from the noise, where here s = 13 is the number of spectral indicators and Tr is the trace operator.

3.2.2. Outlier rejection

Outliers affect the sample variance of any population. As we care most about a correct description of the bulk of SNe Ia, it is desirable to identify and remove these outliers. Amongst them, SNe Ia of type SN1991T (Filippenko et al. 1992a) or SN1991bg (Filippenko et al. 1992b) are known to have different spectral and photometric behavior from other SNe Ia. However, basing an outlier rejection on this empirical identification has two issues. The first is that within the spectral indicator space, some of those subtypes may not appear as distinct subclasses, and do not offer objective grounds for rejection. The second issue is that the attribution of a given SN Ia to one of these subclasses by SNID (Blondin & Tonry 2007) may provide inconsistent results depending on the epoch considered for the identification. In order to apply a self-contained criterion for defining an outlier, we thus resorted to a χ2-based definition for identifying outliers: this quantity can be interpreted as the squared distance to the center of the distribution normalized by the natural dispersion. For SN Ia i, it is given by

χ i 2 = x i T ( Λ Λ T + Ψ i ) x i . $$ \begin{aligned} \chi ^2_i= \boldsymbol{x}_i^T\left(\boldsymbol{\Lambda } \boldsymbol{\Lambda }^T + \boldsymbol{\Psi _i} \right) \boldsymbol{x}_i. \end{aligned} $$(4)

In the case of a Gaussian distribution, χ i 2 $ \chi^2_i $ should follow a χ2-distribution with 13 degrees of freedom. An iterative cut at 3σ on the value of χ i 2 $ \chi^2_i $ rejects 8 SNe Ia from the sample. A visual inspection carried out on each of these 8 SNe Ia, as well as a sub-classification made using SNID, show that four of those objects exhibit shallow silicon features as defined by Branch et al. (2006), two of them have high-velocity silicon features as defined by Wang et al. (2009), one is within the broad line subcategory, and the remaining one has a failed estimation of one of the spectral indicators.

We studied the influence of the 3σ cut by carrying out the analysis successively with and without the cut. Even though this had no major effect on the direction of the vectors, the cut is applied to the rest of the analysis to avoid training the model on outliers.

3.2.3. Factor analysis results on spectral features

The expectation–maximization factor analysis (EM-FA) algorithm described in the previous sections is applied to the 13 spectral features measured at maximum light. The relative weight of the eigenvalues on the total variance is shown in Fig. 2, and the correlations between the eigenvectors, and the spectral indicators are presented in Fig. 3. The significance of the correlation is expressed in units of σ, meaning that the null hypothesis, that is 0 correlation is rejected at σ confidence level. Figure 2 shows that two vectors dominate the variability in the space of spectral indicators. The first vector is linked with coherent variations of the pseudo-equivalent widths, except the widths of the Ca II H&K and S II W lines as seen in Fig. 3. The second vector is anticorrelated with these same two features, and describes a coherent variation of the velocities. Since the first vector is very similar to the pseudo-equivalent width of Si IIλ4131, which is shown by Arsenijevic et al. (2008) to be highly correlated with the stretch parameter, it is therefore natural to assume that this vector represents the main source of intrinsic variability already known for SNe Ia. The second vector is mostly driven by the line velocities, pEW(Ca II H&K), and pEW(S II W) lines, with smaller dependencies from the other pseudo-equivalent widths. Interestingly, while investigating the spectral diversity of SNe Ia beyond stretch and color, Chotard et al. (2011) demonstrated the role of pEW(Ca II H&K), while Wang et al. (2009) focused on the role of the speed of Si IIλ6355. Our second vector unifies these two approaches. The interpretation of the next vectors is less straightforward. As their rank increases, they describe less of the sample variance, and thus exhibit lower correlations with the original data. The third vector is the last of the eigenvectors to show strong correlation with one of the spectral indicators, pEW Si IIλ5972. The correlations with the SALT2 parameters are presented in Fig. 4. As expected from the correlations with the pseudo-equivalent widths, q1 is strongly correlated with X1. Also, q3 exhibits a significant correlation with X1, which may be an indication that the stretch is driven by two different parameters. Remaining factors are only weakly correlated with X1. The correlations between SALT2 color and any of the factors are close to zero. The SALT2 absolute magnitude cosmology residual, ΔμB, are mainly correlated with q1 and q3, presumably due to the correlation of both with X1.

thumbnail Fig. 2.

Relative importance of the eigenvalues associated with the eigenvectors. They are ordered in decreasing order of variance, and the total amounts to 100% once the contribution of the noise (indicated by a red line) is taken into account. We can see that the first three vectors dominate the variability of this space.

thumbnail Fig. 3.

Pearson correlation coefficients between the spectral indicators and the first five factors. The eccentricities of the ellipses indicate the magnitude of the correlation, and the color the importance of the correlation in units of σ. The vectors one and two correspond respectively to a global correlation of the pseudo-equivalent widths and of the line velocities. Only the first three vectors display correlations with a significance higher than 5σ.

Within this framework, the eigenvalues of ΛΛT represent the variance explained by each factor, and s − Tr(ΛΛT) is the variance coming from the noise, where here s = 13 is the number of spectral indicators and Tr is the trace operator. Noise, in this situation, represents 18% of the variability observed, so only the first two vectors clearly outweigh the noise and it is legitimate to ask whether or not the other vectors are sensitive to statistical fluctuations. For the rest of the analysis, we keep the first three factors for the final training and discuss this choice in the Sect. 6.2, by looking at the impact on the final model.

3.3. Extinction curve estimation

In this section, we estimate the average extinction curve by generalizing the procedure described in Chotard et al. (2011). For this, a model of the SED at maximum light is derived. This model allows us to deduce the average extinction curve and the procedure to estimate it: the results are described in the following.

3.3.1. Empirical description

The SED model at maximum light presented here is a generalization of the model presented in Chotard et al. (2011), in which the authors removed intrinsic variability tied to spectral features and measured the remaining spectral variation, which was found to be consistent with an extinction curve like that for the Galaxy. The major difference is that the intrinsic description is here motivated by the factors derived in Sect. 3.2 rather than being described only by pEW(Si IIλ4131) and pEW(Ca II H&K) as was done in Chotard et al. (2011). To model the SED we define for the SN Ia i the vector xi ≡ {hi, 1, hi, 2, hi, 3}, which is the true value of the measured factor qi = {qi, 1, qi, 2, qi, 3}. We propose that these are related to the intrinsic absolute magnitude by:

M i ( λ ) = M 0 ( λ ) + j = 1 j = 3 h i , j α j ( λ ) + A λ 0 , i γ ( λ ) , $$ \begin{aligned} M_i(\lambda )=M_0(\lambda )+ \ \sum _{j=1}^{j=3} h_{i,j} \alpha _j(\lambda ) + A_{\lambda _0,i} \ \gamma (\lambda ), \end{aligned} $$(5)

where M0(λ) is the average spectrum in absolute magnitude, αj(λ) is the spectrum related to the factor hi, j, where j is the factor index running from one to three (the choice of the number of components is discussed in Sect. 6.2). Aλ0, i is a free parameter that can be interpreted as the absorption due to extinction at a reference wavelength λ0 (set here at the median wavelength, 5305 Å, without loss of generality). γ(λ) is an arbitrary function that represents the effect of the extinction, and we have not set any prior on its shape. Once the function γ(λ) is fixed, it is possible to fit the total-to-selective extinction ratio of the Cardelli et al. (1989) law, RV, and the absorption in V-band, AV to it; this is described in the next sub-section. The model parameters here are M0(λ), the αj(λ), the γ(λ), the hi, j, and the Aλ0, i. It is understood that we are modeling the SED at specific wavelengths where we have measurements. For notational efficiency we rewrite Eq. (5) as:

M i = A h i , where $$ \begin{aligned}&\mathbf M _i = \mathbf A \boldsymbol{h}_{i},\, \mathrm{where}\end{aligned} $$(6)

A = ( M 0 , γ , α 1 , α 2 , α 3 ) , $$ \begin{aligned}&\mathbf A =\left(\mathbf M _{0},\boldsymbol{\gamma }, \boldsymbol{\alpha }_{1},\boldsymbol{\alpha }_{2},\boldsymbol{\alpha }_{3} \right), \end{aligned} $$(7)

h i T = ( 1 , A λ 0 , i , h i , 1 , h i , 2 , h i , 3 ) . $$ \begin{aligned}&\boldsymbol{h}_i^T=\left(1,A_{\lambda _0,i},h_{i,1},h_{i,2}, h_{i,3}\right). \end{aligned} $$(8)

In Chotard et al. (2011), the parameters αj(λ) were computed in sequential order, and the reddening curve γ(λ) as well as the extinction Aλ0, i were determined in a second step. We improve on this procedure by applying a global fit for all parameters at once. This is described below, and allows us to derive a global average RV for all SNe Ia, and an AV for each SN Ia.

3.3.2. Fitting the model at maximum light and RV

In this section we explain the framework for fitting the free parameters of Eq. (5) and how we then determine the AV and RV parameters. Fitting the parameters of the absolute SED of Eq. (5) is done using an orthogonal distance regression. Estimating these parameters within the framework of an orthogonal distance regression amounts to minimizing the following χ2:

χ 2 = i ( M i obs A h i ) T W M i ( M i obs A h i ) + ( q i x i ) T W q i ( q i x i ) , $$ \begin{aligned} \chi ^2&= \sum _i \left( \mathbf M _i^\mathrm{obs} - \mathbf A \boldsymbol{h}_{i} \right) ^T \mathbf W _\mathbf{M _i} \left( \mathbf M _i^\mathrm{obs} - \mathbf A \boldsymbol{h}_{i} \right) \nonumber \\&\quad + \left( \boldsymbol{q}_i - \boldsymbol{x}_{i} \right)^T \mathbf W _{\boldsymbol{q}_i} \left( \boldsymbol{q}_i - \boldsymbol{x}_{i} \right), \end{aligned} $$(9)

where M i obs $ \mathbf{M}_i^{\rm obs} $ is the observed spectrum of the SN Ia i in absolute AB magnitude and WMi is the weight matrix of M i obs $ \mathbf{M}_i^{\rm obs} $, defined as:

W M i = [ ( 0 σ λ i 2 0 ) + ( σ cal 2 + σ z 2 ) ( 1 1 1 1 ) + D ] 1 , $$ \begin{aligned} \mathbf W _\mathbf{M _i}= \left[ \begin{pmatrix} \ddots&\,&0 \\&\sigma _{\lambda i}^2&\\ 0&\,&\ddots \end{pmatrix} \ + \ \left(\sigma _{\rm cal}^2+\sigma _{z}^2\right) \begin{pmatrix} 1&\cdots&1 \\ \vdots&\ddots&\vdots \\ 1&\cdots&1 \end{pmatrix} \ + \ \mathbf D \right]^{-1}, \end{aligned} $$(10)

where σλi is the uncertainty of the SN Ia i at the wavelength λ derived from spectral error, σcal the per-spectrum calibration uncertainty, taken to be 0.03 mag, σz the combination of the redshift error and the uncertainty of 300 km s−1 due to peculiar velocities, and D is the dispersion matrix. The dispersion matrix is fit to deal with the remaining variability which is not described by the three factors and the extinction. It includes an estimate of the remaining chromatic and achromatic (gray) dispersions. The estimation of D is described in Appendix D. Wqi is the weight matrix of qi that comes from projecting the spectral feature uncertainties into the factor sub-space, and is defined as:

W q i = Λ T Ψ i 1 Λ , $$ \begin{aligned} \mathbf W _{\boldsymbol{q}_i} = \boldsymbol{\Lambda }^T \boldsymbol{\Psi }_i^{-1} \boldsymbol{\Lambda }, \end{aligned} $$(11)

where Λ and Ψi are the same as in Sect. 3.2. The unknown parameters of the model are the matrix A and the vector hi. These are found by minimizing Eq. (9) using an expectation-minimization algorithm described in Appendix C.

Once the parameters that minimize Eq. (9) have been found, we can compare γλ with an extinction curve to see if it is compatible. It is straightforward to directly estimate the global mean RV from γλ by minimizing

χ R V 2 = [ γ ( a + 1 R V b ) ] T [ γ ( a + 1 R V b ) ] , $$ \begin{aligned} \chi _{R_V}^2 = \left[\boldsymbol{\gamma }-\left(\boldsymbol{a}+\frac{1}{R_V}\boldsymbol{b}\right) \right]^T \ \left[ \boldsymbol{\gamma }-\left(\boldsymbol{a}+\frac{1}{R_V}\boldsymbol{b}\right)\right]\ , \end{aligned} $$(12)

where the a and b vectors are the extinction curve coefficients defined in Cardelli et al. (1989). Minimizing χ R V 2 $ \chi_{R_V}^2 $ as a function of RV then gives:

R V = b T b γ T b a T b · $$ \begin{aligned} R_{V}=\frac{{\boldsymbol{b}}^{T}{\boldsymbol{b}}}{{\boldsymbol{\gamma }}^{T}\boldsymbol{b}-\boldsymbol{a}^{T}\boldsymbol{b}} \cdot \end{aligned} $$(13)

We note that the value we quote for RV within the context of our model depends on the assumptions adopted in the evaluation of the dispersion matrix (discussed in detail in Sect. 6.1) and therefore does not necessarily correspond to the true mean properties of dust. This uncertainty associated with separating SN Ia color behavior from dust behavior is common to all SN Ia fitting methods. Once the value of the global mean RV is fixed, we determine the absorption in the V-band, AV for each SN Ia. This amounts to minimizing

χ A V 2 = i ( δ M i A V , i ( a + 1 R V b ) ) T W M i × ( δ M i A V , i ( a + 1 R V b ) ) , $$ \begin{aligned} \chi _{A_V}^2&= \sum _i \left(\boldsymbol{\delta }\tilde{\mathbf{M }}_i -A_{V,i} \ \left(\boldsymbol{a}+\frac{1}{R_V}\boldsymbol{b}\right) \right)^T \mathbf W _\mathbf{M _i} \nonumber \\&\quad \quad \times \left(\boldsymbol{\delta }\tilde{\mathbf{M }}_i -A_{V,i} \ \left(\boldsymbol{a}+\frac{1}{R_V}\boldsymbol{b}\right)\right), \end{aligned} $$(14)

where AV, i is the absorption in the V-band for the SN Ia i and δ M i $ \boldsymbol{\delta}\tilde{\mathbf{M}}_i $ are the residuals corrected for the intrinsic variabilities

δ M i = M i obs M 0 j q i , j α j . $$ \begin{aligned} \boldsymbol{\delta }\tilde{\mathbf{M }}_i = \mathbf M _{i}^\mathrm{obs} -\mathbf M _{0}- \ \sum _j q_{i,j} \boldsymbol{\alpha }_{j}. \end{aligned} $$(15)

Taking the derivative of χ A V 2 $ \chi_{A_V}^2 $ by AV we find:

A V , i = ( a + 1 R V b ) T W M i δ M i ( a + 1 R V b ) T W M i ( a + 1 R V b ) · $$ \begin{aligned} A_{V,i}=\frac{\left(\boldsymbol{a}+\frac{1}{R_V}\boldsymbol{b}\right)^T \mathbf W _\mathbf{M _i}\boldsymbol{\delta }\tilde{\mathbf{M }}_i}{\left(\boldsymbol{a}+\frac{1}{R_V}\boldsymbol{b}\right)^T\mathbf W _\mathbf{M _i}\left(\boldsymbol{a}+\frac{1}{R_V}\boldsymbol{b}\right)}\cdot \end{aligned} $$(16)

3.3.3. Results

The model above is trained on the 105 SNe Ia that remain in the training sample. The vectors αj for j = 1, j = 2, and j = 3, as well as their effect on the average spectrum M0, are presented in Fig. 5. To quantify the overall impact of each vector in units of magnitude, we compute the average rms of hjαj, the deviation from the average spectrum M0.

The vector α1 presented in Fig. 5, has an average impact of 0.17 mag when multiplied by the standard deviation of q1. This is the expected amplitude for a stretch effect. The structure of this vector is associated with variations of the line depths, consistent with the correlation of the factor q1 with both the pseudo-equivalent widths and the stretch (cf. Figs. 3 and 4).

thumbnail Fig. 4.

Pearson correlation coefficients between the SALT2 parameters and the first five vectors. The eccentricities of the ellipses indicate the magnitude of the correlation, and the color the importance of the correlation in units of σ.

The vector α2 shown in Fig. 5, which is correlated with velocities, has a much weaker impact of 0.05 mag on the average spectrum M0. This is consistent with the fact that this effect has not yet been detected on purely photometric data. Moreover, the associated variability is centered in localized structures such as the regions of the Ca II H&K, Si IIλ6355, and Ca II IR lines. A closer scrutiny shows that the variability especially affects the bluer edge of the features, leading to an overall effect on the line velocities. This is as expected from the correlations of q2 and the minima of P-Cygni profile shown in Fig. 3.

thumbnail Fig. 5.

Impact of each component on the average spectrum. Top panel, top: average spectrum M0 and the predicted effect of a variation of q1 by ±1σ. Bottom: corresponding α1 vector. Middle panel, top: average spectrum M0 and the predicted effect of a variation of q2 by ±1σ. Bottom: corresponding α2 vector. Bottom panel, top: average spectrum M0 and the predicted effect of a variation of q2 by ±1σ. Bottom: corresponding α2 vector.

The vector α3 shown in Fig. 5, has an impact of 0.05 mag on the average spectrum M0. Like the first component q1, q3 is correlated with the stretch (cf. Fig. 4), but the corresponding vector α3 is less structured than α1 in the optical band. However, the expected correlations with the spectral features are consistent with the behavior of q3 (cf. Fig. 3). Moreover, the most prominent features affect the extreme UV part of the spectrum as well as the Ca II H&K and Ca II IR regions. For the latter, brighter SNe Ia exhibit a stronger trough in the higher velocity part of the blends, which could link this vector to the presence of high-velocity calcium structure in ejectas.

In summary, the analysis of the SED components at maximum light confirms that the stretch has the dominant effect on magnitudes as expected, but one also has to take into account other variabilities which are difficult to detect in photometric bands since they are linked to localized features such as velocities.

The γ(λ) curve is shown in Fig. 6 and is consistent with a dust extinction curve. This shows that three eigenvectors provide a sufficient description of the intrinsic variability from the purpose of deriving a color law. The fit of γ(λ) to a Cardelli et al. (1989) law gives a value of RV = 2.6. The Cardelli et al. (1989) law fit coincides remarkably well with γ(λ), except in the UV, where it differs slightly. The RV value is mainly driven by the reddest SNe Ia. Indeed, as can be seen in the lower panel of Fig. 6, three SNe Ia have the dominant contribution on the final value of RV and removing any of these from the sample would significantly alter the result, hence the rather large value of the uncertainty.

thumbnail Fig. 6.

Extinction curve fit result. Top panel: empirical average extinction curve, γ(λ), compared to the best fit by a Cardelli et al. (1989) law, which gives an RV = 2.6. The shaded blue area represent a ±0.5 range on RV, in order to illustrate the typical range of value that are measured for SN Ia in the literature. The black dot on this curve indicates the wavelength used on the lower panel graph. Lower panel: residuals after correction for the intrinsic behavior at 4012 Å as a function of Aλ0. Each dot corresponds to a single SN Ia. The γ4012 Å slope between M and Aλ0 is indicated by the red continuous line. Dashed represent the Cardelli et al. (1989) law for RV = 2.6. The shaded blue area represent a ±0.5 range on RV, in order to illustrate the typical range of value that are measured for SN Ia in the literature.

We made the choice of fitting an average extinction curve for the whole sample, but there are indications from observations that extinction curves exhibit some diversity (Amanullah et al. 2015). Moreover, any such extinction variation, or intrinsic color variation (e.g. Foley & Kasen 2011; Polin et al. 2019), that the model at maximum is unable to capture goes into the dispersion matrix. However, in order to keep the model simple, we keep an average value of RV and we leave RV variation analysis for the future.

Figure 7 presents the set of spectra before any correction and the residuals resulting from the difference between the observed spectrum and Eq. (5). In the top panel, the effect of extinction is clearly visible for spectra before any correction. After correction, the residuals between the models and data are mainly shifted in magnitude with respect to one another and no effect on color appears to remain. This is the signature expected for a mostly gray residual offset.

thumbnail Fig. 7.

Spectra before and after applying the model at maximum light. Top panel: 105 spectra in the absolute AB magnitude before correction (upper set of spectra) and after correction for intrinsic properties and color (lower set of spectra). An arbitrary offset has been applied for each set. The color code indicates the value of Aλ0. Lower panel: wRMS of the residuals for the uncorrected (in red) and corrected (in blue) spectra.

The amplitude of the correction made by the three factors q1, q2, q3, and the extinction curve, given by the wRMS as a function of wavelength, is presented in the lower panel of Fig. 7. Before any correction, the dispersion in the B-band is around 0.4 mag, as expected. Spectra before any correction exhibit both localized structure coming from intrinsic variabilities and a slowly increasing dispersion in the bluer part of the spectrum, which is the signature of extinction. Once spectra are corrected by the three factors q1, q2, q3, and the extinction curve, the wRMS of the residuals drops to a floor of 0.1 mag. However, localized structures remain, concentrated in UV, Ca II H&K, Si IIλ6355, O Iλ7773, and Ca II IR regions. The amplitude of the baseline is compatible with the value of the gray intrinsic dispersion added in cosmological fits for nearby supernovae (Betoule et al. 2014). The fact that a part of the residual is gray is also visible in the dispersion matrix as can be seen in Fig. 8. Indeed, the high correlation across all wavelengths is consistent with gray fluctuation. However, some effects observed in the dispersion matrix are not due to the gray effect, but to the unmodeled variabilities not captured by SUGAR in certain spectral zones, which explains some features observed in the dispersion matrix and in the Fig. 7.

thumbnail Fig. 8.

Square root of the diagonal of the dispersion matrix D (top) and the correlation matrix corresponding to the dispersion matrix D (bottom). The shading represents the degree of correlation, as given by the Pearson correlation coefficient, ρ.

3.4. Time interpolation using Gaussian process

Now, we want to extend our model from maximum light to the full spectral time series. Spectra from the SNfactory are taken at different epochs in the development of each SN Ia. The treatment of these observations taken at different times is an important step to handle before constructing the full SED model. In order to work with a fixed time grid, it is necessary to use a method of interpolation. One solution is to use Gaussian processes, as was done for SNe Ia in works like Kim et al. (2013), Fakhouri et al. (2015), and Saunders et al. (2018). A complete review of Gaussian processes can be found in Rasmussen & Williams (2006). The Gaussian process implementation presented here is based on this review. However, we have developed a specific implementation in order to take into account wavelength dependencies and to accelerate computation of the interpolation. These are described below.

A Gaussian process is a generalization of the Gaussian probability distribution, and is a non-parametric way to interpolate data. The main assumption of Gaussian processes is that observed data are the realization of Gaussian random fields that are characterized by an average function and a correlation function. Therefore, the distribution of SNe Ia magnitude m at a given phase t and a given wavelength λ follows this relation:

m ( t , λ ) N ( m 0 ( t , λ ) , K λ ) , $$ \begin{aligned} m\left(t,\lambda \right) \sim \mathcal{N} \left(m_0\left(t,\lambda \right),\mathbf K _{\lambda }\right), \end{aligned} $$(17)

where m0(t,λ) is the average function of the SNe Ia SED, and Kλ corresponds to the correlation matrix (commonly called the kernel) that describes time correlations between different epochs. In the absence of an explosion model that would yield the analytical form of Kλ, we choose a squared exponential kernel, to which the measurements and calibration uncertainties are added:

K λ ( t i , t j ) = ϕ λ 2 exp [ 1 2 ( t i t j l λ ) 2 ] + σ gray 2 δ ij , $$ \begin{aligned} K_{\lambda } \left(t_{i}, t_{j}\right)=\phi _{\lambda }^{2} \ \exp \left[-\frac{1}{2} \left(\frac{t_{i}-t_{j}}{l_{\lambda }}\right)^{2}\right] + \sigma _{\rm gray}^2 \ \delta _{ij}^{\prime} , \end{aligned} $$(18)

where ϕ λ 2 $ \phi_{\lambda}^2 $ represent the spectral variance around the average function for the wavelength λ, lλ corresponds to the temporal correlation length for the wavelength λ, ti and tj correspond to the phase of observation ij, and σgray is the gray error taken to be 0.03 mag. All wavelengths are treated independently of each other, that is the interpolation is performed on yλn, the nth SN Ia light curve for each wavelength λ from which the average function has been subtracted. For a given set of global hyperparameters (ϕλ, lλ), the interpolation y λn $ \boldsymbol{y}_{\lambda n}^\prime $ and the covariance matrix of uncertainties on the interpolation, cov( y λn ) $ \text{cov}\left(\boldsymbol{y}_{\lambda n}^\prime\right) $, are given by:

y λ n = K λ n ( t , t ) T ( K λ n ( t , t ) + σ λ 2 I ) 1 y λ n $$ \begin{aligned}&\boldsymbol{y}_{\lambda n}^{\prime} = \mathbf K _{\lambda n}(\mathbf t^{\prime} ,\mathbf t )^T\left(\mathbf K _{\lambda n}(\mathbf t ,\mathbf t )+ \boldsymbol{\sigma }_{\lambda }^2 {\boldsymbol{I}}\right)^{-1} \boldsymbol{y}_{\lambda n} \end{aligned} $$(19)

cov ( y λ n ) = K λ n ( t , t ) K λ n ( t , t ) T ( K λ n ( t , t ) + σ λ 2 I ) 1 K λ n ( t , t ) , $$ \begin{aligned}&\text{ cov}\left(\boldsymbol{y}_{\lambda n}^{\prime} \right) = \mathbf K _{\lambda n}(\mathbf t^{\prime} ,\mathbf t^{\prime} ) - \mathbf K _{\lambda n}(\mathbf t^{\prime} ,\mathbf t )^T\left(\mathbf K _{\lambda n}(\mathbf t ,\mathbf t )+ \boldsymbol{\sigma }_{\lambda }^2 {\boldsymbol{I}}\right)^{-1}\mathbf K _{\lambda n}(\mathbf t^{\prime} ,\mathbf t ), \end{aligned} $$(20)

where the vector σ λ 2 $ \boldsymbol{\sigma}_{\lambda}^2 $ is the error on the magnitude for the wavelength λ and for all observed phases. t′ is the new time grid and t the observed phases of the SN Ia. Gaussian processes assume that the observed data are distributed around an average function. This function is a necessary input, because in a region with no data points, the Gaussian process interpolation converges to this average function (after ∼3 temporal correlation lengths). The calculation of this average function is performed as follows. The training spectra are binned according to phase from maximum light in B-band, ranging from −12 to +48 days. Each phase bin spans two days and contains at least three spectra. The weighted average spectrum is computed for each phase bin and given as a first approximation of the average function. However, due to some lower signal-to-noise spectra, this version of the average function is not smooth. A Savitsky–Gollay filter is used to smooth the average function, which is then used for the Gaussian process interpolation.

What remains is to determine the pair of hyperparameters θλ = (ϕλ, lλ) that represent, respectively, the amplitude of the fluctuation around the average function and the temporal correlation length. To estimate the hyperparameters for each wavelength, we maximize the product of all individual likelihoods:

L λ = n 1 2 π N 2 1 | K λ n ( t , t ) + σ λ 2 I | 1 2 × exp ( 1 2 y λ n T ( K λ n ( t , t ) + σ λ 2 I ) 1 y λ n ) . $$ \begin{aligned} \mathcal{L} _{\lambda }&= \prod _{n} \frac{1}{2\pi ^{\frac{N}{2}}} \ \frac{1}{|\mathbf K _{\lambda n}(\mathbf t ,\mathbf t )+ {\boldsymbol{\sigma }}_{\lambda }^2 {\boldsymbol{I}}| ^{\frac{1}{2}}} \nonumber \\&\quad \times \exp \left(-\frac{1}{2} \boldsymbol{y}_{\lambda n}^T\left(\mathbf K _{\lambda n}(\mathbf t ,\mathbf t )+ \boldsymbol{\sigma }_{\lambda }^2 {\boldsymbol{I}}\right)^{-1} \boldsymbol{y}_{\lambda n} \right). \end{aligned} $$(21)

This differs from previous work for two main reasons: the wavelength dependence of the hyperparameters, and their estimation using all the SNe Ia. These two new features are justified mainly by SNe Ia physics and computation time. Since SNe Ia are standardizable candles, we treat them as realizations of the same Gaussian random field, that is we assume that they share the same set of hyperparameters. In this case, we can use all available SNe Ia to estimate these hyperparameters. The wavelength dependence of the hyperparameters is due to as example dust extinction, intrinsic variability, the second maximum in infrared, etc. It follows that the standard deviation around the average function and the temporal correlation length should vary across wavelength. Moreover, the wavelength dependence allows us to be very efficient in terms of computation time. Indeed, it sped up matrix inversion from the classical O( N λ 3 × N t 3 ) $ \mathcal{O}(N_{\lambda}^3 \times N_{t}^3) $ to O( N λ × N t 3 ) $ \mathcal{O}(N_{\lambda} \times N_{t}^3) $, where Nλ is the number of bin in wavelength (197 here), and Nt the number of observed phases (∼14 in average). On a Mac Book Pro with a 2.9 GHz Intel Core i5 processor and 8 GB of RAM, finding all hyperparameters (2 × 197), and computing the interpolation, and the pull distribution of residuals took under 6 min.

Results of hyperparameter adjustment in terms of wavelength are shown in Fig. 9. First, it can be seen that the structure of hyperparameters as a function of wavelength is not random. Indeed, for the parameter ϕλ, its value varies from 0.3 mag to more than 0.6 mag: this number is difficult to interpret directly because it is the amplitude around an average value at all phases of the data. Nevertheless, the increase in the ultraviolet and blue wavelengths can be explained by the dispersion caused by dust extinction of the host-galaxies and a larger intrinsic dispersion in this wavelength range (Maguire et al. 2012; Nordin et al. 2018). There are also structures in the peaks of this intrinsic dispersion that correspond to the regions of Ca II H&K, Si IIλ6355, and Ca II IR lines. The correlation length, lλ, has structure in the silicon and calcium regions and remains globally stable with no features between 6000 Å and 8000 Å (average value of seven days). Once the hyperparameters have been estimated, the interpolations for each SN Ia are performed onto a new phase grid. The phase grid chosen covers a range between −12 and +48 days around maximum brightness in B-band, divided into three day bins. Finally, to check if both the predictions (interpolation) and associated errors are in agreement with what is expected from a Gaussian process, we compute the pull distribution of the residuals. Because we assume that our data are realizations of a Gaussian random fields, the pull distribution should follow a centered unit normal distribution. For each wavelength we fit a Gaussian to the pull distribution and calculate its standard deviation. The results are shown in the bottom of Fig. 9. The pull standard deviation is on average lower than 1 with an average value of 0.8. These interpolation results are used to establish the SED model that is described in the following sections.

thumbnail Fig. 9.

Results of the Gaussian process interpolation. From top to bottom: amplitude of the kernel, correlation length and standard deviation of the pull distribution in terms of wavelength. The gray areas represent the presence of the absorption line that were used in our original factor analysis.

4. The SUGAR model

4.1. The model

The SUGAR model assumes that SED variation at any epoch may be described by spectral features measured at maximum. The SUGAR SED model is based on a linear combination of factors derived in Sect. 3.2.3 and an extinction curve taken as a Cardelli et al. (1989) law derived in Sect. 3.3. To model the SUGAR SED, we propose a model similar to the one we constructed for maximum light. This SUGAR model is based on the three factors for each SN Ia i, combined into the vector xi ≡ {hi, 1, hi, 2, hi, 3}, which in the model is the true value of the measured factor qi = {qi, 1, qi, 2, qi, 3}, and are related to the intrinsic magnitude by:

M i ( t , λ ) = M 0 ( t , λ ) + j = 1 j = 3 h i , j α j ( t , λ ) + A V , i f ( λ , R V ) + Δ M gray i , $$ \begin{aligned} M_i(t,\lambda )=M_0(t,\lambda )+ \ \sum _{j=1}^{j=3} h_{i,j} \alpha _j(t,\lambda ) + A_{V,i} \ f\left(\lambda ,R_V\right) +\Delta M_{\mathrm{gray} \ i}, \end{aligned} $$(22)

where M0(t, λ) is the magnitude of the average spectral time series, αj(t, λ) is the intrinsic variation related to the factor hi, j; as discussed in Sect. 3.2.3, the numbers of factors is set to three. As at maximum, hi, j represents the true value of the measured factor qi, j derived in Sect. 3.2. AV, i is the extinction in the V-band and f(λ,RV) is the Cardelli et al. (1989) law where RV is the extinction ratio that is derived in Sect. 3.3. Finally, the term ΔMgray i is a gray and time independent term comparable to the parameter X0 of SALT2, which allows us to work independently of the distance. Indeed, the effect of the distance on magnitude is equivalent to an additive constant independent of wavelength. Moreover, the parameter ΔMgray i also contains the average of the spectral time series residuals with the SUGAR model. The addition of ΔMgray i differs from what was done when determining the extinction curve in Sect. 3.3 and this choice is discussed in Sect. 6.1. We model the SED at the same wavelengths and epochs where the Gaussian process interpolation is done. For notational efficiency we rewrite Eq. (22) as:

M i = A h i , where $$ \begin{aligned}&\mathbf M _i = \mathbf A \boldsymbol{h}_{i},\, \mathrm{where} \end{aligned} $$(23)

A = ( M 0 , f ( R V ) , 1 , α 1 , α 2 , α 3 ) , $$ \begin{aligned}&\mathbf A =\left(\mathbf M _{0}, \mathbf f \left(R_V\right), 1,\boldsymbol{\alpha }_{1},\boldsymbol{\alpha }_{2},\boldsymbol{\alpha }_{3} \right), \end{aligned} $$(24)

h i T = ( 1 , A V , i , Δ M gray i , h i , 1 , h i , 2 , h i , 3 ) . $$ \begin{aligned}&\boldsymbol{h}_i^T=\left(1,A_{V,i}, \Delta M_{\mathrm{gray} \ i},h_{i,1},h_{i,2}, h_{i,3}\right) . \end{aligned} $$(25)

The unknown parameters of the SUGAR model are therefore the average spectrum M0(t, λ), the intrinsic coefficients αj(t, λ), and the gray offset ΔMgray i. Each SN Ia is parametrized during the training by the qi, j determined in Sect. 3.2.3 and AV, i derived in Sect. 3.3. The average spectral time series M0(t, λ), the intrinsic coefficients αj(t, λ), and the gray offset ΔMgray i are determined in the same way as the modeled described at maximum light in the Sect. 3.3. The main differences with Sect. 3.3 are that the model now depends explicitly on the phases of observation, the extinction curve is fixed at the maximum light value (same RV), and the model is trained on the Gaussian process interpolation of spectra derived in Sect. 3.4 instead of measured spectra. The procedure is described below.

4.2. Fitting the SUGAR model

We use an orthogonal distance regression method to estimate the parameters of the SUGAR model. This is a version of the method described in Sect. 3.3, modified to include the Gaussian process interpolation, the known extinction parameters, and the gray offset. To estimate the SUGAR parameters, we minimize the following χ2:

χ 2 = i ( M i gp A h i ) T W M i ( M i gp A h i ) + ( q i x i ) T W q i ( q i x i ) , $$ \begin{aligned} \chi ^2&= \sum _i \left( \mathbf M _i^\mathrm{gp} - \mathbf A \boldsymbol{h}_{i} \right) ^T \mathbf W _\mathbf{M _i} \left( \mathbf M _i^\mathrm{gp} - \mathbf A \boldsymbol{h}_{i} \right) \nonumber \\&\quad +\left( \boldsymbol{q}_i - \boldsymbol{x}_{i} \right)^T \mathbf W _{\boldsymbol{q}_i} \left( \boldsymbol{q}_i - \boldsymbol{x}_{i} \right), \end{aligned} $$(26)

where the vector M i gp $ \mathbf{M}_i^{\rm gp} $ are the Gaussian process interpolations of the observed spectra (Sect. 3.4) which is in the following vector format:

M i gp = ( y 3340 Å i y λ i y 8580 Å i ) , $$ \begin{aligned} \mathbf M _{i}^\mathrm{gp} =\left(\boldsymbol{y}_{3340\,\AA \ i}^{\prime} \cdots \ \boldsymbol{y}_{\lambda \ i}^{\prime}\cdots \ \boldsymbol{y}_{8580\,\AA \ i}^{\prime} \right), \end{aligned} $$(27)

where y λi $ \boldsymbol{y}_{\lambda \ i}^\prime $ is the light curve at wavelength λ interpolated by Gaussian processes onto new time phase as described in Sect. 3.4. As in Sect. 3.3, the uncertainties affecting the xi are propagated through the weight matrix Wxi. The hi plays a role analogous to the qi of Sect. 3.2 and are estimated like the other free parameters of the model. Finally, the matrix WMi is the weight matrix of the SN Ia i and follows this matrix format:

W M i = ( cov ( y 3340 Å i ) 0 cov ( y λ i ) 0 cov ( y 8580 Å i ) ) 1 , $$ \begin{aligned} \mathbf W _\mathbf{M _i}= \begin{pmatrix} \text{ cov}\left(\boldsymbol{y}_{3340\,\AA \ i}^{\prime} \right)&\,&\,&0\\&\ddots&\,&\\&\,&\text{ cov}\left(\boldsymbol{y}_{\lambda \ i}^{\prime} \right)&\,&\\&\,&\ddots&\\ 0&\,&\,&\text{ cov}\left(\boldsymbol{y}_{8580\,\AA \ i}^{\prime} \right) \end{pmatrix}^{-1}\ , \end{aligned} $$(28)

where cov( y λi ) $ \text{cov}\left(\boldsymbol{y}_{\lambda \ i}^\prime\right) $ is the covariance of interpolation uncertainties from the Gaussian process interpolations determined in Sect. 3.4. The interpolation is done only in terms of the phase relative to B-band maximum, this makes the weight matrix block diagonal for each SN Ia. In this analysis, we do not add any covariance in terms of wavelength coming from the calibration uncertainties, for reasons of computation time and because they are expected to be small. These block diagonal matrices allowed us to accelerate the minimization algorithm. To estimate the 16 653 free parameters of the SUGAR model, it took just less than 1 h on a Mac Book Pro with a 2.9 GHz Intel Core i5 processor and 8 GB of RAM, thanks to sparse linear algebra. We note that while the covariance terms were neglected, the uncertainty at each wavelength was taken into account during the training of the Gaussian processes (these are the blocks cov( y λi ) $ \text{cov} \left (\boldsymbol{y}_{\lambda \ i}^\prime\right) $). The unknown parameters of the model (the matrix A and vectors hi) are computed by minimizing Eq. (26) using an expectation and minimization algorithm. This algorithm, described in Appendix C, avoids degeneracies between parameters.

4.3. The SUGAR components

The SUGAR model is described by a set of spectral time series components (M0, α1, α2, α3) at three day intervals. The SED of a given SN Ia is then obtained by a linear combination of those components employing the (q1, q2, q3) factors describing the supernova. For illustrative purposes, we present a spectral time series in six day intervals as well as their integration in synthetic top-hat filter system comprised of five bands with the following wavelength ranges: U ̂ [ 3300.0 3978.0 ] $ \hat{U}\, [3300.0 {-} 3978.0] $ Å; B ̂ [ 3978.0 $ \hat{B}\, [3978.0 {-} $4795.3] Å; V ̂ [ 4795.3 5780.6 ] $ \hat{V}\, [4795.3 {-} 5780.6 ] $ Å; R ̂ [ 5780.6 6968.3 ] $ \hat{R}\, [5780.6 {-} 6968.3] $ Å; I ̂ [ 6968.3 8400.0 ] $ \hat{I}\, [6968.3 {-}8400.0 ] $ Å. The diacritic hat serves as a reminder that these are not standard Johnson-Cousins filters. The effect of the vectors αj on the average spectrum M0, on the average light curves (obtained by integration of M0 in the U ̂ $ \hat{U} $, B ̂ $ \hat{B} $, V ̂ $ \hat{V} $, R ̂ $ \hat{R} $, I ̂ $ \hat{I} $ bands) and on the average colors U ̂ B ̂ $ \hat{U}-\hat{B} $, B ̂ V ̂ $ \hat{B}-\hat{V} $, V ̂ R ̂ $ \hat{V}- \hat{R} $, R ̂ I ̂ $ \hat{R}-\hat{I} $ is presented in Figs. 10 and 11 for α1, Figs. 12 and 13 for α2, and Figs. 14 and 15 for α3. The upper and lower contours correspond to a variation of ±1σ of the associated qj parameter. The αj(t = 0) components obtained when training the color law with only data at maximum or with the full time series are almost identical: all conclusions from Sect. 3.3.3 remain valid and we focus here on the temporal behavior.

thumbnail Fig. 10.

Average spectral time series and the effect of a variation of q1 by ±1σ. Each phase is separated by a constant magnitude offset.

thumbnail Fig. 11.

Average light curves and color evolution with phase in synthetic bands, and the predicted effect of a variation of q1 by ±1σ.

thumbnail Fig. 12.

Average spectral time series and the effect of a variation of q2 by ±1σ. Each phase is separated by a constant magnitude offset.

thumbnail Fig. 13.

Average light curves and color evolution with phase in synthetic bands, and the predicted effect of a variation of q2 by ±1σ.

thumbnail Fig. 14.

Average spectral time series and the effect of a variation of q3 by ±1σ. Each phase is separated by a constant magnitude offset.

thumbnail Fig. 15.

Average light curves and color evolution with phase in synthetic bands, and the predicted effect of a variation of q3 by ±1σ.

Of the three factors, q1 has the strongest impact: this can be seen from the time series presented in Fig. 10 and on the broad band light curves presented in Fig. 11. This is associated with a stretch effect, visible in U ̂ $ \hat{U} $, B ̂ $ \hat{B} $, and V ̂ $ \hat{V} $ bands as an enlargement of the variation band as we move away from maximum. Between −12 days and +30 days, the q1 factor has a strong influence on spectroscopic details. The brightest SNe Ia exhibit shallower troughs in their spectral features: this is especially visible for Si IIλ4131, Si IIλ5972, and Ca II IR, but also valid for most of the lines. This effect is more pronounced for SNe Ia at earlier phases, and fades over time. After +30 days, the factor q1 is less sensitive to localized features, and shows a relative enhancement of the optical B ̂ $ \hat{B} $ to R ̂ $ \hat{R} $ bands with respect to U ̂ $ \hat{U} $ and I ̂ $ \hat{I} $. Interestingly, between +12 days and +24 days, the brightest SNe Ia are bluer than at maximum or at later phases. This is true in absolute value, but also relative to the average SN Ia. This shows that the SNe Ia color space is driven by a variable intrinsic component in addition to reddening by dust. This intrinsic color variation is linked with the position of the second peak in I ̂ $ \hat{I} $, which appears later for brighter SNe Ia.

As at maximum, the effect of the q2 factor for the whole time series (Fig. 12) is mainly localized around specific spectral features, with little impact on broad band light curves (Fig. 13), with the exception of I ̂ $ \hat{I} $-band. This factor associates higher line velocities around maximum with deeper absorption troughs, strongly visible before maximum and up to +18 days in the Ca II H&K region and at all phases in the Ca II IR region. The net effect described by q2 is that higher velocities are associated with slightly dimmer SNe Ia. At later phases velocity effects are still observable and at all phases after −6 days, high-velocity SNe Ia have localized differences in the 4700 − 5100 Å spectral region. In addition, they are bluer in V ̂ $ \hat{V} $, R ̂ $ \hat{R} $, and I ̂ $ \hat{I} $ after maximum, with a maximal effect on the I ̂ $ \hat{I} $-band at around +12 days. A slight stretch effect is visible in U ̂ $ \hat{U} $, and B ̂ $ \hat{B} $, and also results in a later phase for the second maximum in I ̂ $ \hat{I} $ for the brightest SNe Ia in B ̂ $ \hat{B} $, as was mentioned for q1. The shape of variations of the light curve in I ̂ $ \hat{I} $ are very different for each factor, indicating that this band can help reconstruct the q2 factor when using photometric data only. Furthermore, while the q2 factor has a small impact on the variations of individual light curves, it has a sizable influence on colors, especially R ̂ I ̂ $ \hat{R} -\hat{I} $, U ̂ B ̂ $ \hat{U} -\hat{B} $, and after +25 days, on B ̂ V ̂ $ \hat{B} - \hat{V} $. In addition to the I ̂ $ \hat{I} $-band, the different color variation pattern in B ̂ V ̂ $ \hat{B} -\hat{V} $ at late phases offers a way of disentangling the q1 and q2 factors when dealing with photometric data.

As seen in Fig. 14, the influence of q3 is minimal around maximum light, similar to the results from Sect. 3.3. It grows at other phases and appears as a stretch effect on the light curves presented in Fig. 15. Unlike q1, the stretch described by q3 shows little correlation with the magnitude at maximum. While q3 has a larger impact on the light curves than q2, the reverse is true for individual colors. At late phases, q3 exhibits a brighter-redder correlation that might be employed to distinguish this vector from q1 when using photometric data only. The influence of q3 on spectral structures is mainly visible in calcium regions, although variations can also be observed in other regions. Around Ca II IR, between −6 days and maximum, brighter SNe Ia exhibit a deeper absorption trough at high calcium velocity. This high-velocity calcium feature is visible for all q3 values at −12 days, but fades faster for dimmer SNe Ia. Meanwhile the lower velocity counterpart is only visible in q3 for the dimmest supernovae at −12 days and appears later for brighter SN Ia.

5. SUGAR performance

5.1. Fitting spectra with SUGAR

The measurement of spectral indicators is very sensitive to the signal-to-noise ratio of the spectrum, and at high redshift, measuring the indicators in the observer’s reference frame is not possible. However, once the model has been trained, the parameters q1, q2, q3, AV, and ΔMgray for a given SN Ia can be directly estimated from its spectral time series. Estimating the parameters q1, q2, q3, AV, and ΔMgray is done directly on spectra by minimizing the following χ i 2 $ \chi^2_i $:

χ i 2 = ( M i A x i ) T W M i ( M i A x i ) , $$ \begin{aligned} \chi _i^2 = \left( \mathbf M^{\prime}_{i} -\mathbf A^{\prime}\boldsymbol{x}_i \right)^T \mathbf W _\mathbf{M^{\prime} _i} \ \left(\mathbf M^{\prime} _{i} -\mathbf A^{\prime} \boldsymbol{x}_i \right), \end{aligned} $$(29)

where M i $ \mathbf{M}^\prime_{i} $ is the vector containing the entire spectral time series of SN Ia i, ordered in the same way as for the SUGAR training:

M i = ( y 3340 Å i y λ i y 8580 Å i ) . $$ \begin{aligned} \mathbf M ^{\prime} _{i}=\left(\boldsymbol{y}_{3340\,\AA \ i} \cdots \ \boldsymbol{y}_{\lambda \ i} \cdots \ \boldsymbol{y}_{8580\,\AA \ i} \right). \end{aligned} $$(30)

where yλi corresponds to the light curve at wavelength λ for the observed phases. The model is projected onto observation phases using cubic spline interpolation. For notation efficiency, it is thus ordered as:

A = ( M 0 , 1 , f ( λ , R V ) , α 1 , α 2 , α 3 ) . $$ \begin{aligned} \mathbf A ^{\prime} =\left(\mathbf M _{0}^{\prime} ,1, f\left(\lambda ,R_V\right), \boldsymbol{\alpha }_{1^{\prime}},\boldsymbol{\alpha }_{2^{\prime}},\boldsymbol{\alpha }_{3^{\prime}}\right). \end{aligned} $$(31)

The vector xi contains the parameters that describe the SN Ia i according to the SUGAR model:

x i T = ( 1 , Δ M gray i , A V i , q i , 1 , q i , 2 , q i , 3 ) . $$ \begin{aligned} \boldsymbol{x}_i^T=\left(1,\Delta M_{\mathrm{gray} \ i},A_{V\ i},q_{i,1},q_{i,2}, q_{i,3} \right). \end{aligned} $$(32)

Finally, W M i $ \mathbf{W}_{\mathbf{M}^\prime_i} $ is the weight matrix that comes from the M i $ \mathbf{M}^\prime_i $ errors. Estimating xi amounts to minimizing the χ i 2 $ \chi^2_i $ with respect to xi, giving the solution:

x i = ( A T W M i A ) 1 ( A T W M i M i ) $$ \begin{aligned} \boldsymbol{x}_i=\left( \mathbf{A^{\prime }}^{T} \mathbf W _\mathbf{M ^{\prime} _i}\mathbf A ^{\prime} \right)^{-1} \left(\mathbf{A^{\prime }}^{T} \mathbf W _\mathbf{M ^{\prime} _i}\mathbf M ^{\prime} _i \right) \end{aligned} $$(33)

Therefore the covariance on the xi is given by:

cov ( x i ) = ( A T W M i A ) 1 . $$ \begin{aligned} \text{ cov}\left(\boldsymbol{x}_i\right)=\left(\mathbf{A^{\prime }}^{T} \mathbf W _\mathbf{M ^{\prime} _i}\mathbf A ^{\prime} \right)^{-1}. \end{aligned} $$(34)

The term xi is computed for the 105 SNe Ia that were used to train SUGAR. The values obtained are compared with the factors and AV measured at maximum light in Fig. 16. Both quantities are highly correlated. However, this correlation decreases with decreasing order of the factor index. This is because the factor uncertainties increase with index due to the noise being greater for higher-order components. The increase in noise therefore reduces the correlation. The extinction term is quite compatible between its maximum estimate and its estimation with SUGAR.

thumbnail Fig. 16.

Factors estimated directly from the spectral indicators, as a function of the factors estimated only from the SUGAR model and the AV estimated at maximum as a function of those estimated with SUGAR. The red lines represent the equation x = y.

The term xi is also computed for the 58 SNe Ia kept for validation. The distribution of the xi parameters from the validation sample is compared to the xi parameters from the training in Fig. 17. From this, one can see that the training and validation samples are compatible, as it is confirmed in following sections.

thumbnail Fig. 17.

SUGAR parameters fit directly from spectral data for the 105 SNe Ia used to train SUGAR (blue) and for the 58SNe Ia from the validation sample (red).

5.2. Comparison with SALT2

5.2.1. Difference between SUGAR and SALT2 models

Throughout this article, we use the SALT2 model as a benchmark to evaluate the properties of the SUGAR model, because it has become a reference for cosmological analyses. SALT2 describes the SED of SNe Ia in the following functional form (Guy et al. 2007):

F ( t , λ ) = X 0 [ F 0 ( t , λ ) + X 1 F 1 ( t , λ ) ] exp [ C C L ( λ ) ] , $$ \begin{aligned} F(t,\lambda )=X_0 [F_0(t,\lambda )+X_1 F_1(t,\lambda )]\ \exp [C\, C_L(\lambda )]\ , \end{aligned} $$(35)

where the observed flux F depends on the individual SN Ia parameters (X0, X1, C) and on the model average flux F0 (defined up to a multiplicative factor), the model flux variation F1, and an empirical color law CL. In the regime where X1F1 is small with respect to F0, this model, once expressed in magnitudes, can be identified with the SUGAR model: (F0, F1, CL) and (X1, C) take the roles of (M0, α1, f(RV)), and (q1, AV), respectively. ΔMgray is the equivalent of a linear combination of (log X0, X1, C).

Although the major differences between the models come from the inclusion of two additional intrinsic components in SUGAR, its color law, and the data on which they were trained, there are many other differing characteristics, which are summarized Table 1. The inclusion of two additional intrinsic components is enabled by the level of detail available in our spectrophotometric time series. We therefore expect the SUGAR model to provide a more faithful representation of the spectral details than SALT2. Though analogous in spirit, X1 and q1 are derived employing a quite different paradigm: F1 comes from a PCA analysis performed in flux space, while the αi are trained in magnitude space and driven to reproduce the prominent spectral indicators. A direct generalization of the SALT2 approach with more components is presented in Saunders et al. (2018). While the empirical SALT2 CL function models the average color variation, our color curve was shown to be accurately described by a Cardelli et al. (1989) extinction law. The SALT2 color parameter c thus mixes information about reddening by interstellar media (described by AV) with intrinsic SNe Ia properties that were not directly matched with one of the qi parameters. These differences in the treatment of colors would manifest in long-range spectral variation (because the extinction curve is smooth). In regions where the phase coverage is sufficient, the different phase interpolation methods used are not expected to contribute significantly to differences between models. Otherwise, and for early phases in particular, we expect the differences to be driven by the training samples rather than interpolation technique.

Table 1.

Main differences between the SALT2.4 model and the SUGAR model.

The wavelength and phase coverage of SUGAR are restricted compared to SALT2. This is directly related to the training data: SUGAR is trained on low redshift spectrophotometric time series, while SALT2 is trained on a wide redshift range of photometric data, employing a handful of spectra to help refine the SED details. As SALT2 uses low and high redshift data it offers a better coverage of the UV domain. It also has a larger phase coverage thanks to the rolling cadence strategy, while SNfactory SNe Ia have to be specifically targeted. While SALT2 could be easily trained with SNfactory data added to their sample, the reverse is not true: major adaptation would be needed to allow SUGAR to incorporate photometric data in its training. However, this does not prevent fitting photometric light curves with SUGAR in the same way as with SALT2, provided they fall within the phase and spectral coverage of SUGAR. In contrast, SALT2 was not designed to fit spectrophotometric data.

5.2.2. Spectral residuals

The SUGAR model is designed to improve the spectral description of SNe Ia by adding components beyond stretch and color. While the ideal model would perfectly match the data, the level of mismatch can be quantified by the residuals, that is the difference between the left and right term of Eq. (22). The improvement with respect to SALT2 is then studied by comparing the data to the spectral reconstruction predicted by both models. One example of these comparisons is presented Fig. 18, which shows the spectra and residuals for SN-training-4. For this supernova, SUGAR gives a better description than SALT2. The improvement is especially clear in the UV after +17 days and in the infrared at any phase, and particularly in the O Iλ7773 and Ca II IR regions. This is not surprising insofar as SALT2 is essentially trained to reproduce the bluer part of the spectrum. The SUGAR description is also better in the regions around lines, even if this description is not totally satisfactory in some areas: Ca II H&K, Si IIλ6355, and the bluer part of the spectrum at late phases. In the B-band, the reconstruction of the spectral details is quite similar between SUGAR and SALT2, although SUGAR seems to follow more faithfully the spectral details, as can be seen in the Mg II area.

thumbnail Fig. 18.

SUGAR and SALT2 fit of SN-training-4. Left: observed spectral series of SN-training-4 (in black), compared to SALT2 (in red) and SUGAR (blue). Right: residuals of SALT2 (red) and SUGAR (blue) models to the observations. The black line represents the zero value of the residuals.

SN-training-4 is only one example among the 105 SNe Ia on which SUGAR was trained. To have a statistical description of the precision of the models, we derive the dispersion of the residuals obtained for all SNe Ia as a function of wavelength and phase. In Fig. 19, we present the wRMS of this dispersion: for each wavelength, we compute the wRMS across all phases and weight it by the spectral variance. To show how the accuracy of the model evolves when going from a classic Tripp (1998) relation (when only q1 and AV are included) to the full model, the wRMS for SUGAR is computed three times with progressive inclusion of q1, q2, and q3. We compute the residual dispersion for both the training and the validation sample. What was observed for SN-training-4 is confirmed in this statistical analysis across all SNe Ia: at any wavelength, SUGAR gives a better description than SALT2. The strongest improvement is in the IR, but is also significant in the UV. All the spectral details are significantly improved, such as the Ca II H&K, Si IIλ6355, the O Iλ7773, and the Ca II IR. The best performance of the SUGAR model is obtained in the region around 6500 Å, where the dispersion gets as low as 0.05 mag. It should be kept in mind that the spectral dispersion includes noise variance that may be high in some spectra and is therefore not representative of the dispersion expected in photometry after integration over broad band filters. The phase evolution of the wRMS is presented in Fig. 20: the dispersion in each phase bin is calculated across all wavelengths. This includes wavelengths for which SALT2 is not adapted: this explains relatively high values of wRMS for this model compared to SUGAR.

thumbnail Fig. 19.

wRMS of the residuals as a function of the wavelength for the full SUGAR model (blue line), for the SUGAR model only corrected by q1 and AV (blue dashed dotted line), for the SUGAR model only correct by q1, q2 and AV (blue dashed lines), and for the SALT2 model (red line). Top: training data. Bottom: validation data.

thumbnail Fig. 20.

wRMS of the residuals as a function of phase for the full SUGAR model (blue line), for the SUGAR model only corrected by q1 and AV (blue dashed dotted line), for the SUGAR model only correct by q1, q2 and AV (blue dashed lines), and for the SALT2 model (red line). Top: training data. Bottom: validation data.

The successive inclusion of q1, q2, and q3 can help us understand the origin of SUGAR’s improvement over SALT2. As the training samples for both models are different, one may wonder what SALT2 would have given if trained on SNfactory data. This situation can be studied when including only q1: SUGAR|q1 is slightly worse than SALT2 between Si IIλ4131 and Si IIλ6355 features, and between Si IIλ6355 and O Iλ7773. These regions cover the wavelength range where SALT2 was trained and optimized, while SUGAR|q1 is only the by-product of the three component model and is not by itself optimal for a single component. Earlier phase are more difficult to reproduce in both models: this could be an effect of higher variability beyond stretch at those phases. For later phases, SUGAR|q1 does not exhibit the same increase in wRMS as SALT2. This is also an effect of the lack of coverage for IR wavelength for SALT2. One could therefore expect SALT2 to behave at least as well as SUGAR|q1 in UV and IR when trained on SNfactory data. As expected, the inclusion of additional components improves the accuracy of SUGAR: q2 improves the description of the spectral features because it is correlated with (Ca II H&K, Si IIλ6355, and Ca II IR) while one has to wait for the inclusion of q3 to reach the full SUGAR precision.

The improvement brought by SUGAR is thus enabled by two factors: the spectral coverage and the spectral resolution that allows the inclusion of the new components, q2, which improves the color description, and q3, which helps reduce the dispersion away from maximum light. Moreover, the improvement added by q2 and q3 are also seen in the validation sample, which confirms that these components are not due to overtraining.

5.2.3. Relation between SALT2 and SUGAR parameters

In Sect. 3.2 we presented the correlations between the SALT2 parameters and the first five factors found using factor analysis. After training the full SUGAR model and reconstructing the parameters of individual supernovae as described in Sect. 5.1, we can study the correlations between the SALT2 parameters (X0, X1, C) and the SUGAR parameters (ΔMgray, q1, q2, q3, AV). The direct comparison of X0 and ΔMgray is irrelevant due to the different prescription for lifting the degeneracies of both models. We recognize ΔMgray as the Hubble diagram residuals, a quantity that we can also compute from SALT2 parameters (and denote as Δμcorr). The results of this comparison between SALT2 and SUGAR parameters are presented in Fig. 21. The lessons from the correlations of SALT2 parameters and the original factors still hold: X1 is strongly correlated with q1 and q3; q2 has no significant correlation with any of SALT2 parameters, except with Δμ. The small correlation of q2 and q3 with Δμ shows that the inclusion of this factor can help improve the standardization. The SALT2 color C is strongly correlated with AV, as would be expected since the purpose of these two parameters is to take into account the reddening of SNe Ia. Contrary to the results from Sect. 3.2 with the factors alone, C now exhibits a 4.5σ correlation with q3. This correlation might come from the slight fluctuation in color with respect to phase visible in Fig. 15 and the inclusion of the validation sample. The slight correlation of C and ΔMgray comes from the correlation of ΔMgray and AV. Indeed, ΔMgray and AV were determined separately during the training, without fixing potential degeneracy between them. This explains why they are correlated. Finally, Δμ has a strong correlation with ΔMgray. This indicates that the inclusion of the additional factors in SUGAR is unable to catch the major source of magnitude variability of SNe Ia, even if it provides a much improved description of spectral details.

thumbnail Fig. 21.

SALT2 parameters estimated in photometry as a function of the SUGAR parameters estimated in spectroscopy (on training and validation sample). The color code estimates the correlation between the parameters SALT2 and SUGAR.

6. Discussion

6.1. Gray dispersion and dispersion matrix fitting

For the model at maximum light, we did not fit a gray term but did include a dispersion matrix while estimating the extinction curve. However, for the full spectral time series, we fit for a gray offset and used the extinction curve determined at maximum light. One might wonder why we did not do a simultaneous fit of the full spectral time series, the gray offset and the extinction curve. Our decision was motivated by the following logic.

In order to perform a simultaneous fit, one important thing is to have a good estimate of the dispersion matrix, which is essential for the estimation of the extinction curve, as discussed in Scolnic et al. (2014). In addition, this matrix would have the advantage of taking into account temporal correlations of residuals, in addition to the color correlations of the residuals, which would improve the model developed at maximum light.

However, from a numerical point of view, this would break the sparse algebra approximations that significantly speed up the training. Indeed, the sparse algebra (due to Eq. (28)) makes it possible to improve speed efficiency of minimization of Eq. (26) from 𝒪((NλNtNparam)3) to 𝒪(Nλ×(NtNparam)3), where Nλ is the number of wavelength bin, Nt the number of bin in terms of time, and Nparam is the number of parameters to fit for a given wavelength and epoch. In SUGAR training, Nλ ∼ 190, Nt ∼ 30, and Nparam = 5 (average spectrum + three factors + extinction curve), consequently, not doing sparse algebra significatively slow down the speed of the training algorithm by O( N λ 2 ) $ \mathcal{O}\left(N_{\lambda}^2 \right) $.

Moreover, there would be problems with numerical stability for the estimation of the dispersion matrix due to its size; in the case of the simultaneous fit: it would be ∼20 000 × 20 000. Indeed, it would be necessary to make sure that the dispersion matrix is positive definite, which would involve doing a Singular Value Decomposition that would again slow down the speed of the training algorithm. In addition, it is evident that the gray offset is degenerate with the dispersion matrix because the dispersion matrix can fully capture a gray dispersion. Therefore we deliberately did not include it explicitly in the fit of the extinction curve within SUGAR in Sect. 3.3. There is however a degeneracy between Aλ0 and the gray offset, and part of the latter is captured by this parameter. The SUGAR model presented here is already a significant improvement over the SALT2 model and provides insights into understanding SN Ia variability.

6.2. Test of adding an additional component

In the Sect. 3.2, we discussed the number of factors needed to describe the final SUGAR SED, and we concluded that this cannot be determined only from the factor description of spectral features at maximum light. Indeed, even if they are strongly related, the main goal is to know the number of components needed to describe the full SED and not the number of components needed to describe the spectral features space at maximum light. One way to check if the choice of three factors used here is optimal is to retrain the SUGAR model with more than three components and observe how the spectral residuals evolve with this change.

In the following, we ran the training of SUGAR twice, each time adding an additional component, that is we reproduced Sec. 4.2 with q1 − q4 and q1 − q5 components. In both cases the value of RV = 2.6 found with three factors is fixed in order to focus only on intrinsic parameters. In Fig. 22 we compare the spectral residuals of the SUGAR model to those of SUGAR trained with the additional factors.

thumbnail Fig. 22.

wRMS of the residuals as a function of the wavelength for the full SUGAR model (blue line), for the SUGAR model with the addition of the factor q4 (blue dashed dotted line), and for the SUGAR model with the addition of the factor q4 and q5 (blue dashed lines).

As expected, the addition of the two new components from factor analysis does not improve the description of SNe Ia as significantly as the addition of q2 or q3. The factor q4 slightly improves the description within the Ca II H&K area (0.05 mag) and the Si IIλ6355 (0.03 mag), but does not improve the SED outside these areas. The factor q5 does not significantly improve the description of the SED. This is confirmation that our choice of using three factors provides a good description of the SED.

7. Conclusion

In this paper we have presented a new spectra-temporal empirical model of SNe Ia, named SUGAR. This model significantly improves the spectral description of SNe Ia compared to the current state of the art by going beyond the classical stretch and color parametrization.

In Sect. 2, we presented the SNfactory spectrophotometric dataset that was used to train the model. In Sect. 3, we presented the intermediary data that were used to train the full SUGAR model. In a first step, we selected a set of 13 spectral indicators near maximum light in B-band that are composed of pseudo-equivalent widths and minima of P-Cygni profiles. Those spectral indicators were chosen to describe the intrinsic part of the SUGAR model because they are easy to define and are independent of host-galaxy dust extinction. Then, we defined a new basis where the 13 spectral indicators are uncorrelated. For this we developed a factor analysis algorithm that is more robust in the presence of errors than PCA algorithms. Three factors seems to be effective enough to describe the spectral indicators space at maximum light. The first factor describes the coherent variation of the pseudo-equivalent widths, mainly of the silicon and calcium lines. Like those lines, this factor is strongly correlated with stretch. The second factor is mainly correlated with the velocities and shows a very weak link with pseudo-equivalent widths, except for the Ca II H&K and S II W. The third factor shows a slight correlation with the stretch parameter. Once the factors have been defined, we established a model of SED at maximum light based on these three first factors from spectral features, using the same underlying method as in Chotard et al. (2011). This allows us to separate the intrinsic behavior from color variation due to dust. Finally, we find that the color curve obtained is compatible with a Cardelli et al. (1989) extinction curve with RV of 2.6. We then developed an interpolation method using Gaussian process in order to train the SUGAR model on a uniform and fixed time grid. The correlation length obtained varies between 5 and 12 days depending on spectral regions, which justifies the use of a 3-day time step for the grid. The uncertainty from the Gaussian processes is generally underestimated by a factor of 1.3 and more investigation is needed to understand why this occurs.

The training process for the SUGAR model is described in Sect. 4, and an interpretation of each new component is provided. Both factors q1 and q3 resemble a stretch effect, but q3 has less impact around maximum light and in color space than q1. The effects of q2 are strongest in the areas of spectral features, and mainly evident in the infrared as compared to broad band photometry.

After calculating the model, we showed that, instead of going through the calculation of spectral indicators, we can work directly with the spectral time series to recover the three factors and extinction parameter. By studying model residuals as a function of wavelength, it is shown that SUGAR improves the spectral description 0.1–0.4 mag with respect to SALT2. This is valid for both training and validation data sets, which confirms that there was no overtraining resulting from the addition of q2 and q3. This shows that three parameters, defined at a given phase (i.e. maximum light) have predictive power at other phases. Performance of the SUGAR model makes it an excellent candidate for use with surveys such as ZTF, LSST or WFIRST, and offers an alternative way of going beyond stretch and color to measure distance with SNIa.


3

In general, empirical SNIa modeling must also handle spectroscopic and photometric data that are not observed on the same day or by the same instruments. However, this is not a problem with the SNfactory dataset.

4

We refer to this as O Iλ7773 but note it overlaps with Mg IIλ7812.

Acknowledgments

We thank the technical staff of the University of Hawaii 2.2-m telescope, and Dan Birchall for observing assistance. We recognize the significant cultural role of Mauna Kea within the indigenous Hawaiian community, and we appreciate the opportunity to conduct observations from this revered site. This work was supported in part by the Director, Office of Science, Office of High Energy Physics of the U.S. Department of Energy under Contract No. DE-AC025CH11231. Support in France was provided by CNRS/IN2P3, CNRS/INSU, and PNC; LPNHE acknowledges support from LABEX ILP, supported by French state funds managed by the ANR within the Investissements d’Avenir programme under reference ANR-11- IDEX-0004-02. NC is grateful to the LABEX Lyon Institute of Origins (ANR-10-LABX-0066) of the University de Lyon for its financial support within the program “Investissements d’Avenir” (ANR-11-IDEX-0007) of the French government operated by the National Research Agency (ANR). Support in Germany was provided by DFG through TRR33 “The Dark Universe” and by DLR through grants FKZ 50OR1503 and FKZ 50OR1602. In China support was provided by Tsinghua University 985 grant and NSFC grant No. 11173017. Some results were obtained using resources and support from the National Energy Research Scientific Computing Center, supported by the Director, Office of Science, Office of Advanced Scientific Computing Research of the U.S. Department of Energy under Contract No. DE-AC02- 05CH11231. We thank the Gordon and Betty Moore Foundation for their continuing support. Additional support was provided by NASA under the Astrophysics Data Analysis Program grant 15-ADAP15-0256 (PI:Aldering). We also thank the High Performance Research and Education Network (HPWREN), supported by National Science Foundation Grant Nos. 0087344 and 0426879. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 759194 – USNAC). PFL acknowledges support from the National Science Foundation grant PHY-1404070. The work of MVP (participation in SUGAR implementation in sncosmo) was supported by Russian Science Foundation grant 18-72-00159. We thank Claire-Alice Hébert for reviewing and giving helpful advice on this paper.

References

  1. Aldering, G., Adam, G., Antilogus, P., et al. 2002, SPIE Conf. Ser., 4836, 61 [Google Scholar]
  2. Aldering, G., Antilogus, P., Bailey, S., et al. 2006, ApJ, 650, 510 [NASA ADS] [CrossRef] [Google Scholar]
  3. Amanullah, R., Johansson, J., Goobar, A., et al. 2015, MNRAS, 453, 3300 [NASA ADS] [CrossRef] [Google Scholar]
  4. Arsenijevic, V., Fabbro, S., Mourão, A. M., & Rica da Silva, A. J. 2008, A&A, 492, 535 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  5. Astier, P., Guy, J., Regnault, N., et al. 2006, A&A, 447, 31 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  6. Bailey, S., Aldering, G., Antilogus, P., et al. 2009, A&A, 500, L17 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  7. Bellm, E. 2014, in The Third Hot-wiring the Transient Universe Workshop, eds. P. R. Wozniak, M. J. Graham, A. A. Mahabal, & R. Seaman, 27 [Google Scholar]
  8. Betoule, M., Kessler, R., Guy, J., et al. 2014, A&A, 568, A32 [CrossRef] [Google Scholar]
  9. Blondin, S., & Tonry, J. L. 2007, ApJ, 666, 1024 [NASA ADS] [CrossRef] [Google Scholar]
  10. Blondin, S., Mandel, K. S., & Kirshner, R. P. 2011, A&A, 526, A81 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  11. Bongard, S., Soulez, F., Thiébaut, É., & Pecontal, É. 2011, MNRAS, 418, 258 [NASA ADS] [CrossRef] [Google Scholar]
  12. Branch, D., Dang, L. C., Hall, N., et al. 2006, PASP, 118, 560 [NASA ADS] [CrossRef] [Google Scholar]
  13. Buton, C., Copin, Y., Aldering, G., et al. 2013, A&A, 549, A8 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  14. Cardelli, J. A., Clayton, G. C., & Mathis, J. S. 1989, ApJ, 345, 245 [NASA ADS] [CrossRef] [Google Scholar]
  15. Chotard, N. 2011, PhD Thesis, Université Claude Bernard - Lyon I [Google Scholar]
  16. Chotard, N., Gangler, E., Aldering, G., et al. 2011, A&A, 529, L4 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  17. Copeland, E. J., Sami, M., & Tsujikawa, S. 2006, Int. J. Mod. Phys. D, 15, 1753 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
  18. Delubac, T., Bautista, J. E., Busca, N. G., et al. 2015, A&A, 574, A59 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  19. Dempster, A. P., Laird, N. M., & Rubin, D. B. 1977, J. R. Stat. Soc. Ser. B (Methodological), 39, 1 [Google Scholar]
  20. Fakhouri, H. K., Boone, K., Aldering, G., et al. 2015, ApJ, 815, 58 [NASA ADS] [CrossRef] [Google Scholar]
  21. Filippenko, A. V., Richmond, M. W., Matheson, T., et al. 1992a, ApJ, 384, L15 [NASA ADS] [CrossRef] [Google Scholar]
  22. Filippenko, A. V., Richmond, M. W., Branch, D., et al. 1992b, AJ, 104, 1543 [NASA ADS] [CrossRef] [Google Scholar]
  23. Folatelli, G. 2004, New Astron. Rev., 48, 623 [NASA ADS] [CrossRef] [Google Scholar]
  24. Foley, R. J., & Kasen, D. 2011, ApJ, 729, 55 [NASA ADS] [CrossRef] [Google Scholar]
  25. Garavini, G., Folatelli, G., Nobili, S., et al. 2007, A&A, 470, 411 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  26. Ghahramani, Z., & Hinton, G. E. 1997, The EM Algorithm for Mixtures of Factor Analyzers, Tech. rep. [Google Scholar]
  27. Guillochon, J., Parrent, J., Kelley, L. Z., & Margutti, R. 2017, ApJ, 835, 64 [NASA ADS] [CrossRef] [Google Scholar]
  28. Guy, J., Astier, P., Baumont, S., et al. 2007, A&A, 466, 11 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  29. Guy, J., Sullivan, M., Conley, A., et al. 2010, A&A, 523, A7 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  30. Hamuy, M., Phillips, M. M., Maza, J., et al. 1995, AJ, 109, 1 [NASA ADS] [CrossRef] [Google Scholar]
  31. Kelly, P. L., Hicken, M., Burke, D. L., Mandel, K. S., & Kirshner, R. P. 2010, ApJ, 715, 743 [NASA ADS] [CrossRef] [Google Scholar]
  32. Kim, A. G., Thomas, R. C., Aldering, G., et al. 2013, ApJ, 766, 84 [NASA ADS] [CrossRef] [Google Scholar]
  33. Lantz, B., Aldering, G., Antilogus, P., et al. 2004, SPIE Conf. Ser., 5249, 146 [Google Scholar]
  34. Léget, P. F. 2016, PhD Thesis, Université Blaise Pascal [Google Scholar]
  35. LSST Dark Energy Science Collaboration 2012, ArXiv e-prints [arXiv:1211.0310] [Google Scholar]
  36. Maguire, K., Sullivan, M., Ellis, R. S., et al. 2012, MNRAS, 426, 2359 [NASA ADS] [CrossRef] [Google Scholar]
  37. Mandel, K. S., Foley, R. J., & Kirshner, R. P. 2014, ApJ, 797, 75 [NASA ADS] [CrossRef] [Google Scholar]
  38. Mandel, K. S., Scolnic, D. M., Shariff, H., Foley, R. J., & Kirshner, R. P. 2017, ApJ, 842, 93 [NASA ADS] [CrossRef] [Google Scholar]
  39. Nordin, J., Östman, L., Goobar, A., et al. 2011, A&A, 526, A119 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  40. Nordin, J., Aldering, G., Antilogus, P., et al. 2018, A&A, 614, A71 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  41. Nugent, P., Phillips, M., Baron, E., Branch, D., & Hauschildt, P. 1995, ApJ, 455, L147 [NASA ADS] [CrossRef] [Google Scholar]
  42. Pearson, K. 1901, Phil. Mag., 2, 559 [Google Scholar]
  43. Pereira, R., Thomas, R. C., Aldering, G., et al. 2013, A&A, 554, A27 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  44. Perlmutter, S., Gabi, S., Goldhaber, G., et al. 1997, ApJ, 483, 565 [NASA ADS] [CrossRef] [Google Scholar]
  45. Perlmutter, S., Aldering, G., della Valle, M., et al. 1998, Nature, 391, 51 [CrossRef] [Google Scholar]
  46. Perlmutter, S., Aldering, G., Goldhaber, G., et al. 1999, ApJ, 517, 565 [NASA ADS] [CrossRef] [Google Scholar]
  47. Phillips, M. M. 1993, ApJ, 413, L105 [NASA ADS] [CrossRef] [Google Scholar]
  48. Planck Collaboration XIII. 2016, A&A, 594, A13 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  49. Polin, A., Nugent, P., & Kasen, D. 2019, ApJ, 873, 1 [Google Scholar]
  50. Pskovskii, I. P. 1977, Sov. Astron., 21, 675 [NASA ADS] [Google Scholar]
  51. Pskovskii, Y. P. 1984, Sov. Astron., 28, 658 [NASA ADS] [Google Scholar]
  52. Rasmussen, C. E., & Williams, C. K. 2006, Gaussian Processes for Machine Learning (Cambridge: The MIT Press), 38, 715 [Google Scholar]
  53. Rest, A., Scolnic, D., Foley, R. J., et al. 2014, ApJ, 795, 44 [NASA ADS] [CrossRef] [Google Scholar]
  54. Riess, A., Press, W., & Kirshner, R. 1996, AJ, 473, 88 [Google Scholar]
  55. Riess, A. G., Filippenko, A. V., Challis, P., et al. 1998, AJ, 116, 1009 [NASA ADS] [CrossRef] [Google Scholar]
  56. Riess, A. G., Macri, L. M., Hoffmann, S. L., et al. 2016, ApJ, 826, 56 [NASA ADS] [CrossRef] [Google Scholar]
  57. Rigault, M., Copin, Y., Aldering, G., et al. 2013, A&A, 560, A66 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  58. Rigault, M., Aldering, G., Kowalski, M., et al. 2015, ApJ, 802, 20 [NASA ADS] [CrossRef] [Google Scholar]
  59. Rigault, M., Brinnel, V., Aldering, G., et al. 2018, A&A, submitted [Google Scholar]
  60. Roman, M., Hardin, D., Betoule, M., et al. 2018, A&A, 615, A68 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  61. Rust, B. W. 1974, PhD Thesis, Oak Ridge National Lab., TN [Google Scholar]
  62. Sasdelli, M., Hillebrandt, W., Aldering, G., et al. 2015, MNRAS, 447, 1247 [NASA ADS] [CrossRef] [Google Scholar]
  63. Saunders, C., Aldering, G., Antilogus, P., et al. 2018, ApJ, 869, 167 [Google Scholar]
  64. Savitzky, A., & Golay, M. J. E. 1964, Anal. Chem., 36, 1627 [NASA ADS] [CrossRef] [Google Scholar]
  65. Scalzo, R. A., Aldering, G., Antilogus, P., et al. 2010, ApJ, 713, 1073 [NASA ADS] [CrossRef] [Google Scholar]
  66. Schmidt, B. P., Suntzeff, N. B., Phillips, M. M., et al. 1998, ApJ, 507, 46 [NASA ADS] [CrossRef] [Google Scholar]
  67. Scolnic, D. M., Riess, A. G., Foley, R. J., et al. 2014, ApJ, 780, 37 [NASA ADS] [CrossRef] [Google Scholar]
  68. Scolnic, D. M., Jones, D. O., Rest, A., et al. 2018, ApJ, 859, 101 [NASA ADS] [CrossRef] [Google Scholar]
  69. Silverman, J. M., Kong, J. J., & Filippenko, A. V. 2012, MNRAS, 425, 1819 [NASA ADS] [CrossRef] [Google Scholar]
  70. Spearman, C. 1904, Am. J. Psychol., 15, 201 [CrossRef] [Google Scholar]
  71. Spearman, C. 1927, The Abilities of Man (New York: Macmillan) [Google Scholar]
  72. Spergel, D., Gehrels, N., Baltay, C., et al. 2015, ArXiv e-prints [arXiv:1503.03757] [Google Scholar]
  73. Sullivan, M., Conley, A., Howell, D. A., et al. 2010, MNRAS, 406, 782 [NASA ADS] [Google Scholar]
  74. Suzuki, N., Rubin, D., Lidman, C., et al. 2012, ApJ, 746, 85 [NASA ADS] [CrossRef] [Google Scholar]
  75. Tripp, R. 1998, A&A, 331, 815 [NASA ADS] [Google Scholar]
  76. Troxel, M. A., MacCrann, N., Zuntz, J., et al. 2018, Phys. Rev. D, 98, 043528 [NASA ADS] [CrossRef] [Google Scholar]
  77. Wang, X., Filippenko, A. V., Ganeshalingam, M., et al. 2009, ApJ, 699, L139 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A: Spectral-indicator measurements

As the spectral indicators are a key ingredient of our statistical analysis, we need a robust and automatic algorithm to derive them from the rest-frame spectra and estimate the associated statistical and systematic uncertainties. The main issue to be addressed is the automatic detection of feature boundaries – usually local extrema – as they shift both along the phase and from one supernova to another. Sometimes, the extremum itself cannot be uniquely defined, for example, when there is a mix of several local extrema at the same position or no significant extremum such as in the blue edge of the O Iλ7773 feature. The photon noise also limit the accuracy when determining the wavelength of the maximum, and may induce a systematic shift due to the nonlinear process involved in finding an extremum. In this section, we give a brief overview of the general and automatic method developed and implemented by Chotard (2011) to measure these spectral indicators and evaluate their corresponding uncertainties. This method shares some similarities with previous analyses (Folatelli 2004; Garavini et al. 2007; Nordin et al. 2011; Blondin et al. 2011; Silverman et al. 2012): it is based on extrema searches in a fixed spectral domain after smoothing the data. Improving on previous studies, our smoothing procedure is based on an optimal determination of the regularization parameters, and more noteworthy, we developed a thorough determination of the uncertainties based on a Monte Carlo procedure taking as input the variance of our signal.

A.1. Method

Most of the spectral indicators presented here are defined by the position in wavelength and flux of at least one local minimum or maximum, that is, peaks or troughs of the SN Ia spectral features. In order to compute a precise estimate of these local extrema, a Savitsky–Golay (Savitzky & Golay 1964, SG) smoothing is applied to the original spectra before the rebinning at 1500 km s−1 described in Sect. 2. Since the SG optimal window depends both on the underlying spectral shape and on the S/N ratio, which varies across wavelength, an independent smoothing is applied to each of the nine spectral zones of interest defined in Table A.1.

Table A.1.

Independently smoothed spectral regions, with minimal and maximal boudaries λmin and λmax (in Å).

Each extremum is then selected on the smoothed spectrum as the local extremum inside a given wavelength range, which can be found in Tables A.2 and A.3 for the equivalent widths and the feature velocities respectively. These wavelength ranges have been trained on a set of ∼50 SNe Ia within a phase range of ±5 days around B-band maximum light so that they match the observed diversity of our spectra. The positions in wavelength and flux of these extrema are then used in the spectral-indicator measurements, either directly, for example, for the velocities, or indirectly, for instance, for the equivalent widths through the definition of their pseudo-continuum.

Table A.2.

Definition of the wavelength regions (in Å) where the extrema are expected to be found.

Table A.3.

Wavelength regions (in Å) used to compute the feature velocities, together with their rest-frame wavelengths, λ0.

A.2. Optimal smoothing

The purpose of smoothing, also known as regularization, is to transform the original noisy data in order to get closer on average to the unknown original spectrum, based on some regularity hypothesis. A parameter describing how smooth the final function is has to be introduced, and has to be estimated, either based on physical consideration or deduced from the data themselves. Here we follow the latter approach and describe an optimal way of setting the smoothing parameter given a class of transformations.

A.2.1. Formalism

Smoothing a noisy spectrum Y consists of the determination of the smoothed spectrum Y′, which is a function of the initial flux Y′  =  f(Y). Y′ is an estimator of the noise-free and unknown original flux, convolved by the instrumental resolution: Y ̂ $ \hat{Y} $. It is related to the observed flux by:

Y = Y ̂ + N , $$ \begin{aligned} Y=\hat{Y}+N\ , \end{aligned} $$(A.1)

where N is a realization of the spectral noise vector. The goodness of the smoothing being represented by the error function,

Y Y ̂ = ̂ ( Y Y ̂ ) T W ( Y Y ̂ ) , $$ \begin{aligned} {\Vert Y^{\prime} - \hat{Y}\Vert }\,\hat{=}\,(Y^{\prime} -\hat{Y})^T W (Y^{\prime} -\hat{Y})\ , \end{aligned} $$(A.2)

where W is the inverse of the noise covariance matrix, this requires minimization of a quantity which depends both on the noise properties and the signal shape. As the true spectrum, Y ̂ $ \hat{Y} $, is unknown, we need to build an estimator of Y Y ̂ $ \|Y^{\prime}\,-\,\hat{Y}\| $ which is independent of Y ̂ $ \hat{Y} $. For a linear regularization, one can write:

Y = f ( Y ) = B p Y , $$ \begin{aligned} Y^{\prime} =f(Y)=B_{p}Y\ , \end{aligned} $$(A.3)

where Bp is a smoothing matrix which only depends on the chosen smoothing technique and a smoothing parameter p. Introducing Eqs. (A.1) and (A.3) into (A.2), one can show that:

Y Y ̂ = ( Y Y + N ) T W ( Y Y + N ) $$ \begin{aligned} \Vert Y^{\prime} -\hat{Y}\Vert&= (Y^{\prime} -Y+N)^T W (Y^{\prime} -Y+N)\end{aligned} $$(A.4)

= Y Y + 2 N T W ( B p I ) ( Y ̂ + N ) + N T W N . $$ \begin{aligned}&= \Vert Y^{\prime} -Y\Vert + 2 N^T W (B_{p} - I )(\hat{Y} + N) + N^T W N. \end{aligned} $$(A.5)

An estimator of the error function, ϵ, that has to be minimized can be constructed by noticing that

E [ Y Y ̂ ] = E [ Y Y ] + 2 Tr ( B p ) n , $$ \begin{aligned} E \Big [ \Vert Y^{\prime} -\hat{Y}\Vert \Big ] = E \Big [ \Vert Y^{\prime} - Y\Vert \Big ] + 2\, {\mathrm{Tr}}\, (B_{p}) - n, \end{aligned} $$(A.6)

where E is the mathematical expectation value, Tr is the Trace operator, and n is the rank of the vector Y. As we have only one realization of Y, this translates to:

ϵ = Y Y + 2 Tr ( B p ) n , $$ \begin{aligned} \epsilon = \Vert Y^{\prime} -Y\Vert + 2\, \mathrm{Tr} \,(B_{p}) - n, \end{aligned} $$(A.7)

which is the quantity that is minimized with respect to p.

Given a class of linear smoothing methods Bp, we have thus defined a procedure to find the value of p for which the smoothed spectrum best reproduces the original unknown one given an observed spectrum. This method is not exempt from possible overtraining, however this is mitigated by restricting the optimization to a single parameter. This was tested with Monte Carlo simulations.

A.2.2. B matrix estimation

Among all possibilities, we chose a Savitzky–Golay regularization (Savitzky & Golay 1964, SG) as it allows the reduction of the high frequency noise, while keeping the original shape of the feature. This method relies on fitting a k-order polynomial function for each point i in a fixed window size p (with p >  k + 1) centered on the current point and is designed to preserve the locus of maxima for even values of k. Each observed data point is replaced by its fit value, and the window then moves to the next data point until the spectrum is completely smoothed. As we have fixed the degree of the polynomial function to k = 2, the only parameter that has to be optimized is the window size, p. This transformation is linear, and the optimal window size p and corresponding smoothing matrix Bp are found by minimizing Eq. (A.7). The value of p is estimated individually for each spectral region of a given spectrum, and the spectral indicator measurements are performed on the corresponding smoothed spectrum, as indicated above.

A.3. Uncertainties

The uncertainty on a given spectral-indicator measurement arises from the statistical noise of the data and the induced uncertainties on the measurement method parameters. In our approach, these two uncertainties are independently measured using the smoothed spectrum as a reference: the SG window size is typically large enough so that the residual noise can be neglected in the simulations.

A random noise matching the statistical properties of the observed data is then added in order to generate a mock spectrum, and the spectral indicators are measured on this new spectrum as if it were the observed one. After one thousand generations, we are able to derive the statistical fluctuations of the obtained values, which we quote as the statistical uncertainty. This is computed as the standard deviation from the value measured on the real spectrum, thus taking into account a potential bias. This bias typically corresponds to 10–20% of the total error budget.

In order to save computation time, the initial SG window size p is determined once and for all using the original spectrum. It is then kept constant when measuring spectral indicators on all the simulated spectra. However, the noise affecting the initial spectrum induces an uncertainty in the smoothing parameter. This has been studied on a reduced set of simulations for which the optimal p was derived for each realization of the noise, and the corresponding uncertainty was found to range from 15% to 20% depending on the spectral zone. We then propagate this uncertainty by computing the spectral-indicator values on the original spectrum for several values of p in this 15 − 20% range. The standard deviation of the resulting spectral-indicator value distribution gives an estimate of the systematic error introduced by the arbitrariness of the smoothing method, which is found to be ∼20% of the total error on average. These two uncertainties are quadratically added together to derive the final uncertainty for a given spectral indicator. This estimate takes into account the statistical noise of the spectrum, as well as the induced scatter in wavelength and flux of the extrema. Our measurement errors thus include all nonlinear effects due to limited signal to noise and can be trusted for subsequent statistical analysis. More details could be found in Chotard (2011) and Nordin et al. (2011).

A.4. Performances

A.4.1. Failure rate

When one of the extrema defining a spectral indicator lies on a flat and/or noisy section of the spectrum, its measurement has a chance of failure. In that case, the measurement is automatically rejected and a visual scan using control plots is performed to confirm the actual lack of an extremum. The rejection of the “bad” measurements is performed using a 3σ clipping in the measured-uncertainty space: if the uncertainty made on a given measurement is larger than m + 3 × std, where m and std are the average and standard deviation of this spectral-indicator uncertainty distribution (for the whole sample), the corresponding measurement is rejected. Considering all the spectral indicators measured on the 113 input spectra in a range of phase of ±2.5 days around maximum light and presented in this paper, the global failure rate of the measurement procedure is less than 10−3 for the automatic selections mentioned above. If it does happen, the spectral indicators are set to the average value and assigned an infinite error.

A.4.2. Quoted uncertainties

A simple test has been performed to confirm the robustness of the method. In our selected sample, 18 SNe Ia have two or more spectra taken in the same night in a phase interval of ±5 days around maximum light. We then computed the distribution of the pull,

δ I σ δ I = I 1 I 2 σ I 1 2 + σ I 2 2 , $$ \begin{aligned} \frac{\delta I}{\sigma _{\delta _{I}}} = \frac{I_{1} - I_{2}}{\sqrt{\sigma _{I_{1}}^{2}+\sigma _{I_{2}}^{2}}}, \end{aligned} $$(A.8)

for all the spectral indicators I (feature velocity and absorption ratio) measured on each spectrum of a same night (for a same supernova). This distribution is centered around −0.01 with a dispersion of 1.06 which indicates that our estimation of the uncertainty is valid up to a possible underestimation of the error by 6%. This number is small enough so that we can trust our uncertainty estimation for the main analysis.

Appendix B: Expectation-maximization factor analysis

Dimensionality reduction in the presence of noisy data is often an overlooked problem. However, standard methods like PCA tend to fail at capturing the intrinsic variability of the data and the principal components align with the direction of the noise when the latter becomes important. Factor analysis on the other hand is a statistical method designed to model the covariance structure of high dimensional data using a small number of latent variables. It estimates both the natural variability of the sample and the noise arising from the measurements, under the assumption that the statistics are the same for all data records. Our case is slightly different: on one hand, the noise statistics are different for each measurement, but on the other hand, their variance is already known. We thus adapted the expectation-minimization algorithm presented in Ghahramani & Hinton (1997) to accommodate for the specifics of our problem. The resulting method is also known as probabilistic PCA. The formalism is the following: xi is a vector of rank l representing the measurement i. It is linked to the factor qi, vector of rank k ≤ l and the noise ηi by:

x i = x 0 + Λ q i + η i , $$ \begin{aligned} \boldsymbol{x}_i = \boldsymbol{x}^0 + \mathbf \Lambda \boldsymbol{q}_i + \mathbf \eta _i, \end{aligned} $$(B.1)

where x0 is a central value which can be further neglected without loss of generality (x0 = 0), qi is assumed to follow a normal distribution of unit variance, and ηi follows a multivariate normal distrubution of variance Ψi. In our case, Ψi is diagonal, a property that can be used to speed-up computations. Λ is the matrix containing the k explicative vectors that we need to determine. While Λ is not uniquely defined, ΛΛT is and represents the intrinsic covariance of the data, that is, the one we would observe in the absence of noise. The eigenvectors of ΛΛT thus correspond to the k first eigenvectors that PCA would have found in the absence of noise.

To find Λ, instead of directly maximizing the likelihood of observing xi, the expectation-maximization algorithm introduces the latent variable qi and then maximizes the expected likelihood over qi. The joint probability of xi and qi is the following multivariate normal distribution:

P ( [ x i q i ] ) = N ( [ 0 0 ] , [ Λ Λ T + Ψ i Λ Λ T I ] ) . $$ \begin{aligned} P\left(\left[ \begin{array}{c} \boldsymbol{x}_i \\ \boldsymbol{q}_i \end{array}\right]\right) = \mathcal{N} \left(\left[ \begin{array}{c} 0 \\ 0 \end{array}\right],\left[ \begin{array}{cc} \boldsymbol{\Lambda } \boldsymbol{\Lambda ^T} + \boldsymbol{\Psi _{i}}&\boldsymbol{\Lambda } \\ \boldsymbol{\Lambda }^T&\boldsymbol{I} \end{array} \right]\right)\ . \end{aligned} $$(B.2)

The block-diagonal elements of the covariance matrix represent, respectively, the covariances of xi and qi, and the non block-diagonal elements represent the covariance arising from the relation (B.1). Expectation-maximization is an iterative procedure which ensures that the likelihood increases at each iteration and it has been shown that the convergence is faster than using a gradient method (Dempster et al. 1977). Each iteration proceeds in two steps. The first step, called the E-step, consists of calculating the expectation of the conditional first and second moments of qi for a given Λ:

E-step:

q i = ̂ E [ q | x i ] = Λ T ( Ψ i + Λ Λ T ) 1 x i $$ \begin{aligned}&\boldsymbol{q}_i \hat{=} E\left[\boldsymbol{q} |\boldsymbol{x}_i\right] = \boldsymbol{\Lambda }^T(\boldsymbol{\Psi _{i}} + \boldsymbol{\Lambda }\boldsymbol{\Lambda }^T)^{-1} \boldsymbol{x}_i \end{aligned} $$(B.3)

E [ q q T | x i ] = I Λ T ( Ψ i + Λ Λ T ) 1 Λ + q i q i T . $$ \begin{aligned}&E\left[\boldsymbol{q} \ \boldsymbol{q}^T|\boldsymbol{x}_i\right] = \boldsymbol{I} - \boldsymbol{\Lambda }^T(\boldsymbol{\Psi }_i + \boldsymbol{\Lambda }\boldsymbol{\Lambda }^T)^{-1} \boldsymbol{\Lambda } + \boldsymbol{q}_i \ \boldsymbol{q}_i^T. \end{aligned} $$(B.4)

Once these two quantities are computed, the second step, or M-step, consists of estimating the Λ matrix by maximizing the likelihood expectation, which provides the condition:

i = 0 N Ψ i 1 Λ E [ q q T | x i ] = i = 0 N Ψ i 1 x i q i T . $$ \begin{aligned} \sum \limits _{i=0}^{N} \boldsymbol{\Psi }_i^{-1} \boldsymbol{\Lambda }\; E\left[\boldsymbol{q} \ \boldsymbol{q}^T | \boldsymbol{x}_i\right] =\sum \limits _{i=0}^{N} \boldsymbol{\Psi }_i^{-1} \boldsymbol{x}_i \boldsymbol{q}_i^T. \end{aligned} $$(B.5)

In the case where Ψi is diagonal, one can independently calculate each row of Λ, noted Λj, according to the relation

M-step:

Λ j = [ i = 0 N x i j ψ i jj q i T ] [ i = 0 N 1 ψ i jj E [ q q T | x i ] ] 1 , $$ \begin{aligned} \Lambda ^{j}= \left[\sum \limits _{i=0}^{N} \frac{\text{ x}_i^j}{\psi _i^{jj}} \ \boldsymbol{q}_i^T\right] \left[\sum \limits _{i=0}^{N} \frac{1}{\psi _i^{jj}} \ E[\boldsymbol{q} \ \boldsymbol{q}^T|\boldsymbol{x}_i] \right]^{-1}, \end{aligned} $$(B.6)

After the last iteration, the internal degeneracies of the description are lifted with the transformation of Λ into an orthogonal Λ matrix which satisfies the condition:

Λ Λ T = Λ Λ T . $$ \begin{aligned} {\boldsymbol{\Lambda ^{\prime }}} {\boldsymbol{\Lambda ^{\prime }}}^{T} = {\boldsymbol{\Lambda }} {\boldsymbol{\Lambda }}^{T}. \end{aligned} $$(B.7)

The columns of Λ are aligned with the eigenvectors of ΛΛT and the square of their norm are the respective eigenvalues.

Appendix C: Orthogonal distance regression

C.1. Expectation and maximization steps

Similarly to the factor analysis, hi and A are estimated iteratively. The first step consists of minimizing the χ2 with respect to hi, which amounts to solving the equation:

χ 2 h i = 0 $$ \begin{aligned} \frac{{\partial \chi ^{2}}}{{\partial \boldsymbol{h}_{i}}}=0 \end{aligned} $$(C.1)

and consequently gives:

E-step:

h i = ( A T W M i A + W x i ) 1 ( A T W M i M i + W x i x i ) . $$ \begin{aligned} \boldsymbol{h}_i=\left( \mathbf A ^T \mathbf W _\mathbf{M _i}\mathbf A + \mathbf W _{\boldsymbol{x}_i}\right)^{-1} \left( \mathbf A ^T \mathbf W _\mathbf{M _i}\mathbf M _i + \mathbf W _{\boldsymbol{x}_i}\boldsymbol{x}_i\right). \end{aligned} $$(C.2)

Once the E-step is performed we can estimate the matrix A by

χ 2 A = 0 , $$ \begin{aligned} \frac{\partial \chi ^2}{\partial \mathbf A }=0, \end{aligned} $$(C.3)

which gives:

M-step:

A = ( [ i = 0 N h i h i T W M i ] ) 1 [ i = 0 N W M i M i h i T ] , $$ \begin{aligned} \tilde{\mathbf{A }}=\left(\left[\sum \limits _{i=0}^{N} {\boldsymbol{h}}_i {\boldsymbol{h}}_i^T {\otimes } \mathbf W _\mathbf{M _i} \right]\right)^{-1}\left[\sum \limits _{i=0}^{N} \mathbf W _\mathbf{M _i} \mathbf M _i {\boldsymbol{h}}_i^T\right], \end{aligned} $$(C.4)

where A $ \tilde{\mathbf{A}} $ is the vector that contains all columns of A put end-to-end and the ⊗ operator is the Kronecker product. The estimation of the A matrix and the orthogonal projections hi is done by reiterating the E-step and M-step until χ2 convergences.

C.2. Fixing the degeneracies

The free parameters of Eqs. (5) and (22) that we find through orthogonal distance regression contain degenerate degrees of freedom. These must be fixed in order to present an unique an interpretable solution.

C.2.1. Degeneracies of the extinction fit

The term describing the extinction, Aλ0, iγ(λ), is not directly constrained by external observations. As a consequence, Eq. (5) is invariant under several transformations. The first one is

A λ 0 , i new = A λ 0 , i + j h i j c j , $$ \begin{aligned}&A_{\lambda _0,i}^\mathrm{new}=A_{\lambda _0,i}+\sum _j h_i^j \ c^j ,\end{aligned} $$(C.5)

α λ j new = α λ j γ λ c j . $$ \begin{aligned}&\alpha _{\lambda }^{j \ \mathrm{new}}=\alpha _{\lambda }^{j} - \gamma _{\lambda } \ c^j. \end{aligned} $$(C.6)

We must therefore fix the cj. A natural choice is to impose that the Aλ0, i are decorrelated with the h i j $ h_i^j $. Indeed, any correlation would imply that the extinction term contains information linked to the intrinsic properties described by h i j $ h_i^j $. Thus after each E-step, we imposed a correlation of zero between the Aλ0, i and the h i j $ h_i^j $, which amounts to applying the transformations (C.5) and (C.6) taking for the cj

c = [ cov ( h ) ] 1 cov ( h , A λ 0 ) , $$ \begin{aligned} \mathbf c = \left[\text{ cov}\left(\tilde{\boldsymbol{h}}\right)\right]^{-1} \text{ cov}\left(\tilde{\boldsymbol{h}},\mathbf A _{\lambda _0}\right), \end{aligned} $$(C.7)

where h $ \tilde{\boldsymbol{h}} $ is the vector that contains the hi, cov ( h ) $ \text{ cov}\left(\tilde{\boldsymbol{h}}\right) $ is the covariance matrix of the h $ \tilde{\boldsymbol{h}} $, cov ( h , A λ 0 ) $ \text{ cov}\left(\tilde{\boldsymbol{h}},\mathbf{A}_{\lambda_0}\right) $ is the vector that contains the covariances between the h i j $ h_i^j $ and the Aλ0, i. The second degeneracy to be fixed is the scale of γλ: only the product Aλ0, iγλ plays a role, and we then impose after each iteration

γ λ 0 = 1 $$ \begin{aligned} \gamma _{\lambda _0}=1 \end{aligned} $$(C.8)

by rescaling accordingly Aλ0, i and γλ.

Finally, we need a prescription for the mean value of Aλ0, i: we choose it to be centered at zero. This corresponds to the following transformation

A λ 0 , i new = A λ 0 , i 1 N i A λ 0 , i , $$ \begin{aligned}&A_{\lambda _0,i}^\mathrm{new}=A_{\lambda _0,i} - \frac{1}{N}\sum _i A_{\lambda _0,i}\ ,\end{aligned} $$(C.9)

M λ , 0 new = M λ , 0 + γ λ N i A λ 0 , i . $$ \begin{aligned}&M_{\lambda ,0}^\mathrm{new}=M_{\lambda ,0} + \frac{\gamma _{\lambda }}{N}\sum _i A_{\lambda _0,i}. \end{aligned} $$(C.10)

which leaves the χ2 invariant and amounts to placing the average spectrum at the average extinction.

C.2.2. Degeneracies of the global fit

The estimation of the grey parameter, ΔMgrey i, is affected by different degeneracies which can be fixed after the E or M iterations. We choose to fixe them after each E-step. The first one arises from the invariance of Eq. (22) under the transformation:

Δ M grey i new = Δ M grey i + j h i j c j , $$ \begin{aligned}&\Delta M_{\mathrm{grey} \ i}^\mathrm{new}=\Delta M_{\mathrm{grey} \ i}+\sum _j h_i^j \ c^j ,\end{aligned} $$(C.11)

α t , λ j new = α t , λ j c j . $$ \begin{aligned}&\alpha _{t,\lambda }^{j \ \mathrm{new}}=\alpha _{t,\lambda }^{j} - \ c^j . \end{aligned} $$(C.12)

We must therefore fix the cj. Similarly to the prescription made for the extinction fit, a natural choice is to impose that the ΔMgrey i and the h i j $ h_i^j $ are uncorrelated. Indeed, the interpretation is made easier with αj containing all the information driven by h i j $ h_i^j $, including its global impact on magnitudes. We thus impose after each E-step that the correlation between the ΔMgrey i and the h i j $ h_i^j $ is zero. This is obtained by applying the tranformations (C.11) and (C.12) taking for the cj:

c = [ cov ( h ) ] 1 cov ( h , Δ M grey ) , $$ \begin{aligned} \mathbf c = \left[\text{ cov}\left(\tilde{\boldsymbol{h}}\right)\right]^{-1} \text{ cov}\left(\tilde{\boldsymbol{h}},\boldsymbol{\Delta }\mathbf M _{\rm grey}\right), \end{aligned} $$(C.13)

where h i $ \tilde{\boldsymbol{h}}_i $ is the vector that contains the h i j $ h_i^j $, cov ( h ) $ \text{ cov}\left(\tilde{\boldsymbol{h}}\right) $ is the observed covariance matrix computed on the set of h i $ \tilde{\boldsymbol{h}}_i $ vectors and cov ( h , Δ M grey ) $ \text{ cov}\left(\tilde{\boldsymbol{h}}, \boldsymbol{\Delta}\mathbf{M}_{\mathrm{grey}}\right) $ is the vector that contains the covariances between h i j $ h_i^j $ and ΔMgrey i. Finally, by convention, the ΔMgrey i are centered on zeros at each step. This is obtained by the following transformation, which leaves the χ2 invariant:

Δ M grey i new = Δ M grey i 1 N i Δ M grey i , $$ \begin{aligned}&\Delta M_{\mathrm{grey} \ i}^\mathrm{new}=\Delta M_{\mathrm{grey} \ i} - \frac{1}{N}\sum _i \Delta M_{\mathrm{grey} \ i}\ ,\end{aligned} $$(C.14)

M t , λ , 0 new = M t , λ , 0 + 1 N i Δ M grey i . $$ \begin{aligned}&M_{t,\lambda ,0}^\mathrm{new}=M_{t,\lambda ,0} + \frac{1}{N}\sum _i \Delta M_{\mathrm{grey} \ i}. \end{aligned} $$(C.15)

Appendix D: Matrix dispersion estimation

Once the minimum of the χ2 defined in Eq. (9) is reached, the dispersion matrix is estimated. For this purpose, we use the same method as in Chotard et al. (2011) and which is described in detail in Chotard (2011). The approach is to calculate D from the observed dispersion of residuals and to subtract the average dispersion due to uncertainties:

D = 1 N i ( δ M i δ M i T C i ) $$ \begin{aligned} \mathbf D = \frac{1}{N} \sum _i \left(\boldsymbol{\delta } \mathbf M _{i} \boldsymbol{\delta } \mathbf M _{i}^T - \mathbf C _{i}\right) \end{aligned} $$(D.1)

where δMi are the residuals of the model once the minimum of the χ2 is reached and is defined for a given wavelength as:

δ M λ , i = M λ , i M λ , 0 j q i j α λ j A λ 0 , i γ λ $$ \begin{aligned} \delta M_{\lambda ,i} = M_{\lambda ,i}-M_{\lambda ,0}- \ \sum _j q_i^j \alpha _{\lambda }^{j} - A_{\lambda _0,i} \ \gamma _{\lambda } \end{aligned} $$(D.2)

and Ci is the covariance matrix that accounts for the total propagation of residuals error and is defined as:

C i = ( 0 σ λ i 2 0 ) + ( σ cal 2 + σ z 2 ) ( 1 1 1 1 ) + α cov ( q i ) α T $$ \begin{aligned} \mathbf C _{i} = \begin{pmatrix} \ddots&\,&0 \\&\sigma _{\lambda i}^2&\\ 0&\,&\ddots \end{pmatrix} \ + \ \left(\sigma _{cal}^2+\sigma _{z}^2\right) \begin{pmatrix} 1&\cdots&1 \\ \vdots&\ddots&\vdots \\ 1&\cdots&1 \end{pmatrix} \ + \ \boldsymbol{\alpha } \text{ cov}\left(\boldsymbol{q}_{i}\right) \boldsymbol{\alpha }^T \end{aligned} $$(D.3)

where α is the matrix that contains the intrinsic vectors and is defined as:

α = ( α 1 , α 2 , α 3 , ) $$ \begin{aligned} \boldsymbol{\alpha } = \left(\boldsymbol{\alpha }^1,\boldsymbol{\alpha }^2,\boldsymbol{\alpha }^3,\ldots \right) \end{aligned} $$(D.4)

In order to ensure that the matrix D is positive definite, the negative eigenvalues of the matrix are set to zero. Once the matrix D has been calculated, we add it in the expression of the Eq. (9) in order to recalculate the spectral distribution in energy and we iterate the calculations of D and the computation of the SED, until reaching the maximum of the restricted maximum likelihood (Guy et al. 2010).

All Tables

Table 1.

Main differences between the SALT2.4 model and the SUGAR model.

Table A.1.

Independently smoothed spectral regions, with minimal and maximal boudaries λmin and λmax (in Å).

Table A.2.

Definition of the wavelength regions (in Å) where the extrema are expected to be found.

Table A.3.

Wavelength regions (in Å) used to compute the feature velocities, together with their rest-frame wavelengths, λ0.

All Figures

thumbnail Fig. 1.

Pseudo-equivalent widths of absorption lines at maximum light as well as minima of P-Cygni profiles at maximum of light which are used in our analysis. They are represented on the spectrum of SN-training-77 at a phase of 1 day after maximum brightness.

In the text
thumbnail Fig. 2.

Relative importance of the eigenvalues associated with the eigenvectors. They are ordered in decreasing order of variance, and the total amounts to 100% once the contribution of the noise (indicated by a red line) is taken into account. We can see that the first three vectors dominate the variability of this space.

In the text
thumbnail Fig. 3.

Pearson correlation coefficients between the spectral indicators and the first five factors. The eccentricities of the ellipses indicate the magnitude of the correlation, and the color the importance of the correlation in units of σ. The vectors one and two correspond respectively to a global correlation of the pseudo-equivalent widths and of the line velocities. Only the first three vectors display correlations with a significance higher than 5σ.

In the text
thumbnail Fig. 4.

Pearson correlation coefficients between the SALT2 parameters and the first five vectors. The eccentricities of the ellipses indicate the magnitude of the correlation, and the color the importance of the correlation in units of σ.

In the text
thumbnail Fig. 5.

Impact of each component on the average spectrum. Top panel, top: average spectrum M0 and the predicted effect of a variation of q1 by ±1σ. Bottom: corresponding α1 vector. Middle panel, top: average spectrum M0 and the predicted effect of a variation of q2 by ±1σ. Bottom: corresponding α2 vector. Bottom panel, top: average spectrum M0 and the predicted effect of a variation of q2 by ±1σ. Bottom: corresponding α2 vector.

In the text
thumbnail Fig. 6.

Extinction curve fit result. Top panel: empirical average extinction curve, γ(λ), compared to the best fit by a Cardelli et al. (1989) law, which gives an RV = 2.6. The shaded blue area represent a ±0.5 range on RV, in order to illustrate the typical range of value that are measured for SN Ia in the literature. The black dot on this curve indicates the wavelength used on the lower panel graph. Lower panel: residuals after correction for the intrinsic behavior at 4012 Å as a function of Aλ0. Each dot corresponds to a single SN Ia. The γ4012 Å slope between M and Aλ0 is indicated by the red continuous line. Dashed represent the Cardelli et al. (1989) law for RV = 2.6. The shaded blue area represent a ±0.5 range on RV, in order to illustrate the typical range of value that are measured for SN Ia in the literature.

In the text
thumbnail Fig. 7.

Spectra before and after applying the model at maximum light. Top panel: 105 spectra in the absolute AB magnitude before correction (upper set of spectra) and after correction for intrinsic properties and color (lower set of spectra). An arbitrary offset has been applied for each set. The color code indicates the value of Aλ0. Lower panel: wRMS of the residuals for the uncorrected (in red) and corrected (in blue) spectra.

In the text
thumbnail Fig. 8.

Square root of the diagonal of the dispersion matrix D (top) and the correlation matrix corresponding to the dispersion matrix D (bottom). The shading represents the degree of correlation, as given by the Pearson correlation coefficient, ρ.

In the text
thumbnail Fig. 9.

Results of the Gaussian process interpolation. From top to bottom: amplitude of the kernel, correlation length and standard deviation of the pull distribution in terms of wavelength. The gray areas represent the presence of the absorption line that were used in our original factor analysis.

In the text
thumbnail Fig. 10.

Average spectral time series and the effect of a variation of q1 by ±1σ. Each phase is separated by a constant magnitude offset.

In the text
thumbnail Fig. 11.

Average light curves and color evolution with phase in synthetic bands, and the predicted effect of a variation of q1 by ±1σ.

In the text
thumbnail Fig. 12.

Average spectral time series and the effect of a variation of q2 by ±1σ. Each phase is separated by a constant magnitude offset.

In the text
thumbnail Fig. 13.

Average light curves and color evolution with phase in synthetic bands, and the predicted effect of a variation of q2 by ±1σ.

In the text
thumbnail Fig. 14.

Average spectral time series and the effect of a variation of q3 by ±1σ. Each phase is separated by a constant magnitude offset.

In the text
thumbnail Fig. 15.

Average light curves and color evolution with phase in synthetic bands, and the predicted effect of a variation of q3 by ±1σ.

In the text
thumbnail Fig. 16.

Factors estimated directly from the spectral indicators, as a function of the factors estimated only from the SUGAR model and the AV estimated at maximum as a function of those estimated with SUGAR. The red lines represent the equation x = y.

In the text
thumbnail Fig. 17.

SUGAR parameters fit directly from spectral data for the 105 SNe Ia used to train SUGAR (blue) and for the 58SNe Ia from the validation sample (red).

In the text
thumbnail Fig. 18.

SUGAR and SALT2 fit of SN-training-4. Left: observed spectral series of SN-training-4 (in black), compared to SALT2 (in red) and SUGAR (blue). Right: residuals of SALT2 (red) and SUGAR (blue) models to the observations. The black line represents the zero value of the residuals.

In the text
thumbnail Fig. 19.

wRMS of the residuals as a function of the wavelength for the full SUGAR model (blue line), for the SUGAR model only corrected by q1 and AV (blue dashed dotted line), for the SUGAR model only correct by q1, q2 and AV (blue dashed lines), and for the SALT2 model (red line). Top: training data. Bottom: validation data.

In the text
thumbnail Fig. 20.

wRMS of the residuals as a function of phase for the full SUGAR model (blue line), for the SUGAR model only corrected by q1 and AV (blue dashed dotted line), for the SUGAR model only correct by q1, q2 and AV (blue dashed lines), and for the SALT2 model (red line). Top: training data. Bottom: validation data.

In the text
thumbnail Fig. 21.

SALT2 parameters estimated in photometry as a function of the SUGAR parameters estimated in spectroscopy (on training and validation sample). The color code estimates the correlation between the parameters SALT2 and SUGAR.

In the text
thumbnail Fig. 22.

wRMS of the residuals as a function of the wavelength for the full SUGAR model (blue line), for the SUGAR model with the addition of the factor q4 (blue dashed dotted line), and for the SUGAR model with the addition of the factor q4 and q5 (blue dashed lines).

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.