Open Access
Issue
A&A
Volume 632, December 2019
Article Number A84
Number of page(s) 30
Section Atomic, molecular, and nuclear data
DOI https://doi.org/10.1051/0004-6361/201935085
Published online 05 December 2019

© S. Foschino et al. 2019

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

The aromatic infrared bands (AIBs) have held the attention of a large community of astronomers since their discovery in the 1970s (Gillett et al. 1973). At first named unidentified infrared bands (UIBs), they obtained the current denomination of AIBs after Leger & Puget (1984) and Allamandola et al. (1985) proposed that they can be emitted by large carbonaceous molecules, namely polycyclic aromatic hydrocarbons (PAHs). Their limited size (20 to more than 100 carbon atoms) implies that they are easily heated to high temperatures by the absorption of individual UV photons in photo-dissociation regions (PDRs) where the AIBs dominate the mid-IR spectrum. Once the molecule absorbs a UV photon, it can dissociate, be ionized or relax through infrared emission (e.g., Montillaud et al. 2013). The strongest AIBs, at 3.30, 6.20, 7.70, 8.60, 11.30, and 12.70 μm correspond to the cooling of the molecule through vibrational stretching and bending modes of its C–H and C–C bonds. To study this emission the community has counted on several space observatories: the Infrared Astronomical Satellite (IRAS, Neugebauer et al. 1984), the Infrared Space Observatory (ISO, Kessler et al. 1996), the Spitzer Space Telescope (Spitzer, Werner et al. 2004), and AKARI (Murakami et al. 2007). The upcoming James Webb Space Telescope (JWST, Gardner et al. 2006) has inspired a lot of hope as its capabilities will surpass all former space observatories.

One of the main findings of observational studies is the discovery that the AIB spectrum shows significant variability, depending on the astrophysical sources (Peeters et al. 2002). Variations within individual objects have also been observed (e.g., Joblin et al. 1996a; Sloan et al. 1997; Giard et al. 1992; Cesarsky et al. 2000; Rapacioli et al. 2005). These spectral variations can consist of changes in the band positions and widths or in band relative intensities. To analyze and interpret these variations, several approaches have been developed over the years. The first is a decomposition method, which dates back to the 1990s and consists in fitting multiple Lorentzian profiles to the AIBs (Boulanger et al. 1998). This was extended in the Spitzer era by Smith et al. (2007) who proposed a fitting tool, PAHfit, that automatically decomposes the AIB spectrum into several individual bands (Gaussian or Drude profiles) in the Spitzer range (5−35 μm). These fitting methods have been used widely to extract band intensities and relate them to star-formation rates in galaxies (Galliano et al. 2018), for example. They have also been used to study the variations of the AIB spectrum, for instance by Galliano et al. (2008) who were able to show that the 6.2 μm over 11.3 μm band ratio, expected to trace ionization degree of PAHs, correlates with the ionization balance parameter G 0 T gas n e $ \frac{G_0\sqrt{T_{\mathrm{gas}}}}{n_{\mathrm{e}}} $. Peeters et al. (2017) used Gaussian profiles to fit individual bands from spectral maps of NGC 2023 PDRs (north and south) and discussed in detail the possible assignments of these bands.

Another well-known approach for AIB analysis is the classification of Peeters et al. (2002). Using continuum-subtracted ISO-SWS spectra in the 6−9 μm range, the authors defined the A, B, and C classes based on the peak positions of the 6.2, 7.7, and 8.6 μm bands. Some objects show spectra that belong only to one class (e.g., HII regions are always class A), while other objects can show mid-IR spectra belonging to different classes (e.g., post-AGB stars can be found in Class A, B, and C). Hony et al. (2001) proposed a classification for the 10−15 μm range. This empirical method has the advantage of being simple, but is difficult to apply automatically to large samples of observations. In addition, the strategies of Peeters et al. (2002) and Hony et al. (2001) rely on subregions of the spectrum, and do not consider the full AIB spectrum.

A more detailed analysis of the AIB spectrum consists in using IR absorption spectra of PAHs that have been either measured or computed, as available in the NASA Ames PAH IR Spectroscopic Database (PAHdb, Bauschlicher et al. 2010, 2018; Boersma et al. 2014) or the theoretical spectral database of PAHs (Malloci et al. 2007). Using this approach provides further insights into the chemical diversity of the interstellar PAH population (e.g., Hudgins et al. 2005; Boersma et al. 2009; Simon & Joblin 2010; Bauschlicher et al. 2009; Candian & Sarre 2015). The limitation of this approach concerns the challenge of modeling, in a precise way, the emission spectra of PAHs in astrophysical environments (Mulas et al. 2006a; Boersma et al. 2013), especially when band profiles need to be simulated (Pech et al. 2002; Mulas et al. 2006b).

The last type of analysis approach involves blind signal separation (BSS) methods. Rapacioli et al. (2005) analyzed ISOCAM observations of NGC 7023 and ρ-Ophiucus by using singular value decomposition (SVD) coupled with a Monte Carlo search algorithm and a positivity criterion. Three elementary spectra were extracted and attributed to cationic PAHs, neutral PAHs, and carbonaceous very small grains (PAH+, PAH0, and VSGs, respectively). The PAH+ spectrum exhibits mainly the 6.2, 7.7, 8.6, 11.3, and 12.7 μm bands. It is dominated by the 7.7 μm bands (hereafter the 7.7 μm complex). The PAH0 spectrum is similar, but with a 11.3 μm much stronger than that of the PAH+ spectrum. The VSG population shows a spectrum with a strong contribution of the continuum at wavelengths longer than 10 μm. Its 7.7 μm complex is redshifted to 7.8 μm with respect to the first two spectra. The 8.6 μm band is blended with the 7.8 μm band.

Subsequently, Berné et al. (2007) used non-negative matrix factorization (NMF) on the Spitzer-IRS spectro-imagery observation of NGC 7023, ρ-Ophiucus, and Ced 201. They extracted three representative spectra consistent with those obtained by Rapacioli et al. (2005), but over a wider wavelength range and with higher spectral resolution. Rosenberg et al. (2011) conducted a coupled NMF–database fitting analysis of NGC 7023 NW Spitzer IRS-SH spectra, from 10 to 15 μm where they also identified these three populations, and assigned the 11.0, 11.2, and 11.3 μm bands to PAH+, PAH0, and VSGs, respectively. Joblin et al. (2008) averaged the representative spectra extracted by Rapacioli et al. (2005) and Berné et al. (2007) to build template spectra for each of the three mean populations, i.e., PAH0, PAH+, and VSGs. Then they used them to fit the spectra of the HII regions and planetary nebulae (PNe). They found that the three template spectra were not sufficient to fit PNe spectra in which the 7.7 μm complex is redshifted to ∼7.8 μm. The authors then invoked a fourth template spectrum to better fit this spectrum type. Guided by the PAH+ spectrum and calculated IR absorption spectra available in the databases, they constructed the PAHx spectrum and attributed it to large ionized PAHs. Later, these four templates were incorporated into the PAH Toulouse Astronomical Templates (PAHTAT) tool1 (Pilleri et al. 2012), which can be used to fit Spitzer-IRS spectra with these four template spectra. So far, much of the success of the BSS methods in providing a scenario of evolution of the AIB populations with UV processing resides in the study of NGC 7023, a nebula that is sufficiently extended to observe chemical variations even at the spatial scale of a few arcseconds sampled by the previous IR missions.

The JWST will provide hyperspectral images with unprecedented details combining middle spectral (R ∼ 3000) and high spatial (up to 0.1″) resolutions over the full 0.6 to 28 μm wavelength range. The high sensitivity of the instruments will increase the number of observable astrophysical objects (including fainter regions of PDRs toward molecular clouds, the diffuse interstellar medium, and distant galaxies). These improvements will increase the quantity of the collected hyperspectral image information, with respect to previous missions, by several orders of magnitude. Dedicated methods for analyzing AIB emission in these large data sets are thus required.

The present study aims to provide a fast and robust method based on BSS for the analysis of these data sets. In Sect. 2 we propose such a method. In Sect. 3 we present our test of this method on ISO-SWS data and the obtained set of representative spectra. In Sect. 4 we discuss how this method can help interpret the observations in terms of chemical populations. We conclude in Sect. 5.

2. Description of the approach

2.1. Challenges of large JWST data sets

The next generation of space infrared spectroscopic observations will be provided by the JWST, whose launch is planned in 2021. It will deliver mid-IR spectral cubes of unprecedented quality over the 0.6 to 28 μm range once the data from the near infrared spectrometer (NIRSpec) andmid-IR imager (MIRI) instruments are combined.

A summary of its performance is available in Table A.1. Overall, the MIRI and NIRSpec integral field unit (IFU) mode characteristics combine and surpass the advantages of former space observatories as ISO, Spitzer, and AKARI. The achieved medium spectral resolution will be well adapted to the study of the AIBs. MIRI cubes will have different sizes depending on the channel, with typical values between 17 × 19 × 3000 and 26 × 29 × 3000, whereas NIRSpec IFU cubes will typically be 30 × 30 × 12 000. The memory needed for one data cube will reach several hundred megabytes. Efficient algorithms will thus be required to analyze these large data sets.

In this paper we propose an approach based on linear methods allowing a reduction in the dimensionality of data to that of a few representative spectra using a BSS approach. The first step of the method (which is also an improvement over earlier BSS applications to mid-IR spectra) is to extract the AIB emission from the observations (i.e., without continuum emission, and gas lines). This is done with the linear fitting method described in Sect. 2.2. The BSS method used to perform dimensionality reduction on the AIB extracted spectra is described in Sect. 2.3. The initialization process of NMF algorithms is described in Sect. 2.4. The method is summarized in Sect. 2.5.

2.2. AIB extraction

In this section we present a method for extracting the AIB emission from observations using a linear model to maintain a low computational time. We note that the model presented here is not physically motivated, and only extracts the AIB emission and reduces the impact of noise. Thus, the degeneracy (i.e., that many solutions can provide a good fit) induced by the use of a large catalog is of limited concern, as the objective is mainly to have a good fit to extract a clean AIB spectrum and not to use fit parameters for further analysis. The proposed algorithm uses predominantly relatively localized spectral structures like Gaussians to model bands, lines, and plateaus, in particular in the 6−9 μm region (Peeters et al. 2002), whereas much broader components like blackbodies are used for large-scale spectral structures like the continuum.

2.2.1. Model

Figure 1 shows typical ISO-SWS spectra. They contain broad AIBs, gas lines, continuum, and instrumental noise. Our aim was to extract the AIB contribution. We can model the observation vector sobs by smod as

s mod = s AIB + s cont + s gas + s inst , $$ \begin{aligned} s_{\rm mod}= s_{\rm AIB}+ s_{\rm cont} + s_{\rm gas}+s_{\rm inst}, \end{aligned} $$(1)

thumbnail Fig. 1.

Examples of ISO-SWS spectra used in this study. See Table 1 for the complete list of objects.

where sAIB, scont, sgas, and sinst correspond respectively to the model of AIBs, continuum, gas lines, and instrumental noise. Each component (but not the instrumental noise) is composed of a list of spectra. Section 2.2.2 describes how the lists are constructed.

2.2.2. Model components

The objective of the present model is only to represent the data mathematically in a way that allows us to extract easily an estimation of an AIB spectrum from an observation.

The first component of Eq. (1), sAIB, corresponds to the AIB emission expressed as

s AIB = g × A , $$ \begin{aligned} s_{\rm AIB}= {g}\times A, \end{aligned} $$(2)

where g is a vector of coefficients and A a matrix whose lines are Gaussians.

To build matrix A, we first define the intervals of wavelength over which AIBs can exist. For each interval we define a number of Gaussians with a given width σ (Table 3). We note that for a given interval σ can take several discrete values. Each Gaussian position is spread out evenly over the interval of wavelengths.

The continuum emission model, scont, is expressed as

s cont = k × B , $$ \begin{aligned} s_{\rm cont}= k\times B, \end{aligned} $$(3)

where k is a vector of coefficients and B a matrix whose lines are defined as follows. Each continuum Bλ, i is defined by

B λ , i ( λ , T ) = 1 λ B λ ( λ , T ) × exp C ext N H . $$ \begin{aligned} B_{\lambda ,i}(\lambda ,T)= \frac{1}{\lambda }B_{\lambda }(\lambda ,T)\times \exp ^{-C_{\rm ext}N_{\rm H}}. \end{aligned} $$(4)

We use discrete values of T in an interval of temperature between Tmin and Tmax with a temperature step of ΔT. Likewise, several discrete values of NH can be taken into account. The value of Cext is taken from Weingartner & Draine (2001) for the Milky Way extinction curve.

The third component of the model is the gas emission, sgas, described as

s gas = p × L , $$ \begin{aligned} s_{\rm gas}= p\times L, \end{aligned} $$(5)

where p is a vector of coefficients and L a matrix whose lines are Gaussians. The same principle as for the AIB emission is applied. However, the lines are much narrower. Each set of Gaussians is distributed over an interval Δλ around the central wavelength μ of the line. It must be noted that all matrices A, B, and L are pre-computed. Thus, in the fitting optimization only the coefficients in vectors g, k, and p have to be adjusted. This implies that the model is fully linear since Eqs. (1)–(5) only use additive and multiplicative operators.

2.2.3. Optimization

To estimate smod we minimized the L2 norm between sobs and smod:

s ̂ mod = min s mod | | s obs s mod | | 2 . $$ \begin{aligned} \hat{s}_{\rm mod}= \min \limits _{s_{\rm mod}}{|| s_{\rm obs}-s_{\rm mod}||^2.} \end{aligned} $$(6)

To perform this minimization, we used the non- negative least square (nnls) function of the scipy package implemented in Python.

This is a critical aspect of the method. The fully linear model described in Sect. 2.2.2 allows us to use the nnls function, which is much faster than the Levenberg-Marquardt method used in PAHfit. As an illustration, we compared the computational cost between a Python version of PAHfit and our method (also in Python) when applied to a single Spitzer-IRS spectrum of the NGC 7023 NW PDR, from 5 to 14 μm. The fitting part in PAHfit (algorithm setup and the fit itself) runs for 194 s, while our code lasts 0.47 s. Compared to PAHfit, the main advantage of our method is that it provides a good AIB extraction 400 times faster. This suggests that PAHfit in its current form would be hardly applicable on a 0.6−28 μm IFU cube from JWST (containing 900 spectra of 23 k spectral points), while the linear method would still run over a reasonable timescale.

This method can be applied to each of the observed spectra, yielding for each source an estimate of smod (i.e., s ̂ mod $ \hat{s}_{\mathrm{mod}} $). For each model, this allows us to isolate an estimate of sAIB (i.e., s ̂ AIB $ \hat{s}_{\mathrm{AIB}} $) to then construct Xaib, the matrix containing the modeled contribution of AIB for each object. In other words, a line of Xaib is the s ̂ AIB $ \hat{s}_{\mathrm{AIB}} $ vector obtained for a given object.

2.3. NMF blind signal separation method

2.3.1. Choice of BSS method

The BSS methods are based on the assumption that the signal measured (mid-IR spectra in our case) is a mixture of elementary sources (spectra). The goal of these methods is to extract each of these sources from the data. They have been used in several fields, such as acoustics (Abrard & Deville 2005) and IR astronomical spectroscopy (Berné et al. 2007; Rosenberg et al. 2011). A common assumption on the data mixture is the linear instantaneous model, which implies that all the n spectra of the data set are a linear combination of r elementary sources. This is expressed as

X = M × S , $$ \begin{aligned} X = M\times S, \end{aligned} $$(7)

where X is the n × m data matrix whose lines are observed spectra, M the n × r weight matrix, and S the r × m source matrix. There are four main classes of methods used to perform BSS: independent component analysis (ICA), sparse component analysis (SCA), non-negative matrix factorization (NMF), and Bayesian methods. Here we used NMF since it requires that the elementary sources are non-negative and the data satisfy the linear instantaneous mixture with non-negative coefficients. These conditions are verified in principle in the present case except for linearity when extinction becomes important (see Berné et al. 2007). NMF provides an estimation of X by the product of the non-negatives matrices of weights W and of sources H as

X W × H , $$ \begin{aligned} X \approx W\times H, \end{aligned} $$(8)

where W is an estimate of M and H of S. The lines of H contain an approximation of the elementary spectra which we call “representative spectra”. An algorithm for obtaining W and H was introduced by Lee & Seung (1999, 2001) (see details of the algorithm in Berné et al. 2007). Lin (2007) proposed an algorithm for performing NMF that converges faster using the projected gradient method instead of multiplicative updates. We used the latter method, which minimize the Euclidean distance between matrix Xaib and the matrix product W × H, implemented in MATLAB.

2.3.2. Choice of r

The choice of r (i.e., the number of representative spectra) is a crucial and non-trivial problem in BSS. As it is unknown, a criterion is needed to fix it. Several methods were proposed in the literature. Rapacioli et al. (2005) and Berné et al. (2012) used a method based on the standard deviation σr of the reconstruction residual matrix with a varying number r. The number r is increased until σr reaches the noise level of the observed data. To apply this method, a correct measure of the data sample noise level is needed (e.g., an empty pixel).

Other methods are based on the dimension of the vector space spanned by the data set. The method of singular value decomposition (SVD, Luo & Zhang 2000) applied to a set of spectra can help us find the dimension of the vector space in which all spectra can be expressed. Boulais et al. (2015) used this method on astrophysical data. A steep transition in the eigenvalues of the covariance matrix of X indicate the dimension of its vector space.

Finally, Berné et al. (2007) proposed a user-based method to identify r, which consists of increasing r until artifacts or physically non-relevant spectra are extracted.

2.4. Initialization of NMF with MASS

2.4.1. Variability of the NMF solutions

Both the Lee & Seung (2001) and Lin (2007) algorithms use iterative updates of matrices W and H. The most common method used to initialize W and H is to set their values randomly. Earlier studies (Berné et al. 2007; Rosenberg et al. 2011) show that solutions of NMF applied to mid-IR astronomical with random initialization show variability in the results. To overcome this issue, Rosenberg et al. (2011) used a Monte Carlo (MC) analysis to extract an average solution and error bars. This method is costly in terms of time, however, as it multiplies the total time cost by the number of MC-NMF runs. This is an issue because the convergence time increases with the number of points, and JWST data matrices will contain a hundred times more points.

The robustness of NMF representative spectra extraction relies on how these mathematical variabilities can be reduced. In the following we present a method that allows us to obtain a unique solution for a given data set. Using a different data set, i.e., a different X matrix, implies a change in the solution, i.e., a change in W and H matrices.

2.4.2. Initialization of H by MASS

To circumvent the issues associated with random initialization, Boulais (2017) developed an innovative way to initialize W or H using a geometrical method, the Maximum Angle Signal Separation (MASS) algorithm. The MASS algorithm extracts from the data the observed spectra that are the farthest from an angular point of view in the space vector generated by the data set. Practically, MASS computes the scalar product between each pair of spectra of the data set and, in a first step, isolates the pair that minimize it and creates a subspace vector with them. Then each remaining spectrum is projected on this subspace and the one that maximizes the angle between itself and its orthogonal projection on the first subspace is isolated. This spectrum is added to the first two spectra and the operation starts again until the algorithm isolates r spectra. The output is an estimation of the matrix H called Smass based on the observed spectra.

Based on tests performed on synthetic spectra, Boulais (2017) showed that initializing H by Smass significantly reduces the variability of the solution of NMF, although W is still initialized randomly. It was also shown using these synthetic data that this form of initialization provides a solution that is more reliable compared to what can be obtained using a MC approach. For example, when H is initialized by Smass, the normalized root mean square error measured is 7% against 18% with MC-NMF alone.

Overall, since the dispersion of the solution is significantly reduced, only one run of MASS-NMF allows us to obtain a more accurate result than MC-NMF and a significant gain of time. A detailed comparison of the MC method results and initialization with MASS will be provided in a subsequent paper (Boulais et al. 2019), following Boulais (2017). We therefore included this initialization scheme in the algorithm presented in this study.

2.5. Summary of the MASS-NMF method

In summary, the proposed method for extracting representative spectra from the JWST data is the following:

– Spectral resolution and sampling are homogenized over the whole data sample;

– From the results of the fit, a data matrix, Xaib, whose lines are the AIB spectra extracted from observations is created (Sect. 2.2.3);

– The number of representative spectra r is defined by the user (Sect. 2.3.2);

– The MASS algorithm is applied to Xaib, yielding Smass, a first gross estimate of the results (Sect. 2.4);

– NMF, with H initialized with the values of Smass, is applied to Xaib until convergence is reached (Sect. 3).

3. Test on real data

3.1. Data set and test parameters

In this section, we describe the ISO-SWS data on which the present method is tested in Sect. 3.1.1. The model parameters used are described in Sect. 3.1.2. The choice of the number on representative spectra is described in Sect. 3.1.3.

3.1.1. Data

The ISO-SWS observations cover the 2.4−45 μm range, which overlaps that of the JWST-NIRSpec–MIRI combined data (0.6−28 μm, Gardner et al. 2006). We limit our study to the 2.6 − 15 μm range over which the major PAH bands are located. Since there is no spatial information, we gathered the spectra of different astrophysical objects from the Sloan et al. (2003) catalog of highly processed data product spectra. More specifically, we selected spectra showing clear AIB emission and no noticeable extinction (see Sect. 2.3). The compiled list of objects and their types is given in Table 1. In total, n = 31 spectra were used coming from reflection nebulae (RN), PNe, HII, and Compact HII (CHII) regions, as well as one post-AGB star and one starburst galaxy (i.e., Red Rectangle and M 82, respectively). It appears that some spectra have a negative part due to calibration issues. For example, the IRAS 19442+2427 spectrum has negative values over the wavelength region from 2.6 to ∼6.1 μm. This is circumvented by adding a constant to the spectrum corresponding to the absolute value of the minimum intensity of this spectral region. It has no impact on the analysis since the aim of the present paper is to study the AIB emission that will be extracted using a non-negative least-squares algorithm (see Sect. 2).

Table 1.

Sources used in our sample.

The selected observations came from different proposals and involve various observing modes. An ISO-SWS spectrum is made of the combination of several discrete segments. To homogenize the spectra, we selected the spectral resolution of the mode with the lowest spectral resolution present in our data sample, which corresponds to the scan speed 2 of mode 01. Then we applied Gaussian convolution using a kernel with width corresponding to the lowest spectral resolution of the same segment in the observing mode 01 in scan speed 2. Another necessity is to have dimensional consistency (i.e., the same spectral sampling for all spectra). A wavelength vector with step size of Δ λ 10 $ \frac{\Delta\lambda}{10} $ (where Δ λ = λ ref R $ \Delta\lambda=\frac{\lambda_{\mathrm{ref}}}{R} $) was generated, and the data was interpolated onto this new grid.

This yields n = 31 spectra of m = 6799 points with a spectral resolution constant over each segment, as given in Table 2. Figure 1 shows an example of spectra from each type of object.

Table 2.

Wavelength range and lowest resolution in segment are reported for each ISO SWS segment.

3.1.2. AIB extraction model parameters

The parameters of the AIB extraction model discussed in Sect. 2.2 are presented as follows. Table 3 shows the intervals of wavelengths on which we place the catalog of Gaussians modeling the AIBs. No intervals are considered between 3.55 μm and 5.195 μm because of the lack of known AIBs in this region. Table 4 shows the characteristics of the modified blackbody catalog. Table 5 shows the characteristics of the Gaussian catalog corresponding to the gas lines (i.e., the identification of the transitions and the Gaussian parameters). Figure 2 shows two examples of extraction of the AIB emission. All our fits are found to be of good quality.

Table 3.

Gaussian parameters used to fit the AIB spectra.

Table 4.

Temperature parameters of the continuum.

Table 5.

Gaussian parameters used to fit the gas lines.

thumbnail Fig. 2.

Examples of AIB extraction from mid-IR spectra (green line). The observed spectra are shown in blue, the fit in red, and the continuum in gold. The residual of the fit is shown in orange in the box below each panel. The rest of the AIB extractions are available in Appendix B.

3.1.3. Choice of r

Several ways to determine r are discussed in Sect. 2.3.2. The method used by Rapacioli et al. (2005) and Berné et al. (2012) is not usable here due the impossibility of having a correct measure of the noise. The observations were not done with the same instrumental modes, which means the noise values of the spectra are not uniform.

Figure 3 shows the eigenvalues of the covariance matrix obtained by applying SVD to the data sample used in the present study. No sharp step is observed, which precludes the choice of r by this method. The Berné et al. (2007) method is also not successful here, as these artifacts never appear when increasing r. Since all of the methods known to us for identifying r automatically failed, we relied on previous studies. Rapacioli et al. (2005) found r = 4, with three physical spectra and one spectrum dominated by artifacts. Berné et al. (2007) and Rosenberg et al. (2011) both found r = 3. However, the following studies show that an additional spectrum was needed (PAHx, see Sect. 1) in particular when PNe are included (which is the case here). Thus, r = 4 is also likely and is used for the following test.

thumbnail Fig. 3.

Eigenvalues of the covariance matrix of the AIB spectra (Xaib). The dashed line corresponds to r = 4, which is the dimension we selected. See Sect. 2.3.2 for details.

3.2. Results and performance

The output of the present method is described in Sect. 3.2.1. The reconstruction of each spectrum of the data set using the set of repesentative spectra is described in Sect. 3.2.2. In Sect. 3.2.3, we discuss the influence of the flux of each spectrum on the output of the method. Finally, we briefly discuss the time cost of this method in the perspective of its application on JWST data sets.

3.2.1. Representative spectra

Figure 4 displays the four representative spectra obtained by applying the method described in Sect. 3, with r = 4, on the ISO-SWS data set. The dashed lines in this figure show the template spectra from the PAHTAT fitting tool (Pilleri et al. 2012, see Sect. 1). The first conclusion that can be drawn from this figure is that the extracted spectra using MASS-NMF applied to the ISO data are qualitatively in good agreement with those of Pilleri et al. (2012), and in particular we recognize the four chemical populations presented in this earlier study and ascribed to cationic PAHs (PAH+), neutral PAHs (PAH0), evaporating very small grains (eVSGs), and PAHx (i.e., large ionized PAHs). These assignments are discussed in further detail in Sect. 4.1; however, for the sake of simplicity we use this chemical denomination from now on.

thumbnail Fig. 4.

Comparison between PAHTAT templates (dashed lines for PAH+, PAH0, eVSG, and PAHx; Pilleri et al. 2012) and extracted representative spectra in this work. All spectra are homogeneous to W m−2/μm and normalized to their maximum values. The vertical dashed lines correspond to the positions of the major bands of spectrum 2 (3.29, 6.20, 7.60, 8.60, 11.22, and 12.70 μm).

3.2.2. Spectral reconstruction of observations with representative spectra

One way to assess the quality of the decomposition is to reconstruct the observed spectra using the extracted representative spectra described in the previous section. To perform this reconstruction (or fit), we used the algorithm described in Sect. 2.2 (see Eq. (6)). We used the same catalogs of blackbodies and Gaussian functions to model the continuum and gas line emissions; however, the AIBs are now reconstructed using the four representative spectra. Figure 5 shows two examples of spectral reconstruction. The fits for all the objects reported in Table 1 are given in Appendix D, and the proportions of each spectrum for each reconstruction is given in Table 6.

thumbnail Fig. 5.

Two examples of spectral reconstruction of IRAS 17047−5650 (upper panel) and Orion Bar H2S1 (bottom panel). The thin black line shows the continuum, blue and red show respectively the data and the model. The four representative spectra for the eVSG, PAH+, PAH0 and PAHx populations are shown in red, blue, green, and purple for respectively.

Table 6.

Relative contribution of each representative spectrum in the data sample.

The fits are generally found to be of good quality. To estimate this more quantitatively, we compared the variance of the residual to the noise of the observations. A good reconstruction implies that the two quantities are of the same order. However we do not have a reliable measure of the data noise since each spectrum was obtained at a different time and not all of them were recorded with the same setting. A good measure of the noise in a hyperspectral image would be from a pixel with no AIB signal, which we do not have here. In the absence of an empirical noise estimate, we use the uncertainties provided by the pipeline, as described in Sloan et al. (2003). These are instrumental uncertainties related to the measurement method, not a measure of the total noise, and therefore likely an underestimate of the total noise. The ratio of the variance of the residuals of the reconstruction to the mean of the instrumental uncertainties is compared in Fig. 6 for each source. The values are between 0.34 and 2.5, which shows that the variance of the residuals is on the order of the instrumental uncertainty, implying a good quality of the reconstruction.

thumbnail Fig. 6.

Ratio of the variances of the residuals to the uncertainties.

Joblin et al. (2008) used the four PAHTAT template spectra reported in Fig. 4 and two broad features at 8.2 and 12.3 μm to fit the 5.5–14 μm spectra of a set of PNe and CHII regions after smoothing the observations to the spectral resolution of the templates (λλ = 45). All SWS spectra used in this earlier study are also included in our sample. We can thus compare the results obtained in each study from the spectral reconstruction in terms of the contributions derived for each population (see Table A.2 of Joblin et al. 2008 and Table 6) in Fig. C.1. Although differences in detail can be seen (e.g., a larger contribution of eVSGs in the new study compared to the previous one), an overall agreement is found. This demonstrates that the inclusion of higher resolution data in the decomposition method has preserved the general spectral information in terms of the defined PAH0, PAH+, eVSG, and PAHx contributions. However, the representative spectra obtained here contain more spectral information and therefore possibly chemical information, as is discussed in Sect. 4.2.

3.2.3. Importance of flux in the extraction of the PAHx spectrum

The last line of Table 6 shows the sum of the flux emitted by each population computed from the reconstruction of the observed spectra presented in Sect. 3.2.2. The prominent emission of the PAHx population compared to others is noticeable. This is due to the presence in the sample of the Red Rectangle (RR), whose emission accounts for 55% of the total PAHx flux. This source is 12 times stronger in flux than the average of the rest of the sample, and 3 times stronger than the second brightest spectrum (IRAS 07027−7934). This is important to keep in mind since the version of NMF used in this paper aims to reduce the Euclidean distance between the data matrix X and the estimated W and H matrices (see Sect. 2.3). This implies that the algorithm will provide one vector in the solution that is close (in terms of Euclidean distance) to this observed spectrum. This is what happens here in the extraction, where the PAHx spectrum is very similar to the RR spectrum. If a normalization of the observed spectra to an integrated flux density of one is performed (and the RR contribution is thus artificially reduced) before running NMF, then the population identified as PAHx in Fig. 4 (spectrum 4) is not extracted by the algorithm. This underlines the fact that the presence of the RR in our sample is crucial to the extraction of the PAHx spectrum.

3.2.4. Computation time

On the ISO-SWS data set of dimensions (31 × 6799), the method runs in 10 min on a laptop computer. We generated, based on the ISO data and using interpolations, a data set with the dimensions corresponding to an observation of one field of view of NIRSpec and MIRI combined (i.e., 30 × 30 × 23 000) and applied the proposed method to it. We find that the computation time on a server is on the order of one hour.

4. Towards a chemical interpretation of extracted spectra

In this section we briefly discuss the characteristics of each representative spectrum. We compare them to spectra obtained in earlier studies to propose a possible chemical interpretation of these spectra. This is done to illustrate the methodology, thanks to the global similarities we observed in the representative spectra in the studies and to the second-order variations due to the specificity of the data set used. These assignments should not be considered definitive, in particular given the limited size of the sample and the absence of spatial information. Hopefully, the use of JWST data will allow us to achieve a more secure chemical assignment thanks to larger samples and spatially resolved observations. We also discuss how the classes of Peeters et al. (2002) can be connected to these extracted spectra. In Sect. 4.2, we present some considerations on chemical diversity and on the possibilities that a MASS-NMF analysis can open when dealing with the data sets that will be provided by JWST.

4.1. Chemical assignment of extracted spectra and their connection to standard classes

Improvements in the extracted spectra can be clearly seen in Fig. 4 compared to the spectra presented in Pilleri et al. (2012). These concern both the wavelength coverage (i.e., inclusion of the 3 μm region) and the spectroscopic details that are present in the new spectra. Table B.1 lists the positions of the main detected bands, and of some more minor bands where we are confident that they are associated with real bands and not artifacts of the method. Some bands are well known, such as the satellite bands in the 3.4−3.5 μm range. Other bands are identified because they are observed in several representative spectra, at close positions. The positions of the major bands observed in spectrum 2 (3.29, 6.20, 7.61, 8.60, 11.22, and 12.73 μm) are used as arbitrary reference positions and are indicated by vertical dashed lines in Fig. 4. Overall, these results demonstrate that an application of the MASS-NMF method to the JWST data sets is promising, and likely to provide a scheme of analysis providing unprecedented detail. We describe here the spectral characteristics of the representative spectra and compare them to earlier studies to interpret their chemical origin.

Spectrum 1: PAH cations. Spectrum 1 (Fig. 4) is dominated by two bands centered at 6.20 μm and 7.61 μm. The other major bands peak at 8.53, 11.24, and 12.75 μm. Spectrum 1 shares some characteristics with spectrum 2, but has a higher value of the band intensity ratio of the 7.7 μm complex (C–C stretch) to the 11.20 μm band (C–H out-of-plane bend). Another major difference concerns the 3 μm region where spectrum 1 has no significant bands, contrary to spectrum 2. All these characteristics point to PAH cations (PAH+) being the carriers of spectrum 1, in line with experimental and theoretical studies (e.g., DeFrees et al. 1993; Szczepanski & Vala 1993; Hudgins & Allamandola 1999) and the study by Boersma et al. (2013) using the NASA Ames PAH database. We note that the 6.20 μm band is blueshifted compared to the PAHTAT template, 6.20 μm relative to 6.27 μm. However, the position of this peak varies in the various BSS extractions of the PAH+ spectrum; for instance, it is reported at 6.28 μm in Rapacioli et al. (2005) and 6.24 μm in the extraction performed by Berné et al. (2007). However, the shift of 0.08 μm in previous templates relative to spectrum 1 is less than the spectral resolution of the ISOCAM data (∼0.14 μm) that was used in these studies. Our results can also be compared to the study of Rosenberg et al. (2011) on NGC 7023. The authors applied NMF to Spitzer spectra of NGC 7023 obtained over the 10−15 μm range at R ∼ 600. They found a PAH cation spectrum dominated by a band at 11.00 μm, albeit with relatively large uncertainties (see lower left panel of their Fig. 3). A band at 11.0 μm is indeed present in spectrum 1, although the main one is located at 11.24 μm. This indicates that the representative spectra that can be extracted depend on the considered spectral range and objects (see discussion on chemical diversity in Sect. 4.2).

Spectrum 2: PAH neutrals. This spectrum presents clear peaks at 3.29, 6.20, 7.61, 8.59, 11.22, and 12.73 μm. In addition to the 3.29 μm band, there is a weaker satellite band at 3.40 μm over a plateau, which extends from 3.4 to 3.6 μm and comprises very weak bands at 3.43, 3.47, and 3.51 μm. The 11.22 band is much stronger than in spectrum 1, being close to the 7.7 μm complex emission, which points to neutral species (PAH0) being the dominant carriers (Compiègne et al. 2007; Joblin et al. 1996b). This is also consistent with the previous decomposition studies (Rapacioli et al. 2005; Berné et al. 2007; Pilleri et al. 2012). The position of the 11.2 μm C–H band observed here (11.22 μm) is in good agreement with that of the neutral PAH spectrum (11.20 μm) derived by Rosenberg et al. (2011).

Spectrum 3: eVSGs. This spectrum exhibits bands that are redshifted compared to the bands of the PAH+ and PAH0 spectra and are found to be superimposed on a plateau (Fig. 4). Although there are notable differences, in particular in the 7.7 μm complex, we associate this spectrum with the population of eVSGs described in Pilleri et al. (2012). This population spectrum has a dominant contribution in the PNe spectra. We note in addition that the position of the 11.20 μm band is shifted to ∼11.35 μm, which is similar to the value of the position observed for this band in the VSG spectrum extracted by Rosenberg et al. (2011), i.e., ∼11.40 μm (see their Fig. A1).

Spectrum 4: PAHx? Spectrum 4 shares some similarities with the PAH+ spectrum (spectrum 1), but with a significant redshift of the main bands corresponding to the C–C stretch and C–H in-plane bend motions. The shift is the strongest for the band at 7.85 μm, compared with a position at 7.61 μm in the PAH+ and PAH0 spectra. This is the characteristic put forward by Joblin et al. (2008) for the so-called PAHx, which would be large ionized PAHs (cations or anions of more than 100 carbon atoms). We note that the latter authors had constructed the PAHx spectrum, whereas here this spectrum results from the extraction procedure. We found the presence of a strong 3.3 μm band associated with the PAHx component, also slightly redshifted compared to the band of PAH0. However, we should consider that the extraction of the PAHx spectrum is only possible due to the presence of the intense RR spectrum in the data set (see Sect. 3.2.3). Its extraction is promising but still to be confirmed on a sample containing more RR-like objects.

Finally, although the spectrum of every object is a mix of the four representative spectra, we note that the class A objects of Peeters et al. (2002) are dominated by the PAH0 and PAH+ populations, whereas the class B objects are dominated by eVSG and PAHx populations.

4.2. Chemical diversity

The improved spectral resolution and larger spectral range of our representative spectra can provide new information on the emitting populations. Table B.1 reports the positions of the main detected bands and of some minor bands where we are confident that they are associated with real bands and not artifacts of the method. In the following we more specifically focus on the differences that can be noted for a given type of spectral feature in the representative spectra.

We first consider the 3−4 μm range, which is studied here for the first time using BSS methods. Figure 4 shows that all populations except the carriers of spectrum 1 have a contribution in this spectral range. For PAHs, it is known that ionization induces a collapse of the intensity of the C–H stretch (Pauzat et al. 1995). The effect decreases with increasing size. For large species containing 100 carbons or more, the 3.30 μm band has a comparable intensity for neutrals and cations (Bauschlicher et al. 2009). On the other hand, the 3.30 μm band is in general more intense for anions than for neutrals, as shown by density functional theory calculations, which are gathered in the Theoretical PAH database2 and the NASA Ames PAH IR Spectroscopic Database3. These results strengthen the assignment we made for spectra 1, 2, and 4. Spectrum 1 is dominated by PAH cations, whereas spectrum 2 is dominated by neutrals. The presence of a strong 3.30 μm band in spectrum 4 supports the idea that their carriers, some large ionized PAHs, are anions rather than cations, whereas both species were proposed in Joblin et al. (2008). Even so, the new data raise a number of questions. The absence of the 3.30 μm band in spectrum 1 would go against the fact that the carriers are large PAH cations, unless they are not sufficiently excited to emit at 3.30 μm. The fact that spectrum 3 attributed to eVSGs exhibits a 3.30 μm band is also surprising considering that eVSGs are expected to be larger species and to then emit at lower temperatures than PAHs (Rapacioli et al. 2005). It should be kept in mind, however, that the retrieved representative spectra reflect the dominant contributions in the sample of spectra and do not have the ability to separate chemical populations that are present at an intensity that is too low compared to other populations. Spectrum 3 may therefore contain a mix of eVSGs and smaller PAH-like species that could be produced by their evaporation (Pilleri et al. 2015).

In addition to the 3.30 μm band, spectrum 2 also exhibits the 3.40 μm band, as well as minor bands at 3.43, 3.47, and 3.51 μm. All these bands have been seen in several sources (e.g., Pilleri et al. 2015; Sloan et al. 1997; Ohsawa et al. 2016). They have been attributed to aliphatic bonds in methyl side groups attached to PAHs (Joblin et al. 1996a) or in superhydrogenated PAHs (Bernstein et al. 1996). These species are expected to have another characteristic band at ∼6.9 μm (Steglich et al. 2013), which is indeed present in the PAH0 spectrum. There is also a significant band at 6.9 μm in the eVSG spectrum, which might be related to the aliphatic content of this population (Pilleri et al. 2015). For spectra 1, 3, and 4, no band at 3.40 μm is detected, but a broad plateau centered around 3.45 μm is present. Pilleri et al. (2015) detected this plateau in NGC 7023 and concluded from a spatial analysis that it is related to the emission from hot bands and combination bands involving aromatic bonds.

We see in Sect. 3.2.1 that the 6.20 μm band peak position varies significantly, from 6.20 to 6.27 μm, between spectra 1, 2, and 3, 4. In addition, spectrum 4 has a satellite band at 6.00 μm. This feature has been observed in a number of objects and was previously attributed to a C=O stretching mode in oxygenated PAHs (Peeters et al. 2002), C–C mode in complexes of PAHs with Si atoms (Joalland et al. 2009) or C=C mode from an olefinic side group (Hsia et al. 2016). The range of the 7.70 μm complex indicates some diversity between the populations, although the PAH0 and PAH+ spectra appear relatively close with common bands at 7.61, 7.52/7.53, and 7.76/7.77.

There are significant differences in the 11−15 μm range within the PAH spectra. The band at 11.20 μm is the dominant one followed by the 12.70 μm band. However the 11.20 μm band is much more intense in spectrum 2 (PAH0) than in spectrum 1 (PAH+), where it is accompanied by other features at ∼12 μm. This would indicate more chemical diversity in the PAH+ population.

To conclude, we can see that the representative spectra carry some information on the chemical diversity involved in each population. Even so, our sample is limited and rather heterogeneous. We therefore have to be cautious with any detailed band assignment. More robust information will be obtained while analyzing data obtained at high spatial resolution on specific regions in which the chemical evolution of the populations is driven by the local physical conditions such as the UV radiation field. Our four template spectra are consistent with previous studies showing the importance of UV processing. Among the PAH populations, the PAH0 population is the least processed, which is illustrated by the presence of more fragile aliphatic side groups. The PAH+ population is expected to be more processed and ionized. It is possible that the trends observed in the 3.30 and 11−13 μm ranges (see above) reveal some first steps of dehydrogenation, although not much data are available to support this scenario (Pauzat et al. 1995). The PAHx population indicates the presence of large PAH anions. The spectrum of eVSGs is more difficult to comment and this also reflects our poor knowledge on the nature of this population.

5. Conclusions

In this paper, we present a new linear (thus fast) and robust method for the analysis of the AIB spectra in preparation of the large data sets that will be provided by the JWST. After an automatic extraction of the AIBs from each observed spectrum (see Sect. 2.2), a hybrid blind signal separation method (see Sect. 2.3), i.e., NMF initialized using the MASS algorithm (see Sect. 2.4), is applied on the extracted AIBs, and provides a set of representative spectra (here, r = 4). The method reduces the dimensionality of the study of a data set to that of a reduced number of representative spectra, physically and chemically interpretable in terms of populations of PAHs and VSGs. The speed of the method allows us to analyze large sets of data in a limited time compared to non-linear methods or hybrid Monte Carlo–NMF methods. As a test, we used ISO-SWS archival data (see Sect. 3.1.1) which most closely approached the spectral range and resolution of JWST data. Four representative spectra shown in Fig. 4 were extracted, and they demonstrate that the method is applicable. Based on comparison with previous studies, we propose the following interpretation:

  • Three representative spectra were assigned to PAH0, PAH+, and eVSG spectra, in agreement with previous studies.

  • We attributed the fourth spectrum to the PAHx population, initially invoked and constructed by Joblin et al. (2008) and extracted here automatically for the first time. However, this spectrum should be confirmed on a larger sample because of its specific behavior.

  • We found that the spectral classes of Peeters et al. (2002) can be explained by a dominant PAH0 and PAH+ emission for class A, and by eVSG and PAHx emission for class B.

  • The 3 μm range was included for the first time in such a study, and reveals the presence of aliphatics connected to neutral PAHs, and that the anionic character of the PAHx population is supported by this new analysis.

  • Second-order spectral signatures (e.g., small bands) have been identified, showing that more chemical diversity can be investigated thanks to an improved spectral resolution. This list should however be taken with caution, considering the combined small size and heterogeneity of the sample of astronomical spectra used in this study.

The method presented in this study is readily applicable to JWST hyperspectral images that will be provided by MIRI and NIRSpec IFUs, and we expect that it will allow us to refine the scenarios of PAH formation, evolution, and destruction in galaxies.


1

The IDL code PAHTAT is available for download at http://userpages.irap.omp.eu/~cjoblin/PAHTAT

Acknowledgments

We thank the referee for his/her comments which helped to significantly improve the clarity of the manuscript. We want to thank also Paolo Pilleri for his help during the beginning of this work, and for his scientific comments and his technical support. This work was supported by the French Programme Physique et Chimie du Milieu Interstellaire (PCMI) funded by the Conseil National de la Recherche Scientifique (CNRS) and Centre National d’études Spatiales (CNES). The research leading to these results has also received funding from the European Research Council under the European Union Seventh Framework Programme (FP/2007−2013) ERC-2013-SyG, Grant Agreement No. 610256 NANOCOSMOS.

References

  1. Abrard, F., & Deville, Y. 2005, Signal Process., 85, 1389 [CrossRef] [Google Scholar]
  2. Allamandola, L. J., Tielens, A. G. G. M., & Barker, J. R. 1985, ApJ, 290, L25 [NASA ADS] [CrossRef] [Google Scholar]
  3. Bauschlicher, Jr., C. W., Peeters, E., & Allamandola, L. J. 2009, ApJ, 697, 311 [NASA ADS] [CrossRef] [Google Scholar]
  4. Bauschlicher, C. W., Boersma, C., Ricca, A., et al. 2010, ApJS, 189, 341 [NASA ADS] [CrossRef] [Google Scholar]
  5. Bauschlicher, C. W., Ricca, A., Boersma, C., & Allamandola, L. J. 2018, ApJS, 234, 32 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
  6. Berné, O., Joblin, C., Deville, Y., et al. 2012, Proceedings of the Annual Meeting of the French Society of Astronomy and Astrophysics (SF2A), 6 [Google Scholar]
  7. Bernstein, M. P., Sandford, S. A., & Allamandola, L. J. 1996, ApJ, 472, L127 [CrossRef] [Google Scholar]
  8. Berné, O., Joblin, C., Deville, Y., et al. 2007, A&A, 469, 575 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  9. Boersma, C., Mattioda, A. L., Bauschlicher, Jr., C. W., et al. 2009, ApJ, 690, 1208 [NASA ADS] [CrossRef] [Google Scholar]
  10. Boersma, C., Bregman, J. D., & Allamandola, L. J. 2013, ApJ, 769, 117 [NASA ADS] [CrossRef] [Google Scholar]
  11. Boersma, C., Bauschlicher, C. W., Ricca, A., et al. 2014, ApJS, 211, 8 [NASA ADS] [CrossRef] [Google Scholar]
  12. Boulais, A. 2017, PhD, Université de Toulouse, Université Toulouse III – Paul Sabatier [Google Scholar]
  13. Boulais, A., Deville, Y., & Berné, O. 2015, 2015 IEEE International Workshop of Electronics, Control, Measurement, Signals and their Application to Mechatronics (ECMSM), 1 [Google Scholar]
  14. Boulais, A., Deville, Y., & Berné, O. 2019, A&A, submitted [Google Scholar]
  15. Boulanger, F., Boisssel, P., Cesarsky, D., & Ryter, C. 1998, A&A, 339, 194 [NASA ADS] [Google Scholar]
  16. Candian, A., & Sarre, P. J. 2015, MNRAS, 448, 2960 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
  17. Cesarsky, D., Lequeux, J., Ryter, C., & Gérin, M. 2000, A&A, 354, L87 [Google Scholar]
  18. Compiègne, M., Abergel, A., Verstraete, L., et al. 2007, A&A, 471, 205 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  19. DeFrees, D. J., Miller, M. D., Talbi, D., Pauzat, F., & Ellinger, Y. 1993, ApJ, 408, 530 [CrossRef] [PubMed] [Google Scholar]
  20. Galliano, F., Madden, S. C., Tielens, A. G. G. M., Peeters, E., & Jones, A. P. 2008, ApJ, 679, 310 [NASA ADS] [CrossRef] [Google Scholar]
  21. Galliano, F., Galametz, M., & Jones, A. P. 2018, ARA&A, 56, 673 [NASA ADS] [CrossRef] [Google Scholar]
  22. Gardner, J. P., Mather, J. C., Clampin, M., et al. 2006, Space Sci. Revi., 123, 485 [NASA ADS] [CrossRef] [Google Scholar]
  23. Giard, M., Bernard, J. P., & Dennefeld, M. 1992, A&A, 264, 610 [NASA ADS] [Google Scholar]
  24. Gillett, F. C., Forrest, W. J., & Merrill, K. M. 1973, ApJ, 183, 87 [NASA ADS] [CrossRef] [Google Scholar]
  25. Hony, S., Van Kerckhoven, C., Peeters, E., et al. 2001, A&A, 370, 1030 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  26. Hsia, C.-H., Sadjadi, S., Zhang, Y., & Kwok, S. 2016, ApJ, 832, 213 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
  27. Hudgins, D. M., & Allamandola, L. J. 1999, ApJ, 516, L41 [CrossRef] [Google Scholar]
  28. Hudgins, D. M., Bauschlicher, Jr., C. W., & Allamandola, L. J. 2005, ApJ, 632, 316 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  29. Joalland, B., Simon, A., Marsden, C. J., & Joblin, C. 2009, A&A, 494, 969 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  30. Joblin, C., Tielens, A. G. G. M., Allamandola, L. J., & Geballe, T. R. 1996a, ApJ, 458, 610 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
  31. Joblin, C., Tielens, A. G. G. M., Geballe, T. R., & Wooden, D. H. 1996b, ApJ, 460, L119 [NASA ADS] [CrossRef] [Google Scholar]
  32. Joblin, C., Szczerba, R., Berné, O., & Szyszka, C. 2008, A&A, 490, 189 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  33. Kessler, M. F., Steinz, J. A., Anderegg, M. E., et al. 1996, A&A, 315, L27 [NASA ADS] [Google Scholar]
  34. Lee, D. D., & Seung, H. S. 1999, Nature, 401, 788 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
  35. Lee, D. D., & Seung, H. S. 2001, in Advances in Neural Information Processing Systems, eds. T. K. Leen, T. G. Dietterich, & V. Tresp (MIT Press), 13, 556 [Google Scholar]
  36. Leger, A., & Puget, J. L. 1984, A&A, 137, L5 [Google Scholar]
  37. Lin, C.-J. 2007, Neural Comput., 19, 2756 [CrossRef] [Google Scholar]
  38. Luo, J., & Zhang, Z. 2000, WCC 2000 – ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings. 16th World Computer Congress 2000, 1, 223 [CrossRef] [Google Scholar]
  39. Malloci, G., Joblin, C., & Mulas, G. 2007, Chem. Phys., 332, 353 [NASA ADS] [CrossRef] [Google Scholar]
  40. Montillaud, J., Joblin, C., & Toublanc, D. 2013, A&A, 552, A15 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  41. Mulas, G., Malloci, G., Joblin, C., & Toublanc, D. 2006a, A&A, 460, 93 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  42. Mulas, G., Malloci, G., Joblin, C., & Toublanc, D. 2006b, A&A, 456, 161 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  43. Murakami, H., Baba, H., Barthel, P., et al. 2007, PASJ, 59, S369 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
  44. Neugebauer, G., Habing, H. J., van Duinen, R., et al. 1984, ApJ, 278, L1 [Google Scholar]
  45. Ohsawa, R., Onaka, T., Sakon, I., Matsuura, M., & Kaneda, H. 2016, ApJ, 151, 93 [CrossRef] [Google Scholar]
  46. Pauzat, F., Talbi, D., & Ellinger, Y. 1995, A&A, 293, 263 [NASA ADS] [Google Scholar]
  47. Pech, C., Joblin, C., & Boissel, P. 2002, A&A, 388, 639 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  48. Peeters, E., Hony, S., Van Kerckhoven, C., et al. 2002, A&A, 390, 1089 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  49. Peeters, E., Bauschlicher, C. W., Allamandola, L. J., et al. 2017, ApJ, 836, 198 [NASA ADS] [CrossRef] [Google Scholar]
  50. Pilleri, P., Montillaud, J., Berné, O., & Joblin, C. 2012, A&A, 542, A69 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  51. Pilleri, P., Joblin, C., Boulanger, F., & Onaka, T. 2015, A&A, 577, A16 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  52. Rapacioli, M., Joblin, C., & Boissel, P. 2005, A&A, 429, 193 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  53. Rosenberg, M. J. F., Berné, O., Boersma, C., Allamandola, L. J., & Tielens, A. G. G. M. 2011, A&A, 532, A128 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  54. Simon, A., & Joblin, C. 2010, ApJ, 712, 69 [CrossRef] [Google Scholar]
  55. Sloan, G. C., Bregman, J. D., Geballe, T. R., Allamandola, L. J., & Woodward, E. 1997, ApJ, 474, 735 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
  56. Sloan, G. C., Kraemer, K. E., Price, S. D., & Shipman, R. F. 2003, ApJS, 147, 379 [NASA ADS] [CrossRef] [Google Scholar]
  57. Smith, J. D. T., Draine, B. T., Dale, D. A., et al. 2007, ApJ, 656, 770 [NASA ADS] [CrossRef] [Google Scholar]
  58. Steglich, M., Jäger, C., Huisken, F., et al. 2013, ApJS, 208, 26 [NASA ADS] [CrossRef] [Google Scholar]
  59. Szczepanski, J., & Vala, M. 1993, ApJ, 414, 646 [NASA ADS] [CrossRef] [Google Scholar]
  60. Verstraete, L., Puget, J. L., Falgarone, E., et al. 1996, A&A, 315, L337 [NASA ADS] [Google Scholar]
  61. Weingartner, J. C., & Draine, B. T. 2001, ApJ, 548, 296 [NASA ADS] [CrossRef] [Google Scholar]
  62. Werner, M. W., Roellig, T. L., Low, F. J., et al. 2004, ApJS, 154, 1 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A: Summary of MIRI and NIRSpec characteristics

Table A.1 summarizes the principal characteristics of the instruments NIRSpec and MIRI on board the JWST.

Table A.1.

Performance of the JWST.

Appendix B: Summary of the principal bands in population spectra

Table B.1 gives the position and the assignment of the bands extracted from the different representative spectra. They are classified as main peaks, minor bands, and shoulders, which are bands blended with a nearby main band.

Table B.1.

Bands observed in the template spectra with their plausible assignments.

Appendix C: Comparison with Joblin et al. (2008)

We present in this section the figure comparing the results from Joblin et al. (2008) and those from this work for common objects in both studies; see the text in Sect. 3.2.2 for details.

thumbnail Fig. C.1.

Comparison of the relative populations in the astrophysical objects in common between the study by Joblin et al. (2008) (in green) and this paper (in orange). The middle box shows the symbol attribution for each object. The contributions from Joblin et al. (2008) have been normalized to the sum of the PAHs and eVSG contributions.

Appendix D: Extraction of AIB emission from all observed spectra

All figures in this section show the extraction of AIB emission from all the observed spectra using the method described in Sect. 2.2.

thumbnail Fig. D.1.

Fit (red) to the data (blue) to extract the AIB contribution s ̂ AIB $ \hat{s}_{\mathrm{AIB}} $ (green) using the method described in Sect. 2.2. The black line shows the estimated continuum s ̂ cont $ \hat{s}_{\mathrm{cont}} $ and orange the estimated contribution from noise s ̂ inst $ \hat{s}_{\mathrm{inst}} $. The name of the corresponding astrophysical object is given in the blue box inside the plot.

thumbnail Fig. D.1.

continued.

thumbnail Fig. D.1.

continued.

thumbnail Fig. D.1.

continued.

thumbnail Fig. D.1.

continued.

thumbnail Fig. D.1.

continued.

thumbnail Fig. D.1.

continued.

thumbnail Fig. D.1.

continued.

Appendix E: Reconstruction of all the observed spectra

Figures in this section show the reconstruction of each observed spectrum using the four representative spectra (see Eq. (1)). The same color-coding is used for all of them. The order of the spectra is the same as in Appendix D.

thumbnail Fig. E.1.

Reconstruction (in red, thin line) of the observed data (in blue), using the reconstruction method described in Sect. 3.2.2 based on the four representative spectra (Fig. 4). The relative contribution of each of these spectra is indicated. The black line corresponds to the continuum and orange to the residual of the reconstruction. The name of the corresponding astrophysical object is given in the blue box inside the plot.

thumbnail Fig. E.1.

continued.

thumbnail Fig. E.1.

continued.

thumbnail Fig. E.1.

continued.

thumbnail Fig. E.1.

continued.

thumbnail Fig. E.1.

continued.

thumbnail Fig. E.1.

continued.

thumbnail Fig. E.1.

continued.

All Tables

Table 1.

Sources used in our sample.

Table 2.

Wavelength range and lowest resolution in segment are reported for each ISO SWS segment.

Table 3.

Gaussian parameters used to fit the AIB spectra.

Table 4.

Temperature parameters of the continuum.

Table 5.

Gaussian parameters used to fit the gas lines.

Table 6.

Relative contribution of each representative spectrum in the data sample.

Table A.1.

Performance of the JWST.

Table B.1.

Bands observed in the template spectra with their plausible assignments.

All Figures

thumbnail Fig. 1.

Examples of ISO-SWS spectra used in this study. See Table 1 for the complete list of objects.

In the text
thumbnail Fig. 2.

Examples of AIB extraction from mid-IR spectra (green line). The observed spectra are shown in blue, the fit in red, and the continuum in gold. The residual of the fit is shown in orange in the box below each panel. The rest of the AIB extractions are available in Appendix B.

In the text
thumbnail Fig. 3.

Eigenvalues of the covariance matrix of the AIB spectra (Xaib). The dashed line corresponds to r = 4, which is the dimension we selected. See Sect. 2.3.2 for details.

In the text
thumbnail Fig. 4.

Comparison between PAHTAT templates (dashed lines for PAH+, PAH0, eVSG, and PAHx; Pilleri et al. 2012) and extracted representative spectra in this work. All spectra are homogeneous to W m−2/μm and normalized to their maximum values. The vertical dashed lines correspond to the positions of the major bands of spectrum 2 (3.29, 6.20, 7.60, 8.60, 11.22, and 12.70 μm).

In the text
thumbnail Fig. 5.

Two examples of spectral reconstruction of IRAS 17047−5650 (upper panel) and Orion Bar H2S1 (bottom panel). The thin black line shows the continuum, blue and red show respectively the data and the model. The four representative spectra for the eVSG, PAH+, PAH0 and PAHx populations are shown in red, blue, green, and purple for respectively.

In the text
thumbnail Fig. 6.

Ratio of the variances of the residuals to the uncertainties.

In the text
thumbnail Fig. C.1.

Comparison of the relative populations in the astrophysical objects in common between the study by Joblin et al. (2008) (in green) and this paper (in orange). The middle box shows the symbol attribution for each object. The contributions from Joblin et al. (2008) have been normalized to the sum of the PAHs and eVSG contributions.

In the text
thumbnail Fig. D.1.

Fit (red) to the data (blue) to extract the AIB contribution s ̂ AIB $ \hat{s}_{\mathrm{AIB}} $ (green) using the method described in Sect. 2.2. The black line shows the estimated continuum s ̂ cont $ \hat{s}_{\mathrm{cont}} $ and orange the estimated contribution from noise s ̂ inst $ \hat{s}_{\mathrm{inst}} $. The name of the corresponding astrophysical object is given in the blue box inside the plot.

In the text
thumbnail Fig. E.1.

Reconstruction (in red, thin line) of the observed data (in blue), using the reconstruction method described in Sect. 3.2.2 based on the four representative spectra (Fig. 4). The relative contribution of each of these spectra is indicated. The black line corresponds to the continuum and orange to the residual of the reconstruction. The name of the corresponding astrophysical object is given in the blue box inside the plot.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.