Free Access
Volume 566, June 2014
Article Number L4
Number of page(s) 7
Section Letters
Published online 19 June 2014

Online material

Appendix A: Database biases

The content of the computed part of the NASA Ames PAH IR Spectroscopic Database has some intrinsic biases (Bauschlicher et al. 2010; Boersma et al. 2014). These biases originate historically from a limit in available computational power, that smaller PAH species (NC< 20) are more easily calculated, and a focus on astrophysically relevant species, i.e., pure, cata-condensed, neutral, and singly positively ionized PAH species (Tielens 2008). The bias towards small PAHs in the database is somewhat negated by the physics of the PAH emission process and the wavelength range considered here (5–15 μm). Small PAHs get significantly hotter than their larger counterparts upon absorbing the same photon energy. This pushes more of the emission blue of the wavelength region considered here. Similarly for large PAHs (NC> 80), which stay significantly cooler, more of the emission is pushed red of the wavelength region considered here. From a stability standpoint, the family of cata-condensed, compact PAHs is very stable and hence more likely to survive the rigors of interstellar space (Allamandola et al. 1985). These species are well represented in the database. The database does undersample dehydrogenated PAHs, both in the levels of dehydrogenation and the possible permutations. However, it has been shown that the removal of only one or two hydrogens from cata-condensed PAHs does not alter the spectrum much and that such fully dehydrogenated species in space are probably rare (Bauschlicher & Ricca 2013). Variations in peripheral hydrogen adjacencies are reflected by variations in the 10–15 μm region of the PAH spectrum (Hony et al. 2001). As PAHs become increasingly large, while remaining compact, they obtain more straight edges. This is reflected in their spectra by a strong 11.2 μm feature. Adding more irregular PAHs to the database can alleviate some of the current bias towards compact, straight edged PAHs in the database. Of all studied hetero atom substitutions, nitrogen is the most viable candidate 1) because its inclusion does not affect PAHs stability; 2) because nitrogen is abundant in the circumstellar shells around carbon rich AGB stars (Allamandola et al. 1985, 1989; Frenklach & Feigelson 1989); 3) because of the place where PAHs are thought to be formed (Boersma et al. (2006), and references therein); and 4) because of their known presence in meteorites known presence in meteorites (Hayatsu et al. 1977). Other substitutions either have little effect (e.g., silicon, magnesium) or significantly disrupt the aromatic network, and therefore reduce the stability of the PAH, e.g., oxygen (Hudgins et al. 2005). Singly charged PAH anions are well represented in the database for the larger PAH species. Considering detailed charge balance, doubly charged PAH cations only become important in the more extreme astrophysical environments and higher ionization states can safely be ignored. These and other considerations regarding database biases and their astrophysical relevance have also been discussed in Boersma et al. (2013). Our database mixed spectra are affected by these biases. The region in the spectrum most affected is the 10–15 μm range because of the underrepresentation of irregular PAHs. This could also explain the disparity between the observations and the database mixtures in these regions, e.g., the weak 12.7 μm feature.

thumbnail Fig. A.1

Structure and DFT computed 5–15 μm vibrational infrared spectra of a selection of PAHs. PAHs are a class of carbonaceous molecules that form a skeleton where carbon atoms are arranged in a honeycomb structure with hydrogen atoms sitting on the periphery. Additional atoms, such as nitrogen, also can be present in the skeleton. For each species, the chemical formula, simple name (if it exists), and the corresponding mid-IR spectrum calculated by DFT at 0 K are given. All data are taken from the NASA Ames PAH IR Spectroscopic Database (Bauschlicher et al. 2010; Boersma et al. 2014).

Open with DEXTER

Appendix B: Correlation between the 1000 mixtures

Figure B1 shows the probability density function (PDF) of the correlation matrix of the 1000 mixtures (Fig. 2, left panel). The peak, average, and median correlation coefficients are shown in blue, red, and green, respectively. The 1-sigma variation around the mean is shown in yellow. The peak of the PDF (the most likely correlation) is found at 0.96 and the standard deviation is 0.023, i.e., 85% of the correlations fall between 0.94 and 0.98. Only 4% of the correlations are below 0.9. The distribution is sharp and narrow, showing without a doubt that random PAH mixtures are indeed very alike.

thumbnail Fig. B.1

Probability density function of the correlation coefficients between the average 5–15 μm spectra from 1000 mixtures of 548 species with random abundances between 0–1. The peak, average, and median correlation coefficients are shown in blue, red, and green, respectively. The 1-sigma variation around the mean is shown in yellow.

Open with DEXTER

Appendix C: Statistical analysis

First we concentrate on the database and apply the following procedure:

  • 1.

    Randomly select r spectra from the database.

  • 2.

    Create m random linear combinations (mixtures) of the r spectra by assigning each spectrum a random abundance between 0 and 1 such that (C.1)where X is an m × n matrix holding m number of mixed spectra over n wavelength bins; A is an m × r abundance matrix, containing random numbers between 0 and 1; and S is a r × n matrix holding the original set of database spectra. For this analysis we set m to 100, creating 100 random mixtures each time.

  • 3.

    Repeat steps (1)-(2) p times, randomly selecting a new set of r PAH spectra each time (here we vary r between 10 and 100). Thus, p spectra in matrix X are created, which we will denote as Xi. We use p = 100 for this analysis. This means that there are m random mixtures of r spectra, and we reselect and remix those r spectra p times.

  • 4.

    We find the maximum, minimum, and mean values of the spectra with index i, Xi, in the X matrix and for every wavelength bin, λ. From these values we create three spectra, Smax(λ), Smin(λ), and S(λ). The spectra Smax(λ) and Smin(λ) represent the boundaries within which any spectrum (Xi) falls. The mean spectrum is the kernel spectrum for that particular set of m × p mixed spectra.

  • 5.

    We calculate the Euclidian distance from the minimum and maximum spectra with respect to the mean, (C.2)(C.3)and we define Nr as (C.4)which is a measurement of the maximum variation in the mixtures for each set of r spectra.

Now we consider the observations and calculate Nobs using the following steps:

  • 1.

    Subtract a linear baseline (corresponding to the emission fromanother dust component) for each observed spectrum.

  • 2.

    Out of the ten observed rest-frame spectra, define a minimum, maximum, and mean at each wavelength bin.

  • 3.

    Similarly to step 4 in the database analysis, we find the maximum, minimum, and mean value of the observed spectra with index i, Xi, in the X matrix and for every wavelength bin, λ. From these values we create three spectra, Sobs,max(λ), Sobs,min(λ), and Sobs(λ). The spectra Sobs,max(λ) and Sobs,min(λ) represent the boundaries within which any of the observed spectra (Xi) fall. The mean spectrum Sobs(λ) is the average of the particular set of rest-frame observations presented in Fig. 1. Similarly to (6) for the database spectra, we define

thumbnail Fig. C.1

Top panel: range of variations in the kernel spectra as a function of the number of PAH species considered in the mixture: 10 species (light grey) through 90 species (dark grey). In red is the range for 100 species and in black the range for 548 species. Bottom panel: evolution of the norm Nr which captures the variations in the kernel spectra (blue line) and the norm Nobs which captures the variations in the observations (see text for details).

Open with DEXTER

Appendix C.1: Comparison between database and observations statistics

In Fig. C.1, we present the results of the statistical analysis in a graphical way. The top panel of Fig. C.1 shows the shaded regions between Smin and Smax, which highlights the boundaries of the m × i mixtures of r spectra. The lightest grey shaded region represents r = 10, and increasingly darker greys represent r = 20 − 90. The red region is where r = 100 and the black region is when r = 548, i.e., the whole database. It is clear from the figure that, by increasing the number of species in the sample, the resulting variations between the kernel spectra decrease. This can be investigated more quantitatively, by following the evolution of the norm Nr as a function of the number of species present in the mixture r. This is done in the bottom panel of Fig. C.1 where the decrease of Nr can be seen clearly. One way to compare the variations of the observed AIB spectrum with those present in the kernel spectra, is to compare Nr with Nobs which has a constant value reported in Fig. C.1. When Nr<Nobs, the spectral variations (in terms of Euclidian norm as defined in Appendix C) of the database mixture are within the spectral variations observed in PAH. This happens when r> 30 (Fig. C.1).

Appendix D: Blind signal separation

Blind signal separation is commonly used to restore a set of unknown source signals from a set of observed signals which are

mixtures, or combinations, of these original source signals, with unknown mixture parameters (Hyvarinen et al. 2001). Several methods and algorithms exist in the literature. The astronomical PAH cation and neutral spectra presented in this Letter were obtained with Lee and Seung’s non-negative matrix factorization (Lee & Seung 2001, NMF; NMF was applied to data of the reflection nebula NGC 7023 obtained with the Infrared Spectrograph onboard the Spitzer Space Telescope. Details on the procedure can be found in Berne et al. (2010).

© ESO, 2014

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.