Mapping the structural diversity of C60 carbon clusters and their infrared spectra

The current debate about the nature of the carbonaceous material carrying the infrared (IR) emission spectra of planetary and proto-planetary nebulae, including the broad plateaus, calls for further studies on the interplay between structure and spectroscopy of carbon-based compounds of astrophysical interest. The recent observation of C60 buckminsterfullerene in space suggests that carbon clusters of similar size may also be relevant. In the present work, broad statistical samples of C60 isomers were computationally determined without any bias using a reactive force field, their IR spectra being subsequently obtained following local optimization with the density-functional-based tight-binding theory. Structural analysis reveals four main structural families identified as cages, planar polycyclic aromatics, pretzels, and branched. Comparison with available astronomical spectra indicates that only the cage family could contribute to the plateau observed in the 6-9 micron region. The present framework shows great promise to explore and relate structural and spectroscopic features in more diverse and possibly hydrogenated carbonaceous compounds, in relation with astronomical observations.


Introduction
Buckminsterfullerene (C 60 ) has recently been identified in space (Cami et al. 2010;Sellgren et al. 2010). Although recent observational studies suggest that fullerenes could be produced by photochemical dehydrogenation and isomerization of amorphous carbon grains (García-Hernández et al. 2010, 2011b, the composition of these grains is still debated. Infrared (IR) spectroscopic observations of fullerene-rich planetary nebulae (PNe) found a broad plateau with substructure at 6-9 µm range (García-Hernández et al. 2010, 2011bBernard-Salas et al. 2012). The origin of this plateau remains elusive and various hypotheses have been formulated about its possible carriers, including hydrogenated amorphous carbon (HAC) compounds (Bernard-Salas et al. 2012), very small grains, or polycyclic aromatic hydrocarbon (PAH) clusters (Tielens 2008;Buss et al. 1993;Rapacioli et al. 2005). Similarly, the 8 and 12 µm plateau features of some proto-planetary nebulae (PPNe) were assigned to the vibrations of alkanes or alkyle side groups pertaining to large carbonaceous particles (Kwok et al. 2001;Hrivnak et al. 2000), whose details also remain unclear.
These observations and their various interpretations have motivated further studies to unravel the molecular nature of the carriers of these spectral features. The present Send offprint requests to: e-mail: aude.simon@irsamc.ups-tlse.fr letter describes a purely computational framework designed to systematically and extensively explore the structural diversity of C 60 isomers in a statistically unbiased way and to relate the various isomers obtained to their IR spectral features. Sorting these isomers using appropriate order parameters, our mapping provides unprecedented correlations between structure and spectroscopy in astrochemically relevant compounds. This study also brings indirect insight into the possible precursors of stable fullerenes under astrophysical environments, whose formation in circumstellar environments of carbon-rich evolved stars is somewhat puzzling (Bernard-Salas et al. 2012). Two main routes towards such highly organized molecules have been proposed in the literature, consisting of either the successive growth from smaller carbon chain building blocks, known as the closed network growth mechanism (Dunk et al. 2012), or alternatively the decay and rearrangement of larger carbon grains or dehydrogenated PAHs (Berné et al. 2015) under the action of ultraviolet photons (Zhen et al. 2014) or by collisions with energetic particles (Omont 2016). These socalled bottom-up and top-down mechanisms are expected to prevail in the hot, dense envelopes of evolved stars and in low-density environments such as interstellar clouds, respectively (Berné & Tielens 2012).
The letter is organized as follows. In Section 2, we briefly describe the computational approach. In Section 3 we analyze the structural diversity of C 60 , the classification into four main families of isomers, and the relation with IR fea- tures. In Section 4 we discuss the possible contribution of these families to some features of astronomical spectra for selected PNe and PPNe. Finally, we make some concluding remarks in Section 5.

Computational details
Following the simulation protocol recently described in Bonnin et al. (2019), a systematic and unbiased search for isomers of C 60 was computationally undertaken using the efficient but realistic reactive empirical bond order (REBO) force field (Brenner et al. 2002). Replica-exchange molecular dynamics (REMD) simulations were performed in the 2500-6500 K temperature range using 16 trajectories, the temperatures being allocated in a 12-member geometric series with four extra temperatures around the C 60 melting temperature (Kim & Tománek 1994). All replicas were initiated from the buckminsterfullerene global minimum. The use of high temperatures and the REMD framework ensures that the sample is not biased towards this starting point. The trajectories were integrated over 100 ns using a time step of 0.1 fs, and configurations were periodically saved and locally reoptimized. Spherical boundary conditions were used and the density was varied over four different simulations with ρ = 0.025, 0.15, 0.4, and 1.7 g.cm −3 . Dissociated structures corresponding to a threshold distance greater than 1.85 Å were excluded. This unbiased sampling stage produced 656 438 nonequivalent isomers.
While REBO provides a reasonable description of intramolecular forces in carbon nanostructures, it lacks electrostatics and cannot provide vibrational spectra. Instead we used the self-consistent charge (SCC) densityfunctional-based tight-binding (DFTB) method employing the mio set of parameters (Elstner et al. 1998) with additional dispersion corrections ). Although significantly more expensive than REBO, the SCC-DFTB approach remains computationally efficient owing to approximations and pre-computed pair integrals. It was previously employed successfully to determine structural (Zheng et al. 2005;Yen & Lai 2015;Witek et al. 2006;Malolepsza et al. 2007), spectroscopic (Joalland et al. 2010;Simon et al. 2017aSimon et al. , 2012Simon & Spiegelman 2013b,a), and fragmentation (Simon et al. 2017b(Simon et al. , 2018Rapacioli et al. 2018) properties of PAH molecules and oligomers of astrophysical interest under various charge and energy states, at and away from equilibrium. The formation of PAHs or carbon clusters (Saha et al. 2009(Saha et al. , 2010 and their hydrogen uptake capability (Dominguez-Gutierrez et al. 2018) were also scrutinized.
Reoptimization of the REBO isomers produced 309 168 independent structures at the SCC-DFTB level of theory, for which the IR absorption spectra were computed in the double harmonic approximation. The raw vibrational frequencies obtained with SCC-DFTB were scaled to match the best reference data (see Appendix A). Owing to this dependence of the scaling factors on the frequency domain, a discontinuity arises near 6.0 µm, preventing an accurate analysis of the [5.8-6.2 µm] domain.
All REBO simulations were performed using the LAMMPS software package (Plimpton 1995); the SCC-DFTB calculations were carried out with the deMonNano software package (Heine et al. 2009).

Structural partitioning
The structural properties of our sample of isomers were revealed by employing a reduced number of parameters that characterize features at different scales. In addition to the SCC-DFTB energy, we considered the shape of the individual isomers through the radius of gyration r g and the Hill-Wheeler asphericity parameter β (Hill & Wheeler 1953). The radius of gyration r g measures the spatial extension of the isomer, while β < 1 measures deviations to the perfect sphere for which β = 0. As is shown below, the Hill-Wheeler parameters, previously used to characterize the shape of metallic clusters (Calvo et al. 2000), appear to be quite sensitive for carbon compounds as well.
Local ordering in the carbon clusters was examined from their hybridization level. From the SCC-DFTB electronic density matrix, Mayer's bond orders (Mayer 1983) were computed and bonds placed accordingly when the bond order was higher than 0.8. A carbon atom with a coordination number of n + 1 is attributed to a sp n hybridization level. For a given isomer, the fraction of sp 2 atoms averaged over all atoms was assigned as the main local order parameter to measure the extent of sp 2 hybridization.
A last discriminatory parameter, which can be seen as a semi-local or semi-global index, is the number of 6-member rings within each isomer. This quantity is related to the extent of connectivity within the structure, and can also differentiate isomers with large aromatic domains from those exhibiting topological defects. Figure 1 shows how the various order parameters are distributed among the 3 × 10 5 isomers. Among the five parameters considered, the fraction of sp 2 atoms and the asphericity β are found to best classify the various isomers into structural families. Four such families are identified, all fitting into rectangular boxes in Fig. 1(a). Each family has generally clear signatures as well-defined peaks in the one-dimensional distributions of Figs. 1(b-d) as a function of energy, number of 6-member cycles, and gyration radius, respectively.
The first family, which is thermodynamically the most stable (lowest energies), is defined by isomers with the highest fraction of sp 2 atoms (above 75%). They generally show a high spherical character (low β). These isomers also have the highest number of 6-member rings but the lowest gyration radius (3.7 Å), indicating a rather compact shape. These 8.2×10 4 isomers, known as cages, include buckminsterfullerene as their most remarkable member, as well as defective fullerenes such as structure (2) in Fig. 1 (a).
In order of increasing energy, the second family consists of intermediate fractions of sp 2 atoms between 45% and 75%, and an asphericity parameter β higher than 0.3. These isomers exhibit clear peaks in the energy and gyration radius distributions, and once resolved they also show intermediate values in the number of 6-member rings. This family, classified as flakes or PPAs, contains 7.7×10 4 members.
One minor peak in the gyration radius distribution near 4 Å points at a specific family with rather compact structures, correlated with low asphericity but a more arbitrary sp 2 fraction (still lower than 75%). This third family is defined in the two-dimensional distribution from β < 0.3, and includes rather spherical but fairly open structures with many carbon chains. Following Kim & Tománek (1994) we classify the 3.8×10 4 members of this family as pretzels. Finally, all the remaining isomers lie in the range of β > 0.3 and with a sp 2 fraction lower than 45%. They are associated with the highest energies, the lowest number of 6-member rings, but the highest gyration radius, whose distribution is particularly broad. They usually contain multiple linear chains connected at a limited subset of atoms. This family is broadly referred to as made of branched structures, which are found in numbers of 1.1×10 5 .
Selected isomers are depicted in Fig. 1 to further illustrate the terminology employed above. For each family, the statistically averaged values of all five energetic or structural parameters was determined, and the Euclidean distance of each member of the family to this five-dimensional point calculated. The member having the lowest distance was identified as a most representative isomer of the family, without any bias from human judgment. The structures numbered (1), (4), (5), and (7) in Fig. 1 are the statistically most representative isomers. For each family another isomer was chosen to further illustrate the structural diversity within the families. These isomers are numbered (2), (3), (6), and (8).
The structural diversity found by exploring the potential energy surface of C 60 suggests an equally diverse set of rearrangement pathways towards fullerenes. In addition to amorphous (Sinitsa et al. 2017) and PPA (Berné et al. 2015) intermediates, linear carbon chains, as already identified in circumstellar environments (Bernath et al. 1989;Hinkle et al. 1988), could also play a role by forming pretzel or branched structures also identified here at higher ener-gies. In view of their possible spectroscopic detection, it is important to now determine their infrared signature.

IR spectra in comparison with astronomical data
In the absence of clear information about the underlying thermodynamics under PNe and PPNe conditions, uniform populations were chosen to reconstruct the typical IR spectra of the four structural families from the individual contributions of each member. Such uniform distributions are an oversimplification of the energetics, although Fig. 1 indicates that cages, PPAs, and branched structures poorly overlap with each other in this respect. By summing over thousands of individual topologically distinct conformers, the spectra are necessarily broad and cannot account for sharp bands, which originate from very abundant specific structures and that must be assigned using more traditional approaches. For comparison with astronomical data, the IR absorption intensities were further convoluted by a Planck blackbody radiation law at 400 K, as proposed earlier by Draine and Li (Draine & Li 2001). The four spectra thus determined for the four families are shown in the bottom part of Fig. 2.
All families exhibit features below 6 µm, although significantly attenuated by the convolution with the Planck law for the cages population. These lines involve sp 1 carbons found in surface (cages) or edge (PPA) defects, and long loops and linear chains (pretzels and branched). The terminating chains, in particular, display a strong absorption near 4.7 µm. While such isomers exist among the cage A&A proofs: manuscript no. 34943corr nebulae (black curves) and IR spectra calculated for the cages, PPAs, pretzels, and branched structures families (red, green, purple, and blue curves, respectively). The four vertical lines mark the IR active lines of buckminsterfullerene (Krätschmer et al. 1990) [see, e.g., Fig. 1(1)] and PPA [see, e.g., Fig. 1(7)] populations, they are far more numerous in more open structures such as the pretzel and the branched families, for which this peak is even more intense and doubled with another feature at 5.1 µm. Above 6 µm the spectral features are due to the CC stretching of sp 2 carbons involved in multiple 5-, 6-, or 7-member rings. The corresponding active spectral region is broad and overlaps with the range where CCC in-plane bending modes are found near 10.5-12.5 µm. These features are most prominent for cages, for which the averaged spectrum exhibits a broad band between 6.2 and 11.6 µm with a sharp blue tail, a smooth red tail, and a sharper maximum near 6.6-7.9 µm. Smaller peaks near 8.9 and 10.3 µm are caused by CC stretchings, while the band near 12.5 µm is assigned to in-plane CCC bending modes leading to some rotation of the rings. Bands located above 14 µm corre-spond to softer deformations of the carbon skeleton with a more collective character. The accuracy of these lines is disputable at the present level of theory (see Appendix A.2, Table A.1), and the narrow intense band appearing near 15.7 µm should actually be shifted by +1.2 µm.
The IR spectrum obtained for the PPA family shows a very broad band extending from 6.4 to 13.3 µm, with maxima at 7.4, 9.5, and 10.8 µm. These features are the signature of CC stretching modes that also increasingly combine with CCC bending modes as the wavelength increases. Two bands near 16.4 and 21.9 µm are assigned to in-plane and out-of-plane deformation modes, respectively.
Compared with cages and PPA isomers, the spectrum of the pretzel isomers above 6 µm displays fewer bands except broad maxima near 7.3 and 15.3 µm that we assign to CC stretchings and in-plane bendings, respectively. The much greater structural diversity among this population explains why these modes are associated with broader bands.
Finally, the spectrum of the branched family exhibits an active region in the 6.4-9.6 µm range, but hardly any feature above this range. The maxima at 7.3 and 8.6 µm are assigned to CC stretches coupled to in-plane CCC bending modes involving sp 2 carbons that pertain both to small rings and long loops or linear chains.
These spectra can now be compared to astronomical observations recorded in C-rich and H-rich PNes of the Small Magellanic Cloud (SMC), where fullerenes and PAHs have both been detected (García-Hernández et al. 2010) and where plateaus in the spectra have remained unassigned. The emission features in SMP SMC 16 found in the 6-9 µm and 15-20 µm ranges possibly match those found for the cage population, given that the band calculated at 15.7 µm should be shifted by about +1.2 µm. However, the contribution of cages to the 10-13 µm plateau is questionable, as is the presence of PPA structures, whose spectral feature near 6-10 µm is much broader than the observed data, with no remarkable feature in the 10-13 µm range either. Similarly, pretzel isomers cannot explain the astronomical data for this PNe. In contrast, we find some similarities in the spectrum of the branched family near 6-9 µm, and especially its peak at 8.6 µm. However, the branched family spectrum shows no feature above 10-13 µm, in contradiction with observations.
Comparing now the calculated spectra with the astronomical observations of the Tc 1 PNe where C 60 and C 70 were detected (Cami et al. 2010), the most stable cage population may also contribute to the same regions as for SMP-SMC 16, but cannot account for the very intense 10-13 µm plateau. The contribution of the other families must probably be excluded as they are too dissimilar with astronomical data for this PNe.
Comparison can also be made for the spectrum of SMP SMC 24, which was suggested as being due to a complex mixture of aliphatic and aromatic compounds such as HACs, PAH clusters, fullerenes, and small dehydrogenated carbon clusters (García-Hernández et al. 2011a). As was the case for SMP-SMC 16, only the cage family is compatible as a potential contributor to the features near 6-9 µm and 15-20 µm. The spectra obtained for the cationic case of C + 60 show essentially no change relative to the neutral system (see Appendix B, Fig. B.1). None of the studied families can account for the 10-13 µm feature. However, additional calculations on smaller carbon clusters C 24 and C 42 show that their corresponding cage families possess more intense features in the 10-15 µm domain than C 60 , relative to the 6-9 µm region (see Appendix C, Fig. C.1 top left panel). Two bands are observed in the spectrum of C 24 in the 10-13 µm range (11.1 µm and 11.8 µm) plus a weaker feature near 12.7-13.3 µm. The spectrum of C 42 presents a weak band in the 10.7-12.2 µm range plus a wide intense band centered at 13.7 µm. These smaller cages could then also be considered as potential contributors to the 10-15 µm astronomical features. The C 24 size is all the more interesting as it possesses features in the 15-20 µm range (15.8, 16.8, 17.6, and 18.6 µm at our level of calculation) that are observed in SMC 16 and 24. The present theoretical study providing the IR signature of cage populations for 24, 42, and 60 carbon atoms thus complements recent density-functionaltheory (DFT) investigations suggesting the possible individual contribution of closed cage fullerenes smaller than C 60 to the IR spectra of fullerene-PNe (Adjizian et al. 2016;Candian et al. 2019).
Finally, emission spectra of proto-planetary nebulae that represent the short evolutionary phase between AGB stars and PNe are also worth considering in our comparison, and here we take the example of the PPN IRAS 22272+5435 whose spectrum can be found in Fig. 3 of Kwok et al. (2001). Interestingly, no intense band is observed at 4.7 µm, which discards compounds with long carbon chains such as pretzels or members of the branched family. The features near 8 and 12 µm that were assigned to the vibrations of alkanes or alkyle side groups pertaining to large carbonaceous particles (Kwok et al. 2001) have no direct counterpart on our spectra. However, the shape of the 5-8 µm feature reported by Kwok et al. (2001) is consistent with the spectrum obtained here for the cages population. Moreover, other features such as the intense band near 17 µm are not observed in PPNe. As was the case for PNe, it is plausible that for such PPNe the hydrogenation state plays an important role in these IR features.

Conclusions
Fullerenes identified in space necessarily originate from complex chemical rearrangements that proceed either by growing smaller scale species, or by reducing larger compounds until these aromatic cages with enhanced stability are formed. To unravel the possible intermediates in these rearrangements through their vibrational IR response, and without assuming any specific formation mechanism, we have attempted to systematically sample the possible isomers that a connected set of 60 carbon atoms can exhibit. The conformational landscape could be efficiently explored by combining REMD simulations employing the REBO potential, followed by systematic reoptimization and IR characterization using the more accurate SCC-DFTB method.
Different energetic, structural and electronic parameters allowed the 3 × 10 5 structures obtained to be partitioned into four main families called cages, planar polycyclic aromatics, pretzels, and branched. The IR spectra of these families exhibit significant differences from each other, which in turn could be exploited to get insight into their potential presence in planetary and proto-planetary nebulae. Comparison between calculated and astronomical spectra generally indicates that neutral or cationic cage isomers could account for the 6-9 µm plateau, possibly suggestive of defective fullerenes in these environments. Smaller cage carbon clusters could also contribute. In contrast, more disordered structures containing long carbon loops or a greater number of linear chains are less likely, all the more as the intense IR feature found for these pretzel and branched families near 4.7 µm has not been observed in PNe and PPNe.
The main limitation of our model is that only a single cluster size and stoichiometry were considered, without thermodynamical consideration within the isomer populations. Such important extensions are relatively straightforward to carry out using the present methodology, provided that the potential energy surfaces and IR spectra can be determined at a similar, reasonable computational cost. However, the large amount of associated data might require more robust analysis tools. In future developments, the contributions of hydrogen or other heteroatoms, different sizes and charge states, as well as anharmonicities, will be explored in order to help identify the structural patterns responsible for the as-yet-unexplained astronomical spectral features of PNe, notably in the 9-13 µm region. This work and its future developments will be of great importance to interpreting the IR spectra of carbon-rich PPNe and PNe soon to be observed by the James Webb Space Telescope.
The vibrational modes of carbon clusters are poorly known, except for a few exceptions such as buckminsterfullerene.
Here we use density-functional theory with the B3LYP functional and the pc-1 basis set (Jensen 2001) as the reference method. Given that this approach is not exact, all frequencies obtained with it are scaled by the appropriate factor of 0.973 (Laury et al. 2012).
A sample of 50 isomers of C 24 were chosen randomly among those reported in the work by Bonnin et al. (2019). Each isomer was locally reoptimized at the DFT/B3LYP/pc-1 level of theory, the normal modes being determined by standard diagonalization of the massweighted Hessian matrix. DFT calculations were performed with the Gaussian09 suite of programs (Frisch et al. 2009). This was not achieved for C 60 isomers due to the too high computational cost at the DFT level. However, the peculiar but well known buckminsterfullerene isomer of C 60 was added to the sample. Correlations between the frequencies obtained with the two methods (DFTB and DFT) can be seen for a subset of particularly active modes in Fig  This correlation plot displays two main linear regions, depending on whether the active mode lies below 1750 cm −1 or above this approximate threshold. The DFT frequencies are recovered from SCC-DFTB frequencies if linear relationships of the type are used in the corresponding ranges.
From the knowledge of the DFTB values, the IR band positions used for comparison with astronomical spectra were determined using these linear relations, and further scaled by 0.973. For the sake of clarity, the uncertainty on the determined correction factors also shown in Fig. A.1 was not applied to the spectra of Fig. 2.

Appendix A.2: Assessment of linear corrections on buckminsterfullerene and fully dehydrogenated coronene
The IR active modes of icosahedral C 60 (buckminsterfullerene) and D 6h fully dehydrogenated coronene C 24 were determined with the SCC-DFTB and DFT/B3LYP/pc-1 methods. The raw positions directly obtained with these methods are listed in Tables A.1

Appendix B: Structural isomers of C 60 + : IR spectra
The geometries of all structural isomers of C 60 obtained at the SCC-DFTB level were optimized in a positive charge state (doublet spin-state). Only slight structural changes were observed, the positive charge being diluted over the 60 carbon atoms. Their IR spectra were computed in the double harmonic approximation similarly to their neutral counterparts and the resulting average spectra for the four families as defined in the main text are shown in Fig A similar study to that proposed in the present letter for C 60 has been performed for smaller clusters, i.e., C 42 and C 24 with initial REBO structures taken from Bonnin et al. (2019). The criteria used to determine the families are similar to those described for C 60 . The IR spectra of the four families for the three cluster sizes are given in C.1.