Revisiting fundamental properties of TiO 2 nanoclusters as condensation seeds in astrophysical environments

Context. The formation of inorganic cloud particles takes place in several atmospheric environments, including those of warm, hot, rocky, and gaseous exoplanets, brown dwarfs, and asymptotic giant branch stars. The cloud particle formation needs to be triggered by the in situ formation of condensation seeds since it cannot be reasonably assumed that such condensation seeds preexist in these chemically complex gas-phase environments. Aims. We aim to develop a method for calculating the thermochemical properties of clusters as key inputs for modelling the formation of condensation nuclei in gases of changing chemical composition. TiO 2 is used as benchmark species for cluster sizes N = 1–15. Methods. We created a total of 90000 candidate (TiO 2 ) N geometries for cluster sizes N = 3–15. We employed a hierarchical optimisation approach, consisting of a force-field description, density-functional based tight-binding, and all-electron density-functional theory (DFT) to obtain accurate zero-point energies and thermochemical properties for the clusters. Results. In 129 combinations of functionals and basis sets, we find that B3LYP/cc-pVTZ, including Grimme’s empirical dispersion, performs most accurately with respect to experimentally derived thermochemical properties of the TiO 2 molecule. We present a hitherto unreported global minimum candidate for size N = 13. The DFT-derived thermochemical cluster data are used to evaluate the nucleation rates for a given temperature–pressure profile of a model hot-Jupiter atmosphere. We find that with the updated and refined cluster data, nucleation becomes unfeasible at slightly lower temperatures, raising the lower boundary for seed formation in the atmosphere. Conclusions. The approach presented in this paper allows finding stable isomers for small (TiO 2 ) N clusters. The choice of the functional and basis set for the all-electron DFT calculations has a measurable impact on the resulting surface tension and nucleation rate, and the updated thermochemical data are recommended for future considerations.

Cloud particles form when a supersaturated gas condenses on the surface of ultra-small particles. These nanosized particles are called cloud condensation nuclei (CCN) (Hudson 1993). The formation of CCN is a crucial step within cloud formation, but its formation rate needs to be determined from first principles, as attempts to derive it from models using the nucleation rate as a free parameter are unable to constrain it accurately (Ormel & Min 2019). Several efforts were undertaken to model the forma-tion of CCN from a quantum-mechanical bottom-up approach including nucleating species such as titanates (Jeong et al. 2000;Plane et al. 2013;Patzer et al. 2014), SiO , Fe-and Kr-bearing molecules (Chang et al. , 2013 and alumina , and Al 2 O 3 (Lam et al. 2015;Gobrecht et al. 2021a). Recently, Köhn et al. (2021) introduced a 3D Monte Carlo approach for the nucleation of TiO 2 that agrees reasonably well with results from kinetic nucleation approaches. Comparison studies like this require knowledge of thermochemical cluster data of the most favourable isomers.
On cool rocky planets, CCN can be formed from external sources such as sulfites from volcanic activity (Andres & Kasgnoc 1998), sea salt from ocean spray, condensing meteoritic dust, and dust particles from sand storms. As these sources do not exist for gaseous planets and are not guaranteed to exist for hot rocky planets, CCN need to be produced by chemical reactions out of the gas phase within the atmosphere to allow cloud formation. The existence of clouds and hazes has been predicted by models and has been observed in hot Jupiters, for example HD189733b (Pont et al. 2013; Barstow et al. 2014), HAT-P-7b (Helling et al. Article number, page 1 of 20 Article published by EDP Sciences, to be cited as https://doi.org/10.1051/0004-6361/202243306 A&A proofs: manuscript no. output 2019), or WASP-43b ; warm Saturns (e.g. Nikolov et al. (2021); super-Earths (Kreidberg et al. 2013); and brown dwarfs (Apai et al. 2013). The process that is considered to produce CCN in the atmospheres of gaseous planets is the formation of small clusters through nucleation. One of the species that has been considered for the formation of CCN in gas giant atmospheres is TiO 2 , in addition to less refractory species, for example SiO or KCl (Helling 2018). Different descriptions have been used to describe the formation of condensation seeds in exoplanet, brown dwarf, and asymptotic giant branch (AGB) star research: classical nucleation theory (Gail et al. 1986), modified classical nucleation theory (Gail & Sedlmayr 2013), kinetic nucleation networks (Patzer et al. 1998), and chemical-kinetic nucleation descriptions (Gobrecht et al. 2016;Boulangier et al. 2019). All rely at some point in the modelling process on thermochemical data of the species and their small nanosized clusters. Because experimental data often exist for condensed phases and simple gas-phase molecules only, quantum-chemical calculations provide the possibility of addressing the cluster size space in between. Each cluster size appears with different geometrical structures (i.e. its isomers), and it is not a priori known which of the isomers is the most favourable at each reaction step to eventually form a CCN. Experimental input would be required here. In lieu of this, we assume that the thermodynamically most favourable isomer takes this role of a key reactant. We therefore search for the isomer at the global minimum in potential energy for each cluster size because ideally, this will be the configuration that any cluster of that size will relax towards. Global minimum candidate isomers were found in previous studies on TiO 2 clusters (Lamiel-Garcia et al. 2017;Berardo et al. 2014;Jeong et al. 2000) and in the analysis of their thermochemical properties (Lee et al. 2015). In this work, a hierarchical approach is applied that uses three different levels of complexity: force fields, density-functional based tight-binding (DFTB), and density-functional theory (DFT) calculations. This approach was developed to globally search for potential geometries of small (TiO 2 ) N clusters (N = 3 − 15), and their thermochemical properties are analysed using quantum-chemical DFT calculations. In Section 2 we describe the methods we used to create possible (TiO 2 ) N cluster structures and the approximations we used to describe their interatomic interactions when searching for a geometry that is energetically favourable, that is, is located at a potential minimum. This approach is tested against previous results for small TiO 2 clusters N=3-6. For the DFT calculations, 129 combinations of functionals and basis sets are benchmarked against experimental data, and we find that the B3LYP functional with the cc-pVTZ basis-set, including Grimme's empirical dispersion, most accurately predicts the potential energies and thermochemical properties of the TiO 2 molecule. In Section 3 we evaluate the results for the different steps in our multi-level approach. This section also evaluates the quality of the approach by comparing the cluster isomers we found with known isomers from the literature. Section 4 analyses the impact of the updated cluster potential energies and the related thermochemical data of nucleation rates for a model hot-Jupiter atmosphere. Finally, we discuss our results and possible further work in Section 5.

Methods
The goal of this paper is to evaluate small clusters of titanium dioxide (TiO 2 ) N , N = 1 − 15, with regard to their geometry, binding energy, and thermochemical properties and the impact of these parameters on cloud nucleation in exoplanet atmospheres. A search is conducted for the most favourable isomer of each size N that represents the global minimum of the the potential energy surface (PES), which characterises the energy of the system. Additionally, other favourable isomers with potential energies close to the minimum are explored. In order to achieve this, the geometries of the clusters are varied and brought towards a geometry that is located at a potential minimum. A hierarchical method is employed that uses three different levels of complexity to describe the PES: force fields, DFTB, and DFT calculations (see Figure 1). Because the DFTB approach takes electronelectron interactions of the valence electrons into account compared to the purely ionic force-field description, it approximates the binding energies for the isomers better. This results in a more accurate energetic ordering of the candidate clusters for each size. For every cluster size N, candidate cluster isomers are created using different methods, and they are optimised towards a local minimum in potential energy using a basin-hopping algorithm. The energy evaluation in this optimisation procedure is based on a description of the inter-atomic interactions by the Coulomb-Buckingham force-field approximation. The resulting cluster geometries are then used as inputs for further geometry optimisation using the DFTB approach because it describes the PES and interatomic interactions more accurately than the force fields. The DFTB approach is used as a second step only as its higher accuracy comes at higher computational cost, making it more economical to only use pre-optimised clusters from the force-field description as inputs. Finally, the candidates at the lowest minima in the potential (which in turn have the highest binding energies) from this second step are used as inputs for all-electron DFT optimisations. With this approach, we aim to achieve highly accurate geometries and energies for the thermodynamically most stable isomers of each cluster size N.

DFT minimisation
Global minimum isomers + thermochemical data Fig. 1: Hierarchical approach to determine global minimum candidate structures for clusters. 1.) The geometries and binding energies of small clusters and isomers are used to calibrate the accuracy of the force field and DFTB methods. 2.) A force-field description of interatomic interactions is used to locally optimise the geometries of a large number of generated clusters for each size N towards a potential energy minimum. 3.) The geometries of these locally and semi-classically optimised cluster candidates are then further refined in a third step, using DFTB methods. This step optimises the geometries of the cluster candidates for the lowest possible potential energy. This provides an energetic ordering of the candidate geometries. 4.) The energetically most favourable candidate geometries are then used as inputs for DFT calculations, resulting in the final and most accurate geometries, binding energies, and vibrational and rotational frequencies for each cluster within this approach.

Construction of seed structures for cluster geometries
We applied the hierarchical approach to (TiO 2 ) N cluster formation. This started with the creation of a broad range of different seed structures or cluster geometries. These seed structures are unoptimised cluster geometries for (TiO 2 ) N , N = 3,...,15, which were then optimised towards potential energy minima. In the first optimisation step, these un-optimised cluster geometries were used as inputs with the force-field approach (Sect. 2.4.2 and the first box in Fig. 1). In order to minimise the chances of missing a particular stable cluster, a large number of seed structures that cover a wide range of structurally diverse geometries were generated. These candidate geometries ranged from closely packed and compact structures to larger and extended structures with void parts and included both symmetrical and asymmetrical structures for all sizes. Ideally, they are also easily optimisable, which means that the average distance between neighbouring atoms should not be much larger than their typical bond length. Four different approaches were used to create these seed clusters of size N: -Random: For a fully randomised seed structure creation, starting with a single TiO 2 monomer unit, more units are iteratively attached. They are placed them at the end of a vector with a random orientation and a length of 1.6Å, which corresponds to the typical Ti-O bond distance for nanosized TiO 2 clusters (Fig. 2a). -Known+1: Known stable isomer structures from the literature (Lamiel-Garcia et al. 2017;Berardo et al. 2014) of size N −1 are taken, and one monomer is randomly attached analogous to method 1 to obtain seed clusters for size N (Fig. 2b).
Both of these methods have a tendency to produce rather compact and highly asymmetric seed structures, especially for larger cluster sizes N. Therefore, two more approaches were introduced to create spatially extended clusters and symmetric clusters, respectively.
-Mirror: For even cluster sizes N, a cluster of size N 2 is taken and mirrored about a random axis. For uneven cluster sizes N, a cluster of size N−1 2 is taken and mirrored. Afterwards, a single monomer is added along the mirror axis to bring the total number of monomer units to N. All seed geometries created using this method fall within the C 2 point group (Fig.  2c).
-Equidistant: Seed structures of size N are created by evenly distributing N monomers equidistant across a sphere of random radius, so that the distances between their centres of mass are between 1.6 − 3.2Å. The resulting geometries resemble hollow spheres (Fig. 2d).
An example of a candidate geometry of size N = 7 produced by each of these approaches is depicted in Figure 2. As the parameter space for possible cluster geometries grows with the cluster size N, a distinction between small (N = 3 − 7) and large (N = 8 − 15) clusters was made in order to save computational cost. For small clusters, 1000 clusters were created with method 1, 400 with method 2, 400 with method 3, and 200 with method 4, giving a total of 2000 cluster seed geometries for each size. For large clusters, the number of guessed geometries was multiplied by 5 to account for the larger parameter space. Ten thousand cluster seed geometries were created for each size, 5000 with method 1, 2000 with method 2, 2000 with method 3, and 1000 with method 4 (Table 5). Hence, 90000 geometrical seed structures were tested in total. All seed geometries were then optimised towards potential minima using the force-field description of the PES that is presented at the end of Sect.2.4.2. Fig. 2: Examples for initial un-optimised cluster geometries of size N = 7, generated (a) randomly, (b) from a known cluster with an additional monomer (marked with yellow circles), (c) from a mirrored cluster of size N = 3 with an additional monomer in the centre, and (d) from N = 7 monomers evenly distributed on a sphere.

Density functional theory
To accurately describe a cluster, a large number of interactions, including quantum-mechanical ones, need to be taken into account. We employed the DFT (see Appendix A.1) as a method for solving the Schrödinger equation approximately and for determining zero-point energies, vibrational frequencies, and rotational constants of our clusters. Density functional theory is parametrised through the choice of a functional and a basis set, which has a large impact on the resulting quantities. It is therefore essential to select an appropriate functional and basis set for our purpose.

Finding the ideal functional and basis set
For the calculations at the density functional level of theory (DFT), the Gaussian16 (Frisch et al. 2013) program was used. It is desirable to use a model parametrisation that agrees with measured data. In order to determine the DFT parametrisation in the form of a combination of functional and basis set that most closely resembles measured data, the results were compared to experimentally derived data for the zero-point energy (Malcolm W. Chase 1998), vibrational frequencies (Malcolm W. Chase 1998), and rotational frequencies (Brünken et al. 2008) for the TiO 2 monomer molecule.
Approach for TiO 2 : The tested functionals include two GGA DFT functionals, 12 hybrid functionals that included Hartree-Fock exchange, and the three complete basis set (CBS) methods listed in Table 1. The functionals were selected to cover a broad range of theoretical approaches as well as with regard to their availability within Gaussian16. Although there are no in-teractions within our clusters that stem from non-bonded interactions, we found that empirical damping coefficients as described by Grimme et al. (2011) improve the overall accuracy of our calibration. The EmpiricalDispersion=GD3BJ keyword was therefore used if available. We note that the use of an empirical dispersion does not originate from a physical or chemical consideration. Employing a similar approach in the selection of basis sets, a total of 129 candidate combinations of functionals and basis sets was produced (Table 1). For each of the combinations, Gaussian16 was used to perform an optimisation of the TiO 2 monomer and to calculate a zero-point energy for the titanium and oxygen atoms. The resulting energies for the individual functionals and basis sets were compared with the experimental data found in the JANAF-NIST tables. All 129 functional or basis set combinations were evaluated according to their binding energy for the TiO 2 molecule calculated at a temperature of T = 0K. The binding energy of the monomer (TiO 2 ) was calculated according to and can now be compared to experimental values (Table 2, 4th column). E ZP denotes the zero-point energy, that is, the energy of the system at rest in its ground electronic state. Further values that are known from experiments are the vibrational frequencies ν 1 , ν 2 , and ν 3 (Malcolm W. Chase 1998) and the rotational constants A rot , B rot , and C rot (Brünken et al. 2008) of the TiO 2 monomer. These quantities are also a result of the DFT calculations, and the derived thermochemical properties depend on them (Eq. C.6). They were therefore used to further constrain which of the candidate combinations of functional and basis set was best suited for modelling TiO 2 clusters. For the vibrational frequencies and the rotational constants, the quality of an individual candidate combination was assessed through a deviation parameter. Because the smallest deviation of the DFT results from the experimental data is desired, these deviation parameters were computed by taking the root sum squared of all relative deviations of the DFT results from the experimental data for the vibrational frequencies and the rotational constants (Table 2, Cols. 5 and 6). For the vibrational frequencies, this parameter (vibrational frequency deviation, VFD) was therefore calculated through and equivalently, for the rotational constants (rotational constant deviation, RCD),  (Table 2). Lastly, the runtime for each monomer calculation on a single node (32 cores) was determined because computational speed and efficiency is desirable.
The thermochemical quantity of interest is the Gibbs free energy of formation ∆ f G • (N), which is needed to compute nucleation rates (see Eq. 9 and 17). The molecular system properties that go into computing the Gibbs free energy are the zero-point energy, the vibrational frequencies, the rotational constants and spin multiplicities (Appendix C). The latter do not impact our results as we assumed closed-shell singlet clusters throughout our calculations.
The candidate combination of functional and basis set that has the smallest deviations from the experimentally known values for the zero-point energy, vibrational levels, and rotational constants was considered optimal. The B3LYP functional in combination with the cc-pVTZ basis set and empirical GD3BJ dispersion  deviates least from the experimental values for the zero-point energy and the vibrational frequencies, and it has the second lowest value for the rotational constant deviation. Therefore, B3LYP/cc-pVTZ represents the most suitable choice for the calibration of approaches with a lower level of complexity (Sect. 2.4.1) and for the final optimisation of our candidate clusters (Sect. 2.5). Additionally, it is desirable to reduce computational cost. In order to achieve this, the fastest configuration that falls within the 4 kJ/mol threshold, the B3LYP functional in combination with the def2svp basis set, was chosen to pre-optimise candidate geometries. This allows the final optimisation with the cc-pVTZ basis set to be completed in fewer steps, leading to overall lower computational cost. The CCSD(T) method is known as the gold standard in computational quantum chemistry and has been the method of choice in quantum chemistry (Čížek 1969;Purvis & Bartlett 1998;Ramabhadran & Raghavachari 2013;Nagy & Kállay 2019). However, these calculations require many computational resources and are prohibitive for larger molecular systems such as nanosized clusters. Therefore, we performed CCSD(T)/6-311+G(2d,2p) single-point calculations for the GM candidates with the smallest cluster sizes of N=1-4, and compared the 0K binding energies with our results from hybrid DFT (B3LYP/cc-pVTZ with empirical dispersion). The CCSD(T) calculation results deviate from experimental results for the binding energy of the monomer by ≈ 90 kJ mol , which is not adequate for our calibration purposes. When the binding energy ratios are compared to the monomer, E bind,N /E bind,1 , the values for our DFT calculations do not differ by more than 2% from the CCSD(T) values. We therefore conclude that calibration of our method with the experimentally determined properties of the monomer is sufficient (Appendix B).

Force fields
In order to save computational time, it is beneficial to be as efficient as possible in the optimisation process towards local and global potential minima for candidate isomers. For example, a titanium-dioxide cluster of size N=12 ((TiO 2 ) 12 ) has 36 atoms and accordingly, 3x36 = 108 degrees of freedom. The number of possible geometries is simply too large to be modelled at a feasible computational cost for a large number of clusters. Therefore, a modelling approach was employed that described the interactions between individual atoms through an interatomic Buckingham pair potential including the Coulomb potential (Appendix A.2). Several parametrisations for Ti-O systems are provided in previous studies, for instance Matsui & Akaogi (1991) or Lamiel-Garcia et al. (2017). However, we find that neither of them are well suited for our purposes as they do not accurately depict the B3LYP/cc-pVTZ energetic ordering within isomers of the same size N. To find a force-field prescription that reflects the experimental data available for TiO 2 to the extent possible, a search was conducted for a set of Buckingham parameters that reproduce the binding energies and average bond distance of the smallest clusters calculated at the start of Sect. 2.4.1.
Approach to (TiO 2 ) N : The program we used to calculate the energy as well as the bond distance is the general utility lattice program GULP (Gale & Rohl 2003). The optimise keyword was used in a constant pressure environment (conp) to optimise the DFT optimised clusters and determine their binding energies for each combination of parameters. The parameters that need to be determined in Eq. A.1 are the charges of Ti and O, and the Buckingham pair parameters A, B, and C for each of the relevant interactions, Ti-Ti, O-O, and Ti-O. Because overall, charge neutrality needs to be conserved, the charge of Ti was directly coupled to the charge of O by a factor of -2, that is, there were two negatively charged oxygen anions for every positively charged titanium cation. For the TiO 2 molecule, we found low formal charges. A Mulliken charge analysis of the smallest cluster sizes (N=1-4) revealed average Ti charges lower than +1e. The parametrisation therefore has ten free parameters, three from each of the interactions: Ti-Ti, O-O, and Ti-O, plus the charge. To determine the ideal parameter set, all interaction parameters as well as the charge were varied freely, and the scipy.optimise differential evolution algorithm (Storn & Price 1997) was used to determine the set of parameters that deviated the least from our calculated binding energies and average bond distances and that simultaneously reproduced the energetic ordering of isomers of the same size (Appendix D.1). With this physically consistent set of parameters, the potential energy of any Ti-O system, that is, any TiO 2 cluster geometry, could now be predicted accurately and required little computational power. This approach of searching for candidate structures with the application of a Buckingham pair potential has been used before, for example in Gobrecht et al. (2018) for aluminium oxide, in Cuko et al. (2017) for hydroxylated silica clusters, and in Lamiel-Garcia et al. (2017) for titanium dioxide. The seed structures were optimised using our reparametrised force field as the first step in our hierarchical approach. To even further increase the structural complexity of our searches, a basin-hopping algorithm as described by Wales & Doye (1997) was employed. It was used in its implementation in the ase Python package (Hjorth Larsen et al. 2017) to optimise the geometries of all the seed structures. The potential energy calculation at each step of the optimisation was made through GULP with our parametrisation of the force field. After the force-field optimisation, the candidate geometries are analysed further in Sect. 2.4.3.

Density-functional based tight-binding method
From the geometry optimisation with the force-field approach (Sect. 2.4.2), 2000 candidate geometries at local potential minima for each small cluster size, N = 3 − 7, and 10000 candidate geometries for each large cluster size, N = 8 − 15, were obtained. However, the Buckingham-Coulomb pair potential does not describe the interaction between electrons and their related orbitals. More precisely, only interactions between Ti cations, O anions, and themselves are taken into account. The interaction of (binding) electrons and the electron correlation is neglected. It still serves the purpose of optimising the candidate geometries on a approximate and simplified PES. The potential energies of the (TiO 2 ) N cluster candidates are not accurate because among other reasons, the force-field approach considers singlepoint charges instead of a charge distribution of each ion in the cluster. It is therefore a reasonable assumption that the binding energies of these clusters and their energetic ordering can be improved by an intermediate optimisation step with a more accurate description accounting for electronic orbitals. This was done to obtain a more accurate energetic ordering of the candidate clusters. The best set of candidate clusters was chosen to be optimised with computationally expensive all-electron DFT calculations (see Section 2.5). For this intermediate step, the DFTB  To enable direct comparison to the results of DFTB+ optimised isomers, the potential energy for each of the candidate geometries was obtained by performing a single-point energy calculation with DFTB+. Next, an identical calculation for atomic Ti and atomic O was performed in order to obtain their respective potential energies. The binding energy of a (TiO 2 ) N cluster can then be calculated by subtracting the individual contributions, analogously to Eq.1, This calculation was performed for each of the candidate geometries obtained from Sect. 2.4.2. Each of the candidate geometries was then optimised using the ConjugateGradient method, making use of its implementation within DFTB+, and was again ordered according to their binding energies calculated with Eq. 4. Because the randomisation of seed structures is not ideal, there were duplicates among the optimised geometries.
In order to filter them out, any two candidates with a bindingenergy difference of ∆E Bind ≤ 0.01eV were analysed with regard to their similarity. To do this, the mean of all inter-atomic distances within the cluster was calculated, giving a size parameter R. If the average inter-atomic distance of the two clusters was closer than ∆R ≤ 0.01Å, the two clusters were found to be duplicates, and one was removed from the process. After the duplicate candidate geometries were removed, the lowest energy clusters for each size were used in our final step, optimisation with all-electron DFT. This is described in Sect. 2.5.

Search for global minima with DFT
The energetically most favourable isomers found with DFTB (see Sect. 2.4.3) were further optimised using Gaussian16. In order to reduce the total computational time, a first optimisation was performed with the B3LYP/def2svp functional and basis set combination with empirical dispersion as described by Grimme et al. (2011). After this step, the final optimisation was made with the B3LYP/cc-pVTZ functional and basis set combination, also with empirical dispersion (Sec. 2.3). For all these isomers, a frequency analysis was carried out using the same functional and basis set, in order to ensure that the isomer was a true minimum and to exclude transition states. In addition, we calculated the final energies as well as rotational constants needed to determine the thermochemical properties we are interested in. The thermochemical properties entropy S • (N) J mol·K , change of enthalpy ddH kJ mol and Gibbs free energy of formation ∆ f G • (N) kJ mol were calculated from the output of the DFT optimisations using the RRHO approximation as implemented within the thermo.pl (Irikura 2002) code.

Force-field optimised clusters
The new parametrisation (Table D.1) for the Buckingham-Coulomb force field was used to optimise the geometric seed structures (Sect. 2.1). The algorithm used for this search is the basin-hopping algorithm described by Wales & Doye (1997). The energy evaluation within this algorithm was made using GULP with the new set of parameters. At each step of the algorithm, a local energy minimisation using the FIRE algorithm (Bitzek et al. 2006) was performed. This algorithm was designed specifically for optimisation of atomistic systems towards their closest local minimum. A numerical temperature of T = 100 within the basin-hopping algorithm was chosen to allow for the exploration of nearby local minima in order to determine the deepest potential well in the vicinity. This temperature is not a physical temperature, but influences the rejection criterium within the algorithm. All seed structures created in Section 2.1 for each cluster size N = 3,...,15 were optimised with this algorithm.

DFTB energy calculations and optimisation
A zero-point energy calculation was performed for all cluster geometries calculated in Sect. 3.1, and their binding energies were calculated according to Eq. 4 (Fig. 3). Then each of these cluster geometries was optimised, minimising their potential energy, with the DFTB description of interactions. The ConjugateGradient optimisation algorithm (Hestenes & Stiefel 1952) was used, and the binding energies of the optimised clusters were calculated again (Fig. 4). A comparison of Figures  3 and 4 demonstrates the approximative character of the forcefield approach, as a seemingly broad variety of local minima for small clusters disappears with the introduction of higher complexity in the form of DFTB. The energy levels become discrete, and many different cluster geometries from the force-field approach produce the same geometry after further optimisation. To find potentially new global minimum (GM) structures, all known literature GM clusters were optimised and their binding energies calculated with the DFTB approach for comparison. For each size, any unique cluster that had a higher binding energy than the known literature GM was categorised as a candidate for a new GM. Additionally, the ten isomers with binding energies closest to that of the literature GM were considered. These isomers (Table 4) were passed on to be optimised with all-electron DFT calculations.

Comparison of candidate creation approaches
In this section, the performance of the different methods for the creation of candidate geometries (see Sec. 2.1) is discussed. Additionally, the necessity for cluster geometries from the literature is assessed to establish whether the method is capable of operating without prior knowledge of particularly favourable cluster geometries. The parameter space that needs to be covered to find all possible geometric configurations increases dramatically with cluster size N. One metric to consider is the number of duplicate isomers. Table 5 lists the number of created candidates and resulting unique clusters for all cluster sizes N we considered. For the small clusters, many duplicates are found because the parameter space is small and many created geometries share a nearby potential minimum. As the size of the clusters grows, the number of identical geometries falls, creating more unique clusters per candidate. A plateau of unique clusters is reached from size 12, at which about 68% of the created clusters are unique. As larger clusters have a larger parameter space for possible geometric configurations, the trend of an increasing number of unique clusters is expected to continue. The resulting plateau therefore signals a limit of the current methods to fully explore the parameter space of clusters larger than N = 12. Two out of the four approaches used in this work rely on known cluster geometries that are reported in the literature in order to produce cluster candidates (Known+1 and Mirror, methods 2 and 3 in Sec. 2.1, referred to as dependent methods), while the other two (Random and Equidistant, methods 1 and 4 in Sec. 2.1, referred to as independent methods) need only information about the monomer. Figure 5 and Table 6 show the 50 best candidates, that is, those with the highest binding energy, and their creation methods. For small clusters N = 3 − 6, the independent methods produce the majority of the 50 clusters with the highest binding energy. These independent methods also find the GM candidates reported in the literature for N = 3 − 6, which are needed for calibration. For larger clusters, the dependent methods perform better. The random generation method only produces one of the 50 energetically most favourable clusters for Article number, page 7 of 20  of clusters after optimisation with the DFTB approach. Individual clusters are colour-coded according to the method of creating the candidate cluster (Sec. 2.1). A random spread along the x-axis has been added to enable comparison between clusters at similar energies.
N > 10. This is because the PES is very large and complex at these sizes and cannot be sufficiently explored by a random walk. Therefore, the methods that contain prior information about stable configurations, such as the dependent methods, show an enhanced performance. The equidistant method is similar because it does not rely on previously reported favourable isomers either. For N > 10, the dependent methods therefore make up far more of the energetically most favourable clusters. More elaborate methods for creating first-guess cluster geometries are available through Gaussian process regression, for example (e.g., Meyer & Hauser (2020)). For the present work, the choice was made to apply simple but fast methods, and comparisons to more complex approaches are desirable in future works.

All-electron DFT calculations
All candidate geometries from Table 4, as well as all literature GM geometries from Berardo et al. (2014) and Lamiel-Garcia et al. (2017), were pre-optimised using Gaussian16 with the B3LYP functional, def2svp basis set, and empirical dispersion. Afterwards, they were optimised and a frequency analysis was performed with the best-performing combination found in Sec. 2.3, B3LYP/cc-pVTZ with empirical dispersion. For all isomers, the binding energies were calculated according to Eq. 4, and their respective geometries can be found in electronic form at the CDS via anonymous ftp to 2 or via 3 . For cluster sizes N = 3 − 10, the GM candidates predicted in the literature were found among the candidate geometries after DFTB optimisation. For cluster size N = 11, the predicted GM was found among the candi-  clusters after optimisation with the DFTB approach. Individual clusters are colour-coded according to the method of creating the candidate cluster (Sec. 2.1). A random spread along the xaxis has been added to enable comparison between clusters at similar energies. date geometries after DFT optimisation. For cluster sizes N = 12, 14, and 15, the predicted GM were not found among the candidate geometries, and no more favourable isomer, that is, with an even lower potential energy, was found either. The reason most likely is that the seed-creation approaches employed here are not well suited to cover the large parameter space for these large clusters within 10000 candidates, as was mentioned in Sec. 3.3. For N = 13, we present a new global minimum candidate structure (Fig. 6), which was created through the mirror creation process (method 3 in Section 2.1). Its potential energy is 0.5 kJ mol per monomer unit lower than the previous lowest-lying isomer reported by Lamiel-Garcia et al. (2017). This difference is lower than the typically assumed accuracy for DFT calculations of 4 kJ mol per monomer unit. This supports the assumption that energetically similar isomers need to be studied for each cluster size because either could play a role in the nucleation process.
In thermochemical processes such as nucleation, the most relevant isomer for each cluster size is assumed to be the energetically most favourable one. Any less favourable, meta-stable isomer that forms relaxes into this global minimum, given sufficient time for the relaxation. To compare the thermochemical properties, the thermo.pl program was used to calculate the entropy S [J mol −1 K −1 ], change of enthalpy d∆H [kJ mol −1 ], and Gibbs free energy of formation ∆ f G • [kJ mol −1 ] for three different sets of clusters: This was done in order to compare the impact of the choice of functional and basis set on the zero-point energies and the thermochemical properties to the impact of the completeness of the GM search for the cluster geometries. The complete thermochemical tables for set 2 are only available in electronic form at the CDS via anonymous ftp to or via 4 .

Astrophysically relevant (TiO 2 ) N properties
In the previous sections, we have derived the fundamental quantities including the zero-point energy of TiO 2 clusters. The thermochemical properties of interest for astrophysical studies are The following sections present TiO 2 in the context of the gas phase, TiO 2 clusters as a promising precursors of TiO2 dust formation, and the derivation of their thermochemical properties, especially the size-and temperature-dependent Gibbs free energy of formation ∆ f G • (N, T ). These properties are relevant for cloud formation modelling in exoplanets and brown dwarfs and also for dust formation modelling in AGB stars and supernova ejecta.

Thermodynamical relevance of TiO 2 in hot Jupiters
In order to assess the relevance of TiO 2 for cloud formation in the atmospheres of hot Jupiters or for dust formation in an AGB envelope, it is necessary to know the thermodynamic quantity ranges in which TiO 2 is the most abundant titanium-bearing molecule. To determine this, we applied the gas-phase equilibrium chemistry code GGChem from Woitke et al. (2018). TiO 2 is the most abundant titanium-bearing molecule at a pressure of 10 −3 bar in the low-temperature regime, that is, for T< 1200 K (Figure 7). Less complex Ti-bearing species dominate the chemical Ti-content with increasing temperatures until Ti + is the dominating Ti species at T > 3500K. Figure 8 presents the GGChem model as a 2D (p gas , T gas ) plane. Four 1D atmospheric p gas , T gas profiles are superimposed. The profiles correspond to a model 3D atmosphere of a hot Jupiter with T eff = 1600K and log(g) = 3, orbiting a G star with T * = 5650K (Baeyens et al. 2021). This model was extrapolated to pressures of 10 −14 bar. The 3D atmosphere model is plotted at four different locations on the planet: the substellar and anti-stellar point, and the morning and evening terminator for this hot Jupiter.
Only at the substellar point for the model atmosphere (T eff = 1600K) is TiO 2 not the most abundant Ti-bearing species at any pressure level, except in a thin layer of the atmosphere at 10  such as TiO and later atomic Ti dominant. For the evening terminator, TiO 2 becomes less abundant than TiO at a pressure level of ∼ 10 −2 bar. Moreover, we note that the (TiO 2 ) N cluster ionisation energies of 9.3-10.5 eV are too high to affect related abundances and the TiO N nucleation in exoplanet atmospheres (Gobrecht et al. 2021b). This strengthens the argument that TiO 2 is the most relevant Ti-bearing molecule in the upper atmosphere, which is were cloud formation is expected to take place.

Surface tension of TiO 2
In classical and modified classical nucleation theory, the impact of the Gibbs free energy of formation ∆ f G • on the nucleation process is modelled through the surface tension σ ∞ of the bulk solid (Gail & Sedlmayr 2013). The surface tension is different for small clusters, due to significantly different geometries and properties than the bulk. This difference is neglected in classical approaches. It is possible, however, to determine a surface tension that is valid for small clusters if individual cluster data are available. Here the approach by Jeong et al. (2000) as applied by Lee et al. (2015) was followed, in which the dependence of the Gibbs free energy of formation on the cluster size is linked to the surface tension σ ∞ through with cluster size N, the Gibbs free energy of formation of a cluster of size N, ∆ f G • (N), the Gibbs free energy of the monomer ∆ f G • (1), the Gibbs free energy of the bulk phase ∆ f G • 1 (s), a fitting factor N f , and Here a 0 is the theoretical monomer radius, which is derived from the bulk density of rutile and the molar mass through The fitting factor N f is set to N f = 0, analogously to Lee et al. (2015). The Gibbs free energies for the monomer and for the bulk are known from experiments (Malcolm W. Chase 1998), as are the values necessary to derive a 0 . The Gibbs free energy of formation of the clusters was calcuated from the thermochemical values derived from our all-electron DFT calculations and consistently compared to the thermochemical clusterdata from the study by Lee et al. (2015). This leaves σ ∞ as the only free parameter, which was fit using Eq. 5. The left side of Figure 9 compares the data from the two functional and basis set combinations used in this work to the results from Lee et al. (2015). It becomes apparent that the choice of the functional and basis set influences the resulting thermochemical properties and therefore the surface tension given by the fit. We obtain a different result for the surface tension than Lee et al. (2015) using their data. The result for the surface tension at T = 1000K using all available GM data and the best-performing functional and basis set combination is σ ∞ = 518 erg cm −2 (blue line in Fig. 9). In the right panel of Figure 9, the comparison is made between the fits for σ ∞ using all available GM data (blue) and only the best candidates that were produced from candidates in this work (red). Both the data and the results for σ ∞ vary only slightly. Visible differences occur for N = 14 and N = 15, where this work did not produce candidates that were identical or energetically close to the literature GM. The best-fit value for σ ∞ without using GM candidates found in the literature is σ ∞ = 525 erg cm −2 , which differs from the best-fit value by only 7 erg cm −2 . This indicates that the choice of functional and basis set has a stronger impact on the resulting surface tension than finding the lowest-energy isomer for all sizes, given the extent of our searches.
Because the potential minimum geometries that were missed all have lower enthalpies and therefore presumably also lower Gibbs free energies, they can only lower the resulting surface tension. This approach therefore gives an upper limit for the surface tension σ ∞ .
The best-fit value for the surface tension is found to be dependent on the temperature. Fig. 10 shows the best fit for a linear dependence of σ ∞ on T for T = 500 − 2000 K, Studies have been conducted on the surface tension of bulk rutile at room temperature (Kubo et al. 2007), finding a value of σ ∞ (298.15K) = 1001 erg cm −2 for the (011) lattice. The best fit for room temperature in this study gives a value of σ ∞ (298.15K) = 575.72 erg cm −2 for small (TiO 2 ) N molecular clusters. The factor of almost two in surface tension for small clusters versus the bulk phase of TiO 2 makes it clear that the prior is recommended in all considerations regarding nucleation processes.

Nucleation rates of TiO 2
To quantify the effect of the updated thermochemical cluster data on quantities relevant for cloud formation in exoplanet atmospheres, the nucleation rate for TiO 2 was calculated along the temperature pressure profile of the morning terminator of a hot Jupiter with an effective temperature of T eff = 1600K and a surface gravity of log g = 3 (blue lines in Fig. 8). The gas-phase composition was calculated with the equilibrium chemistry code GGChem.

Modified classical nucleation theory
In this work, the nucleation rates were computed analogously to Lee et al. (2015) to facilitate comparison. The stationary, homogeneous, homomolecular nucleation rate in classical nucleation theory is calculated by Here, S (T ) and f • (1) are the supersaturation ratio and monomer number density of TiO 2 , respectively. τ gr is the growth timescale, defined as with A(N) = 4πa 2 0 N 2/3 the effective cross-section of a spherical (TiO 2 ) N cluster, n f the monomer number density (n f = f • (1)), the sticking factor α, which is assumed to be α = 1, and the relative velocity v rel , which is given for monomers with mass m x by the thermal velocity through Z(N) is the Zeldovich factor, which accounts for the contribution to nucleation from Brownian motion,  Fig. 9: Gibbs free energy of formation per cluster size N as a function of cluster size N at a temperature of T = 1000K. For each approach, a fit for σ ∞ was calculated using eq. 5. Left: Comparison of the resulting surface tensions for different sources of cluster data. The sources are Lee et al. (2015) for the orange line, DFT calculations with the fast basis set def2svp for the green line, and DFT calculations with the accurate basis set cc-pVTZ for the blue line. Right: Comparison of the impact of isomer completeness on resulting surface tension. Both lines use thermochemical data derived from the accurate cc-pVTZ basis set DFT calculations. For the blue line, the energetically most favoured isomer was chosen for all sizes, regardless of whether it was found by the approach in this paper. For the red line, only outputs from the approach described here were used. As the resulting difference is small, it becomes apparent that the choice of the basis set has a greater effect than the completeness of the cluster geometries. and in the final term, the Gibbs free energy ∆G(N) is approximated using modified classical nucleation theory, or MCNT, giving with θ ∞ from Eq. 6. Equation 9 is evaluated at the critical cluster size N * , which is given by The critical cluster size is mainly influenced by the supersaturation of molecular TiO 2 at various temperature and pressure points. When TiO 2 is not supersaturated (ln S (T ) < 0), N * is negative (Eq 14), and consequently, Eq. 13 has no real solution, leading to the absence of nucleation for the modified classical case at these temperature pressure points. Using Eq. 9, we calculated the classical nucleation rate for the given (p gas ,T gas ) profile using three different values for σ ∞ : the temperature-dependent σ ∞ from this work, the temperature-dependent σ ∞ from Lee et al. (2015), and a constant σ ∞ = 797 erg cm −2 , derived from the re-fit to Lee's cluster data in this work. Results are shown in Figure 11. Nucleation rates derived from updated cluster data are overall more efficient and extend to higher temperatures. The position of the peak nucleation rate within the atmosphere is very similar for the three different surface tensions. However, nucleation stays efficient up until slightly lower pressures for updated cluster data.

Non-classical nucleation theory
If detailed cluster data for all small sizes N are available, the nucleation rate can be computed using individual cluster growth rates. The non-classical nucleation rate is with τ gr from Eq. 10 and f • (N) the number density of a cluster of size N. This can be computed from the partial pressure of the latter through Applying the law of mass action to a cluster of size N gives its partial pressure as with p • (1) the partial pressure of the monomer and the reference pressure p • − = 1bar. The results for the non-classical nucleation rates are shown in Figure 12.

Results for TiO 2 nucleation rates
For the modified classical nucleation approach, three values for the surface tension were compared. The shape of the nucleation rate in Fig. 11 is influenced by several factors. The reason that no nucleation occurs below a temperature of T ≈ 680K lies in the p gas , T gas profile that is used, which does not extend to lower temperatures (Fig. 8). At the upper temperature end, nucleation is limited because it depends on the supersaturation of the TiO 2 monomer. Because supersaturation (S > 1) is required for nucleation and TiO 2 is no longer super-saturated at these high temperatures, no nucleation occurs. For lower temperatures, the surface tension from this work results in a higher nucleation rate than the recomputed surface tension with cluster data from for Lee et al. (2015) by about one order of magnitude. The lower surface tension also causes nucleation to become inefficient at higher temperatures. Overall, the value for the surface tension from this work agrees well with the value derived in Lee et al. (2015). However, using their cluster data, we find a significantly higher surface tension (see Fig 11).
Nucleation rates for non-classical nucleation (Fig. 12) are lower than for MCNT for T = 680 − 1600K by about two orders of magnitude. This is the result of taking individual cluster data for all sizes into account instead of combining all the information into one quantity, the surface tension. Therefore, monomer growth processes from size N to size N + 1 are modelled more accurately. If one of the growth processes is less efficient than others, it will create a bottleneck for the overall nucleation rate. In this case, N + 1 is considered as the critical cluster size N * . At higher temperatures (T > 1600), cluster data from this work give lower nucleation rates than the cluster data from Lee et al. (2015) and MCNT. There is no strict upper temperature limit for non-classical nucleation, as clusters can grow as long as it is energetically favourable for them to do so and as long as nucleating material, that is, TiO2 monomers, are available (Eq. 17). Because our updated cluster-data do predict that nucleation of TiO 2 becomes inefficient at lower temperatures, the nucleation process starts at lower pressure levels, that is, higher up in the model atmosphere.

Conclusion
We have presented a method for determining and optimising the geometries, zero-point energies, and thermochemical properties for clusters. Emphasis was placed on exploring the parameter space of possible geometric configurations for these clusters and on deriving their potential energies and thermochemical properties accurately. This approach was tested for small (N = 3 − 15) (TiO 2 ) N clusters. To ensure thermochemical accuracy, 129 combinations of DFT functionals and basis sets were tested with regard to their accuracy against known experimental data for the TiO 2 monomer. The B3LYP functional with the cc-pVTZ basis set and GD3BJ empirical dispersion were found to closely approach the experimental data and were used for all final optimisation steps and frequency analysis. A new force-field parametrisation of the Buckingham-Coulomb pair potential was presented that more accurately reflects the cluster geometry (i.e. bond lengths) and energetic ordering for small TiO 2 clusters than previous parametrisations. For the DFTB description of interactions, the matsci set of Slater-Koster integrals was found to best reflect the energetic ordering for small clusters and their isomers given by the all-electron DFT calculations. The hierarchical optimisation approach works as intended and produces a large number of energetically low-lying isomers for all sizes. For the smallest clusters N < 7, all global minimum candidates reported in the literature were found with methods that do not rely on using known cluster geometries. Because these are the same cluster sizes as were used to calibrate the less complex descriptions of inter-atomic potentials, this step of the approach is therefore independent of prior knowledge of cluster geometries. The Random approach of seed candidate creation is well suited to search small parameter spaces, such as N < 7. However, for any cluster sizes N > 10, the parameter space is too large to be searched with a fully randomised approach. For these larger cluster sizes, the Mirror and Known+1 approaches produce the largest number of energetically low-lying isomers. However, they require prior knowledge of favourable clusters of size N − 1 for Known+1 and of size N/2 for Mirror. This presents a constraint on the efficient search for clusters of these sizes, as without prior knowledge, the global minima in potential energy will have to be found iteratively, always growing from N to N + 1. The current implementation of the hierarchical approach was able to find all known global minima for N = 3 − 11, as well as a new global minimum candidate for N = 13 that lies 6 kJ mol −1 below the energy of the global minimum candidate structure reported by Lamiel-Garcia et al. (2017). For N = 12, the global minimum known from the literature could not be reproduced, the closest isomer produced lies 2.45 kJ mol per monomer unit higher. For N = 14, this energetic distance between the lowest found isomer and the known global minimum is 5.71 kJ mol per monomer unit, and for N = 15, it is 6.8 kJ mol per monomer unit. This shows that the search method is still incomplete for larger clusters. Because only clusters up to size N = 6 had isomers that were used for calibration, the Mirror approach only had a single literature isomer to generate the clusters for N = 14 and N = 15, which drastically reduced the possible configurations explored by this approach.
To estimate the impact of the new thermochemical data on nucleation processes, the surface tension for small molecular clusters, as it is calculated in modified classical nucleation theory, was investigated. The findings of Lee et al. (2015) cannot be reproduced, raising the best-fit value for σ ∞ for the test case at T= 1000K to σ ∞ = 797 erg cm −2 . The fast basis set def2svp gives σ ∞ = 325 erg cm −2 , while the thermochemically more accurate basis set cc-pVTZ gives σ ∞ = 518 erg cm −2 . The spread between these values is a factor of two, and as the surface tension appears in the exponent of the modified nucleation rate (Eq. 9), this spread is amplified. Because the B3LYP/cc-pVTZ functional and basis set was specifically chosen for its accuracy at modelling the thermochemical properties of small TiO 2 molecular clusters, the updated value for the surface tension represents the currently best approximation. We find that for modified classical nucleation theory, the impact on the nucleation rates by the choice of the functional and basis set used to calculate the thermochemical properties is more important than finding a true global minimum as long as the structure that is used is one of the low-energy isomers. This is not true for non-classical nucleation description, which depends on accurate data for all sizes. The non-classical nucleation process described by the updated cluster data becomes inefficient at lower temperatures, which raises the atmospheric lower border for seed formation through that process. A limitation of this description is that it only allows for homogeneous and homomolecular nucleation, ignoring pathways through cluster-cluster collisions and through species other than TiO 2 .
We have shown that the updated cluster data affect the nucleation rates both in their classical and non-classical description. Providing cluster data for larger clusters N > 15 will allow a more detailed comparison with independent methods, for example with molecular dynamics methods as presented in Köhn et al. (2021). Additionally, the spectroscopic properties of small (TiO 2 ) N clusters, in particular, the frequency-dependent opacities, are interesting because they can allow for dust coagulation processes to be constrained through observations (Köhler et al. 2012). The method will be improved by additional candidate creation methods and applied to other potentially CCN-forming species in the atmospheres of hot Jupiters. prised in the cluster S • atoms , For a cluster of size N = 5, the last term is the sum of the entropies of five Ti atoms and ten oxygen atoms. Calculating the enthalpy of formation of the cluster requires information about the enthalpies of the elements that make up the cluster. The entropies and enthalpies for Ti and O used for these calculations are taken from the JANAF-NIST tables (Malcolm W. Chase 1998).
The enthalpy of formation of the cluster at temperature T is then calculated through with the reference temperature T • − = 0K. The enthalpy of formation at the reference temperature ∆ f H • (T • − ) is the binding energy of the cluster, caluclated for a cluster of size N through The entropy of the system S • cluster is calculated from the internal energy of the system, With the internal energy U(T ) − U(0), the partition function q(V, T ), and the number of atoms in the cluster N atoms . The internal energy is derived from the partition function q(V, T ) through When the rigid-rotor harmonic-oscillator (RRHO) approximation for a polyatomic, non-linear molecule is used, its partition function is T is the temperature of the system, β = 1 k B T , with the Boltzmann constant k B . h is the Planck constant, and c is the speed of light. σ is the symmetry number, correcting for the repeated count of indistinguishable configurations. The unknown inputs in Eq. C.6 are the vibrational levels ν i , the rotational constants A rot , B rot , and C rot , and V, the volume of the particle, which are all outputs of the DFT simulations. For further reading about the calculation of these thermochemical properties, we refer to Jeong (2000). The partition function used here omits the electronic partition function, q el , with the energy levels i, their energies ϵ i and their respective degeneracies g i . This is because Gaussian16 assumes that the first electronic excitation energy is much greater than the thermal energy (ϵ 1 ≫ kT ) and that it is therefore inaccessible at any temperature, resulting in a value of q el = 1 for all clusters and temperatures. In addition, all clusters are assumed to be in a singlet state with g i =1 ∀ i.  N = 3, .., 6) are then compared to the energy of the global minimum isomer of that respective size to obtain the relative energy, equivalent to the approach for the force fields in Sect. D.1. These are again compared to the relative energies that result from the DFT calculations ( Fig.  D.2). Analogously to the left panel in Fig. D.1, the relative energy deviations from the global minimum are plotted for the DFT calculations. An ideal description matches the ordering in both directions and falls onto a slope of 1 (black line). If any clusters fall above the horizontal 0 eV line, the DFTB parametrisation shows a different isomer as the global minimum for this cluster size, which is not desirable. When the plots in Figure D.2 are compared, it becomes apparent that for trans3d ( Fig. D.2b) and tiorg ( Fig. D.2c), there are several isomers for all cluster sizes N = 3 − 6 and N = 3, 5 that fall above the horizontal 0 eV line and therefore represent unlikely global minima candidates according to their DFTB description. For matsci (Fig. D.2a), this is not the case, and the lowest-energy isomer corresponds to the global minimum candidate derived from all-electron DFT calculations for each respective cluster size. Visual inspection also shows that the deviation of individual isomers from the linear function with slope 1 is the smallest for matsci, which means that it reproduces the energetic ordering of the DFT calculations best. In comparison, both trans3d and tiorg have large individual outliers for almost all cluster sizes. In order to quantify the quality of each of the Slater-Koster integral sets, the deviation of their relative energies as compared to the relative all-electron DFT energies is calculated. The sum of these residuals is equivalent to Q E in Sec. D.1 and is given in the plots. The visual inspection is confirmed because the matsci set far outperforms the other two. Therefore, the matsci set of Slater-Koster integrals is used for all DFTB calculations in this work.  Tiorg Slater-Koster integrals. The positions of all isomers on the x-axis are consistent in all three panels. The energy level of the global minimum according to DFT is given by the horizontal blue lines. The relative energetic ordering from the DFT calculations is given by their order from right to left, and the DFTB ordering is given from top to bottom.