3D non-LTE iron abundances in FG-type dwarfs

Spectroscopic measurements of iron abundances are prone to systematic modelling errors. We present 3D non-LTE calculations across 32 STAGGER-grid models with effective temperatures from 5000 K to 6500 K, surface gravities of 4.0 dex and 4.5 dex, and metallicities from $-$3 dex to 0 dex, and study the effects on 171 Fe I and 12 Fe II optical lines. In warm metal-poor stars, the 3D non-LTE abundances are up to 0.5 dex larger than 1D LTE abundances inferred from Fe I lines of intermediate excitation potential. In contrast, the 3D non-LTE abundances can be 0.2 dex smaller in cool metal-poor stars when using Fe I lines of low excitation potential. The corresponding abundance differences between 3D non-LTE and 1D non-LTE are generally less severe but can still reach $\pm$0.2 dex. For Fe II lines the 3D abundances range from up to 0.15 dex larger, to 0.10 dex smaller, than 1D abundances, with negligible departures from 3D LTE except for the warmest stars at the lowest metallicities. The results were used to correct 1D LTE abundances of the Sun and Procyon (HD 61421), and of the metal-poor stars HD 84937 and HD 140283, using an interpolation routine based on neural networks. The 3D non-LTE models achieve an improved ionisation balance in all four stars. In the two metal-poor stars, they remove excitation imbalances that amount to 250 K to 300 K errors in effective temperature. For Procyon, the 3D non-LTE models suggest [Fe/H] = 0.11 $\pm$ 0.03, which is significantly larger than literature values based on simpler models. We make the 3D non-LTE interpolation routine for FG-type dwarfs publicly available, in addition to 1D non-LTE departure coefficients for standard MARCS models of FGKM-type dwarfs and giants. These tools, together with an extended 3D LTE grid for Fe II from 2019, can help improve the accuracy of stellar parameter and iron abundance determinations for late-type stars.


Introduction
Iron is one of the most discussed elements in astrophysics. It is the seventh most abundant element in the Sun  and an important source of opacity throughout the solar interior (Bailey et al. 2015;Mondet et al. 2015). Its high cosmic abundance and open 3d shell conspire to make iron dominate the spectra of FG-type dwarfs such as the Sun, and so it is the usual proxy of the stellar metal mass fraction (Z). It is also a precise and illuminating tracer of the evolution of galaxies, being formed copiously by both core-collapse supernovae on short timescales, and Type Ia supernovae at later cosmic epochs (Maoz & Graur 2017). In spectroscopic surveys, including extremely large ones such as APOGEE (Ahumada et al. 2020) and GALAH (Buder et al. 2021), iron is often used to infer other fundamental stellar parameters such as the effective temperature (T eff ) and surface gravity (log g), via excitation and ionisation equilibria of Fe i and Fe ii lines (Frebel et al. 2013;Ruchti et al. 2013;Tsantaki et al. 2013;Ezzeddine et al. 2017;Li & Ezzeddine 2022).
However, measurements of iron abundances (A (Fe) 1 or [Fe/H] 2 ) can suffer from systematic modelling errors. Most spectroscopic analyses carried out today do the following: a) employ one-dimensional (1D) hydrostatic model stellar atmo-1 A (Fe) = log 10 (N Fe /N H ) + 12 2 [Fe/H] = A (Fe) star − A (Fe) Sun spheres that are necessarily in radiative equilibrium at the upper boundary; and b) invoke the assumption of local thermodynamic equilibrium (LTE). Both assumptions can alter Fe i and Fe ii line strengths (Shchukina & Trujillo Bueno 2001;Holzreuter & Solanki 2012. The quantitative 1D LTE systematic errors typically depend on the excitation potential (E low ), and are different for Fe i and Fe ii. Thus they may introduce errors in spectroscopic determinations of T eff and log g as well.
The 1D LTE systematic errors are particularly prominent for warm, metal-poor stars, where A (Fe) determinations are usually underestimated (Amarsi et al. 2016;Nordlander et al. 2017). Here, the reduced metal opacity results in a larger UV flux escaping the deeper photosphere, which drives a severe overionisation of Fe i and a slight over-excitation of Fe ii (Shchukina et al. 2005). This non-LTE effect is enhanced by the steeper temperature gradients within metal-poor 3D model stellar atmospheres as efficient adiabatic cooling makes them much cooler in the upper layers than what is expected from 1D models that are close to radiative equilibrium (Collet et al. 2006(Collet et al. , 2011. In the last decade, 3D model stellar atmospheres combined with 3D non-LTE post-processing radiative transfer calculations have increasingly been used for spectroscopic abundance determinations of different elements. For iron this approach has not been utilised much, other than in studies of the Sun (Lind et al. 2017; Asplund et al. 2021) and of a small number of metal-poor stars (Amarsi et al. 2016;Nordlander et al. 2017).
Here we describe recent 3D non-LTE calculations for iron for FG-type dwarfs (Sect. 2). To our knowledge, this is the first grid of its kind. We briefly explore the systematic effects on 1D LTE and 1D non-LTE measurements of A (Fe) across parameter space (Sect. 3), focussing on the 171 Fe i and 12 Fe ii optical lines used in the analysis of the Gaia benchmark stars (GBS; Jofré et al. 2014). We present a routine based on neural networks that interpolates the predicted abundance differences (Sect. 4), and we describe the impact on four standard stars: the Sun, Procyon (HD 61421), HD 84937, and HD 140283 (Sect. 5). We end with a summary of our overall conclusions, and provide grids of 1D non-LTE departure coefficients for FGKM-type dwarfs and giants 3 as well as the interpolation routine for 1D LTE versus 3D non-LTE abundance differences 4 which is publicly available (Sect. 6).

Model stellar atmospheres
Post-processing radiative transfer calculations were performed across a subset of the STAGGER grid of 3D hydrodynamic model stellar atmospheres (Magic et al. 2013). The selected models have four values of T eff between 5000 K and 6500 K, two values of log g of 4.0 dex and 4.5 dex (in units of logarithmic cm s −2 ), and four values of [Fe/H] between −3.0 dex and 0.0 dex. The models were computed with the solar chemical composition of Asplund et al. (2009) Prior to carrying out the calculations in this work, the models were trimmed and re-gridded to both reduce the horizontal resolution (to speed up the calculations), and to increase the resolution in the steep continuum-forming regions (to improve ac-curacy), in the manner described in Sect. 2.1.1 of Amarsi et al. (2018). The calculations were carried out on five snapshots of each model.
Post-processing radiative transfer calculations were also performed on two families of 1D hydrostatic model stellar atmospheres: ATMO models (Magic et al. 2013, Appendix A) and an extended grid of MARCS models (Gustafsson et al. 2008). The ATMO models adopt the same equation of state, raw opacity data, opacity bins, and formal solver as their companion 3D hydrodynamic STAGGER models. These calculations permit a differential determination of 3D versus 1D effects. The calculations on MARCS models were only used for the analysis of Procyon (HD 61421; Sect. 5.2), and also to generate a grid of 1D non-LTE departure coefficients to be made publicly available (Sect. 6). The grid extends over dwarfs and giants of spectral types FGKM; further details Amarsi et al. (2020b).

Model atom
The model atom is based on the model described by Lind et al. (2017), with a number of updates and modifications outlined in Asplund et al. (2021). These updates are summarised here for completeness.
The energies of LS J levels and the bound-bound radiative transition probabilities (log g f ) were taken from the compilation of Lind et al. (2017). Natural line broadening parameters were calculated here consistently with these radiative data. Pressure broadening parameters due to elastic hydrogen collisions were computed by interpolating the tables of Anstee & O'Mara (1995), Barklem & O'Mara (1997), and Barklem et al. (1998). Bound-free radiative transition probabilities were taken from Bautista et al. (2017), based on the R-matrix method. We note that the B-spline R-matrix calculations of Bautista et al. (2017) are likely to be superior and should be adopted in future work. Inelastic collision cross sections remain a significant source of uncertainty (Mashonkina et al. 2019). These were taken from a number of different studies, as summarised in Table 1. The complete data set is too large to permit 3D non-LTE calculations on a reasonable timescale: it consists of 2980 LS J levels of Fe i (up to 0.003 eV below the ionisation limit), 2880 LS J levels of Fe ii (up to 0.007 eV below the ionisation limit), plus the ground state of Fe iii. To make the calculations tractable, fine structure LS J levels were collapsed into LS terms. Moreover, since highly excited states are rather efficiently coupled via inelastic hydrogen collisions in our model (according to the description of Kaulakys 1985Kaulakys , 1986Kaulakys , 1991, levels with energies greater than 6 eV above the ground state in Fe i and 8.7 eV above the ground state in Fe ii, as well as within the same spin state were collapsed into super-levels with widths of up to ±0.25 eV for Fe i and ±2.5 eV for Fe ii following the same approach tested for silicon , carbon and oxygen (Amarsi et al. 2019b), nitrogen (Amarsi et al. 2020a), and magnesium and calcium . This resulted in 15 super-levels for Fe i and seven super-levels for Fe ii. The highest Fe i super-level is 0.1 eV below the ionisation limit, which previous studies suggest is sufficient for providing a realistic collisional coupling to Fe ii (Mashonkina et al. 2011). Lines involving collapsed levels were also collapsed into super-lines . The reduced atom consists of 177 LS levels and super-levels, of which 100 are made up of Fe i and 76 are made up of Fe ii. Ref. Wang et al. (2018); (b) Bautista et al. (2015); (c) Zhang & Pradhan (1995); (d) Allen (1973); (e) Barklem (2018); (f) Kaulakys (1991); (g) Yakovleva et al. (2019); (h) Lambert (1993)

Non-LTE radiative transfer in 3D and 1D
The post-processing radiative transfer calculations were performed with Balder (Amarsi et al. 2018). This code has its roots in Multi3D (Leenaarts & Carlsson 2009) with updates in particular to the opacity package and formal solver (Amarsi et al. 2016(Amarsi et al. , 2019a. Continuum Rayleigh scattering from hydrogen as well as Thomson scattering from electrons were included as described in Amarsi et al. (2020b), but, in contrast to that work, background lines were treated in pure absorption. The calculations employed the 8-point Lobatto quadrature over µ = cos θ and the 4-point trapezoidal integration over φ across the unit sphere, as described in Sect. 2.2 of Amarsi et al. (2018) (see also Amarsi & Asplund 2017, Sect. 2.1). The emergent synthetic spectra (Sect. 2.4) were based on a formal solution using the 7point Lobatto quadrature over µ = cos θ and 8-point trapezoidal integration over φ across the unit hemisphere. The non-LTE solution was deemed to have converged when the emergent intensities changed by less than a factor of 10 −3 between successive iterations. The calculations on the 1D model stellar atmospheres (ATMO and MARCS models) were performed in almost the same way as the calculations on the 3D models. In particular they employed the same code, Balder, and the same model atom.
The key difference is that the 1D approach included a depthindependent, isotropic microturbulence (ξ mic;1D ). This parameter broadens spectral lines; the physical interpretation is that it reflects a temperature and velocity gradients on scales much smaller than one optical depth (Gray 2008, Chapter 17). These effects are naturally accounted for in the 3D models, and thus no analogous broadening was adopted for those calculations. Therefore, the 1D versus 3D abundance differences (Sect. 2.4) are functions of the ξ mic;1D parameter adopted in the 1D models. The depth-independent ξ mic;1D has been treated as an additional dimension in the theoretical grids presented in Sect. 3 and in Sect. 4; in addition, it is fixed to a representative literature value in Sect. 5. To reduce the dimensionality of the parameter space, one could attempt to calibrate ξ mic;1D on the 3D models (Steffen et al. 2013;Vasilyev et al. 2018). However, a depth-independent, isotropic ξ mic;1D is an imperfect descriptor of the dynamical properties of stellar atmospheres. This is in part because temperature and velocity gradients decrease with increasing geometrical height in the line-forming regions, which implies that a smaller ξ mic;1D is appropriate for lines that form higher up in the atmosphere (Steffen et al. 2013, Fig. 2). At the same time, light from the stellar limb originates from stellar granules that are seen edge-on, characterised by large temperature and velocity gradients, which implies a larger ξ mic;1D when sampling more light from the stellar limb (Takeda 2022). Thus, the calibrated ξ mic;1D is a function of the formation depths and centre-to-limb variations of the particular set of lines used for the analysis. Consequently, a proper calibration would require increasing the dimensionality of the parameter space, rather than decreasing it. As such, for simplicity, a 3D-calibrated ξ mic;1D has not been adopted here. In any case, in the present study, the 1D results mainly serve as an intermediate step to obtain 3D non-LTE abundances through the application of abundance differences.
In the post-processing calculations for the 3D models, A (Fe) was kept consistent with the value used to construct the model stellar atmospheres: in other words, [Fe/M] = 0.0. This was also the case for the calculations on 1D MARCS models. For the 1D ATMO models, iron was treated as a trace element such that, for a given [M/H], calculations were performed for nine values of A (Fe), instead of just one value as for the 3D models. These values correspond to [Fe/M] between −1.0 dex and +1.0 dex in uniform steps of 0.25 dex.

Synthetic spectra and abundance differences
The emergent synthetic spectra from the model stellar atmospheres were calculated after the non-LTE populations had converged. The non-LTE populations of the reduced LS model atom were first redistributed onto the full LS J data set (Sect. 2.2). This was done by multiplying the LS J LTE populations by the LS departure coefficients (e.g. Amarsi et al. 2016). This is valid in the limit of large collisions within fine-structure levels as well as within super-levels. Following this, a formal solution in the outgoing direction was performed to obtain the emergent intensity in different directions (Sect. 2.3), from which the emergent discaveraged intensity (the emergent stellar flux) was determined.
Emergent spectra were only calculated for a small number of Fe i and Fe ii lines. This study employs the 'golden' line list presented in Jofré et al. (2014), adopting the energies, log g f , and pressure broadening parameters given in their Tables 4 and 5.
The log g f and E low for this set of lines are illustrated in Fig. 1. Since this line list was constructed to be useful for stellar spec-  troscopy, lines of larger E low , thus corresponding to lower absorption due to the Boltzmann factor, tend to have larger log g f , giving rise to a linear trend in Fig. 1. All of the lines shown are in the optical regime: λ Air = 478.783 nm to 681.026 nm.
Equivalent widths were determined via direct integration. A complication arose because the line profiles were calculated simultaneously, rather than in a line-by-line fashion. This means that different lines may overlap with each other. Moreover, the extent of the overlap is a function of stellar parameters, as illustrated in Fig. 2. When the overlapping lines are of different E low or of different species, this can add some noise to the analysis and complicate the interpretation. Thus, the discussion of the abundance differences across parameter space (Sect. 3) is restricted to lines that do not overlap with each other, such that the flux depression at the two integration limits -which are generally set at the wavelengths that are in between the neighbouring line cores -is less than 1% from the continuum.
Given 1D LTE, 1D non-LTE, and 3D non-LTE equivalent widths as a function of A (Fe), 1D LTE versus 3D non-LTE abundance differences (∆A 1L-3N ) and 1D non-LTE versus 3D non-LTE abundance differences (∆A 1N-3N ) were then evaluated. These quantities are formally abundance errors: a negative value indicates a positive abundance correction, or in other words that the 3D non-LTE model predicts a larger A (Fe) than the 1D LTE or 1D non-LTE model.  leaving the linear part of the curve of growth and becoming partially saturated as log W/λ increases. When this happens first in 1D LTE, a large increase in A (Fe) 1L is needed to match A (Fe) 3N . This results in large, positive values of ∆A 1L-3N . Vice versa, when partial saturation first occurs in the 3D non-LTE model, one finds large, negative values of ∆A 1L-3N . At even larger log W/λ −4.8, both 1D LTE and 3D non-LTE lines are partially saturated, and the behaviour of the abundance differences again reflect those in the weak-line regime. Further discussion of this effect may be found in Sect The main effect of ξ mic;1D in Fig. 3 can be understood in terms of the behaviour of ∆A 1L-3N with log W/λ. Adopting a smaller ξ mic;1D would mean that the 1D LTE lines become partially saturated and leave the linear part of the curve of growth before the 3D non-LTE lines (first panel of Fig. 3). Conversely, adopting a larger ξ mic;1D desaturates the 1D LTE lines; they leave the linear part of the curve of growth and become partially saturated after the 3D non-LTE lines (lower two panels of Fig. 3). The effect of ξ mic;1D is smaller for lines with log W/λ −4.8 and log W/λ −4.8. In the former case (the weak-line regime), the line core is not yet saturated and so does not suffer so strongly from ξ mic;1D ; while in the latter case (strong lines), the equivalent widths are dominated by the area spanned by extended pressure broadened wings.

Partially saturated lines
The effects described above are 3D effects. As such, ∆A 1N-3N displays a qualitatively similar behaviour to ∆A 1L-3N . The behaviour is also qualitatively similar for other values of log g and [M/H] not shown in the figures.
For clarity and brevity, the remaining parts of Sect. 3 are based on lines with −6.9 < log W/λ < −4.9, and the choice of ξ mic;1D = 1 km s −1 . For this choice of ξ mic;1D , Fig. 3 indicates that the 1D LTE and 3D non-LTE model lines leave the linear part of the curve of growth at roughly the same point. By adopting this ξ mic;1D , and only considering weak lines, the discussion and interpretation of the abundance differences is greatly simplified.

Weak Fe i lines
At [M/H] = 0, for ξ mic;1D = 1 km s −1 , the upper left panels of Fig. 5 show that for weak Fe i lines, ∆A 1L-3N ranges from −0.15 dex to +0.10 dex. At [M/H] = −3, the right panels show that ∆A 1L-3N reaches −0.5 at high T eff and low log g, and +0.2 dex at low T eff , at least for the selection of Fe i lines con-Article number, page 5 of 17 A&A proofs: manuscript no. paper These abundance differences are due to the combination of 3D and non-LTE effects that are rather complicated to disentangle (Sect. 3.2 and 3.3 of Amarsi et al. 2016). The 3D effects can be due to differences in the mean model stellar atmosphere stratifications, the presence of inhomogeneities, and horizontal radiative transfer. These various phenomena are complicated, with the 3D models being either shallower or steeper on average than the 1D models depending on the stellar parameters , and with the sign of ∆A 1N-3N and ∆A 1L-3N varying across the granulation pattern and line formation height (Holzreuter & Solanki 2013). They also depend on the particular choice of ξ mic;1D , with larger values shifting ∆A 1L-3N and ∆A 1N-3N downwards. The non-LTE effects on Fe i, on the other hand, can be due to the over-ionisation of the minority species that would act to increase the inferred A (Fe), or photon losses in the line cores that act to reduce it, broadly speaking (Kostik et al. 1996). This competition can be seen in the comparison of the 3D LTE and 3D non-LTE profiles in the upper right panel of Fig. 2, where the 3D non-LTE profile is narrower due to overionisation, but has a similar core flux depression due to photon losses.
The 3D and non-LTE effects may couple to each other in complex ways. Nevertheless, the overall trend is that ∆A 1L-3N and ∆A 1N-3N are usually the most negative for the high T eff , low log g, and, at least for higher T eff , low [M/H] models (as expected from previous studies; Sect. 1). These trends are more clearly visible in Fig. 6 and Fig. 7, which show the median results for Fe i lines of low and high excitation potential. The left panels of Fig. 7 show that there are peaks in ∆A 1L-3N and ∆A 1N-3N at [M/H] = −2 and T eff = 5000 K, for Fe i lines of low excitation potential. This reflects the competition between (reverse) granulation effects (driving weaker lines in 3D corresponding to more positive ∆A 1L-3N and ∆A 1N-3N ) and the effect of the lower temperatures in the upper layers of the 3D models (driving stronger lines in 3D corresponding to more negative ∆A 1L-3N and ∆A 1N-3N ), with the latter effect becoming much more pronounced at [M/H] = −3.
Over-ionisation is certainly larger for such stars, because the escaping UV flux increases for such stars, and also because Article number, page 6 of 17 A. M. Amarsi et al.: 3D non-LTE iron abundances in FG-type dwarfs more compact stars have more efficient collisional rates that help counteract these effects . At smaller [M/H], photon losses are also typically weaker due to the smaller strengths of the Fe i lines. The 3D effects might in general also be expected to be more severe for such stars, with larger granulation contrasts and thus larger fluctuations in the atmosphere (Magic et al. 2013). In general, as noted above, this does not immediately imply more negative values of ∆A 1N-3N and ∆A 1L-3N ; although, at low [M/H] the steeper mean stratifications of the 3D models act to enhance the non-LTE over-ionisation (Nordlander et al. 2017). These trends with E low imply that 3D non-LTE effects are implicit in 1D LTE spectroscopic determinations of T eff that are based on the excitation balance of Fe i lines. It is beyond the scope of the present study to attempt to derive grids of T eff corrections here. Such an effort would be complicated by the nonlinear behaviour of ∆A 1L-3N with E low and the strong dependence on log W/λ (Sect. 3.1), such that the extent of the T eff errors would depend on the particular set of Fe i lines under consideration.  log g; it gives rise to flat trends in the right panels of Fig. 6. The non-LTE effects in Fe ii lines are not significant at [M/H] = 0, and so ∆A 1N-3N ≈ ∆A 1L-3N . These abundance differences are thus primarily driven by 3D LTE effects, which are discussed in Amarsi et al. (2019b).

Weak Fe ii lines
There are, however, non-LTE effects for the warmest stars at the lowest [M/H]. This can be seen via the subtle differences between the upper and lower right panels of Fig. 5, which indicate small 1D non-LTE effects. We note that Fe ii lines can suffer from over-excitation and become weaker relative to 1D LTE (translating to larger inferred A (Fe) in non-LTE). The non-LTE effects on Fe ii lines are even larger in the metal-poor 3D models due to their steeper gradients (Shchukina et al. 2005). The 3D LTE versus 3D non-LTE abundance differences can be estimated by using the logarithmic ratios of equivalent widths. They are the most negative for the high T eff , low log g, and low [M/H] models (for similar reasons given in Sect. 3.2). At T eff ≈ 6500 K, log g = 4.0, and [M/H] = −3, these differences are −0.09 ± 0.05 dex, averaging over all the Fe ii lines. The differences are already much less severe for the model with similar T eff and log g but with [M/H] = −2 dex, where they amount to −0.02 ± 0.01. Fig. 5 illustrates that for the cooler models, ∆A 1L-3N typically has the same sign for Fe ii lines as for Fe i lines of 2 E low /eV 5; usually they are both negative. The consequence of this is that 1D LTE analyses of lines of either of these species may underestimate A (Fe). On the other hand, the fact that the abundance differences for Fe i and Fe ii usually have the same sign helps to counteract the systematic errors on 1D LTE spectroscopic determinations of log g to some extent, at least for T eff 6000 K.

Interpolation in stellar and line parameters
The derived values of ∆A 1L-3N illustrated above (Sect. 3) were used to analyse standard stars, the results of which are presented in the following section (Sect. 5). To do so, it was necessary to interpolate these abundance differences as a function of stellar parameters (T eff , log g, A (Fe) 3N , and ξ mic;1D ) and line parameters (E low , E up , and log g f ).
Several different approaches based on simple linear and spline interpolation and more sophisticated machine-learning algorithms were explored. Ultimately, following Wang et al. (2021), the interpolation model was based on multi-layer perceptrons (MLP), which is a feed-forward fully connected neu-  ral network that connects the input and output layers through n l number of hidden layers. The hidden layers contain n neurons, and each neuron is connected to all the neurons in the previous layers with weights fitted through back-propagation training. The models are constructed using the MLPRegressor class from the scikit-learn python library (Pedregosa et al. 2012).
Three different MLP models were built, separating the data into three groups: Fe i lines with E low < 2 eV; Fe i lines with E low > 2 eV; and Fe ii lines. This was found to give more reliable results than building a single MLP model for these three groups. The tolerance was set to 10 −5 , and the activation function used was the Rectified Linear Unit function (ReLU). Via a 5-fold cross-validation scheme, the network size was set to n l = 3 and n = 200, and the L2 regularisation term (that prevents overfitting) was set to α = 5 × 10 −4 , 7 × 10 −4 , and 10 −2 for the three different MLP models, respectively. Via a k-fold cross-validation scheme using the MLP models, a select number of severe outliers, reflecting the difficulties in determining equivalent widths (Sect. 2.4), were identified; the MLP models were then rebuilt, with these data removed.
To evaluate the performance of the MLP models, the datasets were randomly divided into a training set (80% of the data) and a test set (20% of the data). The three MLP models were trained on their respective training sets, and then applied to their respective test sets. The overall performance of the interpolation routine is illustrated in Fig. 8, where the histogram shows that the residuals of the three MLP models have standard deviations of 0.024, 0.017, and 0.016 dex, respectively. They reflect the expected 1 σ interpolation error when looking at one Fe i or Fe ii line in one star. These interpolation errors translate to scatter in plots of line-by-line A (Fe) and average out when using sufficiently large numbers of Fe i and Fe ii lines. Moreover the scatter plot in Fig. 8 shows that the largest interpolation errors tend to occur where ∆A 1L-3N is larger.
The input parameters of the interpolation routine are the stellar parameters and line parameters listed above: T eff , log g, A (Fe) 3N , and ξ mic;1D as well as E low , E up , and log g f . The interpolation routine also returns results for stars outside of the grid of stellar parameters (T eff between 5000 K and 6500 K, log g between 4.0 dex and 4.5 dex, and A (Fe) 3N between 4.5 dex and 7.5 dex) and line parameters (see Fig. 1 and Sect. 2.4). In such cases the returned results are intrinsically more uncertain, and should be treated with caution. The possible impact of interpo- lation and extrapolation in line parameters is discussed more in Sect. 5.4 in the context of the application to standard stars. The input parameter A (Fe) 3N is the 3D non-LTE iron abundance. Since this is usually unknown, in principle this routine needs to be applied iteratively. In practice, when the abundance differences are small or do not strongly vary with [M/H], it is safe to make the approximation A (Fe) 3N ≈ A (Fe) 1L , thus avoiding any iterations and allowing for a quick estimate of the abundance correction (which is given by the negative of ∆A 1L-3N ). When the abundance differences are larger, there can be a notable difference in the final result. For example, in the analysis of the star HD 84937 (Sect. 5), the results for individual Fe i lines changed by up to 0.05 dex when iterations were neglected, and the mean difference changed by 0.016 dex. These differences are, however, within the overall uncertainty in A (Fe) for this star. Nevertheless, in the present work, the routine was always applied iteratively to avoid these small systematic errors.
To compare between 1D non-LTE and 3D non-LTE A (Fe), an analogous interpolation routine was constructed to correct for 1D non-LTE effects. To facilitate a fair comparison, this was based on the calculations from the 32 ATMO models, rather than the finer grid of MARCS models (Sect. 2.1). This 1D non-LTE interpolation routine uses the same input parameters as the 3D non-LTE one discussed above. When applying this interpolation routine, it was assumed that the 1D LTE and 1D non-LTE values of ξ mic;1D were identical. In practice, calibrations of ξ mic;1D based on flattening trends with a (reduced) equivalent width can be sensitive to non-LTE effects up to the order of 10% (Amarsi et al. 2016).

Stellar parameters
This section demonstrates the possible impact of ∆A 1L-3N in practice. For this purpose, as well as the Sun, the standard stars Procyon (HD 61421), HD 84937, and HD 140283 were selected to be analysed. These stars are interesting in this context because they are located close to different edges of the grid of ∆A 1L-3N in T eff and [M/H]. Moreover, they are all in the original GBS catalogue , implying that they have relatively well-constrained stellar parameters, and that they are of astrophysical importance for calibrating and validating spectroscopic studies (Pancino et al. 2017;Buder et al. 2021). Lastly, the Sun ) and the two metal-poor stars (Amarsi et al. 2016) have previously been studied using the same or similar model atoms, and thus serve as a rough consistency check of the grid-based approach adopted here. The adopted stellar parameters of these stars are given in Table 2. Where possible, the parameters were updated to account for the most recent and reliable measurements. These stars were analysed in slightly different ways, as discussed below (Sect. 5.2).

Analysis
The Sun was analysed in the simplest way. Line-by-line 1D LTE A (Fe) values measured in the solar flux atlas of Kurucz et al. (1984) were taken from Allende Prieto et al. (2002). The analysis was restricted to weak lines with log W/λ < −4.9 because partially saturated lines are more sensitive to uncertainties in the equivalent widths and also correspond to larger values of ∆A 1L-3N (Sect. 3.1). The A (Fe) values for the Fe ii lines were corrected for the more precise values of log g f provided in Meléndez & Barbuy (2009). The 1D LTE values were then corrected to obtain 3D non-LTE values, using the interpolation routine described in Sect. 4. The interpolation routine uses the 3D non-LTE A (Fe) as an input parameter, and returns the 1D LTE versus 3D non-LTE abundance difference. Since the 3D non-LTE A (Fe) is initially unknown, the interpolation routine was applied iteratively, using the 1D LTE value as the first guess.
For Procyon (HD 61421), Allende Prieto et al. (2002) also provide 1D LTE A (Fe). However, their T eff and log g are smaller than those recommended by Chiavassa et al. (2012) and Heiter et al. (2015). Moreover, their Fe ii line list mostly consists of partially saturated lines; combined with their choice of ξ mic;1D = 2.2 km s −1 , the corresponding ∆A 1L-3N becomes large (Sect. 3.1) and more sensitive to the way in which the 1D LTE abundances were derived. Instead, this star was analysed using the lines in Tables 4 and 5 of Jofré et al. (2014) (the same line list that is the basis of this study, and is illustrated in Fig. 1). The authors provide up to five different measurements of the equivalent width for each line (labelled epinarbo, ucm, porto, bologna, and ulb). For a given line, 1D LTE values of A (Fe) were determined based on each of these measurements separately via spline interpolation of the theoretical grid of equivalent widths based on MARCS model stellar atmospheres (Sect. 2). The results from the Fe i  5.26 ± 0.07 5.40 ± 0.03 5.43 ± 0.07 5.38 ± 0.02 5.61 ± 0.07 5.50 ± 0.02 HD 140283 4.93 ± 0.05 5.05 ± 0.03 5.08 ± 0.05 5.03 ± 0.03 5.23 ± 0.05 5.18 ± 0.02 Table 5. Ionisation imbalance A (Fe) Fe i − A (Fe) Fe ii in 1D LTE, 1D non-LTE, and 3D non-LTE. Errors reflect statistical and stellar parameter uncertainties (not model uncertainties).

Name
1D LTE 1D non-LTE 3D non-LTE Sun 0.06 ± 0.02 0.04 ± 0.02 −0.01 ± 0.02 Procyon −0.07 ± 0.06 −0.04 ± 0.07 0.04 ± 0.07 HD 84937 −0.14 ± 0.07 0.05 ± 0.07 0.11 ± 0.08 HD 140283 −0.12 ± 0.06 0.05 ± 0.06 0.05 ± 0.06 (at most) five different equivalent widths were then individually compared to the overall mean 1D LTE value from all of the Fe i lines. Those that were more than 2 σ from the mean were re-moved, where σ here is the standard deviation of A (Fe) from the different Fe i lines. This sigma clipping was evaluated using the 1D LTE results; the exact same clipping was then later adopted for the 3D non-LTE analysis. The surviving data were then averaged together to get a final 1D LTE A (Fe) for that particular star and from that particular Fe i line. This proceeded iteratively. An identical approach was adopted for Fe ii lines, except comparing them with the overall mean 1D LTE A (Fe) from Fe ii lines. Lastly, the 1D LTE values of A (Fe) were corrected to obtain 3D non-LTE values, using the interpolation routine described in Sect. 4 in the same way as was done for the Sun, albeit with 0thorder extrapolation in T eff and A (Fe) 3N (adopting the values at T eff = 6500 K and A (Fe) 3N = 7.5). As for the Sun, this analysis was restricted to weak lines with log W/λ < −4.9.
The analyses of the metal-poor stars HD 84937 and HD 140283 that were presented in Jofré et al. (2014) Fig. 9. Line-by-line A (Fe) versus log W/λ. We note that 1D LTE and 3D non-LTE Fe i and Fe ii lines are indicated separately, with least-squares fits being overdrawn. on a few lines, especially for Fe ii. Here, these stars are instead analysed using the lines, equivalent widths, and theoretical 1D LTE grids presented in Amarsi et al. (2016). The stellar parameters were updated to those given in Table 2. In all other ways, the analysis proceeded as described for Procyon. In particular, the interpolation routine provided in this work was iteratively used to correct the 1D LTE to 3D non-LTE values as before, with 0thorder extrapolation in log g for HD 140283 (adopting the value at log g = 4.0).
Statistical uncertainties were calculated for each star as the standard error in the mean for Fe i and Fe ii separately. These primarily reflect uncertainties in the equivalent width measurements and adopted values of log g f . For Procyon, HD 84937, and HD 140283, systematic uncertainties were estimated by repeating the above steps (though for the purposes of this study with a fixed selection of lines and equivalent widths in the case of Procyon), while perturbing T eff , log g, and ξ mic;1D individually by the uncertainties listed in Table 2. Systematic uncertainties were neglected for the Sun. The final uncertainties were obtained by adding these difference sources of uncertainty together in quadrature. These uncertainties are presented in Table 3. As seen therein, the sensitivity to ξ mic;1D reaches almost to zero for the 3D non-LTE runs for the Sun and HD 84937 because these models do not adopt any extra broadening parameters; for Procyon and HD 140283, the uncertainty is slightly non-zero because of the 0th-order extrapolation in stellar parameters.
The 1D non-LTE analysis proceeded in almost the same way as the 3D non-LTE one. The only difference was that the 1D non-LTE interpolation routine was used instead of the 3D non-LTE one, as discussed in Sect. 4.

Results
The results for A (Fe) are illustrated in Fig. 9 and Fig. 10. Estimates of [Fe/H] from the 1D LTE, 1D non-LTE, and 3D non-LTE models are given in Table 2. These are based on the weighted means of [Fe/H] Fe i and [Fe/H] Fe ii , with the uncertainties summarised Table 3, and adopting the 1D LTE, 1D non-LTE, and 3D non-LTE solar values (respectively) separately for the A (Fe) Fe i and A (Fe) Fe ii derived here and given in Table 4. In practice, the Fe ii lines are given a much higher weight than the Fe i lines in these weighted means because of the high sensitivity of the Fe i lines to T eff (Table 3).  We make some general remarks here, before discussing each star individually in Sections 5.3.1-5.3.4. There are significant 3D non-LTE effects on Fe i or Fe ii, or both, for all four stars. After averaging over lines, the 3D non-LTE effects act to increase A (Fe) and correspond to negative values of ∆A 1L-3N overall, as is generally expected of Fe i lines with E low > 2 eV and of Fe ii lines (Sect. 3.3). For Fe i, the largest difference is for HD 84937 (0.35 dex), while for Fe ii the largest difference is for HD 140283 (0.15 dex).
Comparing the 1D LTE, 1D non-LTE, and 3D non-LTE results helps disentangle the 3D effects and the non-LTE effects, to some extent. For Fe ii, the line-averaged 1D non-LTE results agree with the 1D LTE ones to 0.02 dex (Table 4); these lines are primarily susceptible to 3D effects (Sect. 3.3). There is more variation for Fe i. For the Sun and Procyon (HD 61421), the 1D LTE and 1D non-LTE results are within 0.03 dex. This is due to the competition between over-ionisation and photon losses in the 3D models at high [M/H] (Sect. 3.2). For the metal-poor stars HD 84937 and HD 140283, the 1D non-LTE results are in between the 1D LTE and 3D non-LTE ones, as the steeper gradients in the 3D models enhance the non-LTE effects (Amarsi et al. 2016).
Significant ionisation imbalances relative to the stipulated uncertainties may reflect modelling errors since such errors were deliberately not taken into account in Sect. 5.2. As noted in Sect. 3.3, the 3D non-LTE effects on Fe i lines of high E low and on Fe ii lines usually have the same sign, which helps to counteract ionisation imbalances in 1D LTE. Despite this, in the current 1D LTE analysis, ionisation balance is not met for any of the stars as can be seen in Table 5 as well as Fig. 9 and Fig. 10. For Procyon, the imbalance in 1D LTE is −0.07 dex, or −1.2 σ; whereas, for the other three stars, it amounts to −2 σ. For HD 84937 and HD 140283, these 1D LTE imbalances correspond to underestimating log g by around 0.3 dex. In contrast, the 3D non-LTE models achieve ionisation balance for the Sun, Procyon, and HD 140283 within the uncertainties. Although HD 84937 shows an imbalance in 3D non-LTE that is 0.11 dex, or 1.4 σ, this is 0.03dex less severe than that in 1D LTE.
Excitation imbalances in Fe i lines (trends in A (Fe) with E low ) may similarly reflect modelling errors. For the Sun and Procyon, the 1D LTE excitation imbalances amount to 0.09 dex as E low increases from 0 eV to 5 eV. In 3D non-LTE, this is reduced for the Sun (0.07 dex), but slightly increased for Procyon (0.10 dex). For the two metal-poor stars, the 1D LTE A (Fe)  Fig. 11. Same as Fig. 9 and Fig. 10, but for the metal-poor stars and restricted to the Fe i lines that are included within the ∆A 1L-3N grid (Fig. 1), and using the extended 3D LTE grid from Amarsi et al. (2019b) for Fe ii lines. changes by 0.19 for HD 84937, and 0.28 dex for HD 140283 as E low increases from 0 eV to 5 eV; flattening this slope would require one to reduce T eff by 250 K and 300 K, respectively. In 3D non-LTE, these abundance slopes are markedly reduced. They amount to just 0.04 dex across 5 eV for HD 84937, corresponding to around 50 K which is well within the formal uncertainty of 97 K (Table 2), while HD 140283 shows no significant trend at all.
Thus in terms of ionisation and excitation balance, the 3D non-LTE models tend to outperform the 1D LTE ones. The residual imbalances most likely reflect remaining uncertainties in the 3D non-LTE models, possibly in the non-LTE model atom itself. At low [M/H], different prescriptions of the inelastic hydrogen collisions can change the inferred iron abundances by 0.03 dex for dwarfs in 1D non-LTE (Mashonkina et al. 2019). These differences would be exacerbated in the 3D non-LTE models of warmer metal-poor stars (such as HD 84937), due to their steeper temperature gradients. In any case, the residual imbalances are typically smaller than the difference between A (Fe) inferred in 1D LTE or 1D non-LTE compared to in 3D non-LTE. The 3D non-LTE models may thus still lead to more accurate determinations of A (Fe) overall.

Sun
For the Sun, Asplund et al. (2021) determined A (Fe) = 7.46 ± 0.04, with the stipulated uncertainties being dominated by systematics in the models. This is in excellent agreement with the 3D non-LTE value determined here, A (Fe) = 7.47 ± 0.01, where systematic modelling uncertainties have been neglected. This consistency is reassuring for the grid-based approach adopted here, given that both studies use the same model atom. Nevertheless, it is preferable to use the value of Asplund et al. (2021) as their analysis is superior to that presented here in a number of ways. In particular, their analysis was based on disc-centre intensity observations rather than the disc-integrated solar flux used here; a more up-to-date line list for Fe i; and a tailored 3D STAGGER model, thus avoiding interpolation in stellar parameters (with the interpolation routine adding some scatter to our results here; Sect. 4).

Procyon
Procyon (HD 61421) is found to be 0.11 dex above solar. This estimate is significantly larger than, to our knowledge, previous estimates based on 1D LTE and non-LTE (Mashonkina et al. 2011), 3D LTE and non-LTE , and 3D LTE (Allende Prieto et al. 2002). Different authors adopt different stellar parameters, ξ mic;1D , line lists, and analysis approaches, so it is difficult to determine exactly from where this discrepancy arises. There is a slightly positive trend with log W/λ for Fe i (Fig. 9), which possibly reflects uncertainties in the upper layers of the 3D model stellar atmospheres (Allende Prieto et al. 2002). Nevertheless, the weakest Fe i lines agree well with the Fe ii lines that form deeper in the atmosphere, for which a positive trend with log W/λ is not seen, and which are given the larger weight in the determination of [Fe/H].
The 3D non-LTE estimate of [Fe/H] would now imply that Procyon is significantly α-poor, [α/Fe] ≈ −0.1, at least when combined with 1D non-LTE estimates of magnesium ) and calcium (Mashonkina et al. 2017). This result may be in conflict with previous 1D LTE studies of these elements in nearby FG-type dwarfs, which indicate a plateau in α, or possibly even a slight upwards trend as [Fe/H] increases above solar metallicities (Bensby et al. 2003, Fig. 12 and Fig. 13). However, this result would be consistent with the negative 3D non-LTE trend found for [O/Fe] (Fig. 12 of Amarsi et al. 2019b). It would be interesting to revisit the abundances of these and other key elements in this star using tailored 3D non-LTE models.
As noted above, Procyon is around 50 K outside of the ∆A 1L-3N grid in T eff and 0.15 dex outside of the ∆A 1L-3N grid in A (Fe) 3N (Sect. 5.2), and that 0th-order extrapolation was applied for these parameters. The inspection of the trends for Fe i indicates that linear extrapolation in T eff would lead to slightly larger abundance differences for Fe i, while linear extrapolation in A (Fe) would lead to slightly smaller abundance differences for Fe i and Fe ii. These changes would be of the order ±0.02 dex at most for Fe i, and somewhat less severe for Fe ii. These are not enough to explain the much larger value of [Fe/H] found here.  , as well as a tailored 3D model stellar atmosphere. The authors also adopted a slightly smaller value of log g (4.06 dex) than that used here (4.13 dex; Giribaldi et al. 2021). The corresponding values found in this work are in good agreement within the uncertainties: −1.86 ± 0.07 and −1.97 ± 0.03, respectively. Amarsi et al. (2016) have also discussed how 1D LTE severely underestimates A (Fe) in HD 84937 relative to 3D non-LTE. This is verified here, with 1D LTE giving 0.35 dex and 0.10 dex smaller values of A (Fe) compared to 3D non-LTE from Fe i and Fe ii, respectively. For Fe i, the 1D non-LTE models give results that are improved compared to 1D LTE, with A (Fe) now only being 0.18 dex smaller than the 3D non-LTE result. Thus for Fe i 1D, non-LTE models should be used when 3D non-LTE models are unavailable.

HD 84937
As mentioned above (Sect. 5.3), the ionisation balance is not quite satisfactory at +0.11 ± 0.08 (A (Fe) from Fe i minus that from Fe ii); although, this is less severe than in 1D LTE at −0.14 ± 0.07. The sign of the imbalance flips, and thus it is possible that at low [M/H] the current models predict slight departures that are too large from LTE in Fe i, perhaps due to inelastic hydrogen collisions that are too inefficient. This would be exacerbated by the steeper gradients in the metal-poor 3D model stellar atmospheres. If this is indeed the case, some of these model shortcomings would be masked by the shallower gradients in the metal-poor 1D models.
For this star, the 1D non-LTE models give a good ionisation balance of 0.05 ± 0.07 dex. However, it should be noted that the residual imbalance in 3D non-LTE of 0.11 dex is much smaller than the difference in A (Fe) between 1D non-LTE and 3D non-LTE, which is 0.18 dex for Fe i. Thus the 3D non-LTE model may still give a more reliable result overall, despite its remaining shortcomings.

HD 140283
Amarsi et al. (2016) determined [Fe/H] = −2.34±0.07 from Fe i and [Fe/H] = −2.28 ± 0.04 from Fe ii for HD 140283 in 3D non-LTE. As for HD 84937 (Sect. 5.3.3), that study was based on an older model atom ) and a tailored 3D model stellar atmosphere. However, the authors adopted a much smaller T eff (5591 K) than that used here (5792 K; see  (Table 3), while much of the difference in Fe i can be attributed to the 200 K larger value of T eff .
Despite the different T eff , the large error in 1D LTE found by Amarsi et al. (2016) is confirmed here, with 1D LTE giving 0.30 dex and 0.13 dex smaller values of A (Fe) than 3D non-LTE from Fe i and Fe ii, respectively. As for HD 84937 (Sect. 5.3.3), the 1D non-LTE models give results that are improved over 1D LTE, with A (Fe) now only being 0.15 dex smaller than the 3D non-LTE result. Thus for Fe i 1D, non-LTE models should be used when 3D non-LTE models are unavailable.
While ionisation and excitation balance are achieved in 3D non-LTE ( Fig. 9 and Fig. 10), it should again be noted that HD 140283 is 0.35 dex outside of the ∆A 1L-3N grid in log g (Sect. 5.2), and so 0th-order extrapolation was applied for this parameter. Inspection of the trends for Fe i indicates that linear extrapolation in log g would lead to slightly larger abundances from Fe i and slightly smaller abundances from Fe ii, overall worsening the ionisation balance by around 0.03 dex. Therefore, if linear extrapolation is more valid than 0th-order extrapolation, these results could indicate that the departures from LTE in Fe i are slightly over-estimated by the current models, at least at low [M/H]. This would be in line with the results for HD 84937 (Sect. 5.3.3). As stated there, if this is indeed the case, some of these model shortcomings would be masked by the shallower gradients in the 1D models.

Interpolation and extrapolation in line parameters
The ∆A 1L-3N grid is unfortunately somewhat limited in the number of lines, especially for Fe ii (Fig. 1). This is particularly problematic at low [M/H], where many of the lines become too weak to be observed (lower right panel of Fig. 4). Thus, the results shown in Fig. 9 and Fig. 10 for the metal-poor stars clearly rely on extrapolation on an irregular grid of line parameters.
To test the possible impact of interpolation and extrapolation errors in line parameters, the analysis of the two metal-poor stars was repeated with two modifications. For Fe i lines, the analysis was restricted to only those lines in the grid (Sect. 2.4). For Fe ii lines, the extended 3D LTE grid of Amarsi et al. (2019b) was used instead; this contains all of the Fe ii lines studied here.