Jacobian-free Newton-Krylov method for multilevel nonlocal thermal equilibrium radiative transfer problems

D. Arramy; J. de la Cruz Rodríguez; J. Leenaarts

doi:10.1051/0004-6361/202449963

Home

All issues

Volume 690 (October 2024)

A&A, 690 (2024) A12

Full HTML

Open Access

Issue		A&A Volume 690, October 2024


Article Number		A12
Number of page(s)		17
Section		Numerical methods and codes
DOI		https://doi.org/10.1051/0004-6361/202449963
Published online		25 September 2024

A&A, 690, A12 (2024)

Jacobian-free Newton-Krylov method for multilevel nonlocal thermal equilibrium radiative transfer problems

D. Arramy^,★, J. de la Cruz Rodríguez and J. Leenaarts

Institute for Solar Physics, Dept. of Astronomy, Stockholm University, AlbaNova University Centre, 106 91 Stockholm, Sweden

Received: 13 March 2024
Accepted: 5 August 2024

Abstract

Context. The calculation of the emerging radiation from a model atmosphere requires knowledge of the emissivity and absorption coefficients, which are proportional to the atomic level population densities of the levels involved in each transition. Due to the intricate interdependence of the radiation field and the physical state of the atoms, iterative methods are required in order to calculate the atomic level population densities. A variety of different methods have been proposed to solve this problem, which is known as the nonlocal thermodynamical equilibrium (NLTE) problem.

Aims. Our goal is to develop an efficient and rapidly converging method to solve the NLTE problem under the assumption of statistical equilibrium. In particular, we explore whether the Jacobian-Free Newton-Krylov (JFNK) method can be used. This method does not require an explicit construction of the Jacobian matrix because it estimates the new correction with the Krylov-subspace method.

Methods. We implemented an NLTE radiative transfer code with overlapping bound-bound and bound-free transitions. This solved the statistical equilibrium equations using a JFNK method, assuming a depth-stratified plane-parallel atmosphere. As a reference, we also implemented the Rybicki & Hummer (1992) method based on linearization and operator splitting.

Results. Our tests with the Fontenla, Avrett and Loeser C model atmosphere (FAL-C) and two different six-level Ca II and H I atoms show that the JFNK method can converge faster than our reference case by up to a factor 2. This number is evaluated in terms of the total number of evaluations of the formal solution of the radiative transfer equation for all frequencies and directions. This method can also reach a lower residual error compared to the reference case.

Conclusions. The JFNK method we developed poses a new alternative to solving the NLTE problem. Because it is not based on operator splitting with a local approximate operator, it can improve the convergence of the NLTE problem in highly scattering cases. One major advantage of this method is that it is expected to allow for a direct implementation of more complex problems, such as overlapping transitions from different active atoms, charge conservation, or a more efficient treatment of partial redistribution, without having to explicitly linearize the equations.

Key words: line: profiles / radiative transfer / methods: numerical / Sun: atmosphere

^★

Corresponding author; dimitri.arramy@astro.su.se

© The Authors 2024

Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1 Introduction

The statistical equilibrium equations describe the radiative and collisional transitions between the different levels of a model atom (see, e.g., Hubeny & Mihalas 2014). When the collisional terms dominate the rate equations, the assumption of local ther-modynamical equilibrium (LTE) is usually adequate, and the atomic level population densities (hereafter population densities) can be obtained analytically using the Saha-Boltzmann equations. When the radiative terms become relevant, however, the radiation field greatly influences the population densities (NLTE). Because of this cross-dependence of the population densities with the radiation field, the NLTE problem must be solved iteratively in order to make them consistent with each other. Moreover, the nonlocality of the radiation field increases the complexity of the problem because all grid cells must be solved simultaneously.

Early attempts solved the rate equations using Lambda iteration, which is based on the fixed-point iteration method (see, e.g., Hubeny & Mihalas 2014). However, this scheme has very poor convergence properties and cannot be used in practice. Auer & Mihalas (1969) proposed a complete linearization method (of the second-order transfer equation) to solve the structure and radiation emerging from static stellar atmospheres. In their implementation, physical accuracy was neglected to make the problem computationally tractable, but it inspired future developments in the field (see below). A different approach, introduced by Rybicki (1972), used the core saturation approximation to eliminate passive photon scatterings in the line core, while only keeping the much more efficient scatterings in the line wings. The latter improved the conditioning of the rate equations, allowing traditional Lambda iteration to converge in a reasonable (but large) number of iterations.

The most successful methods for solving the statistical equilibrium equations are based on the operator splitting technique (Cannon 1973) combined with a linearization of the problem (e.g., Auer & Mihalas 1969; Scharmer & Carlsson 1985; Rybicki & Hummer 1992). Scharmer & Carlsson (1985) linearized the first-order radiative transfer equation and the rate equations with respect to the population densities until they were able to derive a linear system to estimate a correction. (Rybicki & Hummer 1992, RH92 hereafter) followed a slightly different approach and replaced some of the quantities that depend on the population densities with the value from the previous iteration. The fundamental difference between these two methods is that the complete linearization method of Scharmer & Carlsson (1985) is a minimization method of the error in the rate equations, whereas the RH92 method is closer to the fixed-point iteration method, but uses the operator-splitting technique to drive the solution. Furthermore, the complete linearization method of Scharmer & Carlsson (1985) operates on the source function, whereas the RH92 method operates on the emissivity, allowing for a simpler treatment of overlapping (active) transitions and partial redistribution effects (Uitenbroek 2001; Leenaarts et al. 2012; Sukhorukov & Leenaarts 2017).

The performance of these methods is largely determined by the choice of the approximate operator. The simplest block-diagonal (local) operator (Olson et al. 1986) decouples the explicit dependence of the rate equations with respect to space, and it requires very little storage and operations, making it a great choice for multidimensional problems (e.g., Leenaarts & Carlsson 2009; Amarsi et al. 2018). However, information about the nonlocal contribution to the intensity is neglected, as is its origin. A better prediction of the mean intensity can be attained by using the single-point quadrature global operator (Scharmer 1981; Scharmer & Nordlund 1982), which was greatly inspired by the Eddington-Barbier approximation (Milne 1921; Eddington 1926; Barbier 1943). Although it only requires one coefficient per ray and direction (two, when linear interpolation is used), the equations are again spatially coupled, and they must be solved together. Schemes using the global operator generally converge in fewer iterations than those using the local operator. Several codes with implementations of the complete linearization (Carlsson 1986; Hubeny & Lites 1995) and RH92 (Uitenbroek 2001; Leenaarts & Carlsson 2009; Pereira & Uitenbroek 2015; Socas-Navarro et al. 2015; Amarsi et al. 2018; Milić & van Noort 2018; Osborne & Milić 2021) methods have been extensively used by the solar and stellar communities.

A common way of solving nonlinear systems of equations is the Newton-Raphson method (Raphson 1690; Newton 1736). The main limitation to applying it to the statistical equilibrium equations is the expensive calculation of the Jacobian matrix that is required in each iteration. In this paper, we propose to use a modification of the Newton-Raphson method known as the Jacobian-free Newton-Krylov method (Knoll & Keyes 2004), to solve the radiative transfer problem. In this method, the Jaco-bian matrices are neither inverted nor built or stored. Instead, an iterative inversion solver based on Krylov subspaces (Krylov 1931) is used to estimate the Newton-Raphson correction to the unknowns. This method has already proved to be efficient in several fields, such as hydrodynamics or neutron scattering problems. Compared to the method of Scharmer & Carlsson (1985), our method does not require any explicit linearization of the rate and radiative transfer equations, and it does not use the operator splitting.

In Sect. 2, we introduce the numerical problem under consideration and the proposed numerical method for its resolution. In Sect. 3, we discuss our results, and in Sect. 4 we summarize our conclusions and discuss potentially interesting developments for future studies.

2 Problem and methods

2.1 Mathematical description of the problem

2.1.1 Theory

We adopt the notation used in Uitenbroek (2001) to express the statistical equilibrium equations. We furthermore assume a plane-parallel geometry hereafter. In all equations, unless mentioned otherwise, lower indices refer to atomic levels and upper indices refer to depth points within the atmosphere. The RH92 notation elegantly unifies the expressions for bound-bound and bound-free transitions, allowing for a very clean implementation of the rate equations. For a bound-bound transition between a lower level i and upper level j, we can define $V_{i j} = \frac{h ν}{4 π} B_{i j} ϕ_{i j} (ν, μ),$ ${V_{ij}} = {{hv} \over {4\pi }}{B_{ij}}{\phi _{ij}}(v,\mu ),$ (1) $V_{j i} = \frac{h ν}{4 π} B_{j i} ψ_{i j} (ν, μ),$ ${V_{ji}} = {{hv} \over {4\pi }}{B_{ji}}{\psi _{ij}}(v,\mu ),$ (2) $U_{j i} = \frac{h ν}{4 π} A_{j i} ψ_{i j} (ν, μ),$ ${U_{ji}} = {{hv} \over {4\pi }}{A_{ji}}{\psi _{ij}}(v,\mu ),$ (3)

where A_ji, B_ji, and B_ij are the Einstein coefficients, and ϕ_ij and ψ_ij are the line absorption and emission profiles. ν is the frequency, and µ is the line-of-sight angle cosine. Similarly, for a bound-free transition, we can define $V_{i j} = α_{i j} (ν),$ ${V_{ij}} = {\alpha _{ij}}(v)$ (4) $V_{j i} = n_{e} Φ_{i j} (T) \exp {- \frac{h ν}{k_{B} T}} α_{i j} (ν),$ ${V_{ji}} = {n_e}{{\rm{\Phi }}_{ij}}(T)\exp \left\{ { - {{hv} \over {{k_B}T}}} \right\}{\alpha _{ij}}(v),$ (5) $U_{j i} = V_{j i} (\frac{2 h ν^{3}}{c^{2}}),$ ${U_{ji}} = {V_{ji}}\left( {{{2h{v^3}} \over {{c^2}}}} \right),$ (6)

where α_ij is the photoionization cross-section, n_e is the electron density, and Φ_ij(T) is the Saha-Boltzmann function evaluated at temperature T, $Φ_{i j} (T) = \frac{g_{i}}{2 g_{j}} {(\frac{h^{2}}{2 π m_{e} k_{B} T})}^{3 / 2} \exp {\frac{E_{j} - E_{i}}{k_{B} T}},$ ${{\rm{\Phi }}_{ij}}(T) = {{{g_i}} \over {2{g_j}}}{\left( {{{{h^2}} \over {2\pi {m_e}{k_B}T}}} \right)^{3/2}}\exp \left\{ {{{{E_j} - {E_i}} \over {{k_B}T}}} \right\},$ (7)

where ɡ denotes the level statistical weight and E the level energy, and m_e is the mass of the electron. In practice, these expressions can be further simplified using the Einstein relations between coefficients, so that we obtain $V_{j i} = g_{i j} V_{i j},$ ${V_{ji}} = {g_{ij}}{V_{ij}},$ (8) $U_{j i} = g_{i j} (\frac{2 h ν^{3}}{c^{2}}) V_{i j} .$ ${U_{ji}} = {g_{ij}}\left( {{{2h{v^3}} \over {{c^2}}}} \right){V_{ij}}.$ (9)

For bound-bound transitions, assuming complete-redistribution of scattered photons, ɡ_ij = ɡ_i/ɡ_j. For bound-free transitions, $g_{i j} = n_{e} Φ_{i j} (T) \exp {- \frac{h ν}{k_{B} T}} = \frac{n_{i}^{*}}{n_{j}^{*}} \exp {- \frac{h ν}{k_{B} T}},$ ${g_{ij}} = {n_e}{{\rm{\Phi }}_{ij}}(T)\exp \left\{ { - {{hv} \over {{k_B}T}}} \right\} = {{n_i^ * } \over {n_j^ * }}\exp \left\{ { - {{hv} \over {{k_B}T}}} \right\},$ (10)

where the asterisk-superscript denotes the LTE atomic level population.

We can now write the rate equations as a function of V_ij, regardless of whether we consider bound-bound or bound-free transitions. We recall the rate equation for the atomic level i at depth index k, $\underset{p}{\sum^{}} {n_{p}^{k} (C_{p i}^{k} + R_{p i}^{k})} = \underset{p}{\sum^{}} {n_{i}^{k} (C_{i p}^{k} + R_{i p}^{k})},$ $\mathop {{{\mathop \sum \nolimits^ }^}}\limits_p \left\{ {n_p^k\left( {C_{pi}^k + R_{pi}^k} \right)} \right\} = \mathop {{{\mathop \sum \nolimits^ }^}}\limits_p \left\{ {n_i^k\left( {C_{ip}^k + R_{ip}^k} \right)} \right\},$ (11)

where $n_{i}^{k}$ $n_i^k$ is the population density of the atomic level i at depth index k. $C_{i j}^{k} and R_{i j}^{k}$ $C_{ij}^k{\rm{ and }}R_{ij}^k$ are the collisional and radiative rate coeffi-cients of the transition i → j at depth index k with $C_{i i}^{k} = R_{i i}^{k} = 0$ $C_{ii}^k{\rm{ = }}R_{ii}^k = 0$ . The radiative rate coefficients can be expressed as a double integral over the angle and frequency of the intensity (Uitenbroek 2001), $R_{i j}^{k} = \frac{1}{2} \int_{- 1}^{1} d μ \int_{0}^{\infty} \frac{d ν}{h ν} V_{i j}^{k} I_{μ ν}^{k} (n) i < j$ $R_{ij}^k = {1 \over 2}\mathop \smallint \limits_{ - 1}^1 d\mu \mathop \smallint \limits_0^\infty {{dv} \over {hv}}V_{ij}^kI_{\mu v}^k({\bf{n}})\quad i < j$ (12) $R_{j i}^{k} = \frac{1}{2} \int_{- 1}^{1} d μ \int_{0}^{\infty} \frac{d ν}{h ν} [(\frac{2 h ν^{3}}{c^{2}}) + I_{μ ν}^{k} (n)] g_{i j}^{k} V_{i j}^{k} i < j,$ $R_{ji}^k = {1 \over 2}\mathop \smallint \limits_{ - 1}^1 d\mu \mathop \smallint \limits_0^\infty {{dv} \over {hv}}\left[ {\left( {{{2h{v^3}} \over {{c^2}}}} \right) + I_{\mu v}^k(n)} \right]g_{ij}^kV_{ij}^k\quad i < j,$ (13)

for a plane-parallel atmosphere. The expressions for $V_{i j}^{k}$ $V_{ij}^k$ and $ɡ_{i j}^{k}$ $g_{ij}^k$ are given in Eqs. (1)–(10) for bound-bound and bound-free transitions. The last component $I_{μ ν}^{k}$ $I_{\mu v}^k$ is the intensity in the direction µ at a frequency ν and at the depth index k. The vector n contains the population densities with the chosen structure ${(n_{1}^{1}, \dots, n_{N_{ℓ}}^{1}, \dots, n_{1}^{N_{z}}, \dots, n_{N_{ℓ}}^{N_{z}})}^{T}$ ${\left( {n_1^1, \ldots ,n_{{N_\ell }}^1, \ldots ,n_1^{{N_z}}, \ldots ,n_{{N_\ell }}^{{N_z}}} \right)^{\rm{T}}}$ , where N_z and N_ℓ are the number of depth points and active atomic levels, respectively. While all the other quantities do not depend on the population densities, the intensity involves them all in a nonlinear and nonlocal fashion through the radiative transfer equation (RTE), $I_{μ ν} (τ_{μ ν}) = {\int^{}}_{τ_{μ ν}}^{\infty} S_{μ ν} (t) e^{- (t - τ_{μ ν})} d t μ > 0$ ${I_{\mu v}}\left( {{\tau _{\mu \nu }}} \right) = {\mathop \smallint \nolimits^ ^}_{{\tau _{\mu v}}}^\infty {S_{\mu v}}(t){e^{ - \left( {t - {\tau _{\mu v}}} \right)}}dt\mu > 0$ (14) $I_{μ ν} (τ_{μ ν}) = {\int^{}}^{}_{τ_{μ ν}}^{0} S_{μ ν} (t) e^{- (τ_{μ ν} - t)} d t μ < 0,$ ${I_{\mu v}}\left( {{\tau _{\mu \nu }}} \right) = {\mathop \smallint \nolimits^ ^}_{{\tau _{\mu v}}}^0{S_{\mu v}}(t){e^{ - \left( {{\tau _{\mu \nu }} - t} \right)}}dt\,\,\mu < 0,$ (15)

where S_µv = η_µv/χ_µv is the source function, χ_µv and η_µv are the total opacity and emissivity, respectively, which can be calculated through $χ_{μ ν} = χ_{c} + χ_{sca} + \underset{i}{\sum^{}} \underset{j > i}{\sum^{}} V_{i j} (n_{i} - g_{i j} n_{j})$ ${\chi _{\mu \nu }} = {\chi _c} + {\chi _{{\rm{sca}}}} + \mathop {{{\mathop \sum \nolimits^ }^}}\limits_i \mathop {{{\mathop \sum \nolimits^ }^}}\limits_{j > i} {V_{ij}}\left( {{n_i} - {g_{ij}}{n_j}} \right)$ (16) $η_{μ ν} = η_{c} + χ_{sca} J_{ν} + \underset{j}{\sum^{}} \underset{i < j}{\sum^{}} (\frac{2 h ν^{3}}{c^{2}}) g_{i j} V_{i j} n_{j},$ ${\eta _{\mu v}} = {\eta _c} + {\chi _{{\rm{sca}}}}{J_v} + \mathop {{{\mathop \sum \nolimits^ }^}}\limits_j \mathop {{{\mathop \sum \nolimits^ }^}}\limits_{i < j} \left( {{{2h{v^3}} \over {{c^2}}}} \right){g_{ij}}{V_{ij}}{n_j},$ (17)

where the subscript c refers to the background continuum contribution, and sca indicates the background-scattering contribution, which are assumed to be independent with respect to the active population densities. The mean intensity J_ν can be computed using $J_{ν} = \frac{1}{2} {\int^{}}_{- 1}^{1} I_{μ ν} (n) d μ .$ ${J_v} = {1 \over 2}{\mathop \smallint \nolimits^ ^}_{ - 1}^1{I_{\mu v}}(n)d\mu .$ (18)

The presence of J_ν in the scattering term of Eq. (17) might cause the calculations to become more complex because it depends on the intensity, which in turn depends on opacities and emissivi-ties. Because these scattering terms do not originate from active transitions of the atom, however, we use a previous estimation of the mean intensity $J_{ν}^{†}$ $J_v^\dag$ instead of J_ν in Eq. (17). The optical thickness τ_µv is obtained by integrating the opacity over depth, $τ_{μ ν} (z) = \frac{1}{| μ |} {\int^{}}_{0}^{z} χ_{μ ν} (z^{'}) d z^{'} > 0.$ ${\tau _{\mu v}}(z) = {1 \over {\left| \mu \right|}}{\mathop \smallint \nolimits^ ^}_0^z{\chi _{\mu v}}\left( {z'} \right)dz' > 0.$ (19)

Equation (11) describes a system of N_ℓ × N_z equations and variables $n_{i}^{k}$ $n_i^k$ in which N_z equations are redundant. Therefore, we replace one rate equation by a particle conservation equation per depth-point, $\underset{p}{\sum^{}} n_{p}^{k} = n_{tot}^{k},$ $\mathop {{{\mathop \sum \nolimits^ }^}}\limits_p n_p^k = n_{{\rm{tot}}}^k,$ (20)

where $n_{tot}^{k}$ $n_{{\rm{tot}}}^k$ is the total atom density at the depth index k and is kept constant. The replacement was made on the most populated atomic level at each depth point for numerical stability purposes.

Finally, the system of equations was reformulated as $F_{i}^{k} (n) \overset{def}{=} \underset{p}{\sum^{}} {n_{i}^{k} (C_{i p}^{k} + R_{i p}^{k}) - n_{p}^{k} (C_{p i}^{k} + R_{p i}^{k})}$ $F_i^k(n)\mathop = \limits^{{\rm{def}}} \mathop {{{\mathop \sum \nolimits^ }^}}\limits_p \left\{ {n_i^k\left( {C_{ip}^k + R_{ip}^k} \right) - n_p^k\left( {C_{pi}^k + R_{pi}^k} \right)} \right\}$ (21)

for radiative rate equations, and $F_{i}^{k} (n) \overset{def}{=} n_{tot}^{k} - \underset{p}{\sum^{}} n_{p}^{k}$ $F_i^k(n)\mathop = \limits^{{\rm{def}}} n_{{\rm{tot}}}^k - \mathop {{{\mathop \sum \nolimits^ }^}}\limits_p n_p^k$ (22)

for mass conservation equations. Altogether, Eqs. (21)–(22) form a residual vector F(n) with the same dimension as n. Solving the system of equations for the vector of population densities n is therefore equivalent to finding the root of the residual vector F. The calculation of F for a given atmosphere and population densities is detailed in algorithm 1. The residual vector is the central part of the solving process and constitutes the main computational cost of it. Thus, we evaluated the performance of a solver by the number of computation calculations of F (hereafter calls) needed to solve the problem to a given precision.

Algorithm 1Calculation of the residual vector F

2.1.2 Discretization of the RTE

When the radiative rates are computed, the angular and frequency integrals are in practice discretized according to quadrature schemes and yield quadrature coefficients (ω_µ, ω_ν) for each set (µ, ν). Equations (14) and (15) are discretized along the depth axis, and the involved integrals can be calculated assuming a depth-dependent profile for S_µv. This profile is usually taken as simple piece-wise polynomial functions. We considered piece-wise linear functions (Olson & Kunasz 1987), which yielded $I_{μ ν}^{k} = I_{μ ν}^{k + 1} e^{- Δ τ_{μ ν}^{k}} + a^{k} S_{μ ν}^{k} + b^{k} S_{μ ν}^{k + 1} μ > 0$ $I_{\mu \nu }^k = \quad I_{\mu \nu }^{k + 1}{e^{ - {\rm{\Delta }}\tau _{\mu \nu }^k}} + {a^k}S_{\mu \nu }^k + {b^k}S_{\mu \nu }^{k + 1}\quad \mu > 0$ (23) $I_{μ ν}^{k} = I_{μ ν}^{k - 1} e^{- Δ τ_{μ ν}^{k - 1}} + a^{k - 1} S_{μ ν}^{k} + b^{k - 1} S_{μ ν}^{k - 1} μ < 0,$ $I_{\mu \nu }^k = I_{\mu \nu }^{k - 1}{e^{ - {\rm{\Delta }}\tau _{\mu \nu }^{k - 1}}} + {a^{k - 1}}S_{\mu \nu }^k + {b^{k - 1}}S_{\mu \nu }^{k - 1}\quad \mu < 0,$ (24)

where the coefficients a^k and b^k are given in Appendix A.1, and $Δ τ_{μ ν}^{k} \approx \frac{1}{2 | μ |} | z^{k} - z^{k + 1} | (χ_{μ ν}^{k} + χ_{μ ν}^{k + 1})$ ${\rm{\Delta }}\tau _{\mu v}^k \approx {1 \over {2\left| \mu \right|}}\left| {{z^k} - {z^{k + 1}}} \right|\left( {\chi _{\mu \nu }^k + \chi _{\mu \nu }^{k + 1}} \right)$ (25)

is the optical thickness of the slab.

At the top of the atmosphere, we assumed that there is no incoming radiation. The deepest point in the atmosphere was assumed to be thermalized so that the intensity at this location is the Planck function B_ν at the local temperature T, $I_{μ ν}^{N_{z}} = B_{ν} (T^{N_{z}}) μ > 0$ $I_{\mu \nu }^{{N_z}} = {B_\nu }\left( {{T^{{N_z}}}} \right)\quad \mu > 0$ (26) $I_{μ ν}^{1} = 0 μ < 0.$ $I_{\mu \nu }^1 = 0\quad \mu < 0.$ (27)

The angular integrals were evaluated using a Gauss-Legendre quadrature defined in ]0,1[. At each depth index k, the incoming and outgoing rays were considered in the calculation of the mean intensity J. The number of quadrature points is a parameter set by the user. We normally ran with five quadrature points with two rays per angle.

2.2 Newton-Raphson method

2.2.1 Basics

The system of nonlinear equations F(n) = 0 for the vector n may be solved through several numerical iterative methods. The Newton-Raphson method is one of the simplest and most powerful ones (e.g., Press et al. 2002). When n^(p) is the estimate of the solution on the p^th iteration, the next iterate n^(p+1) is sought such that F(n^(p+1)) = 0. When we further define δn^(p) = n^(p+1) − n^(p) as the p^th incremental, the Newton-Raphson method relies on a linearization of F(n^(p+1)), $0 = F (n^{(p + 1)}) = F (n^{(p)} + δ n^{(p)}) \approx F (n^{(p)}) + J_{F} (n^{(p)}) δ n^{(p)},$ ${\bf{0}} = F\left( {{n^{(p + 1)}}} \right) = F\left( {{n^{(p)}} + \delta {n^{(p)}}} \right) \approx F\left( {{n^{(p)}}} \right) + {{\bf{J}}_F}\left( {{n^{(p)}}} \right)\delta {n^{(p)}},$ (28)

where we have introduced the Jacobian matrix J_F associated with the residual vector F and evaluated at n^(p). A possible representation of J_F is given in Fig. 1. Solving the latter linear system for δn^(p) yields $δ n^{(p)} = - J_{F}^{- 1} (n^{(p)}) F (n^{(p)}),$ $\delta {n^{(p)}} = - {\bf{J}}_F^{ - 1}\left( {{n^{(p)}}} \right)F\left( {{n^{(p)}}} \right),$ (29)

from which we may compute the next iterate n^(p+1). This new estimate is a priori not a solution to F(n) = 0, although the iterative process ensures a quadratic convergence to a solution in the best cases (e.g., Dennis & Schnabel 1996). The initial guess n⁽⁰⁾ might be given, for instance, by the LTE or the radiation-free predictions of the population densities. The linearization introduced by the Newton-Raphson method only consists of a mean to solve the raw statistical equations, whereas RH92 solves a linearized approximated version of the problem.

2.2.2 Limitations of the Newton-Raphson method

We note that δn^(p) might also lead to a poorer estimation of the solution. This behavior can occur when the correction δn^(p) lies beyond the domain of linearity of the residual vector around the evaluation vector n^(p). A simple way to overcome this behavior is to limit the incremental vector with a dampening factor α and try α = 1,0.5,0.25,… until ||F(n^(p) + αδn^(p))|| < ||F(n^(p))||. This procedure is the simplest of the so-called line-search methods, although more elaborated ones exist (see Dennis & Schnabel 1996, for instance).

A second problem deals with the possibility to produce solution estimates with negative entries. While mathematically correct, a solution estimate with negative population densities is physically incorrect, and the solver may even overflow when solving the RTE. A possible solution to prevent negative entries consists of limiting the correction at each depth independently.

A third and inherent problem of the Newton-Raphson method deals with the quality of the initial guess. The initial guess is usually an important factor in the method convergence rate or even the failure of the method. The method does not converge or ventures outside the domain of definition of F when a poor starting point is given. The method may also be trapped in a local minimum of the residual vector, which can be difficult to spot. Several tools such as continuation methods can be used to build a robust solver based on the Newton-Raphson method. More details are given in Knoll & Keyes (2004).

The Newton-Raphson requires the knowledge of the inverse of a Jacobian matrix $J_{F}^{- 1}$ ${\bf{J}}_F^{ - 1}$ at each iterative step. Several issues can potentially arise when these quantities are computed, and we list them below.

Implementation: most problems do not have an analytical expression for J_F, and an approximation needs to be given (e.g., by finite differences). The convergence is therefore likely to be less than quadratic. In the worst case, the method may fail when the approximation is too coarse.
Storage: for large problems, storing J_F can be problematic.
Time consumption: inversion of J_F quickly becomes time-consuming as the size of the problem increases, considering traditional inversion routines such as Gauss-Jordan elimination. The computation of the full Jacobian might also be expensive.

The radiative transfer problem we considered disqualifies a classical Newton-Raphson method mainly because of the computational cost derived from building of J_F, even when using analytical expressions. The Newton-Raphson method therefore requires the information of the Jacobian matrices without building them or only building them partially, in order to keep an efficient solver. The next sections detail a way to bypass this thorny problem.

Fig. 1

Jacobian matrix structure for a N_ℓ-level atom problem. J_F is a (N_ℓN_z) × (N_ℓN_z) matrix that contains the derivatives of the residual vector components with respect to the population densities. This matrix can be considered as a N_z × N_z block matrix, where each block J^kℓ is a N_ℓ × N_ℓ matrix. J^kℓ stores the derivatives of F at depth index k with respect to the population densities at depth index ℓ.

2.3 Iterative inversion: Krylov methods

Equation (29) is equivalent to a linear system of the form Ax = b, where A = J_F(n^(p)), x = δn^(p) and b = −F(n^(p)). This linear system can be solved for x without inverting A using iterative approaches such as Krylov methods. In short, Krylov methods are used to solve large linear systems through projections onto a Krylov subspace K_s, $K_{s} = span (r_{0}, A r_{0}, A^{2} r_{0}, \dots, A^{s - 1} r_{0}),$ ${K_s} = {\rm{span}}\left( {{r_0},{\bf{A}}{r_0},{{\bf{A}}^2}{r_0}, \ldots ,{{\bf{A}}^{s - 1}}{r_0}} \right),$

where r₀ = b − Ax₀ is the initial residual vector built from the initial guess x₀. Since x is meant to represent a Newton-Raphson correction, a typical initial guess would be zero and thus r₀ = b (we considered that no correction is required initially). Another initial condition might be given by the solution to Px₀ = b, where P is a preconditioner (Sect. 2.4.3). Then, the solution is estimated as $x - x_{0} = \sum_{i = 1}^{s} κ_{i} q_{i},$ $x - {x_0} = \mathop {\mathop \sum \nolimits^ }\limits_s^{i = 1} {\kappa _i}{q_i},$ (30)

where the set of vectors (q₁,…, q_s) is a basis of K_s and (ĸ₁,…, ĸ_s) are the corresponding coordinates. The purpose of a Krylov method is therefore to construct a basis of K_s, and then to determine the corresponding scalars ĸ_i through projection methods. This construction is made iteratively in s iterations (one iteration per basis vector). Each iteration adds a new component to the solution estimate. The solution is sought such that ‖r‖₂ = ‖b − Ax‖₂ < δ‖r₀‖₂, where δ is a relative tolerance set by the user. We note that the given tolerance might be achieved in fewer than s iterations.

The size of the Krylov subspace is related to the precision of the solution that can be achieved, the latter also depending on the method employed. If s is too small, the desired tolerance level might not be achieved. On the other hand, when s is chosen to be equal to the size of A, a Krylov method will eventually converge to the exact solution in theory. In practice, round-off and truncation errors will limit the maximum precision that can be expected. From the definition of K_s, we note that only matrix-vector products are required in these methods, which is the keystone of Sect. 2.4.

A plethora of Krylov methods has been developed over the past decades, among which two popular techniques and their respective variants are broadly used in various physics problems. We briefly describe them below.

The generalized minimal residual method (GMRES) is usually based on the Arnoldi process or Householder transforms to produce orthonormalized bases. It was first developed in Saad & Schultz (1986) as an improvement and an extension to nonsymmetric matrices of the MINRES method developed by Paige & Saunders (1975). It is a very versatile linear system solver.
The bi-conjugate gradient stabilized (BiCGSTAB) is based on the Lanczos bi-orthogonalization procedure, which generates nonorthogonal bases. It was first developed in van der Vorst (1992) as a variant of the biconjugate gradient method (BiCG).

Both methods can be used with any invertible matrix. GMRES requires one matrix-vector product per iteration, whereas two are needed with BiCGSTAB, but the latter requires less memory than GMRES. This is especially true whenever the number of iterations required for convergence is large because BiCGSTAB uses a constant amount of memory per iteration and GMRES does not. The most crucial feature of GMRES is a strict decrease of the residual norm ‖r‖₂ throughout the iterative process, whereas the convergence behavior of BiCGSTAB is less regular (van der Vorst 1992).

2.4 Jacobian-free Newton-Krylov methods

2.4.1 Setup

We still did not address the problem of the Jacobian matrix estimation and the potential high computational cost it represents. Fortunately, Krylov methods applied to the Newton-Raphson iteration (Eq. (29)) only require the action of the Jacobian matrix J_F onto a generic vector v (see Sect. 2.3). The operation J_F(n^(p))v can be approximated using finite differences (e.g., Knoll & Keyes 2004), $J_{F} (n^{(p)}) ν = \frac{F (n^{(p)} + ϵ ν) - F (n^{(p)})}{ϵ} + 𝒪 (ϵ) forward$ ${{\bf{J}}_F}\left( {{n^{(p)}}} \right)v = {{F\left( {{n^{(p)}} + v} \right) - F\left( {{n^{(p)}}} \right)} \over } + O()\quad {\rm{forward}}$ (31) $J_{F} (n^{(p)}) ν = \frac{F (n^{(p)}) - F (n^{(p)} - ϵ ν)}{ϵ} + 𝒪 (ϵ) backward$ ${{\bf{J}}_F}\left( {{n^{(p)}}} \right)v = {{F\left( {{n^{(p)}}} \right) - F\left( {{n^{(p)}} - v} \right)} \over } + O()\quad {\rm{backward}}$ (32) $J_{F} (n^{(p)}) ν = \frac{F (n^{(p)} + ϵ ν) - F (n^{(p)} - ϵ ν)}{2 ϵ} + 𝒪 (ϵ^{2}) central,$ ${{\bf{J}}_F}\left( {{n^{(p)}}} \right)v = {{F\left( {{n^{(p)}} + v} \right) - F({n^{(p)}} - v)} \over {2}} + O\left( {{^2}} \right)\quad {\rm{central,}}$ (33)

where є is the difference step. These schemes do not use the Jacobian matrix, but rather extra calls of F, which is a huge computational and storage gain especially for large problems, but at the cost of precision. This is the keystone of Jacobian-free Newton-Krylov solvers (hereafter JFNK). These methods were first presented by Knoll & Keyes (2004). Since the residual vector F(n^(p)) is already computed and passed to the Krylov solver, first-order schemes (forward and backward) only require one fresh call of F. In comparison, the second-order scheme (central) requires two fresh calls of F, which is a major drawback. In our problem, every evaluation of F translates into solving the RTE for all rays at all frequencies for a given n. We also note that the finite-differences calculations in Eqs. (31)–(33) do not estimate the individual elements of the Jacobian matrix, but are instead used to directly estimate the matrix-vector product.

Since the Newton-Raphson method is iterative, the matrix-vector product estimates are only needed to be accurate enough to guarantee convergence. This is the main reason why the vast majority of JFNK solvers use first-order schemes rather than higher-order ones (Knoll & Keyes 2004). In practice, round-off and truncation errors may occur, and an optimal choice of є is hard to find. The first source of error is caused by the finite arithmetic precision of computers, and the second source of error is due to the limited accuracy of the scheme. Several empirical choices for є are further detailed in Knoll & Keyes (2004) to minimize both sources of error.

2.4.2 Augmentation of numerical precision with complex numbers

For the considered problem, the components of n^(p) cover a wide range of values. For instance, considering a two-level hydrogen atom in a FAL-C atmosphere (Fontenla et al. 1993), the LTE population densities cover the range 10⁴ up to 10¹⁷ cm⁻³. This leads to high values of є considering the empirical choices. A rescaling of the densities is possible to keep reasonable є values. However, the large span of densities remains problematic within the residual vector F itself and leads to round-off errors. Thus, the usual numerical derivative schemes are not suited for the given problem. Fortunately, there is an alternative scheme that uses complex numbers to increase the precision and dynamic range of the calculations, which is far less affected by round-off errors or substractive cancellations (e.g., Kan et al. 2022; Martins & Ning 2021), $J_{F} (n^{(p)}) ν = \frac{J 𝔐 [F (n^{(p)} + i ϵ ν)]}{ϵ} + 𝒪 (ϵ^{2})$ ${{\bf{J}}_F}\left( {{n^{(p)}}} \right)v = {{\left[ {F\left( {{n^{(p)}} + iv} \right)} \right]} \over } + O\left( {{^2}} \right)$ (34)

where i is the imaginary unit. This special scheme only requires one fresh call of F and is second-order accurate as the central difference scheme. It requires setting up the main routines for complex arithmetic operations, however. We used the linear piece-wise source function scheme described in Sect. 2.1 to integrate the RTE along rays, for which the modifications are straightforward. The remaining truncation error can be greatly attenuated by selecting a tiny є value. Martins & Ning (2021) recommend a typical value є ~ 10⁻²⁰⁰ for double-precision functions. The imaginary part is only used for the Krylov solver, otherwise only the (unperturbed) real part is considered. Since the imaginary part is typically very small compared to the real part, equations that involve the perturbations should be linearized (e.g., by computing the source function). This prevents introducing undesired arithmetic errors. Figure 2 points out a typical accuracy issue in the computation of the Jacobian matrices when using traditional schemes. The complex scheme alone can provide accurate estimates of Jacobian matrices for our given NLTE problem. For this sole reason, we used the complex scheme in our JFNK solvers.

Fig. 2

Jacobian matrix estimation error for the three-level Ca II setup, evaluated at the LTE populations and for the different schemes. The first-order scheme is the forward one. The difference steps є₁ and є₂ are common choices (Knoll & Keyes 2004) and read $ϵ_{1} = \sum_{i, k = 1}^{N_{ℓ}, N_{z}} b | {(n_{i}^{k})}^{(p)} | / (N_{ℓ} N_{z} ‖ ν ‖_{2}) + b$ ${_1} = \mathop \sum \limits_{i,k = 1}^{{N_\ell },{N_z}} b\left| {{{\left( {n_i^k} \right)}^{(p)}}} \right|/\left( {{N_\ell }{N_z}v{_2}} \right) + b$ , where b = 10⁻⁶, and $ϵ_{2} = ϵ_{mach}^{1 / r} \sqrt{1 + {‖ n^{(p)} ‖}_{2}} / ‖ ν ‖_{2}$ ${_2} = _{{\rm{mach}}}^{1/r}\sqrt {1 + {n^{(p)}}{_2}} /v{_2}$ , where є_mach is the machine epsilon for double-precision numbers, r = 2 for forward and backward schemes, and r = 3 for the central scheme. The logarithm of the absolute error (color bar) is clipped below −16.

2.4.3 Preconditioning

Although we have introduced an accurate method to evaluate the matrix-vector products J_F(n^(p))v, the Jacobian matrix may potentially be ill-conditioned. Consequently, the Krylov solver may require many iterations to converge to the desired tolerance because of the high condition number of J_F(n^(p)). Therefore, we propose preconditioning the Krylov solver in order to increase its efficiency. The preconditioning process consists of using a preconditioning matrix P (the preconditioner) such that J_F(n^(p))P⁻¹ (right preconditioning) or P⁻¹ J_F(n^(p)) (left preconditioning) has a lower condition number than J_F(n^(p)). The system to be solved by the Krylov solver is no longer given by Eq. (29) but rather by $(J_{F} (n^{(p)}) P^{- 1}) (P δ n^{(p)}) = - F (n^{(p)})$ $\left( {{{\bf{J}}_F}\left( {{n^{(p)}}} \right){{\bf{P}}^{ - 1}}} \right)\left( {{\bf{P}}\delta {n^{(p)}}} \right) = - F\left( {{n^{(p)}}} \right)$ (35)

for right preconditioning and by $(P^{- 1} J_{F} (n^{(p)})) δ n^{(p)} = - P^{- 1} F (n^{(p)})$ $\left( {{{\bf{P}}^{ - 1}}{{\bf{J}}_F}\left( {{n^{(p)}}} \right)} \right)\delta {n^{(p)}} = - {{\bf{P}}^{ - 1}}F\left( {{n^{(p)}}} \right)$ (36)

for left preconditioning. The preconditioned system is expected to be solved in fewer iterations than the original system. Equation (35) is solved for y = Pδn^(p) first then for δn^(p) If the right preconditioning is used, the matrix-vector product scheme (Eq. (34)) can be adapted for Eq. (35) and reads $J_{F} (n^{(p)}) (P^{- 1} ν) = \frac{ℑ m [F (n^{(p)} + i ϵ (P^{- 1} ν))]}{ϵ} + 𝒪 (ϵ^{2}) .$ ${{\bf{J}}_F}\left( {{n^{(p)}}} \right)\left( {{{\bf{P}}^{ - 1}}v} \right) = {{\Im \left[ {F\left( {{n^{(p)}} + i\left( {{{\bf{P}}^{ - 1}}v} \right)} \right)} \right]} \over } + O\left( {{^2}} \right).$ (37)

If the left preconditioning is chosen, Eq. (34) is used directly, and then the result is left-multiplied by P⁻¹. The second member passed to the Krylov solver in this case is −P⁻¹F(n^(p)). In our JFNK solvers, we used the left preconditioning because the right preconditioning involves an additional inversion step. Because the preconditioner is meant to improve the overall performance of the inversion process, P should be easy to calculate and invert. A commonly used algebraic preconditioner is given by the diagonal or block diagonal of the matrix that is to be inverted (easy inversion and matrix-vector multiplication). This simple matrix is also known as a Jacobi (or block-Jacobi) preconditioner. This choice is particularly interesting in our case because the block diagonal of J_F(n^(p)) is as costly to compute as a call of F (see Sects. 2.1 and 2.5.2). We note, however, that other physics-based preconditioners could be very relevant for our case, as presented in Janett et al. (2024) for solving linear problems. The calculation of such a preconditioner can be performed when calculating F(n^(p)), thus reusing most of the variables that were already computed to obtain the residual vector. Algorithm 2 illustrates the final JFNK solving routine.

2.5 Analytical Jacobian matrix

2.5.1 Derivation

In this part, we derive the expressions of the Jacobian matrix elements as a function of the population densities. In principle, these equations could be used to compute the fully analytical Jacobian matrix. While the calculation of the full Jacobian is expensive, it is at least important to detail these derivations for preconditioning purposes (Sect. 2.4). A more general derivation is provided in Milić & van Noort (2017) for derivatives according to any atmospheric parameter. The Jacobian element $J_{i j}^{k ℓ}$ $J_{ij}^{k\ell }$ can be calculated as follows: $J_{i j}^{k ℓ} = (\frac{\partial F_{i}^{k}}{\partial n_{j}^{ℓ}}) = \frac{\partial}{\partial n_{j}^{ℓ}} (\sum_{p} {n_{i}^{k} (C_{i p}^{k} + R_{i p}^{k}) - n_{p}^{k} (C_{p i}^{k} + R_{p i}^{k})})$ $J_{ij}^{k\ell } = \left( {{{\partial F_i^k} \over {\partial n_j^\ell }}} \right) = {\partial \over {\partial n_j^\ell }}\left( {\mathop {\mathop \sum \nolimits^ }\limits_p \left\{ {n_i^k\left( {C_{ip}^k + R_{ip}^k} \right) - n_p^k\left( {C_{pi}^k + R_{pi}^k} \right)} \right\}} \right)$ (38)

unless $F_{i}^{k}$ $F_i^k$ is a particle conservation equation, in which case the Jacobian element is simply given by $J_{i j}^{k ℓ} = \frac{\partial}{\partial n_{j}^{ℓ}} (n_{tot}^{k} - \sum_{p} n_{p}^{k}) = - δ_{k ℓ},$ $J_{ij}^{k\ell } = {\partial \over {\partial n_j^\ell }}\left( {n_{{\rm{tot}}}^k - \mathop {\mathop \sum \nolimits^ }\limits_p n_p^k} \right) = - {\delta _{k\ell }},$ (39)

where we used the fact that the population densities $n_{i}^{k}$ $n_i^k$ are considered to be independent variables and that $n_{tot}^{k}$ $n_{{\rm{tot}}}^k$ is kept constant, and therefore, $\frac{\partial n_{i}^{k}}{\partial n_{j}^{ℓ}} = δ_{k ℓ} δ_{i j} .$ ${{\partial n_i^k} \over {\partial n_j^\ell }} = {\delta _{k\ell }}{\delta _{ij}}.$ (40)

Algorithm 2JFNK solver

In Eq. (38), the collisional coefficients do not depend on the population densities. Applying the chain rule to Eq. (38) yields $J_{i j}^{k ℓ} = (\frac{\partial F_{i}^{k}}{\partial n_{j}^{ℓ}}) = δ_{k ℓ} Γ_{i j}^{k} + A_{i j}^{k ℓ}$ $J_{ij}^{k\ell } = \left( {{{\partial F_i^k} \over {\partial n_j^\ell }}} \right) = {\delta _{k\ell }}{\rm{\Gamma }}_{ij}^k + A_{ij}^{k\ell }$ (41)

where $Γ_{i j}^{k} = δ_{i j} \sum_{p} (C_{i p}^{k} + R_{i p}^{k}) - (C_{j i}^{k} + R_{j i}^{k})$ ${\rm{\Gamma }}_{ij}^k = {\delta _{ij}}\mathop {\mathop \sum \nolimits^ }\limits_p \left( {C_{ip}^k + R_{ip}^k} \right) - \left( {C_{ji}^k + R_{ji}^k} \right)$ (42) $A_{i j}^{k ℓ} = \frac{1}{2} {\int^{}}_{- 1}^{1} d μ {\int^{}}_{0}^{\infty} \frac{d ν}{h ν} α_{i}^{k} (\frac{\partial I_{μ ν}^{k}}{\partial n_{j}^{ℓ}})$ $A_{ij}^{k\ell } = {1 \over 2}{\mathop \smallint \nolimits^ ^}_{ - 1}^1d\mu {\mathop \smallint \nolimits^ ^}_0^\infty {{dv} \over {hv}}\alpha _i^k\left( {{{\partial I_{\mu v}^k} \over {\partial n_j^\ell }}} \right)$ (43) $α_{i}^{k} = \underset{p > i}{\sum^{}} {V_{i p}^{k} (n_{i}^{k} - g_{i p}^{k} n_{p}^{k})} - \underset{p < i}{\sum^{}} {V_{p i}^{k} (n_{p}^{k} - g_{p i}^{k} n_{i}^{k})}$ $\alpha _i^k = \mathop {{{\mathop \sum \nolimits^ }^}}\limits_{p > i} \left\{ {V_{ip}^k\left( {n_i^k - g_{ip}^kn_p^k} \right)} \right\} - \mathop {{{\mathop \sum \nolimits^ }^}}\limits_{p < i} \left\{ {V_{pi}^k\left( {n_p^k - g_{pi}^kn_i^k} \right)} \right\}$ (44)

by noting that the derivative of the radiative rates only involves the derivative of the intensity $I_{μ ν}^{k}$ $I_{\mu \nu }^k$ with respect to $n_{j}^{ℓ}$ $n_j^\ell$ . The extinction profile within each coefficient $V_{i p}^{k}$ $V_{ip}^k$ is considered to be independent with respect to the population densities. Then, we can expand the derivative and further write $(\frac{\partial I_{μ ν}^{k}}{\partial n_{j}^{ℓ}}) = \underset{p}{\sum^{}} {(\frac{\partial I_{μ ν}^{k}}{\partial χ_{μ ν}^{p}}) (\frac{\partial χ_{μ ν}^{p}}{\partial n_{j}^{ℓ}}) + (\frac{\partial I_{μ ν}^{k}}{\partial η_{μ ν}^{p}}) (\frac{\partial η_{μ ν}^{p}}{\partial n_{j}^{ℓ}})}$ $\left( {{{\partial I_{\mu \nu }^k} \over {\partial n_j^\ell }}} \right) = \mathop {{{\mathop \sum \nolimits^ }^}}\limits_p \left\{ {\left( {{{\partial I_{\mu \nu }^k} \over {\partial \chi _{\mu \nu }^p}}} \right)\left( {{{\partial \chi _{\mu \nu }^p} \over {\partial n_j^\ell }}} \right) + \left( {{{\partial I_{\mu \nu }^k} \over {\partial \eta _{\mu \nu }^p}}} \right)\left( {{{\partial \eta _{\mu \nu }^p} \over {\partial n_j^\ell }}} \right)} \right\}$ (45)

because the intensity, through the RTE, is only a function of the optical depth and the source function. We further develop Eq. (45) and write $(\frac{\partial χ_{μ ν}^{p}}{\partial n_{j}^{ℓ}}) = δ_{p ℓ} β_{j}^{ℓ}$ $\left( {{{\partial \chi _{\mu v}^p} \over {\partial n_j^\ell }}} \right) = {\delta _{p\ell }}\beta _j^\ell$ (46) $(\frac{\partial η_{μ ν}^{p}}{\partial n_{j}^{ℓ}}) = δ_{p ℓ} γ_{j}^{ℓ} + χ_{sca}^{p} (\frac{\partial J_{ν}^{p}}{\partial n_{j}^{ℓ}})$ $\left( {{{\partial \eta _{\mu \nu }^p} \over {\partial n_j^\ell }}} \right) = {\delta _{p\ell }}\gamma _j^\ell + \chi _{{\rm{sca}}}^p\left( {{{\partial J_v^p} \over {\partial n_j^\ell }}} \right)$ (47)

where $β_{j}^{ℓ} = \underset{s > j}{\sum^{}} V_{j s}^{ℓ} (1 - g_{j s}^{ℓ})$ $\beta _j^\ell = \mathop {{{\mathop \sum \nolimits^ }^}}\limits_{s > j} V_{js}^\ell \left( {1 - g_{js}^\ell } \right)$ (48) $γ_{j}^{ℓ} = \underset{s < j}{\sum^{}} (\frac{2 h ν^{3}}{c^{2}}) V_{s j}^{ℓ} g_{s j}^{ℓ},$ $\gamma _j^\ell = \mathop {{{\mathop \sum \nolimits^ }^}}\limits_{s < j} \left( {{{2h{v^3}} \over {{c^2}}}} \right)V_{sj}^\ell g_{sj}^\ell ,$ (49)

so that Eq. (45) therefore reduces to $(\frac{\partial I_{μ ν}^{k}}{\partial n_{j}^{ℓ}}) = β_{j}^{ℓ} (\frac{\partial I_{μ ν}^{k}}{\partial χ_{μ ν}^{ℓ}}) + γ_{j}^{ℓ} (\frac{\partial I_{μ ν}^{k}}{\partial η_{μ ν}^{ℓ}}) + \underset{p}{\sum^{}} χ_{sca}^{p} (\frac{\partial J_{ν}^{p}}{\partial n_{j}^{ℓ}}) (\frac{\partial I_{μ ν}^{k}}{\partial η_{μ ν}^{p}}) .$ $\left( {{{\partial I_{\mu \nu }^k} \over {\partial n_j^\ell }}} \right) = \beta _j^\ell \left( {{{\partial I_{\mu \nu }^k} \over {\partial \chi _{\mu \nu }^\ell }}} \right) + \gamma _j^\ell \left( {{{\partial I_{\mu \nu }^k} \over {\partial \eta _{\mu \nu }^\ell }}} \right) + \mathop {{{\mathop \sum \nolimits^ }^}}\limits_p \chi _{{\rm{sca}}}^p\left( {{{\partial J_v^p} \over {\partial n_j^\ell }}} \right)\left( {{{\partial I_{\mu \nu }^k} \over {\partial \eta _{\mu \nu }^p}}} \right).$ (50)

Equation(50) consists of a linear contribution of the intensity and a summation of nonlinear cross terms due to the background scattering. Both derivatives involving $I_{μ ν}^{k}$ $I_{\mu \nu }^k$ depend on the scheme chosen to solve the RTE. Since we used the linear piece-wise source function scheme detailed in Sect. 2.1, the corresponding expressions for the derivatives are $(\frac{\partial I_{μ ν}^{k}}{\partial χ_{μ ν}^{ℓ}}) = (\frac{\partial I_{μ ν}^{k + 1}}{\partial χ_{μ ν}^{ℓ}}) e^{- Δ τ_{μ ν}^{k}} + a_{χ}^{k} δ_{k ℓ} + b_{χ}^{k} δ_{k + 1 ℓ},$ $\left( {{{\partial I_{\mu \nu }^k} \over {\partial \chi _{\mu \nu }^\ell }}} \right) = \left( {{{\partial I_{\mu \nu }^{k + 1}} \over {\partial \chi _{\mu v}^\ell }}} \right){e^{ - {\rm{\Delta }}\tau _{\mu v}^k}} + a_\chi ^k{\delta _{k\ell }} + b_\chi ^k{\delta _{k + 1\ell }},$ (51) $(\frac{\partial I_{μ ν}^{k}}{\partial η_{μ ν}^{ℓ}}) = (\frac{\partial I_{μ ν}^{k + 1}}{\partial η_{μ ν}^{ℓ}}) e^{- Δ τ_{μ ν}^{k}} + a_{η}^{k} δ_{k ℓ} + b_{η}^{k} δ_{k + 1 ℓ},$ $\left( {{{\partial I_{\mu \nu }^k} \over {\partial \eta _{\mu \nu }^\ell }}} \right) = \left( {{{\partial I_{\mu v}^{k + 1}} \over {\partial \eta _{\mu \nu }^\ell }}} \right){e^{ - {\rm{\Delta }}\tau _{\mu v}^k}} + a_\eta ^k{\delta _{k\ell }} + b_\eta ^k{\delta _{k + 1\ell }},$ (52)

for outgoing rays (µ > 0), and $(\frac{\partial I_{μ ν}^{k}}{\partial χ_{μ ν}^{ℓ}}) = (\frac{\partial I_{μ ν}^{k - 1}}{\partial χ_{μ ν}^{ℓ}}) e^{- Δ τ_{μ ν}^{k - 1}} + c_{χ}^{k} δ_{k ℓ} + d_{χ}^{k} δ_{k - 1 ℓ},$ $\left( {{{\partial I_{\mu \nu }^k} \over {\partial \chi _{\mu \nu }^\ell }}} \right) = \left( {{{\partial I_{\mu \nu }^{k - 1}} \over {\partial \chi _{\mu \nu }^\ell }}} \right){e^{ - {\rm{\Delta }}\tau _{\mu \nu }^{k - 1}}} + c_\chi ^k{\delta _{k\ell }} + d_\chi ^k{\delta _{k - 1\ell }},$ (53) $(\frac{\partial I_{μ ν}^{k}}{\partial η_{μ ν}^{ℓ}}) = (\frac{\partial I_{μ ν}^{k - 1}}{\partial η_{μ ν}^{ℓ}}) e^{- Δ τ_{μ ν}^{k - 1}} + c_{η}^{k} δ_{k ℓ} + d_{η}^{k} δ_{k - 1 ℓ},$ $\left( {{{\partial I_{\mu \nu }^k} \over {\partial \eta _{\mu \nu }^\ell }}} \right) = \left( {{{\partial I_{\mu \nu }^{k - 1}} \over {\partial \eta _{\mu \nu }^\ell }}} \right){e^{ - {\rm{\Delta }}\tau _{\mu \nu }^{k - 1}}} + c_\eta ^k{\delta _{k\ell }} + d_\eta ^k{\delta _{k - 1\ell }},$ (54)

for ingoing rays (µ < 0), where the expressions of the different involved coefficients are given in Appendix A.2. These coefficients can be constructed from the variables used when solving the RTE to save computational time. The boundary conditions for this scheme (Sect. 2.1) imply that $(\frac{\partial I_{μ ν}^{N_{z}}}{\partial χ_{μ ν}^{ℓ}}) = 0, (\frac{\partial I_{μ ν}^{N_{z}}}{\partial η_{μ ν}^{ℓ}}) = 0 μ > 0$ $\left( {{{\partial I_{\mu \nu }^{{N_z}}} \over {\partial \chi _{\mu \nu }^\ell }}} \right) = 0,\quad \left( {{{\partial I_{\mu \nu }^{{N_z}}} \over {\partial \eta _{\mu \nu }^\ell }}} \right) = 0\quad \mu > 0$ (55) $(\frac{\partial I_{μ ν}^{1}}{\partial χ_{μ ν}^{ℓ}}) = 0, (\frac{\partial I_{μ ν}^{1}}{\partial η_{μ ν}^{ℓ}}) = 0 μ < 0$ $\left( {{{\partial I_{\mu \nu }^1} \over {\partial \chi _{\mu \nu }^\ell }}} \right) = 0,\quad \left( {{{\partial I_{\mu \nu }^1} \over {\partial \eta _{\mu \nu }^\ell }}} \right) = 0\quad \mu < 0$ (56)

for each value of ℓ. It can be shown that Eqs. (51)–(54) define two upper (µ > 0) and two lower (µ < 0) triangular matrices, $(\frac{\partial I_{μ ν}^{k}}{\partial χ_{μ ν}^{ℓ}}) = {\begin{array}{l} 0, & ℓ < k \\ a_{χ}^{k}, & ℓ = k \\ b_{χ}^{ℓ - 1} + a_{χ}^{ℓ} e^{- Δ τ_{μ ν}^{ℓ - 1}}) \prod_{i = k}^{ℓ - 2} e^{- Δ τ_{μ ν}^{i}} & ℓ > k \end{array}$ $\left( {{{\partial I_{\mu \nu }^k} \over {\partial \chi _{\mu \nu }^\ell }}} \right) = \{ \matrix{ {0,} \hfill & {\ell < k} \hfill \cr {a_\chi ^k,} \hfill & {\ell = k} \hfill \cr {b_\chi ^{\ell - 1} + a_\chi ^\ell {e^{ - \Delta \tau _{\mu \nu }^{\ell - 1}}})\mathop \prod \limits_{i = k}^{\ell - 2} {e^{ - \Delta \tau _{\mu \nu }^i}}} \hfill & {\ell > k} \hfill \cr }$ (57)

for µ > 0, and $(\frac{\partial I_{μ ν}^{k}}{\partial χ_{μ ν}^{ℓ}}) = {\begin{array}{l} 0, & ℓ > k \\ c_{χ}^{k}, & ℓ = k \\ (d_{χ}^{ℓ + 1} + c_{χ}^{ℓ} e^{- Δ τ_{μ ν}^{ℓ}}) \prod_{i = ℓ + 1}^{k - 1} e^{- Δ τ_{μ ν}^{i}} & ℓ < k \end{array}$ $\left( {{{\partial I_{\mu \nu }^k} \over {\partial \chi _{\mu \nu }^\ell }}} \right) = \{ \matrix{ {0,} \hfill & {\ell > k} \hfill \cr {c_\chi ^k,} \hfill & {\ell = k} \hfill \cr {\left( {d_\chi ^{\ell + 1} + c_\chi ^\ell {e^{ - \Delta \tau _{\mu \nu }^\ell }}} \right)\mathop \prod \limits_{i = \ell + 1}^{k - 1} {e^{ - \Delta \tau _{\mu \nu }^i}}} \hfill & {\ell < k} \hfill \cr }$ (58)

for µ < 0. These expressions are valid for internal depth points. The matrices related to the derivatives with respect to the emis-sivities are obtained by using the associated coefficients. These matrices can be understood as Jacobian matrices of the intensity with respect to opacities and emissivities.

The last part to detail deals with the derivative of the mean intensity term. Through its definition, we note that this quantity involves derivatives of the intensity with respect to the population densities. In brief, the full expansion of Eq. (50) consists of an intricate self-consistent but linear system of the derivatives of the intensity with respect to the population densities. It is possible, however, to find a solution to this system that can be written as $(\frac{\partial I_{μ ν}^{k}}{\partial n_{j}^{ℓ}}) = β_{j}^{ℓ} (\frac{\partial I_{μ ν}^{k}}{\partial χ_{μ ν}^{ℓ}}) + γ_{j}^{ℓ} (\frac{\partial I_{μ ν}^{k}}{\partial η_{μ ν}^{ℓ}}) + \sum_{p} χ_{sca}^{p} r_{p} (\frac{\partial I_{μ ν}^{k}}{\partial η_{μ ν}^{p}}),$ $\left( {{{\partial I_{\mu \nu }^k} \over {\partial n_j^\ell }}} \right) = \beta _j^\ell \left( {{{\partial I_{\mu \nu }^k} \over {\partial \chi _{\mu \nu }^\ell }}} \right) + \gamma _j^\ell \left( {{{\partial I_{\mu \nu }^k} \over {\partial \eta _{\mu \nu }^\ell }}} \right) + \mathop \sum \limits_p \chi _{{\rm{sca}}}^p{r_p}\left( {{{\partial I_{\mu \nu }^k} \over {\partial \eta _{\mu \nu }^p}}} \right),$ (59)

where the coefficients r_p and the solution derivation can be found in Appendix B. In practice, the background-scattering contribution to the derivatives is usually very weak and can therefore be neglected.

2.5.2 Preconditioning of JFNK with the analytical Jacobian matrix

Preconditioning the JFNK solver is troublesome when the analytical derivation of the Jacobian matrix is used. It is expensive to compute Eq. (50) when the interest is in the Jacobi pre-conditioner alone, because all off-diagonal terms need to be calculated. This issue arises from the background-scattering contribution, and we therefore have to deal with the following in order to obtain the preconditioner:

The summation in Eq. (50) involves all cross terms (k ≠ p terms).
The presence of the mean intensity in the scattering term also involves all cross terms (Sect. 2.5).

Therefore, only an approximate Jacobi preconditioner can be used at the cost of precision and potentially convergence rate of the Krylov solver. We give three simple solutions to overcome both problems and still provide a preconditioner that dramatically improves the convergence properties of the Krylov solver:

The very first solution consists of disregarding the background-scattering contribution. In this case, Eq. (50) becomes for k = ℓ $(\frac{\partial I_{μ ν}^{k}}{\partial n_{j}^{k}}) = β_{j}^{k} (\frac{\partial I_{μ ν}^{k}}{\partial χ_{μ ν}^{k}}) + γ_{j}^{k} (\frac{\partial I_{μ ν}^{k}}{\partial η_{μ ν}^{k}}) .$ $\left( {{{\partial I_{\mu \nu }^k} \over {\partial n_j^k}}} \right) = \beta _j^k\left( {{{\partial I_{\mu \nu }^k} \over {\partial \chi _{\mu \nu }^k}}} \right) + \gamma _j^k\left( {{{\partial I_{\mu \nu }^k} \over {\partial \eta _{\mu \nu }^k}}} \right).$ (60)
The second solution consists of discarding all of the cross terms to only keep the local one. The preconditioner further reduces to a local operator when we use $(\frac{\partial I_{μ ν}^{k}}{\partial n_{j}^{k}}) = β_{j}^{k} (\frac{\partial I_{μ ν}^{k}}{\partial χ_{μ ν}^{k}}) + [γ_{j}^{k} + χ_{sca}^{k} {(\frac{\partial J_{ν}^{k}}{\partial n_{j}^{k}})}^{†}] (\frac{\partial I_{μ ν}^{k}}{\partial η_{μ ν}^{k}}),$ $\left( {{{\partial I_{\mu \nu }^k} \over {\partial n_j^k}}} \right) = \beta _j^k\left( {{{\partial I_{\mu \nu }^k} \over {\partial \chi _{\mu \nu }^k}}} \right) + \left[ {\gamma _j^k + \chi _{{\rm{sca}}}^k{{\left( {{{\partial J_v^k} \over {\partial n_j^k}}} \right)}^\dag }} \right]\left( {{{\partial I_{\mu \nu }^k} \over {\partial \eta _{\mu \nu }^k}}} \right),$ (61)

where a previous estimate of the derivative of the mean intensity is used instead of the current one. This estimate is easily computed for the next iteration by an integration of Eq. (61) over all angles. This quantity can be initialized to zero (zero radiation initial guess) or the mean intensity given by the LTE solution can be calculated.
The last solution, which is the least constraining but requires additional operations, uses the solution given by Eq. (59) while only considering the local terms. The corresponding operator then has $(\frac{\partial I_{μ ν}^{k}}{\partial n_{j}^{k}}) = β_{j}^{k} (\frac{\partial I_{μ ν}^{k}}{\partial χ_{μ ν}^{k}}) + [γ_{j}^{k} + χ_{sca}^{k} r_{k}] (\frac{\partial I_{μ ν}^{k}}{\partial η_{μ ν}^{k}}),$ $\left( {{{\partial I_{\mu v}^k} \over {\partial n_j^k}}} \right) = \beta _j^k\left( {{{\partial I_{\mu \nu }^k} \over {\partial \chi _{\mu \nu }^k}}} \right) + \left[ {\gamma _j^k + \chi _{{\rm{sca}}}^k{r_k}} \right]\left( {{{\partial I_{\mu \nu }^k} \over {\partial \eta _{\mu \nu }^k}}} \right),$ (62)

where the coefficient r_k is $r_{k} = \frac{\sum_{}_{μ} ω_{μ} [β_{j}^{k} (\frac{\partial I_{μ ν}^{k}}{\partial χ_{μ ν}^{k}}) + γ_{j}^{k} (\frac{\partial I_{μ ν}^{k}}{\partial η_{μ ν}^{k}})]}{1 - \sum_{}_{μ} ω_{μ} χ_{sca}^{k} (\frac{\partial \partial_{μ ν}^{k}}{\partial η_{μ ν}^{k}})} .$ ${r_k} = {{{{\mathop \sum \limits_{} }_\mu }{\omega _\mu }\left[ {\beta _j^k\left( {{{\partial I_{\mu \nu }^k} \over {\partial \chi _{\mu \nu }^k}}} \right) + \gamma _j^k\left( {{{\partial I_{\mu \nu }^k} \over {\partial \eta _{\mu \nu }^k}}} \right)} \right]} \over {1 - {{\mathop \sum \limits_{} }_\mu }{\omega _\mu }\chi _{{\rm{sca}}}^k\left( {{{\partial \partial _{\mu \nu }^k} \over {\partial \eta _{\mu \nu }^k}}} \right)}}.$ (63)

A preconditioner that follows one of the three solutions only requires the calculation of ℓ = k terms in Eq. (50), which are only built from the coefficients $a_{χ}^{k}, a_{η}^{k}, c_{χ}^{k}$ $a_\chi ^k,a_\eta ^k,c_\chi ^k$ , and $c_{η}^{k}$ $c_\eta ^k$ given in Appendix A.2. Furthermore, it can be calculated in a comparable number of operations as the residual vector F. In our JFNK solvers, we use the first solution to calculate the preconditioner since the scattering terms are negligible in Eq. (59) compared to the other contributions, at least with the setups we considered (Sect. 3.1).

3 Results and discussion

We have ported a simplified version of the excellent RH code (Uitenbroek 2001) to Python. The latter solves the statistical equilibrium equations using the RH92 method, but with the possibility of including partial redistribution effects of scattered photons (PRD). This Python version does not include PRD effects, and it is significantly slower than the C-version of RH. It is implemented using modern programming constructions, however, such as classes and inheritance, which has been extremely useful in the implementation of our proof-of-concept JFNK solver because we were able to reuse most of the atom structures, opacity calculation routines, and formal solvers of the transport equation. We used the RH92 method as a reference in order to evaluate the performance of our JFNK method. We analyzed the convergence properties of different schemes and therefore did not include Ng-acceleration in our calculations (Ng 1974), which is implemented in the RH code.

3.1 Setup

All the results presented in this paper were computed using a FAL-C model atmosphere of the solar photosphere, chromosphere, and transition region (Fontenla et al. 1993), which consists of 82 depths points covering the interval τ₅₀₀ = [10⁻¹⁰,23], where τ₅₀₀ is the optical depth at λ = 500 nm. Figure 3 depicts the stratifications of the gas temperature, electron density, and total hydrogen density. The atmosphere does not have a native line-of-sight velocity profile.

Three different atomic setups were used, consisting of HI and Can. The different transitions are listed in Table 1. Each setup also included collisional and bound-free transitions (Shull & van Steenberg 1982; Burgess & Chidichimo 1983; Arnaud & Rothenflug 1985). The absorption and emission profiles of each line at each location were modeled with the Voigt function, which depends on the Doppler width and the damping parameter. The latter includes radiative, Stark, linear Stark (in calculations with H atoms), and van der Waals broadening. The angular integrals were discretized using a five-point Gauss-Legendre quadrature.

We performed our calculations using the Newton-Raphson, two JFNK (using GMRES and BiCGSTAB , respectively) and the RH92 methods. Both JFNK solvers systematically used the analytical Jacobi left preconditioner (Sects. 2.4.3 and 2.5.2). All solvers required an exit condition that defined a good convergence. In our case, we kept track of the residual norm ||F||_∞ and of the population relative change norm $‖ δ n / \bar{n} ‖_{\infty} = \frac{1}{2} ‖ (n_{new} - n_{prev}) / (n_{new} + n_{prev}) ‖_{\infty}$ $\delta n/\overline n {_\infty } = {1 \over 2}({n_{{\rm{new}}}} - {n_{{\rm{prev}}}})/\left( {{n_{{\rm{new}}}} + {n_{{\rm{prev}}}}} \right){_\infty }$ . Unless mentioned otherwise, we assumed that a method has converged when $‖ δ n / \bar{n} ‖_{\infty} < 10^{- 4}$ $\delta n/\overline n {_\infty } < {10^{ - 4}}$ . This condition was sufficient most of the time, although we also imposed a minimal drop of ||F||_∞ by two magnitudes to avoid premature exits.

Table 1

Summary of the different atom setups.

Fig. 3

FAL-C atmosphere. The temperature and total hydrogen and electron densities are shown. The height dimension origin (z = 0) corresponds tO τ₅₀₀ = 1.

3.2 Krylov solver tolerance impact

The Krylov solver that is internally used in the JFNK method can generally have a number of parameters that must be chosen for a given run (e.g., the size of the subspace or the convergence criteria). In the case of simple solvers such as GMRES or BiCGSTAB, this set of parameters reduces to the accuracy to which the solution is desired. This single parameter steers the behavior of the JFNK method and its convergence properties. Thus, we investigated the impact of the Krylov solver accuracy on the stability and the efficiency of the JFNK method.

Our first test was to assess the ability of a JFNK solver to match the Newton-Raphson solution. We would expect minimal differences between the two solvers as the accuracy of the Krylov solver increases. Figure 4 shows the convergence properties of our solvers in the case of the six-level HI atom setup. Several accuracy levels are displayed to highlight the evolution of the discrepancies between the different methods. The Newton-Raphson solver outperforms JFNK solvers in every situation. By truncating the precision of the correction provided by the Krylov solver, each JFNK iteration becomes less accurate, usually leading to additional iterations (compared to the standard Newton-Raphson case) in order to achieve a given convergence level in the population densities. The differences between the two methods decrease when the precision of the Krylov solver is substantially increased. Both JFNK solvers display similar results and converge to the same solution. They show the same behavior as the Newton-Raphson solver when r_tol ~ 10⁻⁴, which validates the implementation of our solvers.

A second chart is presented in Fig. 5 and directly compares our iterative solvers with the RH92 solver. In the JFNK method, we would expect a range of tolerances for which the method is optimal (with respect to the residual function calls):

Smaller tolerances result in more precise estimates of the inverse of the Jacobian. The JFNK method therefore requires fewer Newton-Raphson iterations, but the overall number of calls to F is nonetheless higher because the additional accuracy is not effective enough in the convergence of the JFNK method.
Higher tolerances result in coarse estimates of the inverse of the Jacobian. Even though the number of calls to F is reduced, Newton-Raphson iterations usually yield poorer corrections. Therefore, the JFNK method requires additional Newton-Raphson iterations to converge. Overall, the solver requires more calls to F.
The optimal range thus consists of a trade-off between the accuracy of the inverse of the Jacobian and the calls to F needed to obtain them.

We note that overly coarse estimates of the inverse of the Jacobian can yield an unstable behavior throughout Newton-Raphson iterations. We found a few situations in which this feature can be helpful to escape potential local minima of the residual vector, and the method can even converge in fewer calls to F. These inaccurate corrections will probably not lead to the convergence of the JFNK method, however. For the sake of stability, we recommend to use a Krylov relative tolerance smaller than ~10⁻².

Figure 5 (top) highlights for the three-level Ca II setup an optimal range spanning from r_tol ~ 3 × 10⁻⁴ to ~3 × 10⁻². In this range, the JFNK (GMRES) solver outperforms the RH92 solver, whereas the JFNK (BiCGSTAB) solver outperforms the latter in a much narrower range. It is also possible to witness the irregular convergence behavior of the JFNK (BiCGSTAB) solver because the corresponding curve is less smooth than that of the JFNK (GMRES) solver. In the case of a six-level setup (Fig. 5 bottom), there is no such optimal range. Instead, the number of residual vector calls decreases almost monotonically with the Krylov solver tolerance. Both JFNK solvers outperform the RH92 solver for Krylov relative tolerances higher than ~3 × 10⁻³ (GMRES) and ~1 × 10⁻² (BiCGSTAB). The JFNK (GMRES) solver has always proven to outperform the JFNK (BiCGSTAB) solver for most Krylov tolerances and various setups, and it outperforms the RH92 solver in a wider range of Krylov tolerances than the BiCGSTAB counterpart. The residual vector calls from the JFNK / Newton-Raphson solvers and the iterations of the RH92 solver are different, however. Therefore, Fig. 5 only makes sense when the two are comparable in operations or execution time, which is the case for our setups (see Sect. 3.3). Thus, the Jacobi preconditioner allows the JFNK method to be more efficient than the RH92 method when it is used optimally. The preconditioner is less efficient for a setup consisting of many frequencies, however, such as the six-level Ca II or HI ones. Moreover, the lack of an optimal range of Krylov tolerances indicates that the Jacobi preconditioner is not enough for setups like this. This statement also includes every setup with strong scattering.

Fig. 4

Comparison of the Newton-Raphson method with JFNK routines for the six-level HI setup (zero radiation initial guess). As the Krylov solver relative tolerance r_tol becomes small, the discrepancy between the different methods reduces to truncation and round-off errors.

Fig. 5

Residual vector calls required for convergence as a function of the Krylov solver relative tolerance. Top: Three-level Ca II setup. Bottom: Six-level Ca II setup. Both setups use the LTE initial condition.

3.3 Performance of the solver

Table 2 shows the average execution time per call to F (or equiv-alently, for the RH92 solver) for different setups. The pure call to F is always slightly faster to execute by the JFNK solvers than the RH92 solver equivalent because the RH92 solver method itself requires the computation of cross-coupling terms and the remaining rate matrix elements in order to update the population densities. On the other hand, computing the residual vector and updating the preconditioner (JFNK) requires approximately twice the time of a pure F call (by a JFNK solver), which was expected. The preconditioner update call, while more time-consuming than the RH92 solver equivalent, is only performed once per Newton-Raphson iteration. The main contribution to the execution time is due to the Krylov solver calls, and therefore, to pure residual vector estimates. As a result, it can be shown that the JFNK solver calls, as implemented in our proof-of-concept code, do require slightly less time on average than RH92 calls, even for extremely suboptimal Krylov tolerances.

In the following part, we compare the quality of the solutions provided by JFNK solvers with the reference RH92 case. The convergence condition is usually given by a sufficiently small change in the population densities. For this purpose, we used the JFNK residual norm ||F||_∞ as the metric because it is derived from the raw equations we attempted to solve, although the RH92 solver is not designed to minimize the raw residual norm. Moreover, the residual norm might stall when using RH92 because of the deepest layers of the atmosphere. The medium indeed becomes strongly collisional, and therefore, the radiative contribution and the changes that may occur during the solving process do not have a significant impact. Nevertheless, we disregarded these layers in the estimation of the residual norm because they are close to LTE. In the following, the residual norm is evaluated considering only the first 50 points (z > 700 km) of our atmosphere, where the chromosphere is located. A very large error in the RH92 curve is obtained when this is neglected, and the error is mostly driven by deeper layers where LTE should hold. This behavior was absent in the JFNK calculations.

Figures 6 and 7 show this clamped residual norm as well as the population change norm for the Ca II and the H I setups, respectively. It is clear that the RH92 solver displays a lower convergence rate for the largest part of the solving process. JFNK solvers, on the other hand, are outperformed at the beginning before the convergence rate surpasses that of RH92 (Ca II) or equals the latter (H I). As an outcome, JFNK solvers can perform better than the RH92 solver, especially when the initial guess is close enough to the solution. Moreover, the two figures show that the solution provided by the JFNK solvers is one hundred times more precise than that of RH92. We recall that the success condition is dictated by the population change norm and not by the residual norm.

The size of the population change norm is a poor criterion for convergence. Figure 8 shows that the maximum error in the rate equations for the JFNK solver is lower than in the RH92 case for the same correction size. The JFNK solver achieves a lower absolute error in the residual norm than RH92 for any given convergence condition set on the size of the population change norm. This is expected because the size of the correction per iteration is affected by the efficiency of the solver: For example, in the extreme case of a traditional lambda iteration, this leads to very small corrections with a very large error in the rate equations (i.e., residual norm), whereas in an accelerated lambda iteration, the convergence is comparatively more efficient.

Finally, we provide in Figs. 9-10 the spectra of the six-level Ca II and six-level H I setups, respectively. In both cases, the emerging spectra predicted by the RH92 and the JFNK solvers are essentially identical. We note that the additional accuracy of the solution provided by a JFNK solver does not yield noticeable changes in the emerging intensity, and we can therefore decide to terminate the solving process earlier and still output a similar result.

Table 2

Average execution time per call for the JFNK and the RH92 solvers.

Fig. 6

Residual and population change norms during the solving process of the six-level Ca II setup. The initial population densities are the LTE ones. Both JFNK solvers used a Krylov relative tolerance of 10⁻². Each cross marker corresponds to a Newton-Raphson iteration.

3.4 Calculations with velocity gradients

In this section, we show that the JFNK solver can properly handle nonstatic atmospheres with velocity gradients as a function of depth. We modified the FAL-C model atmosphere by introducing a sharp velocity gradient around z = 1000 km, corresponding to the lower chromosphere. Velocity gradients can be problematic when the velocity jump between consecutive grid cells is larger than approximately one-third of the Doppler width (Ibgui et al. 2013). Under these circumstances, the discretization of the RTE can lead to artifacts in the intensity.

In order to avoid numerical artifacts in the calculation of the intensity, we performed a depth optimization by placing more points where the gradients in temperature, density, optical depth or line-of-sight velocity are large. All quantities were interpolated to the new depth grid by linear interpolation. The total number of depth points was kept equal to that in the original model. This method is essentially an extension of the depth-optimization included in the Multi code (Carlsson 1986), which now also accounts for the presence of velocity gradients. The upper left panel in Fig. 11 illustrates the artificial velocity gradient represented in the optimized grid.

When the velocity gradient is properly sampled with sufficient depth points, there is no fundamental reason why any of the algorithms would perform very differently than in the static case. Our convergence plots in Fig. 11 show a very similar behavior than those in Fig. 6 for the Ca II atom. After a few iterations, the residual norm ||F||_∞ is lower than in the RH92 curve, whereas the population change norm ||δn/n||_∞ is larger. After approximately 80 iterations, the RH92 method has achieved a convergence (in the residual norm) that is similar to that of the GMRES after 50 iterations.

The emerging intensity spectrum now shows strong asymmetries around the core of all chromospheric lines, which become progressively more blueshifted by the presence of the positive velocity gradient at the base of the chromosphere. The Ca II H&K lines (3968 Å and 3934 Å) show the well-known enhancement of one of the k2 peaks (in the case of the red wing) because the blueshifted line profile in the core frequencies leaves an opacity gap in the red wing of the line where photons can escape more efficiently compared to the static case (Scharmer 1984).

Fig. 7

Residual and population change norms during the solving process of the six-level HI setup. The initial population densities are the zero radiation initial guess ones. Both JFNK solvers used a Krylov relative tolerance of 10⁻². Each cross marker corresponds to a Newton-Raphson iteration.

Fig. 8

Residual norm of the rate equations (six-level HI setup) as a function of the population change norm for the JFNK (blue) and RH92 (red) schemes. The Krylov relative tolerance was set to 10⁻². The population densities were initialized using the zero radiation approximation. Each solver was run in order to achieve several Newton relative tolerances in the population change norm (e.g., 10⁻¹, 10⁻²). We then recorded the final residual and population change norms.

3.5 Prospects

The proposed JFNK method can be upgraded for a better efficiency, and there are several ways of doing so. The external Newton-Raphson update does not leave much space for improvement, but it might be interesting to implement a continuation or a line-search method to potentially reduce the number of Newton-Raphson iterations. A nice survey of continuation methods is given in Allgower & Georg (1993). A potentially simple but robust modification would be to implement a hybrid solver mixing the RH92 and the JFNK solvers. Starting with a few RH92 iterations before switching to JFNK iterations would allow this hybrid solver to avoid the usual Newton method deficiencies, as well as providing a better initial guess for the Newton-based solver. This behavior was encountered when attempting to solve the problem for instance with the six-level HI setup starting with the LTE population densities. The performance of our JFNK solvers otherwise reduces to how efficient the Krylov solver can be, and this is therefore dictated by the quality of the preconditioner and the solver itself.

The Jacobi preconditioner we used has proven to be relatively inefficient in several of our setups, and therefore, it could be improved. in our implementation, the local preconditioner is a block-diagonal matrix. When it multiplies the Jacobian, it destroys the nonlocal derivatives in the left-hand side of Eq. (36), and therefore, it has a similar effect as the adoption of a local approximate operator in the RH92 method. However, we recall that the preconditioner should be kept to be easily invertible and calculable, thus leaving a narrow space for improvement. To do this, we provide two possible routes to calculate a more suitable preconditioner. The first option deals with the single-point quadrature approximation of the RTE from Scharmer & Carlsson (1985). An approximation of this kind could greatly simplify the calculations of the nonlocal part of the Jacobian matrices and therefore might provide a more accurate preconditioner than the Jacobi one. The second option is more related to the JFNK formalism and is presented in Chen & Shen (2006) and deals with an adaptive preconditioning technique. in brief, we can take advantage of the matrix vector products calculated by a Krylov solver to iteratively update the preconditioner. it also allows us to computate a nonlocal contribution to the preconditioner.

Another upgrade that can be implemented deals with the initial guess provided to the JFNK solver. in this paper, we used two possible initial guesses, which are the LTE and the zero radiation ones. However, there might be other possibilities more suited for a Newton-based method applied to the radiative transfer problem such as the JFNK one. For instance, the population densities can be initialized with those derived from the escape probability theory (e.g., Hubeny & Mihalas 2014; Judge 2017).

We only considered 1D plane-parallel NLTE problems here. The extension to 3D geometry could be possible with some considerations. First, 3D radiative transfer codes are usually domain-decomposed for parallelization purposes (Leenaarts & Carlsson 2009), where each processor or machine only has access to the properties of the atmosphere, opacities, emissiv-ities and population densities within one subdomain. In order to implement the inner Krylov solver, we would need to collect all population densities from all subdomains and keep the vector basis of the Krylov subspace in one manager task. The key part is the evaluation of Eq. (34), which applies a perturbation to the population densities over the entire domain. The manager would need to propagate the relevant perturbed population densities to each subdomain, but the calculation of J can be made in the same domain-decomposed way. The cost is one additional communication from the manager to the worker tasks per Krylov iteration. At current memory standards in high-performance computing centers, this approach should be reasonable.

Fig. 9

Ca II (six-level) spectrum. In black, we plot the output from the RH92 solver. In blue, we plot the output from the JFNK solver (GMRES) with a Krylov relative tolerance of 10⁻².

Fig. 10

Six-level HI spectrum. In black, we plot the output from the RH92 solver. In blue, we plot the output from the JFNK solver (GMRES) with a Krylov relative tolerance of 10⁻².

4 Conclusion

We presented a Jacobian-free Newton-Krylov method for solving the multilevel NLTE radiative transfer problem assuming statistical equilibrium. Our implementation showed a similar convergence as the Newton-Raphson method, without ever building the full Jacobian matrix explicitly. As a benchmark, we solved the NLTE problem assuming plane-parallel geometry and the FAL-C model for a three-level Ca II as well as six- level Ca II and H I atoms, which have been commonly used in solar physics applications. We showed that our solver can converge faster than other methods based on linearization, such as RH92. The improvement in the convergence rate depends on the atom, but it usually reaches a factor 1.5–2 in the best cases. The latter is evaluated in terms of the number of formal solutions needed to converge the problem. However, we note that the JFNK formal solutions are faster because no cross-term summations are required compared to RH92. The downside of our method is that it relies on an appropriate election of the convergence tolerance for the Krylov inner solver. Our sensitivity study seems to indicate that an optimal performance can be attained when the tolerance is set in the range of 10⁻³–10⁻².

In order to increment the accuracy of the Newton-Raphson correction per iteration, we augmented the precision of the formal solver using complex numbers. This change was required given the enormous dynamic range of the atomic level population densities from the photosphere to the transition region.

Compared to other studies that used Krylov-subspace methods to iteratively solve the linear two-level atom problem (e.g., Hubeny & Burrows 2007; Anusha et al. 2009; Benedusi et al. 2021, 2022), our method handles multilevel nonlinear problems. Because the Jacobian matrix does not need to be explicitly computed in each iteration, this method becomes particularly interesting for more complex problems, which we briefly discuss hereafter as future prospects.

The more obvious application relates to problems where partial redistribution effects of scattered photons are important. While several efficient solutions are available for the two-level atom problem (e.g., Scharmer 1983; Paletou & Auer 1995), similar methods for multilevel problems have important limitations. For example, Hubeny & Lites (1995) presented a PRD method based on the complete linearization approach of Scharmer & Carlsson (1985), which does not consider overlapping active transitions. Uitenbroek (2001) overcame that limitation by using the RH92 formalism and performing two iterative cycles, separating the correction to the atomic level population densities and the correction to the emissivity profile. The method converges, but it requires several evaluations of the redistribution integral per iteration. The method presented in this paper shows great potential to accelerate the convergence of PRD problems because it does not require any explicit linearization of the problem.

Another extension could be the inclusion of charge conservation when H atoms are solved. The idea would be to add another conservation equation and update the electron density in each iteration because the ionization of H is usually dominant in the chromosphere. Previous studies have included these corrections, but needed to perform Newton-Raphson iterations due to the nonlinear dependences of the Saha equation and the rate equations on the electron density (e.g., Leenaarts et al. 2007; Bjørgen 2019). Since we did not perform any explicit linearization of the rate equations or the transfer equation and we already used Newton-Raphson iterations, the inclusion of charge conservation could be very efficient and relatively straightforward.

Fig. 11

Velocity gradient convergence test for the different solvers. Top left panel: line-of-sight velocity profile used for the test. Top right panel: Associated convergence plot for the six-level Ca II setup with a Krylov relative tolerance of 10⁻² and initial LTE population densities. All the solvers required more iterations to converge than in the case shown in Fig. 6. Bottom panel: converged spectrum including the velocity gradient for RH92 (black) and JFNK (GMRES) (blue). A JFNK (GMRES) velocity-free reference spectrum (dashed black) is also shown. A blueshift clearly occurs near the line center due to the positive velocity gradient at the base of the chromosphere, resulting in very asymmetric output lines.

Acknowledgements

We are very thankful to the referee for his/her constructive suggestions and careful evaluation of our manuscript. The Institute for Solar Physics is supported by a grant for research infrastructures of national importance from the Swedish Research Council (registration number 2021-00169). JL acknowledges financial support from the Swedish Research council (VR, project number 2022-03535). No animals were harmed in the making of this manuscript. This project has been funded by the European Union through the European Research Council (ERC) under the Horizon 2020 research and innovation program (SUNMAG, grant agreement 759548) and the Horizon Europe program (MAGHEAT, grant agreement 101088184). Part of our computations were enabled by resources provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS), partially funded by the Swedish Research Council through grant agreement no. 2022-06725, at the PDC Center for High Performance Computing, KTH Royal Institute of Technology (project numbers NAISS 2023/1-15 and NAISS 2024/1-14).

Appendix A Discretization coefficients

A.1 Piecewise linear RTE

$a^{k} = 1 - \frac{1 - e^{- Δ τ_{μ ν}^{k}}}{Δ τ_{μ ν}^{k}} b^{k} = \frac{1 - e^{- Δ τ_{μ ν}^{k}}}{Δ τ_{μ ν}^{k}} - e^{- Δ τ_{μ ν}^{k}}$ ${a^k} = 1 - {{1 - {e^{ - \Delta \tau _{\mu \nu }^k}}} \over {\Delta \tau _{\mu v}^k}}\quad {b^k} = {{1 - {e^{ - \Delta \tau _{\mu v}^k}}} \over {\Delta \tau _{\mu \nu }^k}} - {e^{ - \Delta \tau _{\mu v}^k}}$ (A.1)

A.2 Derivatives of the intensity

For outgoing rays (µ > 0), $a_{χ}^{k} = - \frac{S_{μ ν}^{k}}{χ_{μ ν}^{k}} a^{k} + \frac{| z_{k + 1} - z_{k} |}{2 μ} [e^{- Δ τ_{μ ν}^{k}} (S_{μ ν}^{k + 1} - I_{μ ν}^{k + 1}) - \frac{S_{μ ν}^{k + 1} - S_{μ ν}^{k}}{Δ τ_{μ ν}^{k}} b^{k}]$ $a_\chi ^k = - {{S_{\mu v}^k} \over {\chi _{\mu v}^k}}{a^k} + {{\left| {{z_{k + 1}} - {z_k}} \right|} \over {2\mu }}\left[ {{e^{ - \Delta \tau _{\mu \nu }^k}}\left( {S_{\mu \nu }^{k + 1} - I_{\mu \nu }^{k + 1}} \right) - {{S_{\mu \nu }^{k + 1} - S_{\mu v}^k} \over {\Delta \tau _{\mu \nu }^k}}{b^k}} \right]$ (A.2) $b_{χ}^{k} = - \frac{S_{μ ν}^{k + 1}}{χ_{μ ν}^{k + 1}} b^{k} + \frac{| z_{k + 1} - z_{k} |}{2 μ} [e^{- Δ τ_{μ ν}^{k}} (S_{μ ν}^{k + 1} - I_{μ ν}^{k + 1}) - \frac{S_{μ ν}^{k + 1} - S_{μ ν}^{k}}{Δ τ_{μ ν}^{k}} b^{k}]$ $b_\chi ^k = - {{S_{\mu \nu }^{k + 1}} \over {\chi _{\mu \nu }^{k + 1}}}{b^k} + {{\left| {{z_{k + 1}} - {z_k}} \right|} \over {2\mu }}\left[ {{e^{ - \Delta \tau _{\mu v}^k}}\left( {S_{\mu \nu }^{k + 1} - I_{\mu v}^{k + 1}} \right) - {{S_{\mu \nu }^{k + 1} - S_{\mu v}^k} \over {\Delta \tau _{\mu v}^k}}{b^k}} \right]$ (A.3) $a_{η}^{k} = \frac{S_{μ ν}^{k}}{η_{μ ν}^{k}} a^{k}$ $a_\eta ^k = {{S_{\mu v}^k} \over {\eta _{\mu v}^k}}{a^k}$ (A.4) $b_{η}^{k} = \frac{S_{μ ν}^{k + 1}}{η_{μ ν}^{k + 1}} b^{k}$ $b_\eta ^k = {{S_{\mu v}^{k + 1}} \over {\eta _{\mu v}^{k + 1}}}{b^k}$ (A.5)

when k < Nz − 1; otherwise, one can set the coefficients to zero. For ingoing rays (µ < 0), $c_{χ}^{k} = - \frac{S_{μ ν}^{k}}{χ_{μ ν}^{k}} a^{k - 1} + \frac{| z_{k - 1} - z_{k} |}{2 μ} [e^{- Δ τ_{μ ν}^{k - 1}} (S_{μ ν}^{k - 1} - I_{μ ν}^{k - 1}) - \frac{S_{μ ν}^{k - 1} - S_{μ ν}^{k}}{Δ τ_{μ ν}^{k - 1}} b^{k - 1}]$ $c_\chi ^k = - {{S_{\mu v}^k} \over {\chi _{\mu \nu }^k}}{a^{k - 1}} + {{\left| {{z_{k - 1}} - {z_k}} \right|} \over {2\mu }}\left[ {{e^{ - \Delta {\tau _{\mu \nu }} - 1}}\left( {S_{\mu \nu }^{k - 1} - I_{\mu v}^{k - 1}} \right) - {{S_{\mu \nu }^{k - 1} - S_{\mu v}^k} \over {\Delta \tau _{\mu \nu }^{k - 1}}}{b^{k - 1}}} \right]$ (A.6) $d_{χ}^{k} = - \frac{S_{μ ν}^{k - 1}}{χ_{μ ν}^{k - 1}} b^{k - 1} + \frac{| z_{k - 1} - z_{k} |}{2 μ} [e^{- Δ τ_{μ ν}^{k - 1}} (S_{μ ν}^{k - 1} - I_{μ ν}^{k - 1}) - \frac{S_{μ ν}^{k - 1} - S_{μ ν}^{k}}{Δ τ_{μ ν}^{k - 1}} b^{k - 1}]$ $d_\chi ^k = - {{S_{\mu \nu }^{k - 1}} \over {\chi _{\mu \nu }^{k - 1}}}{b^{k - 1}} + {{\left| {{z_{k - 1}} - {z_k}} \right|} \over {2\mu }}\left[ {{e^{ - \Delta \tau _{\mu \nu }^{k - 1}}}\left( {S_{\mu \nu }^{k - 1} - I_{\mu \nu }^{k - 1}} \right) - {{S_{\mu \nu }^{k - 1} - S_{\mu v}^k} \over {\Delta \tau _{\mu \nu }^{k - 1}}}{b^{k - 1}}} \right]$ (A.7) $c_{η}^{k} = \frac{S_{μ ν}^{k}}{η_{μ ν}^{k}} a^{k - 1}$ $c_\eta ^k = {{S_{\mu v}^k} \over {\eta _{\mu v}^k}}{a^{k - 1}}$ (A.8) $d_{η}^{k} = \frac{S_{μ ν}^{k - 1}}{η_{μ ν}^{k - 1}} b^{k - 1}$ $d_\eta ^k = {{S_{\mu v}^{k - 1}} \over {\eta _{\mu v}^{k - 1}}}{b^{k - 1}}$ (A.9)

when k > 1; otherwise, one can set the coefficients to zero.

Appendix B Full Jacobian with background scattering terms

Let us recall Eq. 50 with different indices $(\frac{\partial I_{j ν}^{i}}{\partial n_{r}^{ℓ}}) = β_{r}^{ℓ} (\frac{\partial I_{j ν}^{i}}{\partial χ_{j ν}^{ℓ}}) + γ_{r}^{ℓ} (\frac{\partial I_{j ν}^{i}}{\partial η_{j ν}^{ℓ}}) + \sum_{p} χ_{sca}^{p} (\frac{\partial J_{ν}^{p}}{\partial n_{r}^{ℓ}}) (\frac{\partial I_{j ν}^{i}}{\partial η_{j ν}^{p}}) .$ $\left( {{{\partial I_{jv}^i} \over {\partial n_r^\ell }}} \right) = \beta _r^\ell \left( {{{\partial I_{jv}^i} \over {\partial \chi _{jv}^\ell }}} \right) + \gamma _r^\ell \left( {{{\partial I_{jv}^i} \over {\partial \eta _{jv}^\ell }}} \right) + \mathop \sum \limits_p \chi _{{\rm{sca}}}^p\left( {{{\partial J_v^p} \over {\partial n_r^\ell }}} \right)\left( {{{\partial I_{jv}^i} \over {\partial \eta _{jv}^p}}} \right).$ (B.1)

If we develop the mean intensity term using the angular quadrature scheme with weights ω_µ, one can write $(\frac{\partial I_{j ν}^{i}}{\partial n_{r}^{ℓ}}) = β_{r}^{ℓ} (\frac{\partial I_{j ν}^{i}}{\partial χ_{j ν}^{ℓ}}) + γ_{r}^{ℓ} (\frac{\partial I_{j ν}^{i}}{\partial η_{j ν}^{ℓ}}) + \sum_{p} χ_{sca}^{p} (\frac{\partial I_{j ν}^{i}}{\partial η_{j ν}^{p}}) \sum_{q} ω_{q} (\frac{\partial I_{q ν}^{p}}{\partial n_{r}^{ℓ}}) .$ $\left( {{{\partial I_{jv}^i} \over {\partial n_r^\ell }}} \right) = \beta _r^\ell \left( {{{\partial I_{jv}^i} \over {\partial \chi _{jv}^\ell }}} \right) + \gamma _r^\ell \left( {{{\partial I_{jv}^i} \over {\partial \eta _{jv}^\ell }}} \right) + \sum\limits_p {\chi _{{\rm{sca}}}^p} \left( {{{\partial I_{jv}^i} \over {\partial \eta _{jv}^p}}} \right)\sum\limits_q {{\omega _q}} \left( {{{\partial I_{qv}^p} \over {\partial n_r^\ell }}} \right).$ (B.2)

Going any further requires simplifying the notations to keep as much clarity as possible. Let us define the following quantities $M_{i j} = (\frac{\partial I_{j ν}^{i}}{\partial n_{r}^{ℓ}})$ ${M_{ij}} = \left( {{{\partial I_{jv}^i} \over {\partial n_r^\ell }}} \right)$ (B.3) $B_{i j} = β_{r}^{ℓ} (\frac{\partial I_{j ν}^{i}}{\partial χ_{j ν}^{ℓ}}) + γ_{r}^{ℓ} (\frac{\partial I_{j ν}^{i}}{\partial η_{j ν}^{ℓ}})$ ${B_{ij}} = \beta _r^\ell \left( {{{\partial I_{jv}^i} \over {\partial \chi _{jv}^\ell }}} \right) + \gamma _r^\ell \left( {{{\partial I_{jv}^i} \over {\partial \eta _{jv}^\ell }}} \right)$ (B.4) $Q_{i j}^{k} = χ_{sca}^{i} (\frac{\partial I_{j ν}^{k}}{\partial η_{j ν}^{i}}) .$ $Q_{ij}^k = \chi _{{\rm{sca}}}^i\left( {{{\partial I_{jv}^k} \over {\partial \eta _{jv}^i}}} \right)$ (B.5)

Here we have omitted the indices r, ℓ, and ν. From this point on, we no longer mention nor write these indices; however, it should be noted that the final solution should be computed for them as well. Equation B.2 can now be expressed as $M_{i j} - \sum_{p} Q_{p j}^{i} \sum_{q} ω_{q} M_{p q} = B_{i j} .$ ${M_{ij}} - \sum\limits_p {Q_{pj}^i} \sum\limits_q {{\omega _q}} {M_{pq}} = {B_{ij}}$ (B.6)

In this equation, the unknowns one is seeking are the coefficients Mij. Equation B.6 can also be expressed as $x_{i} - \sum_{p} 〈 x_{p}, Ω 〉 {(Q^{i})}^{⊤} e_{p} = b_{i} .$ ${x_i} - \sum\limits_p {\left\langle {{x_p},{\bf{\Omega }}} \right\rangle } {\left( {{{\bf{Q}}^i}} \right)^ \top }{e_p} = {b_i}$ (B.7)

where xi = (M_i1,…, M_iNµ)^τ, b_i = (B_i1,…, B_iNµ)^τ, Ω = (ω₁,…, ω_Nµ)^τ, and Qⁱ is the matrix associated with the coefficients Qⁱ_kl. The vector ep is the p^th canonical basis vector. At this point, the unknowns are gathered into the vectors x_i . It should be mentioned that if the background scattering is ignored, x_i = b_i and the solution is therefore simple. Otherwise, let us dot product Eq. B.7 with Ω to obtain $r_{i} - \sum_{p} A_{i p} r_{p} = b_{Ω, i}$ ${r_i} - \sum\limits_p {{A_{ip}}} {r_p} = {b_{\Omega ,i}}$ (B.8)

where r_i = (x_i, Ω), b_Ω,_i = (b_i, Ω),and A_ij = [QⁱΩ]_j. The equation can also be written using matrix notations to obtain $(I - A) r = b_{Ω},$ $({\bf{I}} - {\bf{A}})r = {b_\Omega },$ (B.9)

where I is the identity matrix. This is a simple linear system of unknown r for which the solution is $r = {(I - A)}^{- 1} b_{Ω}$ $r = {({\bf{I}} - {\bf{A}})^{ - 1}}{b_\Omega }$ (B.10)

but can be simplified into $r \approx (I + A) b_{Ω}$ $r \approx ({\bf{I}} + {\bf{A}}){b_\Omega }$ (B.11)

if the background scattering is weak. Introducing this result into Eq. B.7 gives the final solution $x_{i} = b_{i} + \sum_{p} r_{p} {(Q^{i})}^{⊤} e_{p},$ ${x_i} = {b_i} + \sum\limits_p {{r_p}} {\left( {{{\bf{Q}}^i}} \right)^ \top }{e_p},$ (B.12)

and if the background scattering is weak, $x_{i} \approx b_{i} + \sum_{p} b_{Ω, p} {(Q^{i})}^{⊤} e_{p} .$ ${x_i} \approx {b_i} + \sum\limits_p {{b_{\Omega ,p}}} {\left( {{{\bf{Q}}^i}} \right)^ \top }{e_p}.$ (B.13)

References

Allgower, E. L., & Georg, K. 1993, Acta Numerica, 2, 1 [NASA ADS] [CrossRef] [Google Scholar]
Amarsi, A. M., Nordlander, T., Barklem, P. S., et al. 2018, A&A, 615, A139 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Anusha, L. S., Nagendra, K. N., Paletou, F., & Léger, L. 2009, ApJ, 704, 661 [NASA ADS] [CrossRef] [Google Scholar]
Arnaud, M., & Rothenflug, R. 1985, A&AS, 60, 425 [NASA ADS] [Google Scholar]
Auer, L. H., & Mihalas, D. 1969, ApJ, 158, 641 [NASA ADS] [CrossRef] [Google Scholar]
Barbier, D. 1943, Ann. Astrophys., 6, 113 [NASA ADS] [Google Scholar]
Benedusi, P., Janett, G., Belluzzi, L., & Krause, R. 2021, A&A, 655, A88 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Benedusi, P., Janett, G., Riva, G., Belluzzi, L., & Krause, R. 2022, A&A, 664, A197 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bjørgen, J. P. 2019, PhD thesis, Stockholm University [Google Scholar]
Burgess, A., & Chidichimo, M. C. 1983, MNRAS, 203, 1269 [NASA ADS] [CrossRef] [Google Scholar]
Cannon, C. J. 1973, ApJ, 185, 621 [Google Scholar]
Carlsson, M. 1986, Uppsala Astronomical Observatory Reports, 33 [Google Scholar]
Chen, Y., & Shen, C. 2006, IEEE Trans. Power Syst., 21, 1096 [NASA ADS] [CrossRef] [Google Scholar]
Dennis, J. E., & Schnabel, R. B. 1996, Numerical Methods for Unconstrained Optimization and Nonlinear Equations (Society for Industrial and Applied Mathematics) [CrossRef] [Google Scholar]
Eddington, A. S. 1926, The Internal Constitution of the Stars [Google Scholar]
Fontenla, J. M., Avrett, E. H., & Loeser, R. 1993, ApJ, 406, 319 [Google Scholar]
Hubeny, I., & Burrows, A. 2007, ApJ, 659, 1458 [CrossRef] [Google Scholar]
Hubeny, I., & Lites, B. W. 1995, ApJ, 455, 376 [NASA ADS] [CrossRef] [Google Scholar]
Hubeny, I., & Mihalas, D. 2014, Theory of Stellar Atmospheres: An Introduction to Astrophysical Non-equilibrium Quantitative Spectroscopic Analysis, Princeton Series in Astrophysics (Princeton University Press) [Google Scholar]
Ibgui, L., Hubeny, I., Lanz, T., & Stehlé, C. 2013, A&A, 549, A126 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Janett, G., Benedusi, P., & Riva, F. 2024, A&A, 682, A68 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Judge, P. G. 2017, ApJ, 851, 5 [NASA ADS] [CrossRef] [Google Scholar]
Kan, Z., Song, N., Peng, H., & Chen, B. 2022, J. Computat. Appl. Math., 399, 113732 [CrossRef] [Google Scholar]
Knoll, D., & Keyes, D. 2004, J. Computat. Phys., 193, 357 [CrossRef] [Google Scholar]
Krylov, A. N. 1931, Izv. Akad. Nauk SSSR Ser. Fiz.-Mat, 4, 491 [Google Scholar]
Leenaarts, J., & Carlsson, M. 2009, in Astronomical Society of the Pacific Conference Series, 415, The Second Hinode Science Meeting: Beyond Discovery-Toward Understanding, eds. B. Lites, M. Cheung, T. Magara, J. Mariska, & K. Reeves, 87 [NASA ADS] [Google Scholar]
Leenaarts, J., Carlsson, M., Hansteen, V., & Rutten, R. J. 2007, A&A, 473, 625 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Leenaarts, J., Pereira, T., & Uitenbroek, H. 2012, A&A, 543, A109 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Martins, J. R. R. A., & Ning, A. 2021, Engineering Design Optimization (Cambridge University Press) [CrossRef] [Google Scholar]
Milić, I., & van Noort, M. 2017, A&A, 601, A100 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Milić, I., & van Noort, M. 2018, A&A, 617, A24 [Google Scholar]
Milne, E. A. 1921, MNRAS, 81, 361 [Google Scholar]
Newton, I. 1736, The Method of Fluxions and Infinite Series: With Its Application to the Geometry of Curve-lines. By … Sir Isaac Newton, … Translated from the Author’s Latin Original Not Yet Made Publick. To which is Sub-join’d, a Perpetual Comment Upon the Whole Work, … By John Colson, … (Henry Woodfall; and sold by John Nourse) [Google Scholar]
Ng, K. C. 1974, J. Chem. Phys., 61, 2680 [NASA ADS] [CrossRef] [Google Scholar]
Olson, G. L., & Kunasz, P. B. 1987, J. Quant. Spec. Radiat. Transf., 38, 325 [NASA ADS] [CrossRef] [Google Scholar]
Olson, G. L., Auer, L. H., & Buchler, J. R. 1986, J. Quant. Spec. Radiat. Transf., 35, 431 [NASA ADS] [CrossRef] [Google Scholar]
Osborne, C. M. J., & Milic, I. 2021, ApJ, 917, 14 [NASA ADS] [CrossRef] [Google Scholar]
Paige, C. C., & Saunders, M. A. 1975, SIAM J. Numer. Anal., 12, 617 [NASA ADS] [CrossRef] [Google Scholar]
Paletou, F., & Auer, L. H. 1995, A&A, 297, 771 [NASA ADS] [Google Scholar]
Pereira, T. M. D., & Uitenbroek, H. 2015, A&A, 574, A3 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 2002, Numerical recipes in C++ : the art of scientific computing [Google Scholar]
Raphson, J. 1690, Analysis Aequationum Universalis Seu Ad Aequationes Alge-braicas Resolvendas Methodus Generalis, & Expedita, Ex Nova Infinitarum Serierum Methodo, Deducta Ac Demonstrata [Google Scholar]
Rybicki, G. B. 1972, in Line Formation in the Presence of Magnetic Fields, 145 [Google Scholar]
Rybicki, G. B., & Hummer, D. G. 1992, A&A, 262, 209 [NASA ADS] [Google Scholar]
Saad, Y., & Schultz, M. H. 1986, SIAM J. Sci. Statist. Comput., 7, 856 [CrossRef] [MathSciNet] [Google Scholar]
Scharmer, G. B. 1981, ApJ, 249, 720 [Google Scholar]
Scharmer, G. B. 1983, A&A, 117, 83 [NASA ADS] [Google Scholar]
Scharmer, G. B. 1984, in Methods in Radiative Transfer, 173 [Google Scholar]
Scharmer, G., & Carlsson, M. 1985, J. Computat. Phys., 59, 56 [NASA ADS] [CrossRef] [Google Scholar]
Scharmer, G. B., & Nordlund, Å. 1982, Stockholms Observatoriums Reports, 19 [Google Scholar]
Shull, J. M., & van Steenberg, M. 1982, ApJS, 48, 95 [NASA ADS] [CrossRef] [Google Scholar]
Socas-Navarro, H., de la Cruz Rodriguez, J., Asensio Ramos, A., Trujillo Bueno, J., & Ruiz Cobo, B. 2015, A&A, 577, A7 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Sukhorukov, A. V., & Leenaarts, J. 2017, A&A, 597, A46 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Uitenbroek, H. 2001, ApJ, 557, 389 [Google Scholar]
van der Vorst, H. A. 1992, SIAM J. Sci. Statist. Comput., 13, 631 [CrossRef] [Google Scholar]

All Tables

Table 1

Summary of the different atom setups.

In the text

Table 2

Average execution time per call for the JFNK and the RH92 solvers.

In the text

All Figures

Fig. 1

Jacobian matrix structure for a N_ℓ-level atom problem. J_F is a (N_ℓN_z) × (N_ℓN_z) matrix that contains the derivatives of the residual vector components with respect to the population densities. This matrix can be considered as a N_z × N_z block matrix, where each block J^kℓ is a N_ℓ × N_ℓ matrix. J^kℓ stores the derivatives of F at depth index k with respect to the population densities at depth index ℓ.

In the text

Fig. 2

Jacobian matrix estimation error for the three-level Ca II setup, evaluated at the LTE populations and for the different schemes. The first-order scheme is the forward one. The difference steps є₁ and є₂ are common choices (Knoll & Keyes 2004) and read $ϵ_{1} = \sum_{i, k = 1}^{N_{ℓ}, N_{z}} b | {(n_{i}^{k})}^{(p)} | / (N_{ℓ} N_{z} ‖ ν ‖_{2}) + b$ ${_1} = \mathop \sum \limits_{i,k = 1}^{{N_\ell },{N_z}} b\left| {{{\left( {n_i^k} \right)}^{(p)}}} \right|/\left( {{N_\ell }{N_z}v{_2}} \right) + b$ , where b = 10⁻⁶, and $ϵ_{2} = ϵ_{mach}^{1 / r} \sqrt{1 + {‖ n^{(p)} ‖}_{2}} / ‖ ν ‖_{2}$ ${_2} = _{{\rm{mach}}}^{1/r}\sqrt {1 + {n^{(p)}}{_2}} /v{_2}$ , where є_mach is the machine epsilon for double-precision numbers, r = 2 for forward and backward schemes, and r = 3 for the central scheme. The logarithm of the absolute error (color bar) is clipped below −16.

In the text

	Fig. 3 FAL-C atmosphere. The temperature and total hydrogen and electron densities are shown. The height dimension origin (z = 0) corresponds tO τ₅₀₀ = 1.
In the text

	Fig. 4 Comparison of the Newton-Raphson method with JFNK routines for the six-level HI setup (zero radiation initial guess). As the Krylov solver relative tolerance r_tol becomes small, the discrepancy between the different methods reduces to truncation and round-off errors.
In the text

	Fig. 5 Residual vector calls required for convergence as a function of the Krylov solver relative tolerance. Top: Three-level Ca II setup. Bottom: Six-level Ca II setup. Both setups use the LTE initial condition.
In the text

	Fig. 6 Residual and population change norms during the solving process of the six-level Ca II setup. The initial population densities are the LTE ones. Both JFNK solvers used a Krylov relative tolerance of 10⁻². Each cross marker corresponds to a Newton-Raphson iteration.
In the text

	Fig. 7 Residual and population change norms during the solving process of the six-level HI setup. The initial population densities are the zero radiation initial guess ones. Both JFNK solvers used a Krylov relative tolerance of 10⁻². Each cross marker corresponds to a Newton-Raphson iteration.
In the text

Fig. 8

Residual norm of the rate equations (six-level HI setup) as a function of the population change norm for the JFNK (blue) and RH92 (red) schemes. The Krylov relative tolerance was set to 10⁻². The population densities were initialized using the zero radiation approximation. Each solver was run in order to achieve several Newton relative tolerances in the population change norm (e.g., 10⁻¹, 10⁻²). We then recorded the final residual and population change norms.

In the text

	Fig. 9 Ca II (six-level) spectrum. In black, we plot the output from the RH92 solver. In blue, we plot the output from the JFNK solver (GMRES) with a Krylov relative tolerance of 10⁻².
In the text

	Fig. 10 Six-level HI spectrum. In black, we plot the output from the RH92 solver. In blue, we plot the output from the JFNK solver (GMRES) with a Krylov relative tolerance of 10⁻².
In the text

Fig. 11

Velocity gradient convergence test for the different solvers. Top left panel: line-of-sight velocity profile used for the test. Top right panel: Associated convergence plot for the six-level Ca II setup with a Krylov relative tolerance of 10⁻² and initial LTE population densities. All the solvers required more iterations to converge than in the case shown in Fig. 6. Bottom panel: converged spectrum including the velocity gradient for RH92 (black) and JFNK (GMRES) (blue). A JFNK (GMRES) velocity-free reference spectrum (dashed black) is also shown. A blueshift clearly occurs near the line center due to the positive velocity gradient at the base of the chromosphere, resulting in very asymmetric output lines.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Allgower, E. L., & Georg, K. 1993, Acta Numerica, 2, 1 [NASA ADS] [CrossRef] [Google Scholar]

[2] Amarsi, A. M., Nordlander, T., Barklem, P. S., et al. 2018, A&A, 615, A139 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[3] Anusha, L. S., Nagendra, K. N., Paletou, F., & Léger, L. 2009, ApJ, 704, 661 [NASA ADS] [CrossRef] [Google Scholar]

[4] Arnaud, M., & Rothenflug, R. 1985, A&AS, 60, 425 [NASA ADS] [Google Scholar]

[5] Auer, L. H., & Mihalas, D. 1969, ApJ, 158, 641 [NASA ADS] [CrossRef] [Google Scholar]

[6] Barbier, D. 1943, Ann. Astrophys., 6, 113 [NASA ADS] [Google Scholar]

[7] Benedusi, P., Janett, G., Belluzzi, L., & Krause, R. 2021, A&A, 655, A88 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[8] Benedusi, P., Janett, G., Riva, G., Belluzzi, L., & Krause, R. 2022, A&A, 664, A197 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[9] Bjørgen, J. P. 2019, PhD thesis, Stockholm University [Google Scholar]

[10] Burgess, A., & Chidichimo, M. C. 1983, MNRAS, 203, 1269 [NASA ADS] [CrossRef] [Google Scholar]

[11] Cannon, C. J. 1973, ApJ, 185, 621 [Google Scholar]

[12] Carlsson, M. 1986, Uppsala Astronomical Observatory Reports, 33 [Google Scholar]

[13] Chen, Y., & Shen, C. 2006, IEEE Trans. Power Syst., 21, 1096 [NASA ADS] [CrossRef] [Google Scholar]

[14] Dennis, J. E., & Schnabel, R. B. 1996, Numerical Methods for Unconstrained Optimization and Nonlinear Equations (Society for Industrial and Applied Mathematics) [CrossRef] [Google Scholar]

[15] Eddington, A. S. 1926, The Internal Constitution of the Stars [Google Scholar]

[16] Fontenla, J. M., Avrett, E. H., & Loeser, R. 1993, ApJ, 406, 319 [Google Scholar]

[17] Hubeny, I., & Burrows, A. 2007, ApJ, 659, 1458 [CrossRef] [Google Scholar]

[18] Hubeny, I., & Lites, B. W. 1995, ApJ, 455, 376 [NASA ADS] [CrossRef] [Google Scholar]

[19] Hubeny, I., & Mihalas, D. 2014, Theory of Stellar Atmospheres: An Introduction to Astrophysical Non-equilibrium Quantitative Spectroscopic Analysis, Princeton Series in Astrophysics (Princeton University Press) [Google Scholar]

[20] Ibgui, L., Hubeny, I., Lanz, T., & Stehlé, C. 2013, A&A, 549, A126 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[21] Janett, G., Benedusi, P., & Riva, F. 2024, A&A, 682, A68 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[22] Judge, P. G. 2017, ApJ, 851, 5 [NASA ADS] [CrossRef] [Google Scholar]

[23] Kan, Z., Song, N., Peng, H., & Chen, B. 2022, J. Computat. Appl. Math., 399, 113732 [CrossRef] [Google Scholar]

[24] Knoll, D., & Keyes, D. 2004, J. Computat. Phys., 193, 357 [CrossRef] [Google Scholar]

[25] Krylov, A. N. 1931, Izv. Akad. Nauk SSSR Ser. Fiz.-Mat, 4, 491 [Google Scholar]

[26] Leenaarts, J., & Carlsson, M. 2009, in Astronomical Society of the Pacific Conference Series, 415, The Second Hinode Science Meeting: Beyond Discovery-Toward Understanding, eds. B. Lites, M. Cheung, T. Magara, J. Mariska, & K. Reeves, 87 [NASA ADS] [Google Scholar]

[27] Leenaarts, J., Carlsson, M., Hansteen, V., & Rutten, R. J. 2007, A&A, 473, 625 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[28] Leenaarts, J., Pereira, T., & Uitenbroek, H. 2012, A&A, 543, A109 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[29] Martins, J. R. R. A., & Ning, A. 2021, Engineering Design Optimization (Cambridge University Press) [CrossRef] [Google Scholar]

[30] Milić, I., & van Noort, M. 2017, A&A, 601, A100 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[31] Milić, I., & van Noort, M. 2018, A&A, 617, A24 [Google Scholar]

[32] Milne, E. A. 1921, MNRAS, 81, 361 [Google Scholar]

[33] Newton, I. 1736, The Method of Fluxions and Infinite Series: With Its Application to the Geometry of Curve-lines. By … Sir Isaac Newton, … Translated from the Author’s Latin Original Not Yet Made Publick. To which is Sub-join’d, a Perpetual Comment Upon the Whole Work, … By John Colson, … (Henry Woodfall; and sold by John Nourse) [Google Scholar]

[34] Ng, K. C. 1974, J. Chem. Phys., 61, 2680 [NASA ADS] [CrossRef] [Google Scholar]

[35] Olson, G. L., & Kunasz, P. B. 1987, J. Quant. Spec. Radiat. Transf., 38, 325 [NASA ADS] [CrossRef] [Google Scholar]

[36] Olson, G. L., Auer, L. H., & Buchler, J. R. 1986, J. Quant. Spec. Radiat. Transf., 35, 431 [NASA ADS] [CrossRef] [Google Scholar]

[37] Osborne, C. M. J., & Milic, I. 2021, ApJ, 917, 14 [NASA ADS] [CrossRef] [Google Scholar]

[38] Paige, C. C., & Saunders, M. A. 1975, SIAM J. Numer. Anal., 12, 617 [NASA ADS] [CrossRef] [Google Scholar]

[39] Paletou, F., & Auer, L. H. 1995, A&A, 297, 771 [NASA ADS] [Google Scholar]

[40] Pereira, T. M. D., & Uitenbroek, H. 2015, A&A, 574, A3 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[41] Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 2002, Numerical recipes in C++ : the art of scientific computing [Google Scholar]

[42] Raphson, J. 1690, Analysis Aequationum Universalis Seu Ad Aequationes Alge-braicas Resolvendas Methodus Generalis, & Expedita, Ex Nova Infinitarum Serierum Methodo, Deducta Ac Demonstrata [Google Scholar]

[43] Rybicki, G. B. 1972, in Line Formation in the Presence of Magnetic Fields, 145 [Google Scholar]

[44] Rybicki, G. B., & Hummer, D. G. 1992, A&A, 262, 209 [NASA ADS] [Google Scholar]

[45] Saad, Y., & Schultz, M. H. 1986, SIAM J. Sci. Statist. Comput., 7, 856 [CrossRef] [MathSciNet] [Google Scholar]

[46] Scharmer, G. B. 1981, ApJ, 249, 720 [Google Scholar]

[47] Scharmer, G. B. 1983, A&A, 117, 83 [NASA ADS] [Google Scholar]

[48] Scharmer, G. B. 1984, in Methods in Radiative Transfer, 173 [Google Scholar]

[49] Scharmer, G., & Carlsson, M. 1985, J. Computat. Phys., 59, 56 [NASA ADS] [CrossRef] [Google Scholar]

[50] Scharmer, G. B., & Nordlund, Å. 1982, Stockholms Observatoriums Reports, 19 [Google Scholar]

[51] Shull, J. M., & van Steenberg, M. 1982, ApJS, 48, 95 [NASA ADS] [CrossRef] [Google Scholar]

[52] Socas-Navarro, H., de la Cruz Rodriguez, J., Asensio Ramos, A., Trujillo Bueno, J., & Ruiz Cobo, B. 2015, A&A, 577, A7 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[53] Sukhorukov, A. V., & Leenaarts, J. 2017, A&A, 597, A46 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[54] Uitenbroek, H. 2001, ApJ, 557, 389 [Google Scholar]

[55] van der Vorst, H. A. 1992, SIAM J. Sci. Statist. Comput., 13, 631 [CrossRef] [Google Scholar]