Effects of a fully 3D atmospheric structure on exoplanet transmission spectra: retrieval biases due to day-night temperature gradients

Transmission spectroscopy provides us with information on the atmospheric properties at the limb, which is often intuitively assumed to be a narrow annulus around the planet. Consequently, studies have focused on the effect of atmospheric horizontal heterogeneities along the limb. Here we demonstrate that the region probed in transmission -- the limb -- extends significantly toward the day and night sides of the planet. We show that the strong day-night thermal and compositional gradients expected on synchronous exoplanets create sufficient heterogeneities across the limb to result in important systematic effects on the spectrum and bias its interpretation. To quantify these effects, we developed a 3D radiative transfer model able to generate transmission spectra of atmospheres based on 3D atmospheric structures. We first produce synthetic JWST observations from a simulation of GJ 1214b and demonstrate the necessity of a real 3D approach to model data for such precise observatories. Second, we investigate how day-night temperature gradients cause a systematic bias in retrieval analysis performed with 1D forward models. For that purpose we synthesize a large set of forward spectra for prototypical HD 209458 b and GJ 1214 b type planets varying the temperatures of the day and night sides as well as the width of the transition region. We then perform typical retrieval analyses and compare the retrieved parameters to the ground truth of the input model. This study reveals systematic biases on the retrieved temperature (found to be higher than the terminator temperature) and abundances. This is due to the fact that the hotter dayside is more extended vertically and screens the nightside. These biases are difficult to detect as the 1D profiles used in the retrieval are found to provide an excellent match to the observed spectra. This needs to be kept in mind when interpreting real data.


Biases in the analysis of transmission spectra of tridimensional planets
With the first spectroscopic observations of exoplanets, we are now able to study planetary atmospheres beyond our solar system. In recent years, spectroscopic observations have seen tremendous developments, and the coming years are even more promising, particularly because of the launch of the James Webb Space Telescope (JWST; Beichman et al. 2014) and the dedicated ARIEL mission (Tinetti et al. 2017).
One of the most common method to interpret atmospheric spectra is based on inverse atmospheric retrieval modelling (Madhusudhan 2018). However, because of the complex thermal structure of the atmosphere and the numerous gases to retrieve (with their possibly complex spatial distribution) the number of parameters to handle can render the inversion computational cost prohibitive-that is if there are enough data to have a well-constrained problem to start with. As a result, retrieval algorithm are required to make drastic assumptions on the forward model to render the problem tractable.
The problem is that when these assumptions do not hold to a sufficient degree in the observed atmosphere, that creates a systematic bias that can lead the retrieval algorithm far from meaningful solutions. Identifying and alleviating these biases is thus a crucial goal to prepare for the next generation of precision observatories, and there have been several attempts in this direction. For example, the often made assumption of uniform mixing ratio in the atmosphere led Evans et al. (2017) to retrieve a 100-1000× solar VO/H 2 O ratio in the atmosphere of of WASP-121 b. But Parmentier et al. (2018) showed that accounting for the chemical dissociation of some species at the hottest altitudes allowed them to understand the data with solar abundances. Rocchetto et al. (2016) also thoroughly quantified the impact of assuming a vertically isothermal atmosphere. Yet, all current retrieval algorithms are still fundamentally limited by the assumption of a spherically symmetric atmosphere: they use one-dimensional forward models to constraint spectra and atmospheric parameters of three-dimensional objects for which we expect heterogeneities. Such an approach is bound to create counter-intuitive biases that we need to quantify. With that in mind, Feng et al. (2016)  Schematic of the geometry of a light ray crossing the atmosphere. The inner circle is the arbitrary reference surface of the planet of radius R p . The lighter grey region is the atmosphere. The distance from the center of the planet is r = R p +z. A light ray is defined by its distance of closest approach to the planet's center, ρ, and the corresponding tangential altitude z t = ρ − R p . The direction of the light ray defines the x direction. As we further discuss the extent of the limb along the ray, we introduce x limb so that the absorption outside the [−x limb , x limb ] segment is negligible in determining the transit radius, and the corresponding limb opening angle ψ. signal for methane in emission. In the same vein, Blecic et al. (2017) used the dayside emission spectrum computed with a post-processed 3D Global Climate Model (GCM) to identify the region effectively probed by the retrieval of secondary eclipse data.
On the transmission spectroscopy side, the study of the horizontal atmospheric heterogeneities have focused on the effect of clouds, with Line & Parmentier (2016) who have showed that the presence of clouds on parts of the limb only could mimic a high mean molecular weight atmosphere. Yet, this study produced their forward spectra by simply averaging two 1D models so that only a limited kind of heterogeneities could be investigated. To go further Charnay et al. (2015), Way et al. (2017), Parmentier et al. (2018), and Lines et al. (2018), among others, have produced transit spectra from 3D atmospheric simulations. However, because of the difficult geometry, they still relied on a 1D radiative transfer transmission code that is either fed an average limb profile from a 3D simulation or that performs spectra of all the columns at the terminator of the model -that they assume to be equivalent to the limb plane -before averaging. Even if the second approach -that we will hereafter call limb-averaged or (1+1)D method -does capture the spatial variations of the atmosphere along the terminator, it completely neglects horizontal variations across it. Indeed, as the ray goes from the day side to the night side before coming to the observer, it crosses one of the most steeply changing region of the atmosphere: the transition from day to night side. The effect of such a thermocompositional transition within the limb on retrieved parameters is unknown at present. Indeed, although various authors have also developed a fully consistent transmission model able to predict such effects, these authors have not tried to retrieve physical parameters from their forward spectra. Fortney et al. (2010), Burrows et al. (2010), and Dobbs-Dixon et al. (2012), for example, have focused on the potential differences between the east and west limbs, while Miller-Ricci Kempton & Rauscher (2012) and Showman et al. (2013) have mainly looked at the effect of the doppler shifting by winds on high resolution spectra.

How wide can the limb be: a simple estimate
The day-night transition would not be a problem if the region probed in transmission were infinitely thin. We would just see a slice of the atmosphere. But in fact, and quite counter-intuitively, the width of this region -that will be our definition of the limb 1is much larger on some planets that is generally expected. Therefore, the transit spectrum encodes a much wider diversity of temperatures and compositions.
Although the effect of this larger extent of the limb will be demonstrated a posteriori by the results of our 3D transit model, let us here try to give simple arguments to estimate how different planets can be affected. In other words, how wide can we expect the limb to be on any given planet.
Of course, the problem in providing such a simple estimate is that the region that will contribute to the transit spectrum does not only depend on the global parameters of the planet, but also on the precise chemical-physical conditions in the atmosphere and how they vary spatially, as will be demonstrated later on. There is thus a certain degree of arbitrariness if one wants to come up with a simple general estimate. For this reason, we will first use a simple geometrical argument. The advantage of this is that it will allow us to identify the key dimensionless parameter controlling the limb width. In Appendix A we derive a model of a more specific case of chemical inhomogeneity and show that the two approaches indeed yield similar results.
Let us consider a light ray passing through the limb as shown in Fig. 1. Estimating the width of the limb comes down to the computation of the maximum distance from the limb plane, x limb , at which the atmosphere still affects measurably the optical depth along all the observed rays, and especially the deepest one. The choice we have to make here is the highest pressure probed in transit (P bot , that we will assume to define the planetary radius R p ) and the lowest pressure at which the atmosphere is still able to significantly affect the transmission of a given ray (P top ). Then the width of the limb is given by ψ ≡ 2 arccos R p R p + z(P top ) . (1) Since, in a isothermal atmosphere with a atmospheric scale height at the surface equal to H and a varying gravity, the hypsometric relation writes we get The first important result is that we see that the important dimensionless quantity in our problem is the ratio of the scale height to the planetary radius. The higher this parameter, the larger the curvature effects in the atmosphere. This parameter will often appear later on. We also directly see that the larger the pressure range probed, the wider the limb. Many models predict that the lowest levels probed in transit are around 100 mb in the visible/near infrared. On the other hand, Kreidberg et al. (2014) have shown that in order to explain the flat spectrum of GJ 1214 b, an opaque aerosol deck is needed as high as 10 −3 -10 −2 mb, showing that absorbers at such altitudes can indeed still affect the transit spectrum.

Radius
Simple estimate of the opening angle of the region around the terminator that affects the transmission spectrum (i.e. the limb). The contours give the opening angle in degrees as a function of the planetary radius and the atmospheric scale height of the atmosphere at 10 mb (the variation of gravity with height is accounted for). The atmosphere is assumed to become transparent above the 1 Pa pressure level. Black dots are known planets for which the radius and surface gravity are measured. Some key objects are identified. The equivalent temperature is derived from the scale height assuming an atmosphere of hydrogen and helium (solar abundances) and a surface gravity of 10 m/s 2 . For hot, low gravity planets like GJ 1214 b, the limb can cover almost half the surface of the planet! To get a quantitative idea, Fig. 2 shows the result of Eq. (3) as a function of the planetary radius and scale height of the atmosphere. To have a relatively conservative estimate, we restrained the pressure range to (P bot , P top )=(10 mb, 10 −2 mb). One can see that even for the Earth, the limb region already spans more than ten degrees. For a warm sub-Neptune planet like GJ 1214 b, this can be as large as 45 • -50 • . In numerical terms, this means that within a typical resolution 3D atmosphere model of this planet, say 128 grid points in longitude for sake of concreteness, a single ray would interact with about 24 consecutive horizontal cells before leaving the planet. From these numbers, it becomes evident that the 1+1D approach, by picking only one out of those 24 cells as representative of the terminator, is a crude approximation to a real transmission spectrum.

Goals of the study
Our goal is thus to identify the various biases of retrieval methods created by thermal and compositional -including cloudsinhomogeneities in the atmosphere in transmission. To that pur-pose, we need a transmission spectrum generator able to match the complexity of a real three-dimensional planet. We thus developed a tool able to compute transmission spectra using a parametrized 3D atmospheric structure or the outputs of a 3D atmospheric simulation by a global climate model-namely the LMD Generic model Leconte et al. 2013;Charnay et al. 2015). This tool, Pytmosph3R, and its architecture are described in Sect. 2. Then we show an extensive validation in Sect. 3. Next, Sect. 4 presents a first application of this tool to a simulation of the atmosphere of GJ 1214 b where we demonstrate the necessity of a real 3D approach to model data for such precise observatories. Finally, we investigate how day-night temperature gradients expected for exoplanets cause a systematic bias in retrieval analysis of real data performed with 1D forward models (Sect. 5).

Presentation of Pytmosph3R
Pytmosph3R is designed to compute transmission spectra based on 3D atmospheric simulations performed with the LMDZ generic global climate model. It produces transmittance maps of the atmospheric limb at all wavelength that can then be spatially integrated to yield the transmission spectrum. The code is entirely written in python.
In this section we present the various modules of the code: • The geometrical framework used to map the atmospheric structure from the spherical coordinates used by the GCM onto cylindrical coordinates that are more suitable to follow photons crossing the atmosphere, • The two algorithms that can be used for the calculation of the slant optical path -a discretised and an integral method, • The various sources of opacity included in our radiative transfer model, • The spatial integration to produce spectra.

The spherical grid used in atmospheric simulations
Typical 3D atmospheric simulations -LMDZ included -provide state variables such as temperature and mixing ratios of various absorbers/scatterers on a longitude/latitude/pressure grid. Although pressure is convenient variable to compute atmospheric motions, transit spectroscopy is fundamentally about knowing the physical area of the opaque region of the atmosphere. We thus first have to interpolate the outputs of the climate model on a spherical (λ, ϕ, z) grid, where λ is the longitude, ϕ is the latitude, and z the altitude. When needed, we will also refer to r ≡ R p + z the distance to the planet center, or α the colatitude.
The longitude/latitude grid is evenly spaced and follows the native grid of the atmospheric model. The altitude grid is also evenly spaced. However, as will be discussed in detail in Sect. 3, the resolution of this grid can, and usually should, be higher than the native resolution of the input simulation as it will set the precision of the output spectrum. We find that a good compromise between computation time and accuracy is reached for a vertical resolution of about a tenth of the scale height. Let n z , n λ , and n ϕ be the number of grid cells in each of the three dimensions.
The top should also be chosen high enough for the atmosphere to be transparent there. This will be quantified hereafter. If this altitude is above the top of the input simulation, the code extrapolates the atmosphere above this top assuming hydrostatic equilibrium and a fixed temperature (or a profile that the user needs to define).
Then, we integrate the hydrostatic equilibrium equation within each column of the model to compute the values of all the necessary variables from the climate model (e.g. temperature and mixing ratios) on this new altitude grid. During this integration, the variation of gravity with altitude is taken into account (see Eq. (7)), an effect that proved to be crucial to reach the precision needed.
2.1.2. The cylindrical grid used for the radiative transfer As long as the light from the star propagates in straight lines, cylindrical coordinates provide the most natural set to treat transit geometry. Indeed, due to the great distance between the observer and the planet, light rays follow parallel paths so that the radiative transfer can be solved within a hollow cylinder that is tangent to the planetary surface at the bottom and stops at the top of the modelled atmosphere.
We thus define a cylindrical grid whose coordinates will be denoted (ρ, θ, x) where the x axis is the line between the center of the planet and the observer (the line of sight), with x increasing toward the observer (see Fig. 3). The x=0 plane corresponds to the plane perpendicular to the line of sight passing through the center of the planet (hereafter called the plane of the sky). ρ is the distance from the x axis in the plane of the sky, and θ is the azimuth, defined here as the angle on the limb plane measured from a reference direction (see below).
Hereafter we will refer to a "(ρ, θ)-ray" as the ray of light that crosses the limb plane at those coordinates. The transmittance map of the atmosphere is simply constructed by computing the chord optical depth of the ray along the x direction for every (ρ, θ). From this, and assuming a luminosity/spectral distribution (limb-darkening, spots) over the stellar disk as well as a transit trajectory, we can produce spectral transit lightcurves.
The resolution of the cylindrical grid is based on that of the spherical one, which has a finer altitude resolution compared to the GCM: ∆ρ=∆z and ∆θ=∆ϕ. There is no benefit in increasing the resolution in θ, GCM cells being considered uniform horizontally. To test the impact of shooting our rays through the middle of our layers or at their interfaces, the ρ grid can be shifted relative to the r grid using the ω parameter (0 ≤ ω < 1) so that ρ = r + ω ∆z. To speed up computation, (ρ, θ)-rays will eventually be divided irregularly along the x direction to follow closely where they go from one spherical cell to another, as explained in 2.2 (see also Fig. 3).

Orientation of the planet and correspondence between coordinate systems
The cylindrical coordinate system needs to be properly oriented with respect to the spherical grid. For this, we require only to know the longitude and latitude of the observer in the spherical coordinates, (λ obs , ϕ obs ), at the time of observation (or alternatively the colatitude α obs ) 2 . The unit vector pointing toward the observer,û obs , then defines the direction of the x axis of our cylindrical coordinates. The last thing that we need to define is the arbitrary reference direction for the azimuth. For this we choose the projection of the rotation axis of the planet onto the plane of the sky.
With these two definitions, there is a unique relationship between the spherical and cylindrical coordinates for a given point. The translation from one system to the other with an arbitrary orientation however requires to solves a set of non-linear equations that is detailed in Appendix B.

Dividing (ρ, θ)-rays into subpaths
Along each (ρ, θ)-ray (identified by the indices i ρ and i θ ) we locate all the intersections with relevant interfaces of the sphericalgrid and divide the ray into segments of irregular lengths, each of them belonging to a different cell. All quantities used to calculate the optical path -pressure and density excepted -are considered constant within each segment/cell. The pressure/density can be either kept constant within a segment/cell or assumed to follow hydrostatic equilibrium.
In practice we first divide each (ρ, θ)-ray with a constant ∆x step, calculate the spherical coordinates of the resulting discrete points and give them the three indices (i r ,i λ ,i ϕ ) of the spherical cell they belong to (see Appendix B). The step ∆x is chosen small enough (< ∆z) so that two successive points can only belong to either the same cell or to two adjacent cells (i.e., cells separated by a facet, an edge, or a corner). One, two, or all three indices (i r ,i λ ,i ϕ ) can be incremented between two successive points.
When a change of index occurs between two points, the code determines analytically the position of the intersection(s) between the (ρ, θ)-ray and the surface(s) separating cells. This comes down to solving for the unknown position x int along the ray knowing ρ, θ, and the equation of the surface(s) crossed. The equations to be solved for the three type of intersections (depending on the varying index) are detailed in Appendix C From these positions, the length of the subpaths belonging to individual cells can be measured. When more than one index is incremented between two points (near an edge or a corner, which implies that a third cell has been crossed) and once the intersections have been located, their x-position are sorted in increasing order so that subpaths can be measured and attributed to specific cells.

Optical depth
At this point, all the (ρ, θ)-rays are now subdivided into N x (ρ, θ) segments of length {∆x i (ρ, θ)} i=1,N x . The number and length of these segments of course changes for each (ρ, θ)-rays depending on the number of intersections found in the previous step. Each of these segments has been assigned to a given cell of the spherical grid so that we know all the quantities describing the physical state of the atmosphere in the i-th segment: temperature (T i ), volume mixing ratio of the j-th of the N spe species (χ j,i ), and mass mixing ratio of the k-th of the N con species of condensed particles (q k,i ).
The goal is now to compute the optical depth (hence the transmittance) of the atmosphere for each (ρ, θ) which is given by where ∆τ i is the optical depth of a given segment. Pytmosph3R can calculate optical depth in two ways. Pressure (and density) can be either considered constant within a cell, hereafter called the discretised method, or it can be integrated along the optical path assuming a hydrostatic vertical  . The left panel shows the grid as seen from the observer. The planet moves to the right, which defines the θ = 0 axis. The pink dots are examples of (ρ, θ)-rays (To follow the python convention used by the code, the indices given start at 0). The native GCM grid at the terminator of an example simulation is shown in black to illustrate its non-uniform vertical spacing. The right panel shows this from the side where we can follow light rays (dashed lines) as they go horizontally from left to right. The red boxes illustrate how the x-discretisation is computed for each rays to follow closely where it goes from on cell of the spherical grid to another. structure within the cell, the so-called semi-integral method. In both cases, we assume that cross-sections are constant along the given sub-path to spare considerable CPU time. This approximation yields negligible errors as long as the simulation is sufficiently resolved in the vertical.

Discretized calculation of the optical depth
With the discretized method, pressure and density are considered constant so that the optical depth of a segment simply reduces to where σ mol , σ sca , and σ con are the cross-sections for the molecular, Rayleigh scattering, and continuum absorptions. k mie is the absorption coefficient associated to the Mie scattering by aerosols. The parametrization used for these absorptions is discussed hereafter.

Integral method
With the integral method, we now assume that the pressure follows the hydrostatic law within a segment, thus varying with altitude. The optical depth of a segment for any contribution is then given by where x i is the positions of the beginning of the segment. Let us call z i the corresponding altitude. The simplification comes from the fact that the altitude profile of the pressure within an isothermal atmosphere is analytical even if the gravity varies with height. It is given by where M is the molar mass of the gas, R the universal gas constant, and g(z) the gravity at a given altitude. This entails where the last integration is carried out numerically between the lowest and the highest point of the segment. The increased accuracy of this method comes from two reasons: i) the variation of gravity with height is built in, and ii) more importantly the exponential variation of pressure is fully accounted for. This explains why a much lower number of vertical layers are needed with this method to reach numerical convergence.

Molecular lines
Pytmosph3R deals with molecular absorptions in two possible ways: tables of monochromatic cross-sections or correlated-k coefficients (also called k-distribution method; Fu & Liou 1992).
Using k-distributions considerably reduces the computing time but new k-tables must be pre-computed each time one wants to change the atmospheric composition or the resolution of the output spectrum.
Because our code is designed to work with the TauREx retrieval code (Waldmann et al. 2015), we use the same set of highresolution cross-sections produced by the ExoMol project. This data set is precomputed on a T − log P grid going from 200 K to 2800 K every 100 K and from 10 −3 to 10 bar every 0.3 dex. Pythmosph3R users can either choose linear or optimal interpolation in temperature. For the optimal interpolation we follow Hill et al. (2013) who prescribes where i is the molecular/atomic species index, λ the wavelength, and T the temperature of the cell. The (a, b) scaling factors are given by where T u and T l are upper and lower temperatures respectively. k-distributions can only be interpolated linearly in temperature. Along the pressure coordinate, the interpolation is log-linear. The effect of using different interpolations schemes has been tested and we find that it does not introduce significant differences.

Continuum absorptions
Our principal source of continuum opacity is due to collision induced absorptions. We account for this process for all the species for which such information is available in the HITRAN database and following the prescriptions of Richard et al. (2011). Furthermore, for specific species such as water vapor, we can add a continuum that is accounting for the truncation of the far wings of the lines and the neglect of many weak lines in some of our sets of cross-sections or correlated-k tables. In such case, the water continuum is added using the CKD model (Clough et al. 1989). We however note that care must be taken to ensure that the molecular opacities used must be computed consistently to do not count some effects twice.

Rayleigh Scattering
Multiple scattering is neglected. The contribution of Rayleigh scattering is thus treated as a simple extinction. The crosssection of any single gas molecule is given by the common formula where λ is the wavelength (here in m), N std is the number density of a gas under standard conditions, n λ is wavelength-dependent, real refractive index of the gas and F k (λ) is the King correction factor which accounts for the depolarization. The accuracy of this essential part of the radiative transfer mainly depends of the calculation of refraction indices. We used the most recent data available in the literature. For sake of completeness, we have reviewed the parametrization that we use for H 2 , He, H 2 O, N 2 , CO, CO 2 , CH 4 , O 2 and Ar in Appendix D and in Table D.1.

Mie scattering for aerosols
Transmission spectra of transiting exoplanets are affected by clouds and hazes. The LMDZ GCM can include cloud physics and provide the properties and 3D distribution of liquid/solid condensates and aerosols. Assuming spherical particles with a size similar to or larger than the considered wavelengths, we use Mie scattering formalism compute their extinction factor Q ext and resulting opacities, following the same method as in radiative transfer modules of the GCM (Madeleine et al. 2011). We linearly interpolate the value Q ext on effective radius and wavelength using pre-calculated lookup tables. The absorption coefficient is estimated as where Q ext is the extinction coefficient for the wavelength λ and a given effective radius r eff , q k and ρ con are the mass mixing ratio and the density of the species considered, and ρ gas the total gas density.

Generation of transmittance maps and spectra
To generate the global absorption spectrum of the planet, the wavelength-dependent (ρ, θ) map of optical depth is first converted into a transmittance map In the most general case, the in-transit flux should be computed by convolving this transmittance map with a given surface brightness distribution for the star. However, in the most simple case of a homogeneous stellar disk the effective area of the planet reads where R p is the radius of planet (at the bottom of our model atmosphere), S ρ = 2π(ρ + ∆ρ/2)∆ρ/N θ , N θ the number of θ points, and ∆ρ the layer thickness. Eventually, the relative dimming of the stellar flux due to the planet is given by where R is the stellar radius. For each monochromatic transit depth, we can determine an effective radius of the planet defined as

Validation approach
In the following subsections we validate the different modules of the code: calculation of the vertical structure, 3D geometry of the radiative transfer, integration scheme for the optical depth, interpolation scheme for the opacities, etc. In the absence of another validated code able to produce synthetic transmission spectra from 3D simulations, we decided to compare our code with a hierarchy of 1D models with increasing complexity-from a purely analytical model (Sect. 3.2) to the more realistic forward spectrum generator used in the TauREx retrieval tool (Waldmann et al. 2015;Sect. 3.3).
In each case, the comparison with a given 1D model proceeds as follows: On one hand we generate a spectrum from the 1D model with a given vertical temperature profile (possibly isothermal). On the other hand, we use the same vertical profile to generate a full spherically symmetric-3D structure on our spherical grid. This structure goes through our whole code to generate a spectrum that should, ideally, be the same as the 1D one. Finally, we increase the resolution of our 1D/3D grids until convergence is reached.
It appeared relatively early in our tests that, to numerical precision, our transmittance map were completely insensitive to the azimuth angle, θ, in any spherically symmetric configuration, as it should be. This not completely trivial result shows that our careful way of computing the path length of the ray in each cell does get rid of the singularities -in particular at the poles -of our initial longitude/latitude grid. This also allowed us to test the convergence of the algorithm of vertical integration at very high resolutions. Indeed, when a very large number of layers is used -above ∼1000, which is already much larger than what needs to be used in practice to be converged -the computing time for the full 3D code becomes prohibitive. For this reason, in some cases below, the results shown are derived from a sector of the limb only (i.e. a given θ). This approach is made possible by the fact that the numerical differences between the various sectors is negligible.
As expected from our analytical arguments and further demonstrated in the next subsections, we find that the most, and in fact almost only, important parameter determining the resolution needed to model a given atmosphere is the ratio of the planetary radius to the scale height (R p /H): the lower this parameter, the larger the vertical resolution needed. It stems from two reasons. A lower ratio means that i) gravity will vary more significantly in each vertical layer, and ii) curvature effects will be more pronounced and rays will pass through more atmospheric columns along their path (see Fig. 2). As a result the validation cases we present hereafter focus on covering a wide variety of R p /H. However, because the scale height is not set, but determined from the composition and temperature of the atmosphere and from the surface gravity, fixing R p /H entails varying one of these parameters to compensate. For each of the validation setups described in the following sections, the compensation parameter is thus always stated. This can sometimes result in a physically inconsistent set of planetary parameters that we deem acceptable in the context of our validation. The values used for the other parameters are given in Table 1, unless otherwise stated.
As will become clear in the following, our 3D model can reproduce our validation cases to numerical precision. We thus used these tests to further derive some guidelines on the number of layers and the model roof pressure to use in various cases to reach a satisfactory accuracy. To quantify what we call a "satisfactory accuracy", we use the difference on the predicted transit depth between two models and ask that it be significantly smaller than the photon noise that can be expected for the given target in one transit. If one wants a more stringent condition, one can use the expected systematic noise floor for a given instrument (expected to be on the order of a few ppm with JWST, for example).

Validation with a monochromatic, analytical model at constant gravity
To focus on the radiative transfer basics we first compare the optical depths computed with our model with those given by an analytical solution. For an isothermal, horizontally uniform atmosphere with constant gravity and a scale height H R p , Guillot (2010) provides an analytical formula for the transmission optical depth as a function of the distance to the center of the planet: where τ tr is the chord optical depth, τ ⊥ the vertical optical depth. As described in Appendix A, we use this formula to calculate the planet transit. We then compare the results with those of Pytmosph3R for a wide range of vertical resolution: 50 to more than 10 000 levels. For 1000 levels and more, and as long as the thin atmosphere assumption is respected, the effective absorption radii from both models agree within ∼ 1 centimetre (i.e. ∼ 10 −9 relative accuracy). As one of our main goals is to be able to perform a retrieval on a spectrum derived from a 3D model as if it were from a real planet, we decided to use the forward model of an existing, and validated, atmospheric retrieval code for our comparison. The added benefit of this validation approach is that we make sure that in the relevant case of spherically symmetric atmosphere, both our 1D and 3D forward model can produce spectra without any numerical biases-meaning here that the differences between the spectra can always be reduced to be much smaller than the noise that will be prescribed in the retrieval step by choosing a sufficient vertical resolution.
As a result, we know that when we retrieve the properties of an atmosphere generated with our 3D model, the various potential biases in the retrieved parameters are entirely due to the heterogeneities of the atmosphere and not the differences in the numerics.
In our case, we are using TauREx, whose forward model is described in more details in Waldmann et al. (2015). For the reasons mentioned above, we decided to use the same philosophy used in TauREx in the implementation of many physical processes, in particular concerning the molecular opacities and the Rayleigh scattering. Two notable exceptions are i) the vertical grid and ii) the optical path calculations that will be discussed more specifically in the next section. Let us just mention here that our requirement that our 3D model be as compatible with TauREx as possible is the reason why we keep two algorithms for the calculations of the optical depth in the code: the integral scheme which is the optimal scheme in terms of convergence and should be used in general, and the discretised one that always gives a result closer to TauREx when the same number of layers are used in both models.

Validation for a isothermal, gray atmosphere with varying gravity
We now assume an isothermal atmosphere with an altitudedependent gravity. We consider a uniform composition of H 2 /He/H 2 O. In this first step, to validate only the geometrical part of the code, we assume that H 2 and He are transparent, and that water has a gray opacity that is independent of temperature and pressure. In this set of simulations, the scale height is set to the prescribed value by adjusting the temperature of the atmosphere, all other parameters being kept fixed to the value given in Table 1.
For each R p /H ratios, we first compute the transit radius of a high resolution reference model using Pytmosph3R with 50 000 layers. Then we compute the difference between this reference and the transit radius given by our three models at various vertical resolutions. This is summarized in Fig. 4 where the results for Pytmosph3R/discretised method are shown by dashed lines, for Pytmosph3R/integral method by solid lines, and TauREx by dots. The various colors are for different R p /H. Note that although we quote some planet names for each ratio to give an idea of what type of planet it describes, calculations where all performed for the same planetary radius (1 R J ).
The conclusions that can be drawn from this test are the following: • All three models converge toward the same result as vertical resolution is increased so that the differences can be reduced to an arbitrarily low value, • As advertised, atmospheres with a low R p /H require less vertical layers to reach a given accuracy, • For the same number of layers, the accuracy of our integral version of the code is orders of magnitude better than the discretized one. This is thus the preferred mode for most applications, • As expected, the discretised mode, although less accurate, is always closer to TauREx and should probably be used if a retrieval analysis is to be performed. This indeed introduces as little biases as possible during the retrieval step.
In terms of the number of layers needed to reach convergence -the precision needed depending of course on the observations -the integral mode of Pytmoshp3R requires as little as 20-30 layers in almost all practical cases of interest. This yields about 2-3 points per pressure decade or about 1 per scale height. For the discretised version as well a for TauREx and indeed most forward model using the same philosophy, 100 layers are generally enough, but this should be taken with caution. Especially hot or low gravity objects require a finer resolution and some published models have probably reached convergence only marginally.
Despite our efforts, it can be seen that for a given number of layers, the discrete version of Pytmosph3R is not exactly equiv- Fig. 5. The figure shows the decimal logarithm of the reduced chi-square for the effective radius between our 3D model and the full forward model from TauREx over a range of temperature between 200 and 2600 K and of R p /H between 40 and 260. As a comparison, all planets described in Fig. 4 have R p /H > 100. The black line shows where the reduced χ 2 equals unity, i.e. where model differences become insignificant compared to the noise. We assumed an error calculated with a sun-like star at 100 lyr, 1 hour of integration and JWST instruments. The same work has been done for four vertical resolutions : 50, 100, 200 and 500 layers. alent to TauREx. We find that this small discrepancy is fundamentally due the fact that our vertical grid uses altitude rather than pressure levels as usually done in 1D models. When gravity varies with altitude, an iso-altitude grid is not equivalent to an iso-log pressure one, hence the small difference.

Comparison with the full TauREx forward model
Here we compare spectra from Pytmosph3R and TauREx for isothermal atmospheres with the same composition as before. There are two differences however. First, we now compute a full spectrum with all our opacity sources varying with wave-length, temperature, and pressure. Second, we now fix the atmosphere temperature beforehand and adapt the surface gravity of the planet to get the desired R p /H ratio. This stems from the fact that will now impact the molecular features directly and not only through the scale height. This allows us to test our interpolations in the opacity databases as well.
As the model produces a full spectrum and not only a monochromatic transit radius, we quantify the differences between the two codes using a reduced chi-square calculated as follows: where δ 3D,λ and δ 1D,λ are the transit depths at the wavelength λ given by Pythmosph3R and TauREx, N λ is the number of spectral bins, and σ hν,λ is the stellar photon noise computed for a JWST observations. The latter is given by where .
The parameters are τ λ is the system throughput, R and T s the radius and temperature of the host star, A the collecting area of the telescope (here 25 m 2 ), ∆t the integration time, and d the distance of the target. In our example, we considered a Sun-like star, a 1-hour integration, and a system at 100 ly. Increasing the noise tends towards making Pytmosph3R and TauREx spectra indistinguishable.
We computed the logarithm of this reduced chi-squared as a function of the atmospheric temperature and R p /H for four different vertical resolutions. Results are shown in Fig. 5.
As already discussed, we can see that the agreement between the two codes at a given vertical resolution generally increases with R p /H. At a given R p /H ratio and in this specific example, the agreement also slightly depends on temperature. This is due to the overall increase of H 2 O opacity with temperature that pushes the opaque region upward and increases the limb opening angle (seeFig. 2). With high enough vertical resolution the two models can however agree well within the noise budget. As the lowest encountered R p /H is about 100 (for a hot-Neptune like GJ 1214 b), we confirm that 50 layers should be sufficient in most cases. These maps can however be used as a guide to chose the resolution needed if a more stringent case is found.

Effect of model top altitude
When computing transmission spectra, if the model does not reach a height in the atmosphere that is transparent enough at all wavelengths, the transit radius of the planet may be underestimated in opaques parts of the spectrum. The choice of the model top pressure thus generally results from a trade-off between computation time and convergence.
This point is even more crucial here because several other technical reasons can limit the maximum altitude of a 3D climate model to a few tenths of pascals or more (short radiative timescales, absence of non-LTE radiative treatment or conductive heat transfer, etc.).
To illustrate these limitations, Fig. 6 shows two spectra computed by Pytmosph3R for a simulation of GJ 1214 b (see Sect. 4). The black spectrum uses directly the outputs of the global climate model and stop around a pressure level of 0.5 Pa. The blue spectrum is obtained by extending the model upward to 10 −4 Pa assuming that the atmosphere remains isothermal above the GCM model top. For this particular case, this is necessary and sufficient to reach a degree of convergence commensurate with the future precision of the data (∼ 1 − 10ppm). Extending the model to even lower pressures does not significantly change the resulting spectrum.

The case of GJ 1214b as an illustration of Pythmosph3R capabilities
In this section, we apply Pythmosph3R to GCM simulations of the atmosphere of GJ 1214 b to illustrate the possibilities of the code. This also allows us to show examples of horizontal inhomogeneities and some of their effects on transmission spectra. Despite the flat spectra currently obtained with WFC3/HST, possibly due to high-altitude clouds/aerosols (Kreidberg et al. 2014), GJ 1214 b is one of the rare known targets that is not a gas giant but still offers a favourable configuration for transmission spectroscopy : the vicinity to the Sun (13 pc), a low-density implying the presence of an atmosphere, a short period (1.58 d), a red dwarf host and an expected high scale height to radius ratio (see Fig. 2). Modelling the formation, distribution and spectral signature of aerosols motivated the use of a 3D atmospheric model of the atmosphere (Charnay et al. 2015) that we used to compute transmission maps and spectra with Pythmosph3R.

Input 3D atmospheric model
We use the 100 × solar metallicity simulation of Charnay et al.   Charnay et al. (2015). The inner white disk represents the inner part of the planet with a radius assumed to be equal to the one of GJ 1214 b (17.600 km). Top: Temperature at the terminator. The planet is seen from the observer during transit, the poles being at the top and bottom. Bottom: The star is on the left and the observer on the right on the z = 0 axis. The poles are on the x = 0 axis. From center outward, the 5 solid lines are respectively the 10 6 , 10 3 , 1, 10 −2 , and 10 −4 Pa pressure levels. The outer circle is there as an eye guide to highlight the non-sphericity of the planet. Temperature is well homogenized below ∼10 3 Pa level. Maps are to scale and show that the dayside is noticeably more extended vertically than the nightside.
An important assumption in this simulation is the local chemical equilibrium : in each cell, the composition is imposed by the local temperature, pressure and elemental composition. As the simulation assumes a circular orbit with null obliquity and a synchronized rotation, the state of the atmosphere after convergence is fairly stable, exhibiting only stochastic variations. We use an arbitrary timestep to produce synthetic spectra. The thermal structure is shown in Fig. 7 and the absorbers/clouds distribution in Fig. 8. We redirect the reader to Charnay et al. (2015) for more details on the model.  Charnay et al. (2015). For KCl and ZnS, only the molecules in the condensed phase are considered. The dashed (solid) black curves are the contour of unit optical depth with (without) cloud opacities at a wavelength relevant to the molecule considered (1.17 µm for CH 4 ; 4.30 µm for CO 2 ; 1.08 µm for KCl et ZnS). The size of the inner white disk delimiting the base of the atmosphere is reduced for clarity. The whole simulated atmosphere is shown. The two white circles delimit the region enlarged in Fig. 10.

Transmittance maps
In order to understand the global transmission spectrum Pyth-mosph3R offers the possibility to draw transmittance map in any spectral bin of the spectrum. Viewing the azimuthal and vertical inhomogeneities provides some important piece of information to interpret spectral features. Fig. 10 shows transmittance maps for GJ 1214 b simulated atmosphere, with and without the effect of clouds. Temperature is directly or indirectly at the origin of the inhomogeneities. Colder regions at the western terminator are less extended vertically and locally induce a smaller absorption radius. As the simulation imposes a local chemical equilibrium, colder regions are also poorer in CO 2 and richer in CH 4 , affecting the transmittance distribution in the displayed CO 2 (14.9 and 4.3 µm) and CH 4 (1.17 µm) bands. In the 14.9 µm CO 2 bands another thermal effect comes from the strong temperature dependency of the absorption cross-section, which strongly enhances the transmittance inhomogeneities compared with the 4.3 µm band, much less sensitive to the temperature. The effect of clouds is the strongest at 0.95 µm although they mask some chemical inhomogeneities at high pressure.

Comparison with averaging methods based on 1D models
In the absence of a code like Pythmosph3R, producing a transmission spectrum from a 3D simulation using a radiative transfer model based on a horizontally-homogeneous atmospheric profile implies an average of some kind.

Mean profile
One method consists in averaging first the atmospheric quantities and then computing a transmission spectrum. One single atmospheric profile (temperature, pressure, chemical abundances and cloud properties as a function of altitude) is obtained by averaging the 2N − 2 profiles found on the terminator, weighted by the fraction of azimuth covered by each cell (N being the number of latitude points on the simulation grid). The transmission spectrum is then computed assuming that this atmospheric profile covers the whole planet. Fig. 11 shows the comparison between spectra resulting from this mean profile method and those generated by Pytmosph3R.
The difference is in large part due to the lower temperature patch at the morning terminator near the equator (west of the substellar point) which is due to the equatorial jet bringing cold air from the nightside there (see Fig. 7). This difference is considerably larger than the expected photon noise with JWST over most of the spectrum making this method clearly inadequate. This, in a sense, already highlights why 1D retrieval methods may be biased when interpreting the spectra of real atmospheres.  Fig. 8. The region shown between the inner white disk and the outer black dotted circle is an enlargement of the region delimited by white circles in Fig. 8. The white dotted circle corresponds to the effective radius: the radius of the opaque disk resulting in the same overall absorption as the simulation over a homogeneous stellar disk. The maps in the upper row are obtained with Mie scattering turned off in order to show the effect of clouds. Clouds dominate near 0.95 µm, methane is the main gaseous absorber at 1.17 µm, and carbon dioxide at 4.30 µm.

Limb integration
A better technique that is usually applied, for instance by Charnay et al. (2015), Way et al. (2017), Parmentier et al. (2018), or Lines et al. (2018, involves computing in a first step 2N − 2 transmission spectra assuming a horizontally-homogeneous atmosphere, one for each atmospheric column found on the terminator. Then the total transmission spectrum is calculated as an average of these intermediate spectra, weighted by the fraction of azimuth covered by each column. Fig. 12 compares spectra obtained with this limb integration technique and shows the comparison between spectra resulting from this approach (red line) and from Pytmosph3R (red line).
A rapid comparison of Figs. 11 and 12 clearly shows that limb integration performs much better than the mean profile approach. However, there still remains significant discrepancies throughout the spectrum and especially in some molecular bands. By definition, these differences only come from the atmospheric inhomogeneities along the path of the ray. They are due to the effect of the • Day to night temperature gradient: as the dayside is hotter than the night side, the vertical extent of the atmosphere changes along the ray. As shown in Sect. 5, this causes a net increase in absorption visible in the water bands (see lower panels of Fig. 12). Although this effect is on the order of the photon noise for a single transit for the (relatively) cold atmosphere of G J1214 b, it can strongly affect the retrieval of the properties of hotter planets as demonstrated in the next section.
• Day to night compositional gradient: for the absorption bands due to absorbers with an heterogeneous distribution (like CO 2 is our case, which absorbs prominently at 4.5 and 15 µm) a change of composition along the line of sight creates a signal that is much greater than the expected noise and that cannot be modelled by the limb integration method. See below for the possible causes of such a compositional gradient. Although it is the most prominent effect in our GJ 1214 b model, the parameter space to cover to fully quantify it is large and will have to wait a future study.
• Day to night asymmetry of the cloud distribution: the temperature change at the terminator, in turn, allows us to expect changes in the properties of the clouds there (Lee et al. 2016). This will also be looked at in a future study. Fig. 11. Transit spectra computed from a 3D simulation of GJ 1214 b atmosphere by Charnay et al. (2015) with Pythmosph3R (black) and a mean profile approximation (see text). Spectra are shown with (right) and without (left) the radiative effects from the KCl and ZnS clouds. In both cases the atmosphere is extrapolated vertically beyond the top of the simulation. Cloud particle radius is fixed to 0.5 µm. The plots on the lower part show the difference between the two methods. The purple line indicates the photon noise with a JWST aperture, an exposure time of twice the transit duration and a (low) resolution of R = 100.

Comments on the possible causes of compositional heterogeneities
In the model of GJ 1214 b we use, the concentrations of CO, CO 2 and CH 4 are computed assuming chemical equilibrium. For such atmospheres and cooler ones, the 3D variations of these species are thus strongly overestimated. While the hottest regions of the atmosphere (above ∼ 1000 K) are expected to reach equilibrium faster than typical dynamic timescales, it is not the case in the coldest layers (below ∼ 700 K) probed by transmission, where endothermic reaction become extremely slow. As a consequence, thermochemistry is expected to produce a more homogeneous composition controlled by the hottest/deepest regions. However, more irradiated planets must have overall hotter atmospheres spanning a larger range of temperatures (due to shorter radiative timescales) with very different equilibrium compositions and kinetics fast enough to main local equilibrium (Agúndez et al. 2014). These atmospheres are expected to exhibit the strongest horizontal variations of temperature and composition. Parmentier et al. (2018) recently showed that water itself may be dissociated on the dayside of some hot-Jupiters and recombine close to their terminator. In addition, UV-driven photochemistry may create additional heterogeneities by allowing some reactions to take place on the dayside of the planet only.
We can thus expect hot planets to exhibit strong day to night compositional gradients that may become a dominant issue in retrieving atmospheric properties through transmission spectroscopy. Quantifying the stellar irradiation at which these effects become significant will need further modelling.

Effect of Day/night side temperature differences on retrieval
As visible in Fig. 2, the region probed in primary transit is much larger than is usually acknowledged, especially on hot and/or low gravity objects. So during transit, we are not only probing a thin plane -the so-called terminator -but an area that can extend significantly on both the day and night sides. Because these two parts of the planets are expected to exhibit quite different temperatures (as visible in Fig. 7), it seems important to quantify the extent of the imprint of this temperature inhomogeneity on the transit spectrum of the planet, and how it will affect any attempt to retrieve the temperature at the terminator.
To answer these questions, we conduct a simple experiment. For two prototypical planets (respectively based on GJ 1214 b and HD 209458 b), we build idealized 3D atmospheric structures that are symmetric about the star-planet axis but that continuously go from a high temperature T day on the dayside to a lower one, T night , on the nightside. The atmosphere is assumed to have a uniform composition to enable us to concentrate on thermal effects. We then simulate the transit spectrum and try to invert it. Finally, the retrieved temperature is compared to the input one.

Parametrization of the atmospheric structure
The structure that we chose for the atmosphere is inspired by the 3D simulations of Charnay et al. (2015) for GJ 1214 b whose temperature distribution is shown in Fig. 7. Its most salient feature is the continuous transition in temperature between the day and night side. For sake of simplicity, we assume that this transition occurs linearly over a region that is parametrized by its opening angle -hereafter called the transition angle (β).
Moreover, as is well known and further exemplified by Fig. 7, the atmosphere of gaseous exoplanets is usually well mixed at depth. The temperature is thus assumed to be uniform below a pressure level P iso (that will taken to be 10 mb as in the simulation, although some models do predict inhomogeneities to persist at deeper levels).
In summary, for a given location identified by its pressure, P, and the (possibly negative) local solar elevation angle, α , the temperature is given by Examples of such idealized atmospheric structures are shown in Fig. 13 for three different day-night transition angles. We chose to show the structures that are most representative of the real GJ 1214 b case to allow for a direct comparison with Fig. 7. For brevity we will also refer hereafter to the "uniform case": this stands for a case where the whole upper atmosphere has a uniform temperature equal to the one at the terminator that serves as a comparison. Therefore, in this case, Once our four parameters -T day , T night , β, and P iso -have been chosen, the 3D atmospheric structure is integrated from the surface of the planet that is assumed to be the 10 bar level. Hereafter, we call the radius of this isobar R p , and assume that it contains most of the mass of the object so that the gravity above this level only depends on the altitude.
The reason we are not performing such test on idealized structures and not more realistic ones fro a GCM is that we want to isolate the effect of the day-night heterogeneities. In a GCM simulation, there would also be vertical and equator-pole temperature variations that would preclude the identification of a single effect.

Effect of the day/night temperature difference on the transmission spectrum
One might naively except that because of the symmetry of our temperature distribution, the contributions of the hot and cold sides should cancel out. This is however not the case, as can be seen in Fig. 14. Indeed, by comparing the transmittance map for a given planet in a uniform case, or with a day-night temperature gradient, we directly see that the opaque region extends higher up in the latter case. The greater scale height on the dayside is not compensated by the lower one on the night side.
This can be understood easily using a slightly modified version of the analytical model of Guillot (2010) or Vahidinia et al. (2014) where we separate the atmosphere into two hemispheres that differ only through their temperatures -and thus atmospheric scale heights (H day and H night ) -above the pressure level P iso which is located at an altitude z iso above the reference radius of the planet R p . Below z iso , the scale height is the same everywhere (H iso ). In essence, this corresponds to the case described above in the limit where β → 0.
We follow the notations in Fig. 1  These color maps show the temperature distribution of our model atmospheres in a plane perpendicular to the terminator for three different transition angles between a 650 K dayside and a 300 K nightside (from left to right, the day-night transition angle is β=15 • , 30 • , and 60 • ). The inner white circle represents the inner part of the planet with a radius assumed to be equal to the one of GJ 1214 b (17.600 km). The star is on the left and the observer on the right on the y = 0 line. From center outward, the 5 solid lines are respectively the 10 6 , 10 3 , 1, 10 −2 , and 10 −4 Pa pressure levels. Below the 10 3 Pa level, the atmosphere is assumed to efficiently redistribute heat and is horizontally isothermal. These maps are to scale and show that the dayside is noticeably more extended than the nightside. The parameters are T day = 1800 K, T night = 1000 K, and P iso = 10 mb, that are representative of the real planet (Parmentier et al. 2013). A low transmittanceblue shades -is representative of opaque regions deep down and a transmittance near unity -yellow -is representative of the transparent upper atmosphere. The left panel shows the case with a uniform upper temperature of 1400 K, and middle one, the case with a β = 15 • transition region between the day and night side. The higher opacity of the middle case is further highlighted by the right panel that shows the map of the transmittance difference between the left and the middle case. The altitude of the top of the isothermal region is ∼ 3 800 km. entail that the number density at a given altitude is where i is either day or night depending on the hemisphere.
In the limit where all the altitudes in the atmosphere are small compared to R p , the slant optical depth is the sum of the day and night side contribution, yielding  Fig. 15. Spatially integrated spectra of the relative transit depth (in ppm) expected for GJ 1214 b (left) and HD 209458 b (right) as a function of the day-night transition angle (β). The color goes from red to green when β goes from 0 • to 50 • every 15 • . The blue curve is the reference uniform case. Spectra with larger β are undistinguishable from the uniform case. The different labeled groups of curves are for our various sets of day-night temperatures. For each (T day − T night ) group, the temperature at the terminator of all the models -including the uniform one -is (T day + T night )/2. It can be seen that the transit depth increases monotonically when β decreases.
if z t > z iso , and if z t < z iso . Note that in the above equations, σ mol is the mean cross section of the gas normalized to the total density. Now, to see the increase in optical depth, let us divide the result above by the optical depth in the uniform case (τ uni ) given by Eq. (A.3). This yields where ∆H ≡ (H day − H night )/2 so that (H day , H night ) = H iso (1 ± ∆H), and ∆z = (z t − z iso )/H iso . The expansion in ∆H gives First, we see that because of the symmetry of the setup, the first order term disappears. Second, it readily results that the optical depth in the heterogeneous case is larger than in the uniform case for all the rays with a tangent altitude that is (1 + √ 2)/2 ≈ 1.2 scale heights greater than the altitude of the isothermal region. This qualitatively explains why there is little difference in the transmittance below the altitude of the isothermal region (∼ 3 800 km) in Fig. 14.
Finally, when these transmittance maps are integrated vertically, we get the transmission spectra shown in Fig. 15. As expected, the effective altitude at which the atmosphere becomes opaque systematically increases when the horizontal thermal gradient is increased near the terminator.
An important point is that the spectrum for the non-uniform case -although different from the spectrum that is be obtained for a uniform atmosphere with the temperature of the terminator -can be similar to the spectrum that would be obtained by a uniform, but globally hotter atmosphere. It is thus not surprising that a retrieval algorithm would be biased and retrieve a hotter atmosphere.

Bias in retrieved temperatures
The last step of our analysis is to actually run a 1D retrieval procedure on the transmittance spectra obtained with our parametrized 3D atmospheric structures to assess the biases entailed by such an approach. To do so, we use the JWST simulation tool PandExo (Batalha et al. 2017) to simulate the expected uncertainties over a wavelength range of 1.0 -10 µm for the duration of a single transit. For this, we combined the simulated spectra of the NIRISS/SOSS, NIRSpec/G395M and MIRI/LRS instruments. Observed uncertainties above 10 µm were too large to have a significant impact on retrieval results and were discarded. Finally, we binned the simulated observations to a resolution of 100 constant in wavelength. Given that we are investigating biases due to the model rather than observational noise, we here only consider the uncertainties calculated by PandExo but do not add additional noise to the mean of our simulated observations. Significant biases in the posterior distributions of retrieved parameters can occur when considering single random noise instances of the data. To correctly alleviate such noise-induced biases, one would need to combine posterior distributions of multiple noise-instance retrievals. Fortunately, Feng et al. (2018, sec. 5.2) have shown the combination of multiple noise-instance retrievals to converge to the noise-free retrieval posterior distributions, as expected from the central limit theorem. We have furthermore neglected the inclusion of non-Gaussian noise due to instrument or stellar systematics, as these are either not currently known and/or data set dependent. We therefore note that the retrieved parameter uncertainties presented here are theoretical lower limits. For each model of our grid in temperature and day-night transition angle, we ran the spectrum through the TauREx retrieval software (Waldmann et al. 2015). Here we considered water as the only trace gas with absorption cross sections computed using the Barber et al. (2006) line list. We include Rayleigh scattering and collision-induced absorption of H 2 -H 2 and H 2 -He (Borysow et al. 2001;Borysow 2002;Rothman et al. 2013) and assumed the atmosphere to be cloud-free. The vertical temperature-pressure profile was modelled to be isothermal. A typical posterior distribution for the retrieved parameters resulting from this procedure is shown in Fig. 16, along with the input simulated spectrum and the fitted one.
The retrieved temperatures and water abundances as a function of the transition angle (β) for our GJ 1214 b case are shown in Fig. 17. Fig. 18 shows the temperature result for HD 209458 b. We tested different day and night side temperatures to see how planets with various irradiations would behave. To be able to compare the retrieved temperature (T ret ) in these different cases, we use the relative retrieved temperature which should thus be equal to 0.5 if we were to retrieve the terminator temperature. Although there are some quantitative differences among these cases, some robust trends emerge: • The retrieved temperature is systematically biased toward a higher temperature than that of the terminator (Θ ≥ 0.5). • There are two regimes separated by a critical day-night transition angle that depends on the characteristics of the planet and is roughly consistent with our estimate of the opening angle of the part of the planet that is probed in transit, i.e. the limb (ψ, denoted by a vertical dashed line; See Fig. 2).
• For β < ψ, the retrieved temperature decreases roughly linearly with the transition angle. As the transition angle between the day and night side goes to zero (very sharp transition expected for the hottest planets), the retrieved temperature approaches the dayside temperature. • For β > ψ, the temperature structure within the limb, hence the retrieved temperature, does not vary much. Whether the actual retrieved temperature is equal to the temperature at the terminator depends on the case (see below). • Despite the uniform composition in our models, the retrieved abundance is always significantly biased -in the sense that the real abundance is outside the formal error bars of the retrieval -although the magnitude and direction depends on the specific case. It is sensible to assume that other more complex biases will arise if chemical gradients are present as well.
If in the HD 209458 b case (see Fig. 18), the retrieved temperature converges toward the temperature at the terminator when the latter becomes more uniform (β → 180 • ), it is not necessarily the case for GJ 1214 b. We find that this absence of convergence at large angles always occurs when the hot, deep atmosphere below the P iso level is probed by the transit spectrum: the retrieval is biased by the vertical temperature gradient. This does not happen for our Hot Jupiter case because of the larger radius that push the transit photosphere at lower pressures. Although an important bias in itself, it has already been studied by Rocchetto et al. (2016), and will not be further discussed here.

Could we see that something is wrong?
Could an observer, having performed the retrieval, detect that the retrieved quantities are biased by the day to night temperature gradient? This is indeed a crucial point.  Fig. 17. Retrieved temperature (top) and water abundance (bottom) as a function of the transition angle between the day and nightside of a planet with the radius and surface gravity of GJ 1214 b. The colours correspond to the temperature range (Blue: 300-650 K; Black: 500-1000 K; Red: 1000-1500 K). To put all cases on a single diagram we show the relative retrieved temperature, Θ ≡ (T ret −T night )/(T day −T night ), so that in all cases Θ = 0.5 at the terminator. The retrieved temperature is systematically biased toward the dayside temperature (Θ ≥0.5), especially when the transition is sharp. For comparison the estimate of the width of the limb given by Eq. (3) is shown by the vertical dotted line.

Input Abundance
Unfortunately, this seems precluded, even with the exquisite precision of JWST. As can be seen in Fig. 16, the best-fit 1D, isothermal spectrum does not miss any feature of the input 3D spectrum. In fact, Fig. 19 shows that the reduced χ 2 of the optimal retrieved models are always near or below unity 3 . Based on this metric, the 1D isothermal atmosphere model thus provides an acceptable fit to the data, at least in the low resolution mode that we have explored here. In fact, it even seems that the fit is better when the bias is the strongest (low day-night transition angle). This counter intuitive result comes from the fact that when the temperature transition is sharp, we probe almost exclusively the dayside. The atmosphere thus appears more homogeneous.
A procedure that would use a radiative transfer code similar to Pythmosph3R to retrieve a 3D structure/composition (which would imply formidable computing resources) would admittedly reveal the issue as its posterior distributions would expose the full extent of the degeneracies and result in larger, more reliable error bars on the retrieved quantities. Nevertheless, such a sophisticated tool may not be able to achieve a better retrieval. Dimensionless retrieved temperature as a function of the transition angle between the day and nightside of a planet with the radius and surface gravity of HD 209458 b. The colors correspond to the temperature range (Blue: 500-1000 K; Black: 1000-1500 K; Red: 1000-1800 K). See Fig. 17 Fig. 19. Reduced χ 2 for the fit of the optimal retrieved models from Figs. 17 and 18. The color coding is the same as for those figures. Solid curves are for GJ 1214 b and dashed are for HD 209458 b. As the reduced χ 2 is always close to or smaller than 1, the fit would always be considered satisfactory.
Indeed, even if a better χ 2 may be found with a forward model using a 3D thermal and compositional structure, the 1D model already provides an excellent match and the improvement, if any, would be achieved at the expense of adding so many parameters that parsimony criteria may favour the most simple model. However, a much higher resolution may change this state of affairs. Especially, if we start to be sensitive to the line shape of individual lines. It is also possible that an east-west asymmetry of the terminator could add some signal that would be distinguishable from any 1D profile. This will have to be assessed in a future study.

Which atmospheres are affected?
Since the first detections of the thermal emission of a planet, the existence of a strong day-night temperature gradient on hot extrasolar planets has been well established (Cowan & Agol 2011). A clear trend has even emerged that the hotter the planet, the greater the thermal contrast (Komacek et al. 2017;Keating & Cowan 2018) -a thermal contrast that can reach more than a thousand degrees. The bias on the retrieved limb temperatures on real planets is thus potentially huge. The yet rather obser-vationally unconstrained parameter is the width of the day-night transition region (here β) and how it compares to the width of the limb that is effectively probed in transmission (ψ). Since phase curve observations have not quite yet the resolution necessary to precisely measure the width of the transition, we turn to published models.
For Neptune-like planets such as GJ 1214 b, our predictions based on a GCM tell us that the thermal-only effect of the daynight temperature gradient is on the order of 50-100 ppm in the water bands, and clearly detectable by JWST (see Fig. 12). It is thus sensible to assume that any hotter Neptune-like planet should be increasingly affected because of their higher H/R p ratio and their stronger thermal day-night contrast (see Fig. 15). On the contrary, how cold -the stratosphere of colder planets being more uniform -will a planet need to be before such effects are undetectable remains to be elucidated.
However, keep in mind that we discussed here only the direct thermal effect, leaving out the possible chemical heterogeneities. The strong signature of day-night gradients in the CO 2 and CH 4 bands the spectrum in Fig. 12, which can reach 600-1000 ppm, are believed to be of even greater importance and could possibly strongly hamper the ability of conventional retrieval algorithm to retrieve meaningful molecular abundances in the case of heterogeneous atmospheres.
Such considerations are even more important for hot-Jupiters. Despite their higher gravity, they can be much hotter, which entails that the day night contrasts are expected to be much stronger-both thermally and compositionally because chemical timescales are expected to be short compared to the advection timescales. A good example are the so-called ultra hot-Jupiters where it is predicted that some very abundant molecules on the nightside, such as water, could be almost completely absent on the dayside due to thermal dissociation (Parmentier et al. 2018).

Conclusion
Overall, our most important conclusion is that the region of the atmosphere probed in by transit spectroscopy, i.e. the limb, is not confined to a narrow annulus around the planet as often implied, but can indeed extend relatively far throughout its two hemispheres. This is especially true for hot and/or low gravity objects, the most significant metric being the ratio of the atmospheric scale height to the radius of the planet.
As a result, in addition to the variations of atmospheric properties of the atmosphere along the terminator, the transit spectrum is also affected by their variations across the limb, i.e. along the path of the light rays.
To investigate all these effects, we have developed Pytmo-sph3R, a transit spectrum generator that can work with a 3D atmospheric structure, whether it is the output of a global circulation model or a more idealized one. Using this tool along with a 3D atmospheric model of GJ 1214 b, we have recovered previous results that the temperature and compositional variations along the terminator significantly affect the transit spectrum and will have to be accounted for in future studies. These effects can in principle be partially accounted for by using a (1+1)D, or limb integrated, approach where one 1D spectrum is generated for each part of the limb before it is weighted and added to the others to generate the global one.
However, our fully 3D framework has shown that at the precision that will be reached by future observatories, the limb integrated approach is insufficient. Indeed, we have shown that for temperature gradients realistically expected for observable exoplanets, the transit spectrum is significantly affected by the structure of the atmosphere across the limb, i.e. the thermal and compositional gradients between the day and night side of the planet. We further demonstrated that this effect systematically biases 1D retrieval methods toward the temperature of the day side. The extent of this bias, however, depends on the strength of the temperature contrast as well as its sharpness around the terminator, the latter being the most difficult to predict.
In other words, one should be aware of the fact that the temperature (or its profile) retrieved from transmission spectra may not apply to the terminator itself, and that temperatures at the terminator are in fact significantly smaller. This will of course be a routine problem for future high precision observatories, but we have demonstrated that the effect on the spectrum -which can reach hundreds, if not thousands of ppms in some cases (see Fig. 15) -is well above the precision of 50-100 ppm that has been achieved with HST and Spitzer (Cowan et al. 2015). So current observations of some Hot Jupiter are possibly already affected. To what extent? this remains to be elucidated.
We first start be computing the optical depth along a ray crossing the atmosphere of a planet where temperature, gravity, and composition are constant. Here, we will follow the notations of Vahidinia et al. (2014), but the reader is referred to Guillot (2010) for a more in depth discussion. Notations are summarized in Fig. 1.
The constant atmospheric scale height, H, entails that the gas number density is given by n = n 0 e −z/H , (A.1) where n 0 is the density at the reference radius (R p ). For a ray with a given tangent altitude z t , the altitude in the atmosphere at a distance x from the limb plane is given by to first order in z t /R p . The optical depth along a ray from the star to a given position x due to a given species is thus τ tr (z t ) =  where χ is the volume mixing ration of the considered species, and σ mol its cross section at the wavelength considered. Noting that the vertical optical depth is given by τ ⊥ (z t ) = H σ mol χ n 0 e −z t /H , (A.4) we retrieve the result from Guillot (2010) that Following Sect. 2.5, to first order, the transit depth is given by where ρ = R p + z t . Along with Eq. (A.3), Eq. (A.6) is used to validate our model in Sect. 3.2.

Appendix A.2: Heterogeneous composition
In this subsection, we will slightly modify the model above to answer the following question: how far from the limb plane can an increase in the abundance of a given species still affect the transit in the relevant bands. Two answer that, we will assume that the mixing ratio of the considered species is χ day along the line ray for x < x limb (where x limb is negative if the transition is on the day side) and χ night beyond that. This is supposed to mimic a situation where an absorber, like TiO, becomes less abundant at the terminator and on the night side because it condenses at cooler temperatures. Because of the symmetry of the problem, this also treats the situation where an absorber becomes more abundant on the night side. Following Fig. 1, x limb is also parametrized by the limb angle tan(ψ/2) = x limb /(R p + z t ).
If the composition were uniform, the optical depth at a given tangent altitude would be τ tr = 2πR p H σ mol χ night n 0 e −z t /H . (A.7) Using τ tr ∼ 1 as a criterion for the effective altitude of absorption of the atmosphere, we get an effective altitude for the uniform case that is z uni = H ln 2πR p H σ mol χ night n 0 . (A.8) Now, in the heterogeneous case, using the penultimate line of Eq. (A.3) yields which is equal to the uniform case in the x limb → −∞ limit, as expected. The effective altitude is then At what angular distance from the limb plane can we still see such an increase of a given species in the transit spectrum? To answer this question, one needs to quantify at which angle ψ does the resulting change in effective transit altitude of the atmosphere due to the heterogeneity become measurable. This writes δ het − δ uni > σ obs ⇔ z het (ψ) − z uni > R 2 2R p σ obs , (A.11) where σ obs is the relative precision level of the observations. Using the parameters for HD 209458 b and assuming a noise floor of 10 ppm -which is probably conservative for JWST -we see that an increase of the TiO abundance on the dayside of only a factor 100 is visible as far as 15 • from the limb plane. This increase is also conservative as the abundance of TiO at temperature below 1600 K is less that 10 −10 (Lodders 2002). This results in a limb width ∼ 30 • which is consistent with our other estimate (see Fig. 2).