The Acceleration of Superrotation in Simulated Hot Jupiter Atmospheres

Context. Atmospheric superrotating flows at the equator are an almost ubiquitous result of simulations of hot Jupiters, and a theory explaining how this zonally coherent flow reaches an equilibrium has been developed in the literature. However, this understanding relies on the existence of either an initial superrotating or a sheared flow, coupled with a slow evolution such that a linear steady state can be reached. Aims. A consistent physical understanding of superrotation is needed for arbitrary drag and radiative timescales, and the relevance of considering linear steady states needs to be assessed. Methods. We obtain an analytical expression for the structure, frequency and decay rate of propagating waves in hot Jupiter atmospheres around a state at rest in the 2D shallow water beta plane limit. We solve this expression numerically and confirm the robustness of our results with a 3D linear wave algorithm. We then compare with 3D simulations of hot Jupiter atmospheres and study the non linear momentum fluxes. Results. We show that under strong day night heating the dynamics does not transit through a linear steady state when starting from an initial atmosphere in solid body rotation. We further show that non linear effects favour the initial spin up of superrotation and that the acceleration due to the vertical component of the eddy momentum flux is critical to the initial development of superrotation. Conclusions. Overall, we describe the initial phases of the acceleration of superrotation, including consideration of differing radiative and drag timescales, and conclude that eddy-momentum driven superrotating equatorial jets are robust, physical phenomena in simulations of hot Jupiter atmospheres.


Introduction
Understanding the atmospheric dynamics of hot Jupiters, Jovian planets in short period orbits, has been a major challenge since their discovery (Mayor & Queloz 1995). Due to their proximity to their host star hot Jupiters are expected to be tidally-locked (see Baraffe et al. 2010, for review), resulting in a permanent day and night side driving atmospheric circulations with no equivalent in our solar system, which in turn likely mix material between the two hemispheres (Drummond et al. 2018b,c). Cooper & Showman (2005) performed the first study of the atmosphere of HD 209458b (e.g., Charbonneau et al. 2002;Sing et al. 2008;Snellen et al. 2008) using a General Circulation Model (GCM), and such GCMs have subsequently been used extensively to characterise hot Jupiters (e.g., Showman et al. 2008;Heng et al. 2011;Rauscher & Menou 2012;Dobbs-Dixon & Agol 2013;Helling et al. 2016). The physical complexity, or completeness, of these GCMs varies greatly, for example treatments of the dynamics and radiative transfer range from those adopting the primitive equations of dynamics, corresponding author: florian_debras@hotmail.com and simple Newtonian cooling, to those solving the full Navier-Stokes equations and more accurate radiative transfer (see, notably, Amundsen et al. 2014;Amundsen et al. 2016). Recent advances have also been made in the treatment of chemistry, regarding both the gas phase (see Drummond et al. 2016;Tsai et al. 2018;Drummond et al. 2018a,b,c) and the condensates, or clouds Lines et al. 2018b,a;Roman & Rauscher 2019).
A common feature has emerged from almost all GCM studies of hot Jupiters: the atmosphere exhibits equatorial superrotation, a prograde atmospheric wind velocity greater than that arising from the rotation of the planet alone, over a range of pressures. Observations have detected an eastward shift of the peak infrared flux from the substellar point in the atmosphere of hot Jupiters (Knutson et al. 2007;Zellem et al. 2014), consistent with that found in simulations caused by the advection of heat by the superrotating jet. Mayne et al. (2017) attempted to suppress the formation of the equatorial jet in simulated hot Jupiter atmospheres by forcing the deep atmospheric flow, or altering the model parameters. They found superrotation to be a very robust feature in numerical simulations. However, a recent measurement has inferred an opposite, westward shift for COROT-2b (Dang et al. 2018), and Armstrong et al. (2016) previously obtained variability in the position of the hot spot with time suggesting additional Article number, page 1 of 28 arXiv:1911.03182v1 [astro-ph.EP] 8 Nov 2019 A&A proofs: manuscript no. main complexity (a potential link to magnetic fields has recently been investigated by Hindle et al. 2019). Superrotation therefore has to be explained with sound physical arguments. Showman & Polvani (2011) were the first to study the formation of a superrotating jet in simulated hot Jupiter atmospheres using a simplified two-layer model. Exploring the linear steady state of the atmosphere Showman & Polvani (2011) highlighted the formation of a Matsuno-Gill (hereafter MG, see Matsuno 1966;Gill 1980) pattern, where the atmospheric perturbations are 'tilted' in the latitude-longitude plane driving momentum transport to the equator and accelerating the jet. Showman & Polvani (2011) posit that the non-linear equilibrium is reached when the transport of meridional and vertical eddy momentum into the region, acting to accelerate and decelerate the jet, respectively, are balanced by the atmospheric drag. Tsai et al. (2014) extended the study to a full 3D dynamical model including consideration of the resonance of the atmospheric wave response, as well as the 'tilt' of the vertical component which acts to drive the vertical eddy-momentum transport, under the assumption of equal drag and radiative timescales. This was followed by Hammond & Pierrehumbert (2018) who explored superrotation in 2D with the addition of a shearing flow. Perez-Becker & Showman (2013) considered the propagation of waves and the resulting balance for the equilibrated jet and propose diagnostics for predicting the day to night temperature contrast, controlled by the efficiency of the zonal advection. This analysis was later improved upon by Komacek & Showman (2016), across an extensive range of dissipation timescales.
There is however an inherent discrepancy between the works of Komacek & Showman (2016) and Showman & Polvani (2011): when the atmospheric drag timescale is large, superior to a few 10 5 s, the linear steady states obtained in Komacek & Showman (2016) tend to decelerate the equator although the associated non linear steady states exhibit equatorial superrotation. This raises the question: is superrotation properly explained through the transition from a linear steady state ?
Specifically, the study of Tsai et al. (2014) is only valid in the moderate to strong dissipation limit, and that of Hammond & Pierrehumbert (2018) requires an initial sheared superrotation. However, Komacek & Showman (2016) showed that superrotation develops only if the dissipation is sufficiently low (see their Figure 4). Current theories are therefore applicable only once an initial flow has been set up, and its evolution is slow compared to the wave propagation time.
In this study, we address the issue of what is driving the initial spin up of superrotation in simulated hot Jupiter atmospheres. In order to do this we develop a description of the time dependent waves supported by our simulations of hot Jupiters atmospheres with arbitrary drag and radiative timescales, and determine which are responsible for driving the evolution of the jet. Firstly, in Section 2 we state our main assumptions and develop the mathematical framework we adopt throughout this work, before finding the form of the time dependent linear solution to the beta-plane equations. Additionally, in this section we summarise the main results of Showman & Polvani (2011), Tsai et al. (2014) and Komacek & Showman (2016) upon which we base our study. Obtaining the form of the solution to the time dependent case is not sufficient as the controlling parameters remain unconstrained and are not easily accessible analytically. Therefore, in Section 3 we numerically explore the sensitivity of the steady linear solution to the shape of the forcing, or heating, showing that the linear steady state requires a day-night heating contrast but is insensitive to the exact shape of the forcing itself. This confirms that the limitations of the current the-ory do not come from the simplified form of the forcing, but that time-dependent linear considerations must be included. We therefore study numerically the propagating waves, except for the special case of Kelvin waves which have an analytical expression, to build a more complete picture of the physical process of acceleration of superrotation. In Section 4 we determine the characteristic decay timescale for different waves and arbitrary dissipative timescales, which are used to explain the structure of both the linear steady states presented in Komacek & Showman (2016) (their Figure 5) and the time-dependent linear evolution of simulated hot Jupiters. In Section 5, we then combine the understanding developed throughout this study to detail the transition to superrotation in 3D GCM simulations through eddy-mean flow interaction under different conditions, revealing the importance of time-dependent linear considerations as well as vertical momentum transport across different drag and forcing regimes. Finally, we summarise our conclusions in Section 6. Overall, our study shows that the paradigm of equatorial superrotation in hot Jupiters is robust: superrotation is accelerated by an eddy-mean flow interaction (i.e. atmospheric waves interacting with the background flow), and is strongly influenced by the wave dissipation timescales and vertical momentum convergence.

Theoretical framework
For this study we adopt the 2D shallow water equations under the equatorial β-plane approximation. As in Showman & Polvani (2011), the shallow water approximation consists of considering that the planet can be decomposed into an upper, constant density but dynamically active layer with a free surface at the bottom exchanging heat and momentum with a lower, quiescent layer of much higher density. The equatorial β-plane simplifies the spherical planet as a local Cartesian plane at the equator, with the Coriolis parameter depending linearly on the Cartesian meridional coordinate, y, with a factor of proportionality β = 2Ω/R, where Ω is the rotation rate of the planet and R its radius. The further away from the equator, the less valid this approximation is but it allows for analytic solutions, particularly suited for the study of equatorial superrotation. Wu et al. (2000) showed that the 3D structure of solutions to the β-plane system can be decomposed onto an infinite sum of solutions of 2D β-planes with different characteristic heights. The importance of this decomposition regarding hot Jupiter atmospheric dynamics has been emphasised by Tsai et al. (2014), where a vertical shift of the wave response is presented when the mean background velocity is changed (their Figure 10). We begin by summarising the main results of Matsuno (1966), Gill (1980) and Showman & Polvani (2011), all of which solve the 2D β-plane equations.
Following Showman & Polvani (2011), the nondimensional, linearised equations of motion for a forced 2D, equatorial β-plane can be written as where x and y are the horizontal coordinates, t is time, u is the zonal velocity (x direction), v the meridional (y direction), h the height (H) of the shallow water fluid minus the initial, horizontally constant height (H 0 ), i.e. h = H − H 0 , τ drag the drag timescale, τ rad the radiative timescale and Q the heating function. The characteristic length, speed and time, corresponding to the Rossby deformation radius, the gravity wave speed and the time for a gravity wave to cross a deformation radius in the shallow water system are: respectively, where g is the gravitational acceleration assumed constant, and β = 2Ω cos φ/R or the derivative of the Rossby parameter with φ the latitude of the β-plane. In the rest of this paper we only consider φ = 0. Eqs.
(1) to (3) form a linear differential equation of the form ∂X/∂t = LX + Q where X = (u, v, h) is a vector of solutions, L a linear operator and Q = (0, 0, Q) is the vector form of the forcing. Hence the solution is the sum of a homogeneous and a particular solution. The spatial part of the 3D solution can be expressed as an infinite sum of modes indexed by m with equivalent depth H m instead of H 0 . The orthogonal base functions are sinusoidal in z, the vertical coordinate, and of the form e i(mz) with m ∈ N and the heating must be decomposed onto these functions (see Tsai et al. (2014) section 2 for the rescaling of z and Wu et al. (2000) section 2 for a discussion on boundary conditions).
When neither drag nor heating are considered, Matsuno (1966) expressed the analytic solutions to the homogeneous equations in the form {u, v, h} = {ũ,ṽ,h} exp(iωt + ikx), where ω is the complex frequency, k the longitudinal wavenumber and a tilde denotes a function of y only. Dropping the tilde for simplicity, the homogeneous equations can be expressed as: iωv + yu + ∂h ∂y = 0, iωh + iku + ∂v ∂y = 0 . Matsuno (1966) showed that this system reduces to a single equation for v, namely, By analogy with the Schrödinger equation of a simple harmonic oscillator, the boundary condition v → 0 when |y| → ∞ requires with n ∈ N. As this is a third order equation, the eigenvalues for the frequency are labelled ω n,l with l = 0, 1, 2, and the corresponding eigenvectors are labelledX n,l = (u n,l , v n,l , h n,l ),. Finally, the case where n = 0 is treated separately, and the important case where v is identically null is similar to a coastal Kelvin wave, with ω = −k (Matsuno 1966). The form of the solutions in the y direction are expressed through the use of the parabolic cylinder functions ψ n , given by ψ n (y) = H n (y)e −y 2 /2 , where H n is the n th Hermite polynomial. Finally, simple mathematical arguments show that the eigenvalue ω is always real: the homogeneous solutions are only neutral modes. Matsuno (1966) also showed that the eigenvectors of Eqs.(7) to (9) form a complete, orthogonal set of the 2D beta-plane: at a given time, any function on the beta plane can be written as a linear combination of the ψ n (y) exp(ikx) functions. Matsuno (1966) and Gill (1980) obtained the steady state solution to Eqs.(1) to (3) under the inclusion of heating (and cooling) and drag. The completeness and orthogonality of the above functions allows one to write: where q n,l is the projection of Q ontoX n,l . Matsuno (1966) (their Eq.(34)) and Gill (1980) showed that a steady solution X to the forced problem with τ drag = τ rad is given by Showman & Polvani (2011) showed that, for a horizontal wavenumber one representing the asymmetric heating of hot Jupiters and τ drag = τ rad = 10 5 s, the steady linear solution exhibits a 'chevron' shaped pattern (in pressure, density or temperature), and has been denominated the Matsuno-Gill solution, leading to a net acceleration of the equator at the non linear order. However, it is not clear whether a linear steady state is relevant in a case where non-linearities are likely dominant, i.e. hot Jupiters where the extreme forcing is likely to trigger non-linear effects over short timescales, and further whether it is appropriate to choose equal values for both dissipation timescales. Therefore, we require the time dependent solution of Eqs.
(1) to (3) in the general case, which are expressed in section 2.3.

Non-Linear Accelerations from the Linear Steady State
Now that we have reviewed the main assumptions and equations for our basic framework, in this section, we move on to summarising the key results of Showman & Polvani (2011) and Tsai et al. (2014) A key conclusion of Showman & Polvani (2011) is that the Matsuno-Gill pattern is a linear steady state, but the non linear accelerations from this circulation trigger an equatorial superrotation. Consider a linear perturbation (a wave or a steady linear circulation) associated with velocities u , v , w in the longitudinal, latitudinal and vertical directions, respectively. The non linear momentum fluxes per unit mass from this perturbation scale as where φ l is the latitudinal flux of momentum, φ v the vertical and an overline denotes a longitudinal average. When φ l is positive, there is a net transport of eastward momentum to the North, with a negative value resulting in a net transport to the South. When φ v is positive, there is a net transport of eastward momentum upward, or downwards when it is negative. Therefore, if φ l is negative in the mid latitudes of the Northern hemisphere and positive in the Southern hemisphere, there is a net meridional convergence of eastward momentum (a similar argument applies in the vertical coordinate for a 3D systems). For the shallow water βplane system used in Showman & Polvani (2011) the vertical momentum flux is accounted for by the addition of a coupling term between the deeper (high pressure), quiescent atmosphere and the dynamically active (lower pressure) atmosphere (R term in Eqs.(9) and (10) of Showman & Polvani 2011).
In Figure 1a, we present the temperature (colour scale, K) and wind vectors (vector arrows) as a function of latitude and longitude for a typical Matsuno-Gill circulation. This 3D linear steady state was obtained using ECLIPS3D (see Debras et al. 2019, for details and benchmarking of ECLIPS3D), a linear solver for waves, instabilities and linear steady states of an atmosphere under prescribed heating and drag. The initial state around which the equations are linearised is an axisymmetric, hydrostatically balanced state at rest. The bottom pressure is set to 220 bars and the temperature profile follows that of Iro et al. (2005), with the polynomial fit in the log of pressure of Heng et al. (2011), assuming an ideal gas equation of state. Due to the very high inner boundary pressure, the choice of the inner boundary condition does not impact our results. The physical parameters relevant for HD 209458b used in this setup, namely the radius R p , the rotation rate Ω, the depth of the atmosphere R top , the surface gravity g p , the inner boundary pressure p max , the specific heat capacity c p and the ideal gas constant R are given in Table 1. Finally, the heating as well as drag and radiative timescales were prescribed as in Komacek & Showman (2016) with the addition of an exponential decay of the heating in the upper, lower pressure, part of the atmosphere. The exponential damping acts to mimic the damping of vertical velocities close to the outer boundary, or 'sponge layer' applied in 3D GCMs (see for example . In turn, this damping layer allows the adoption of a 'no escape' or reflective outer boundary condition, which would otherwise reflect waves back into the numerical domain. The equilibrium temperature towards which the atmosphere relaxes is a sinusoidal function of latitude and longitude, and ∆T eq , the equilibrium day side-night side temperature difference, decreases logarithmically in pressure between 10 −3 bar where ∆T eq = ∆T eq,top (the value at the top of the atmosphere which is set to 10K in this case) to 10 bars where ∆T eq = 0. The drag timescale, τ drag , is constant throughout the atmosphere at 10 5 s and the radiative timescale, τ rad , is a logarithmically increasing function of pressure (see Eq.(7) of Komacek & Showman 2016) between 10 −2 bar where τ rad = τ rad,top = 10 5 s and 10 bar where τ rad = 10 7 s.
As shown in Figure 1a the maximum temperature at the equator is shifted eastward from the substellar point (the substellar point is set at a longitude of 180 degrees) in our results consistent with observations (Knutson et al. 2007;Zellem et al. 2014). The meridional circulation exhibits a Rossby wave-type circulation at mid latitudes, with clockwise or anticlockwise rotation around the pressure maxima, and a Kelvin wave type circulation at the equator, with no meridional velocities. The combination of both of these circulations brings eastward momentum to the equator to the East of the substellar point, and advects westward momentum to the mid latitudes to the West of the substellar point. Globally, it is easily shown that φ l is indeed negative in the Northern hemisphere and positive in the Southern hemisphere: there is a net convergence of eastward momentum at the equator. According to Showman & Polvani (2011), this convergence is associated with divergence of vertical momentum flux, and equilibration occurs when the vertical and meridional terms balance. Tsai et al. (2014) have further extended this understanding by expanding to include the vertical transport more completely. Tsai et al. (2014) projected the heating function onto equivalent height β-plane solutions (see Section 2.1), showing that the vertical behaviour of the waves can be linked to the equilibration of . Temperature (colourscale in K) and horizontal wind (arrows) as a function of longitude (x axis) and latitude (y axis) at the 40 mbar pressure level of the linear steady state (denominated Matsuno-Gill circulation) obtained using ECLIPS3D (Debras et al. 2019) with heating function, drag and radiative timescales following the definitions of Komacek & Showman (2016). Following the notation of Komacek & Showman (2016): (a) ∆T eq,top = 100 K, τ drag,top = 10 5 s and τ rad,top = 10 5 s. The maximum speed at this pressure range is 10 m.s −1 . (b) ∆T eq,top = 100 K, τ drag,top = 10 6 s and τ rad,top = 10 4 s. The maximum speed at this pressure range is 100 m.s −1 . Note that the maximum speed has been multiplied by ten as the drag timescale has been multiplied by ten.
the jet (a synonym here, and throughout this work for equatorial superrotation). More precisely, Tsai et al. (2014) show that in the limit of slow evolution, or strong dissipation, their linear development around a steady flow with constant background zonal velocity reproduces the wave processes occurring in 3D simulations extremely well. Tsai et al. (2014) show that the wave response of the atmosphere is shifted from West to East when the background zonal velocity is increased (their figure 10): this is interpreted as a convergence towards a single equilibrium state, where the non linear acceleration from the linear processes cancel. Although very detailed and physically relevant, the results of Tsai et al. (2014) are, as they state, only applicable to the strong or modest damping scenario, dictated by the fact that the waves must have the time to reach a stationary state before non linearities become significant. Throughout this work we define 2.06 × 10 −5 Depth of the atmosphere, R top (m) 1.1 × 10 7 Surface gravity, g p (ms −2 ) 9.42 Inner boundary pressure, p max (Pascals, Pa) 220 × 10 5 Specific heat capacity (constant pressure), c p (Jkg −1 K −1 ) 14 308.4 Ideal gas constant, R (Jkg −1 K −1 ) 4593 the drag regime relative to the initial acceleration of the jet: in the 'weak' drag regime, non-linear terms become non negligible before a linear steady state (MG) could have been reached, in the 'modest' drag regime the time to reach the MG state is comparable with the time to depart from this linear steady state, and in the 'strong' drag regime we can decouple the linear and the non linear evolution of the planet, as considered by Showman & Polvani (2011). Once an initial jet has been accelerated, the evolution of the atmosphere is much slower than its initial acceleration and the results of Tsai et al. (2014) therefore apply, even in a weak drag regime, explaining the consistency of their work for the evolution of the jet towards an equilibrated state.
Komacek & Showman (2016) compare the steady states from various 3D GCM simulations across a range of τ rad and τ drag values (their figures 4 and 5). The simulations of Komacek & Showman (2016) extend from low forcing, hence a linear steady state, to strong forcing, hence a non linear steady state. Contrary to the conclusions of Showman & Polvani (2011), Komacek & Showman (2016) also show that when the linear steady state resembles that of Figure 1a, the associated non linear steady state is not (or weakly) superrotating. This can be understood by the fact that τ drag is smaller than the characteristic timescale of advection by the superrotating jet over the whole planet, hence the jet is dissipated before it can reach a steady state. In Figure 1b, we present the results from ECLIPS3D obtained when reproducing a particular setup of Komacek & Showman (2016), namely with τ drag = 10 6 s and τ rad,top = 10 4 s. According to the analysis of Komacek & Showman (2016) Figure 1b would lead to a removal of momentum at the equator. Komacek & Showman (2016) acknowledge this: "these phase tilts are the exact opposite of those that are needed to drive superrotation". For the explanation of Showman & Polvani (2011) the stationary wave pattern obtained from the heating is postulated to accelerate superrotation, but Komacek & Showman (2016) present results which oppose this scenario: when the linear steady state accelerates the equator (Figure 1a, with τ drag = 10 5 s) the associated non linear steady state is not superrotating. However, when the linear steady states takes momentum away from the equator (Figure 1b, with τ drag = 10 6 s), the non linear steady state is superrotating. Thus, there is an inherent discrepancy between Showman & Polvani (2011) and Komacek & Showman (2016). In order to understand this discrepancy, we need to go a step beyond the sole consideration of a linear steady state, and study the evolution of the linear solution with time. This is the objective of the sections 2.3 and 4.

Time dependent solutions
Now that we have established the basic mathematical system, and summarised the current picture of superrotation in hot Jupiter atmospheres, we move to expressing the time dependent solution to the forced problem which provides us with the shape of the atmospheric wave response. Our main assumption is that the heating function can be decomposed onto the homogeneous solutions of Eqs.(17) to (19). When τ rad = τ drag , the horizontal part (defined as X n,l (x, y, t = 0) in Eq. (20)) of the eigenvectors still form a complete set of the equatorial β-plane because of the orthogonality and completeness of the Hermite functions, as in Matsuno (1966). However, when τ rad τ drag the eigenvectors are no longer orthogonal, as shown in Appendix A, and a rigorous proof would be needed to show that they still form a complete set of solutions 1 . From a physical perspective, it is expected that the heating function will trigger linear waves which are solutions of the homogeneous equation, and such a decomposition of the heating function onto these waves therefore probably exists, although it is no longer simply given by a scalar product. Finally, it is worth stating that solving ∂X/∂t = aX + Q is straightforward (except that we don't know the eigenvalues and eigenvectors of the homogeneous equation), however, employing a Green's function to solve this equation provides a more physically intuitive result in terms of wave propagation and dissipation.
With the addition of a drag timescale, τ drag , and a radiative timescale, τ rad , Eqs.(7) to (9) can be modified to yield iω iω Indexing again the solutions by n and l as in Matsuno (1966), we define X n,l = (u n,l , v n,l , h n,l ) =X n,l (y)e ikx+(iν n,l −σ n,l )t (20) as an eigenvector of Eqs.(17), (18) and (19), withX n,l (y) the amplitude of the wave, k the horizontal wave number, ν n,l its frequency and σ n,l its damping (or growing/growth) rate (note that ω n,l = ν n,l + iσ n,l ). We also define L as the operator of the same equations, such that LX n,l = 0 for all n and l. The general equation can then be written as LX F = Q, where Q is the forcing which, as it is only present in the third individual equation is A&A proofs: manuscript no. main given by Q = (0, 0, Q), and X F is the forced, time dependent solution. A homogeneous solution X H can be written in its general form as where α n,l are scalars. When τ drag = τ rad , ν n,l are similar to the ω n,l of Matsuno (1966) and σ n,l = τ −1 drag for all (n, l). When τ drag τ rad , the analytical expressions for ν n,l and σ n,l are not known a priori. In order to solve the general equation, we seek the causal Green function X G that represents the solution at time t due to switching on the forcing at time t only. Therefore, for all t and t : where δ(t) is the Dirac distribution and F is the heating function. In the case of simulated hot Jupiter atmospheres, the star is effectively 'switched on' at t = 0 after which the heating is constant with time (in the linear limit). F can then simply be expressed as F(x, y, t ) = Θ(t )Q(x, y), where Θ(t) is the Heavyside function (null when t < 0 and equal to 1 otherwise). The forced solution of Eqs. (17) to (19) is simply the integral over t of the causal Green function, hence the sum of the responses of the atmosphere excited by a Dirac function of the forcing at time t : where the change in the upper limit of integration can be made due to the fact that the Green function is causal, and is therefore zero when t − t is negative . From the definition of the Green function (Eq.(22)), for (t − t ) > 0 we have LX G = 0. Hence X G is a homogeneous solution of Eqs.(17) to (19) when t > t . It is then logical to choose for X G : and it is easily verified that where we have used the fact that the derivative of the Heavyside function is the Dirac distribution, L = ∂/∂t + L h where L h is an operator acting on the horizontal coordinates only and LX n,l = 0. Therefore, in order to solve the forced problem we can project the forcing onto the homogeneous solutions and write for t = t : n,l α n,l X n,l (t = t ) = n,l α n,lXn,l = Q(x, y).
The first equality simply arises from the definition ofX n,l , whereas the second uses Eq.(22). Hence if we know the projection of Q onto theX n,l , the α n,l quantities, the final solution can be obtained. By setting α n,l = q n,l , we recover the results of the previous section (these are termed b m and b m,n,l in Matsuno (1966) and Tsai et al. (2014), respectively where the latter is in 3D). The solution to the forced problem is given by injecting the Green function (Eq.(24)) into Eq.(23): Under this integral form, it is clear that the solution is the continual excitation (the Θ term) of waves with a characteristic frequency ν n,l and time of decay of σ −1 n,l . The amplitude of the excited waves is proportional to their projection onto the forcing function Q, as explained by Gill (1980) and Tsai et al. (2014). Solving this integral yields: In this form we simply recover the results of Matsuno (1966), and notably their Eq.(34) or our Eq. (14), where X MG is the Matsuno-Gill solution, hence the steady solution to the forced problem. The time dependent part of the solution could have been obtained from a simple first order equation solution. However, the Green function formalism allows us to determine that the solution consists of permanently forced waves that are damped with time, and that the shape of the atmosphere is given by the interactions between these waves. Notably, with Eq. (27), we can write the form of which confirms the interpretation of the stationary solution as an infinite interaction of waves. Additionally, it shows that for a given heating function, changing the value of τ drag (but keeping τ rad = τ drag ), hence not altering theX n,l and ν n,l but only σ n,l = τ −1 drag , will change the linear steady solution. This is as the excited waves will not propagate to the same length before being damped. This was first realised by Wu et al. (2001), where they show that the zonal wave decay length is of the order of τ −1 rad τ −1 drag for arbitrary τ rad and τ drag . From Eq.(27), we see that the linear solution is controlled by three main parameters: the shape of the forcing function (q n,l ), the global behaviour of the waves (horizontal shape ofX n,l and ν n,l ) and the dissipation of the waves (σ n,l ). Although Eq.(27) provides the form of the solution to the time dependent problem, the three main parameters we have detailed remain unknown. Therefore, we move to quantifying the sensitivity of the solution to the forcing function, hence the influence of the q n,l in the next section.

Insensitivity of the Matsuno-Gill solution to the differential heating
The interpretation of Showman & Polvani (2011) relies on a simplified treatment of the forcing, the impact of which we must first assess before discussing the impact of the linear evolution of the atmosphere. Firstly, in order to derive analytical results, Showman & Polvani (2011) impose an antisymmetric (sinusoidal) heating where the night side of the planet is cooled as much as the day side is heated. In this case, the linear steady solution gives rise to the chevron shaped pattern they denominate the 'Matsuno-Gill' circulation. However, the actual structure of the heating is not just a sinusoidal function. Moreover, from Mayne et al. (2017) and Amundsen et al. (2016) we know that there are qualitative differences between the steady state of GCM simulations of hot Jupiters calculated using either a Newtonian heating or with a more sophisticated radiative transfer scheme. In that regard, the first intuitive idea to test is whether the MG pattern is robust when the heating function is changed. With the addition of the vertical dimension, Tsai et al. (2014) have shown that the linear solution strongly resembles the MG circulation at low pressures in the atmosphere. It is not clear whether this holds with realistic, three dimensional heating functions. From Eq. (14) or (30), as X n,l (x, y) =X n,l (y) exp(ikx), we know that the projection of the heating function onto different wavenumbers, k, can alter the resulting Matsuno-Gill circulation from that obtained at wavenumber one. To verify this point, using ECLIPS3D we solve for linear steady circulations across a range of prescribed heating rates, adopting the drag and radiative timescales of Komacek & Showman (2016), as used previously for the data presented in Figure 1a. We use three forms of heating, the first two modified from that used for Figure 1a, and a third one, inspired from 3D GCM simulations: -The same day-side forcing as Figure 1a, but no night-side cooling. -The same forcing as Figure 1a but with the night-side cooling enhanced by a factor of 3. -A heating profile matching that obtained from 3D GCM simulations (taken from Amundsen et al. 2016), including net day side heating and night side cooling.
The first two cases allow us to test the robustness of the pattern under extreme situations, with the third mimicking the GCM simulations. ECLIPS3D calculates the linear steady states by inverting the linear matrix obtained with the full 3D equations (see description in Debras et al. 2019). The outer boundary condition is a 'no escape' condition (zero vertical velocity w), and we have applied an exponential decay of the forcing in the high atmosphere (low pressure) to mimic the damping of vertical velocities, or sponge layer, employed in the UM and other GCMs. For the inner boundary, we adopt a solid boundary condition for the vertical velocity (w = 0), and a free slip (no vertical derivatives of the horizontal velocities, u and v) or no slip (no horizontal velocities) condition on u and v give qualitatively similar results because of the very high inner pressure. These assumptions are obviously simplifications of the real physical situation, but we have assessed their impact by changing the range of pressures at the inner and outer boundaries, and find no significant change in the qualitative results. The physical parameters adopted for HD 209458b are the same as in Mayne et al. (2017) and given earlier in Table 1. Figure 2 shows the perturbed temperature (steady temperature minus initial temperature, colorscale) and winds (arrows) as a function of longitude and latitude of the MG solutions for the three different heating profiles described previously, at a height where the initial pressure is 50 mbar. Figure 2 shows that the shape of the linear steady circulation does not qualitatively depend on the forcing: we always recover an eastward shift of the hot spot, associated with a tilt of the eddy patterns leading to a net acceleration of the equator for the non linear order. As long as there is a differential heating between the day and the night side, the linear steady solution of the atmosphere exhibits the "chevron-shaped" pattern of the MG circulation (we have verified that a constant heating across the whole planet or just a cooling on the night side does not lead to solutions of a similar form to the MG solutions). There is however a change in the quantitative values, without affecting the qualitative aspects of the non linear momentum transfer (see Section 5). Therefore, relaxing the approximation of a wavenumber one (e.g., day-night antisymmetric forcing) heating function has no influence on the non linear acceleration around the linear steady state: the projection onto different zonal wavenumbers changes the zonal amplitude of the MG circulation, but not its qualitative shape. We only presented results with a characteristic drag and radiative timescale of 10 5 s, but we have verified that changing these values affects the shape of the solutions in all cases in a similar way.
Globally, even for simulations with a proper treatment of the radiative transfer, as long as the planet is tidally locked we expect the linear steady circulation to be MG-like. In the paradigm proposed by Showman & Polvani (2011) and Tsai et al. (2014), the propagation of planetary waves impose this global MG circulation, exhibiting no superrotation, with the acceleration on to superrotation being due to eddy mean flow interactions around this primordial state. Therefore, the time to reach the linear steady state, which is by no means a non linear steady state, must be small relatively to other dynamical times in the system. In the case where the drag and radiative timescales are equal, this led Tsai et al. (2014) to conclude that their work could not apply to long diffusion timescales. We now seek to assess if similar conclusions can be made in cases where the drag and radiative timescales differ, in order to determine how the atmosphere reacts at first order. To develop this argument, we analytically restrict ourselves to the quasi geostrophic set of equations (2D Cartesian, shallow water β-plane), as detailed in section 2.1.

Decay time of damped waves
The first task is to derive an equation for the decay timescale, or growth rate, for a general damped wave in our framework. If τ drag = τ rad , using the complex frequency ω ∈ C (contrary to Matsuno (1966) where ω ∈ R), we can define iλ = iω+1/τ drag to transform Eqs. (17) to (19) into Eqs. (7) to (9), equations of which we know the eigenvalues from Matsuno (1966). Therefore, λ is real which implies that the real part of ω, the frequency, is unaltered from the original equations of Matsuno (1966), but the imaginary part of ω becomes, One can then express e.g., u n,l as: u n,l =ũ n,l e −t/τ drag e i(ν n,l t+kx) .
This shows that all modes decay over a characteristic timescale, namely, the drag (or radiative) timescale. For the case where τ rad = τ drag the time to converge to the Matsuno-Gill circulation is the decay timescale, as one would naively expect, and all waves have the same exponential decay in time.
If τ drag τ rad , Eq.(10) must be modified to obtain: It is easy to verify that setting τ −1 drag = τ −1 rad = 0 gives back Eq.(10). To go one step further, we define a complex number c such that: and choose for c the only root with positive real and imaginary part. This definition ensures that the real part of c 2 is always positive, which allows for the solutions to decay at infinity, see Appendix A. We then choose as a variable z = cy (the cases c = 0 and c = ∞ are of no physical interest). Using this new variable, ∂ 2 v/∂y 2 = c 2 ∂ 2 v/∂z 2 , Eq.(33) simplifies to Dividing this equation by c 2 (and recalling the definition of z, z = cy), we obtain Defining the multiplier of v in the second term as m, the equation can be expressed as, The major difference with the Matsuno case, Eq. (10) is that now z ∈ C, and so the boundary conditions are altered. As |z| → ∞ when y → ±∞, we have to solve this equation with the following boundary condition: v → 0 when |z| → ∞ .
As in the case where m is real, one could show (see, for example, Abramowitz & Stegun 1965) that the only solutions are the parabolic cylinder functions, Eq. (12): H n (cy)e −c 2 y 2 /2 where n ∈ N, hence the need for (c 2 ) > 0 so that the solutions decay when y → ±∞, and provided that: Defining x = iω + τ −1 drag and γ = τ −1 rad − τ −1 drag , followed by taking the square of Eq.(39) in order to obtain c 4 yields, Eq.(40) has already been obtained by Heng & Workman (2014) (their Eq.(121) with different notations: their F 0 is τ −1 rad in our work, ω R is our ω and ω I is −τ −1 drag ), in order to derive steady state solutions as performed in Wu et al. (2001) and Showman & Polvani (2011). Eq.(40) is a polynomial of order six, but a thorough study of Eq.(39) reported in Appendix B shows that only three different waves propagate, and two for n=0, as in Matsuno (1966) .
The horizontal shape of the solutions to Eqs. (17) to (19) in the general case are given by Eq.(A.7), but we would also require the solutions of Eq.(40) to obtain a fully analytic expression for the waves. Therefore, we have solved Eqs.(17) to (19) numerically over an extensive range of n, k, τ drag and τ rad values (we have verified that our numerical solutions recover the limits τ −1 drag = τ −1 rad = 0 and τ −1 drag = τ −1 rad ). First, as expected no mode can exponentially grow given a background state at rest. The cases of k ∼ 1 and n = 1, 3, 5 and 7 are the most important for hot Jupiters, as the heating function is dominated by wavenumber 1, i.e. a day and night side (non-dimensional value around 0.7, see Section 5). Here the n number represents the order of the Hermite polynomial, hence the number of zero nodes in latitude (note that as c 2 can be complex, the number of zeros in latitude is no more solely defined by the Hermite polynomials as in the c = 1 case). If the heating function is a Gaussian function, as chosen by Matsuno (1966); Showman & Polvani (2011);Tsai et al. (2014), then the projection of Q on to the parabolic cylinder function stops at the third order (this is not necessarily true when τ drag τ rad but we don't expect large amplitude in the n > 3 waves, as the forcing exhibits no zeros in latitude).
Using our numerical solutions we explore the behaviour of gravity (Section 4.1.2), Rossby (Section 4.1.3) and Kelvin waves (Section 4.1.4) before summarising our results (Section 4.1.5). For all cases the frequency of the waves remains within an order of magnitude of the free wave frequency (when τ −1 drag = τ −1 rad = 0). Hence the major changes, between cases, are obtained for the decay rate and horizontal shapes. The shapes of the waves with non zero τ drag and τ rad are shown in Appendix C.Regarding the decay rates, we express them as power laws fitting reasonably well the numerical values in the next section. We have also implemented the 2D-shallow water equations in ECLIPS3D (detailed in Debras et al. 2019) to verify our numerical results discussed in this section. For all cases for matching parameters (characteristic values, τ rad , τ drag ), the agreement between ECLIPS3D and the results discussed here (obtained using the Mathematica software) for both the decay timescales or growth rates and frequency of waves is excellent. Furthermore, inserting the numerical values obtained here in Eq.(A.7) recovers the exact modes as obtained using the shallow water version of ECLIPS3D.

Gravity waves
In this work we define "gravity waves" as the solutions to Eqs. (17) to (19) which tend to the standard definition of a gravity wave in the limit τ drag → τ rad (we have verified that the identification is unchanged for τ rad → τ drag ). As there are only three solutions to these equations, see e.g., Matsuno (1966), the Rossby wave is therefore the last mode.
Our numerical results gave a characteristic time of decay for gravity waves of ∼ τ drag , when τ drag ∼ τ rad , as expected. For cases where the drag is dominant over the radiative forcing, τ drag τ rad , the decay timescale obtained numerically is ∼ τ drag i.e., the drag controls the timescale of the convergence of the atmosphere. Physically, this is expected as drag will prevent the wave from propagating and damp the perturbed velocity efficiently, preventing the temperature and pressure to depart significantly from the forced equilibrium profile.
However, when the radiative forcing is dominant over the drag, τ drag τ rad , we find two cases. Firstly, when τ rad 1 the numerically obtained decay rate for the gravity waves (σ g1 ) is given by For this case, if τ rad → 0, σ −1 g1 (the decay timescale) is given by the drag timescale, as for the case of dominant drag. However, if τ rad is larger the behaviour is more complex and includes a dependence on the order of the Hermite polynomial n, although this still results in the decay timescale being the same order of magnitude as the drag timescale. We interpret it as the fact that, although the temperature and pressure perturbation will be dictated by the forcing, the time to damp the wave is still governed by the time for the velocities to be damped, hence the drag timescale.
Secondly, for the case where τ rad 1 the numerically obtained limit for the decay rate (σ g2 ) is given by In this case, the radiative timescale is long enough to be imposed as the characteristic time of decay even for the velocities, and the decay of the wave is only controlled by this timescale.

Rossby waves
The behaviour of the Rossby wave decay timescale is more complex than that of gravity waves. When |τ −1 drag − τ −1 rad | 0.5, for all individual values of τ −1 drag or τ −1 rad , the absolute value of the imaginary part of c 2 is much larger than that of the real component. This means that the Rossby wave modes oscillate in the y direction several times before being damped, in these conditions. Additionally, for increasing values of |τ −1 drag − τ −1 rad |, the amplitude at the equator becomes negligible, and the mode's peak amplitude moves to higher latitudes, where the equatorial β-plane approximation begins to break down. Therefore, our numerical results show that the simple shallow-water, equatorial β-plane framework adopted in this work is not usefully applicable to the case of Rossby waves where |τ −1 drag − τ −1 rad | 0.5. This is confirmed by the graphical representation of these waves in Appendix C. We will therefore rely on numerical results of section 4.2 for this region of the parameter space. However, for |τ −1 drag − τ −1 rad | 0.5 (which is the case for all τ −1 drag , τ −1 rad < 0.5), the decay rate for Rossby waves (σ R ) we have derived numerically can then be approximated by, Therefore, for long radiative and drag timescales, Rossby waves are equally sensitive to the damping of velocities and temperature. Such a result is expected from the conservation of potential vorticity, which gives rise to Rossby waves, and is defined in the shallow water system as (ξ + f )/h where ξ = ∇ ∧ u is the vorticity and f = 2Ω cos φ the Coriolis parameter. When neither ξ and h are strongly damped, we might therefore expect a combination of the drag and radiative damping to return to equilibrium. Comparing the decay timescale for Rossby waves with that obtained for gravity waves in Section 4.1.2 shows that the decay rates differ between these two cases.

Kelvin waves
Kelvin waves are a particular solution of the homogeneous Eqs. (17) to (19), as first pointed out by Matsuno (1966), where the meridional velocity is zero, and can be characterised analytically. Combining Eqs. (17) and (19) with v = 0 yields, and hence where A is a constant. If the boundary condition u = 0 for y → ±∞ is to be satisfied, the factor of y 2 /2 must have a negative real component. Additionally, in order for u and h not to be identically zero, ω must be a root of a second order polynomial given by that is, where the term under the square root can be negative, and therefore provide an imaginary component. Further algebraic manipulation then yields, In order to satisfy the boundary conditions the term under the square root in this equation must be positive, or the result is a pure imaginary number. In other words, Kelvin waves are able to propagate only when the condition, is met. Additionally, this simple estimation of the regimes where Kelvin waves can be supported in the atmosphere may well be an over estimate for the 3D, spherical coordinate case as the characteristic scale of the damping of the Kelvin wave must be smaller than the scale of the planet's atmosphere itself. The real part of ik/(iω + τ drag −1 ) must therefore be large enough (and negative). Finally, when Kelvin waves propagate their characteristic decay rate (σ K ) is given by, This result is similar to the decay timescale for Rossby waves (compare Eqs. (43) and (50)). τ drag and τ rad therefore have a symmetric contribution for Kelvin waves, as expected when considering Eqs. (17) and (19) for v = 0: they have a symmetric effect on u and h.

Decay Timescale Summary
We have now obtained expressions for the asymptotic values of the decay timescales for damped waves under the 2D shallowwater, β-plane framework (see Section 2.1). In particular, for the case of Kelvin waves we obtained an analytical expression for the decay rate, Eq.(50). We have also shown that for the regime where the analytical calculations are valid, Rossby waves exhibit the same decay rate as found for Kelvin waves. For the more general case, aside from considerations of whether the waves can be supported by the atmosphere we have two limits: 1. For τ drag ∼ τ rad and τ drag τ rad , simply, σ R ∼ σ K ∼ σ g ∼ τ −1 drag within a factor of ∼ 2. 2. For τ drag τ rad , the Kelvin decay rate is the invert of the radiative timescale. The Rossby decay rate is the invert of the drag timescale if |τ −1 drag − τ −1 rad | 0.5 and we obtained no semi-analytical solution if |τ −1 drag − τ −1 rad | 0.5. Finally, the gravity waves decay rate becomes: -For τ rad 1: σ g ∼ τ −1 drag + τ 1/(n+2) rad -For τ rad 1: σ g ∼ τ −1 rad /3 Some of the numerical values we used to derive these asymptotic evaluations are reported in plain lines on Figure 3. As our results have been derived for the 2D shallow-water, β-plane system, and for the case of Rossby waves in particular, the behaviour of the waves may not be captured correctly. Therefore, we next move to verifying and extending our approach into 3D using ECLIPS3D.

Extension to 3D with ECLIPS3D
So far we have determined the characteristic frequencies and decay timescales for various atmospheric waves in our 2D framework, introduced in Section 2.1. In this section we extend our analysis to full 3D spherical coordinates using ECLIPS3D. In 3D, spherical coordinates, the dependency of the waves on the stratification and the value of the drag and radiative timescales is difficult to predict theoretically. Therefore, we approach the problem numerically using ECLIPS3D, studying the modes which propagate in a stratified atmosphere at rest, as is used for the initial condition when simulating hot Jupiter atmospheres. The background temperature-pressure profile is set to that of Iro et al. (2005), employing the polynomial fit of Heng et al. (2011). The pressure at the bottom of the atmosphere is set to 10 bars (the depth of the atmosphere is now 7 × 10 6 m), capturing the dynamically active region of the atmosphere, driven by the forcing in the first phase of simulation but without detailing the innermost regions where the density is much higher. As before, we have varied the inner boundary condition to test its impact on our results, and find our conclusions to be robust to this choice. The selection of the modes of interest is performed by first excluding modes with unrealistic amplitudes at the pole or boundary, and then selecting modes with only one oscillation in longitude i.e. wavenumber 1, matching the heating function. Additionally, we restrict to modes with at most two nodes in the latitude direction which are the dominant modes (see discussion previously in this section). This selection process is also described in Debras et al. (2019).
For the first study we have verified that all modes supported when τ drag = τ rad = 10 6 s is adopted throughout the atmosphere are similar in form to when τ drag = τ rad = 0 s but exhibit a characteristic decay frequency of 10 −6 s −1 . Although Rossby and gravity waves are supported with characteristic heights of order the height of the atmosphere itself, Kelvin waves seem only to be supported at the pressure scale height, or smaller.
We have subsequently applied ECLIPS3D with τ rad prescribed as a function of pressure following Iro et al. (2005), and τ drag set as a constant between 10 5 s and 10 6 s. For this setup, we recover the usual (see e.g., Thuburn et al. 2002) Rossby and gravity modes over different vertical wavelengths. Therefore, although our 2D analysis breaks for Rossby modes with large τ drag (see Section 4.1.3), we recover them in the full 3D spherical coordinate treatment using ECLIPS3D, with their decay rate always comparable to the reciprocal of the drag timescale. However, in this setup, with a pressure dependent radiative timescale, we do not find Kelvin modes with atmospheric-scale characteristic heights (we use 40 points in the z coordinates, therefore we are unable to resolve modes with H 10 5 m). However, we obtain Kelvin modes with smaller characteristic heights in the deepest, highest pressure, region of the atmosphere where the radiative timescale is long, in agreement with our previous estimations (see Section 4.1.4, Eq. (49)). For this setup we also obtain mixed Kelvin-gravity modes, concentrated at the equator, as well as mixed Rossby-gravity modes, with the Rossby component dominating in the high latitudes and gravity component component near the equator. In the case of the pure gravity modes the resulting frequencies and decay rates, from ECLIPS3D in 3D spherical coordinates, are in good agreement with the 2D estimations (Section 4.1.2). However, for Kelvin modes although the decay rates obtained from ECLIPS3D are in good agreement with our 2D analytical expressions (Section 4.1.4), the obtained frequencies are a little larger than our analytical analysis would suggest. Finally, for pure Rossby modes, for the range where our 2D analysis is valid, the decay timescales are again in good agreement between our 2D estimations (Section 4.1.3) and the numerical results in 3D, but similarly to the Kelvin modes the frequencies are slightly underestimated.
From our ECLIPS3D outputs we have calculated the value of √ σ 2 + ν 2 for all modes supported in the simulated atmosphere (keeping τ rad prescribed following Iro et al. (2005) and τ drag = 10 6 s), as the amplitude of a given mode in the linear steady state of the atmosphere is inversely proportional to √ σ 2 + ν 2 (Eq. (28)). These results show that the value of √ σ 2 + ν 2 is an order of magnitude smaller for Rossby modes, compared with Kelvin or gravity modes. As discussed in Section 4.1.3 the frequencies of the modes are not significantly altered by the drag timescale, and Matsuno (1966) has shown that Rossby waves have frequencies ∼ 10 times smaller than gravity waves and ∼ 3 times smaller than Kelvin waves in this regime. Additionally, when τ drag is large but τ rad is modest, the decay rate of Kelvin waves is controlled by the radiative timescale, whereas the decay rate of the gravity waves and Rossby waves is conversely controlled by the drag timescale. Therefore, the value of √ σ 2 + ν 2 will be bigger for Kelvin waves, over that obtained for Rossby and gravity waves, for both of which this quantity will be of order the frequency, which is ten times smaller for Rossby waves compared to gravity waves. This demonstrates that the Rossby waves will propagate over greater timescales and lengths, and dissipate globally more energy in the linear steady state (see Eq. (27)). The influence of the Rossby waves in the steady linear circulation regime will be dominant over Kelvin and gravity modes, explaining the qualitative shape of Figure 1b.
As a summary, Figure 3, shows the decay timescales for gravity, Rossby and Kelvin waves, obtained from Eqs.(40), (50) as a function of the drag timescale for τ rad = 0.3 (in dimensional units, a few 10 4 s), a characteristic value in the superrotating regions of HD 209458b, or as a function of the radiative timescale for τ drag = 20 (about 10 6 s), a value which allows for superrotation in the non linear limit. We also plot values extracted from numerical exploration with ECLIPS3D, although the radiative timescale is not constant in these numerical results and the characteristic height might differ. We recover the fact that Kelvin waves are more damped than other waves for the timescales used in this section, thought to be representative of hot Jupiter atmospheres, but not for all timescales (notably when τ drag = τ rad ). Additionally, there are many regions of the parameter space where Kelvin waves don't actually propagate, as evidenced in section 4.1.4. This highlights the need to constrain the timescales to understand the spin up of superrotation, and to know the wave behaviour across different timescales. Figure 4 shows the pressure perturbations (colour scale, total pressure minus initial pressure) and horizontal winds (vectors) as a function of longitude and latitude, for four Rossby modes. Two of the modes in Figure 4 are from ECLIPS3D, 3D spherical coordinate calcuations, and two from the analytical studies (i.e. Article number, page 11 of 28 A&A proofs: manuscript no. main  (50) while crosses are results from seven ECLIPS3D calculations. As τ rad is not constant with depth in the ECLIPS3D results, we have chosen to use an arbitrary value of τ rad = 0.3 for comparison. However when τ drag increases the location where the wave exhibits its maximum perturbation moves to higher pressures which should correspond to an increase in the equivalent τ rad . For the Rossby waves, the low τ drag limit has not been studied numerically as it is irrelevant for superrotation. Kelvin waves of comparable height with Rossby and gravity waves are only clearly identified in four ECLIPS3D calculations. derived using equations in Appendix A; with Matsuno (1966) notations, they both have n = 1 and l = 3), shown as the left and right columns, respectively. The locations in longitude are arbitrary as the initial state is axisymmetric and at rest. The Rossby modes shown in Figure 4 from ECLIPS3D have been chosen such that the maximum amplitude was present in the deeper, high pressure, regions where drag is dominant (top panel), in one case, and for the other case the amplitude was maximum in the upper, low pressure, part of the atmosphere, where the radiation timescale is shorter than that of the drag (bottom panel). Figure  4 shows that the ECLIPS3D results and those from our 2D analytical treatments are in good agreement. Specifically, the 'tilt' of the modes in the latitude-longitude plane, and the location of the maximum perturbation in pressure are broadly consistent between the analytical 2D and numerical 3D results. This agreement is comforting given that one case is a simplification of an atmosphere on a 2D shallow water β-plane and the other one a restriction onto one height of a fully 3D, spherical mode. There are however discrepancies, notably at mid and high latitudes.
Interestingly, with ECLIPS3D we also recover Rossby waves with the opposite tilt in latitude than the results of the shallow water equations (right panel of Figure 4), as well as Rossby waves with no tilt (their horizontal shape being similar to the Rossby waves from Matsuno (1966)). All of these waves have comparable frequencies and decay rates, no matter the tilt. We attribute the existence of these waves to the density stratification and the dependence of the radiative timescale with height, not considered in the shallow water equations. Therefore, it is no easy task to predict what will be the shape of the linear steady state as it depends on the projection of the heating function on all these waves with different tilts, but also different characteristic heights. The fact that the waves are not orthogonal anymore (Appendix A) further prevents an easy evaluation of the projection of the heating function on each wave. However, it is primordial to note that Rossby waves always exhibit a much lower amplitude in the pressure/temperature perturbation at the equator rather than in mid latitudes. This provides additional confidence in the assertion that Figure 1b is really dominated by the Rossby wave component.
It is, therefore, difficult to conclude on the behaviour of the linear solutions solely from the results of the ECLIPS3D calculations in 3D, as we also require the projection of the heating function onto the waves. However, clearly we recover tilted Rossby waves, and an absence of Kelvin waves in the upper part of the atmosphere where superrotation develops (or they must have a very small characteristic height). This is in contradiction with Showman & Polvani (2011), where they link the acceleration of superrotation to the interaction between a standing equatorial Kelvin wave and mid latitude Rossby wave, but in agreement with our analytical estimations for the domain of existence of Kelvin waves. As already stated in this paper, this does not refute the theories of Showman & Polvani (2011) and Tsai et al. (2014) regarding the equilibration of superrotation, but shows that the spin-up of an initial jet is not due to a linear, chevron-shaped steady state and time dependent linear considerations must be taken into account. Globally, our 2D semi-analytical arguments seem to be a good approximation of the 3D linear evolution of the atmosphere of hot Jupiters, and we devote the next section to an application of these estimations to provide a physical understanding to the acceleration of superrotation.

Transition to superrotation
In Section 3, we have shown that the linear steady state of our atmosphere was not significantly altered when moving to a more realistic heating profile taken from a 3D GCM simulation. Therefore, to gain further insight into the acceleration of the superrotation we turn to the time-dependent, linear effects. This led us to develop expressions for the time-dependent linear solution to the problem in Section 4. In this current section we use these solutions to understand the transition to superrotation in simulated hot Jupiter atmospheres. We first assess the validity of our own results by comparing them with the simulations of Komacek & Showman (2016) in Section 5.1. This is followed, in Section 5.2, by an order of magnitude analysis, which allows us to conclude that we can not explain the acceleration to superrotation under the simplifications employed for the linear steady state (although such simplifications are well suited for studying the equilibration of the superrotation, see Tsai et al. 2014, for example). Finally, we test our interpretation against the results of 3D GCM simulations in Section 5.3, revealing that vertical accelerations are vital to the process.

Qualitative structure of linear steady state
Before discussing the transition to superrotation in the non linear limit, we first apply our understanding from Section 4 to interpret the form of the various linear steady states presented in Komacek & Showman (2016), their Figure 5. As shown by Wu et al. (2001) the zonal damping rate is proportional to τ −1 drag τ −1 rad . Therefore, if the two timescales (drag and radiative) are small, or one of them is vanishingly small, the zonal propagation of waves will be extremely limited in longitude. This is clearly seen in Komacek & Showman (2016), as when one or two of the timescales are short the temperature gradient is huge between the day and night side, and the zonal flows restricted largely to the day side. Essentially, in this case the waves excited on the day side, have been damped before reaching the night side and therefore can not lead to efficient wind generation and heat redistribution (in the linear limit). This is also discussed in Komacek & Showman (2016).
Let us suppose that one of the timescales (i.e. radiative or drag) is much shorter than the other one, hence is the dominant timescale, which we denote simply as τ. In order for Kelvin waves to propagate we require, from the dimensional form of Eq.(49) : Therefore, if the dominant (shorter) timescale is inferior to ∼ 2.5 × 10 4 s, Kelvin waves cannot propagate unless both timescales are equal. However, if both these timescales are 2.5 × 10 4 s, the dissipation time for Kelvin waves will be very short. Assuming a simple characteristic speed of waves to be √ gH, which in our case is roughly 2 × 10 3 m.s −1 , the time for a wave to travel around the whole planet is ∼ 2 × 10 5 s. Therefore, for drag or radiative timescales of 2.5 × 10 4 s, even in the cases where Kelvin waves exist they can not propagate around the whole planet and, thereby, generate the stationary chevron shaped MG pattern of Showman & Polvani (2011). The chevron shaped pattern of the linear steady state can therefore only exist when both timescales are 10 5 s and comparable in value. This is seen in Figure 5 of Komacek & Showman (2016), where this pattern is clearly not ubiquitous, and this restricts strongly the cases where the explanation of Showman & Polvani (2011) is applicable. In other words, acceleration of superrotation in hot Jupiters from the MG or chevron shaped, tilted linear steady state is only possible over a restricted parameter space, which is therefore not likely to provide an explanation for all exoplanet cases.
Additionally, Eq.(51) shows that for a given τ but for a varying H, there is a minimum height for the propagation of Kelvin waves. As shown by Wu et al. (2000) and later by Tsai et al. (2014), the three dimensional structure of the propagating waves can be decomposed onto waves in a shallow water system but with differing equivalent depths. The modes are solutions to the homogeneous equations but with a characteristic height which varies between modes. Tsai et al. (2014) also show, in their Figure 2 that the projection of the heating function of the vertical modes has a high amplitude for modes with equivalent height between 5H P and 0.2H P , where H P is the pressure scale height, roughly of the order of 4 × 10 5 m in hot Jupiter atmospheres. Therefore, adopting these limits we have H = 5H P ∼ 2 × 10 6 m and H = 0.2H P ∼ 8 × 10 4 m, and obtain 10 4 s < τ < 6 × 10 4 s .
Therefore, in our case, this implies that if τ < 10 4 s, Kelvin waves are unable to propagate with characteristic height less than 5H P : the linear steady state will have an almost null projection onto Kelvin waves. However, if τ > 6 × 10 4 s, the majority of the Kelvin waves excited by the forcing can propagate (we recall that if both timescale are equal all the wave can propagate, as in the neutral case, but we expect the radiative timescale to be at least an order of magnitude smaller in superrotating regions). Additionally, as introduced in Section 4, our estimates for the regimes where Kelvin waves are supported by the atmosphere is likely to be wider than the real situation, meaning that the criteria for Kelvin wave propagation are also likely to be stricter. The behaviour outlined in this section is readily apparent in Figure 5 of Komacek & Showman (2016). In the limit where the drag is strong, the waves are damped efficiently, the thermal structure strongly resembles the thermal forcing, and there is no planetary Kelvin wave structure evident at the equator. However, in the case of weak drag (τ drag > 10 5 s) the Kelvin wave component is visible in the temperature structure only when the radiative timescale is comparable (i.e. > a few 10 4 s, in other cases the temperature is almost uniform at the equator). Finally, in the limit of short radiative timescale the Kelvin waves do not propagate, the dynamical shape of the atmosphere is dominated by other components and the linear steady state strongly resembles Figure 1b. Interestingly, it appears that the cases that superrotate in the non linear limit all have a negligible Kelvin wave contribution in their linear steady state, in other words, either the equator is not dominated by Kelvin-type circulation or the high latitudes are dominant.
To conclude this section we detail further the case τ drag = 10 5 s and τ rad,top = 10 4 s. Following Eq.(9) of Komacek & Showman (2016), this choice for τ rad,top implies a radiative timescale of 8 × 10 4 s at P = 80mbar (the isobaric surface presented in their Figure 5) and the drag timescale 10 5 s. Using (Eq.(49)), Kelvin waves are able to propagate in this scenario. Additionally, the difference between the drag and radiative timescales enables us to properly consider the behaviour of the Rossby wave component. Solving Eq.(40) for these prescribed radiative and drag timescales yields a decay timescale of ∼ 0.38 (in nondimensional units) for both Rossby and Kelvin waves, while the decay timescale of the gravity waves is ∼ 0.36. Therefore, when τ drag = 10 5 s and τ rad,top = 10 4 s we have Kelvin waves propagation, with a decay timescale long enough for the waves to traverse the entire planet and comparable lifetimes for all three wave types considered (Kelvin, Rossby and gravity). In this instance we expect the steady state of the atmosphere to be a combination of planetary waves all with roughly the same magnitude (depending on the projection of the heating function), which leads to the the chevron shaped pattern predicted by Showman & Polvani (2011) in the Matsuno-Gill circulation. This is confirmed by Figure 5 of Komacek & Showman (2016), where, in the limit discussed, the linear steady state is similar to that shown in our Figure 1a.
These comparisons of our estimations with results from this work and previous studies show that our semi-analytical analysis is actually rather powerful in understanding the resulting linear steady state response of a hot Jupiter like atmosphere. The natural next step is to explore the implications for numerical simulations of a hot Jupiter atmosphere from the initial condition to the final superrotating state.

Order of magnitude analysis
In this section, we estimate the maximum forcing under which the consideration of a linear steady state is relevant, then we estimate the time dependent wave response in the weak drag regime.
As we are performing a linear study, for constant τ rad and τ drag the value of the maximum velocity is proportional to the amplitude of the forcing, which is well represented by the dayside-nightside equilibrium temperature difference we apply at the top of the atmosphere, ∆T eq,top . Assuming that the radiative timescale as a function of pressure within the atmosphere is appropriately represented by the polynomial fit of Heng et al. (2011), adapted from Iro et al. (2005), the amplitude of the MG-circulation will then only depend on ∆T eq,top and the drag timescale.
Denoting the zonal velocity of the linear steady state as u MG , the linear steady state can only be reached if the non-linear terms can be neglected when u = u MG , i.e., once the linear steady state is formed. The non-linear terms scale with the zonal advection, u MG ∂u MG /∂x ∼ u 2 MG /L, where L is a characteristic length. Whereas, the linear terms are of the order of u MG /τ drag . Therefore, equating these two estimates provides a maximum zonal speed for which the non-linear terms may be accurately neglected, u max , where for any ∆T eq,top at a constant τ drag . Combining Eqs. (53) and (54) then yields a maximum forcing temperature difference value for which the linear steady state can be reached: For equilibrium temperature contrast at the top of the atmosphere greater than the value in Eq.(55) non-linear effects can no longer be neglected during the acceleration to the linear steady state, which would not be reached by a GCM. This has already been noted in Section 3.3.2 of Tsai et al. (2014), where they acknowledge that their analysis is strictly valid only in the strong or moderate damping scenario. As we see in Eq.(55), if τ drag is too large, i.e. a low damping scenario, the maximum forcing will be very small, and the linear approximation becomes invalid for forcing amplitudes relevant to hot Jupiters. This analysis allows us to more rigorously define the weak, modest and strong damping regimes we had previously mentioned in Section 2.
In the strong or moderate damping scenario, ∆T eq,top ∆T max , the atmosphere first reaches the linear steady solution and then the subsequent evolution is controlled by non linear acceleration. In this regime, the non linear evolution from the Matsuno-Gill linear steady state is given by ∂u MG /∂t ∼ u 2 MG /L, where the u 2 MG /L term comes from advection, then the characteristic time τ evol for the atmosphere to significantly depart from the MG state is This limit is that studied in Tsai et al. (2014), where the waves change the mean flow in a quasi-static way leading to the emergence and equilibration of superrotation.
In the low damping scenario however, the atmosphere never reaches the MG steady state, and non linear considerations must be taken into account when the characteristic speed exceeds u max (hence Eq.(29) is irrelevant and only Eq. (27)-(28) can be used to understand the transition to superrotation). Let us suppose for example, that the atmosphere is composed of two waves: a slowly oscillating, slowly decaying Rossby wave, hence ν 1 , σ 1 1 and a quickly oscillating, strongly damped Kelvin or gravity wave with ν 2 , σ 2 1. Our analysis of Section 4 shows that when τ rad ∼ 10 4 s and τ drag ∼ 10 6 s Rossby waves indeed have small frequency and decay rate whereas Kelvin and gravity waves have order of magnitude higher frequencies and decay rates (provided they can propagate). In this simplified case, Eq.(28) simplifies to where X F is the time dependent solution vector. We know that in the linear steady state, assuming q 1 ∼ q 2 , the Rossby wave component will hold much more power than the Kelvin or gravity wave as |iν 1 − σ 1 | |iν 2 − σ 2 |. However, if we select a time t such that |iν 2 − σ 2 |t 1 and (necessarily) |iν 1 − σ 1 |t 1, Eq.(57) can be linearised to first order, yielding Therefore, in the limit of very early times in the evolution the two wave components in the time dependent solution of the atmosphere are comparable (provided q 1 ∼ q 2 ). The wave components remain comparable even when |iν 2 − σ 2 |t ∼ 1 but |iν 1 − σ 1 |t 1, where the exponential term for the Kelvin or gravity wave is almost zero, but the linearisation holds for the Rossby wave, hence, Dividing the amplitude of our first, Rossby wave, in this linear time dependent state, α 1 , by the amplitude of wave 2, α 2 yields, Provided that the projection of the heating function on the two wave components are similar i.e. q 1 ∼ q 2 , the time dependent solution will exhibit comparable amplitudes for both waves before the (slowest) Rossby wave has grown much larger than the asymptotic amplitude of the Kelvin or gravity wave. However, the steady state will be dominated by the Rossby wave component. Therefore, although the linear steady state strongly depends on τ rad and τ drag , in the limits of short timescales the structure of the atmosphere only depends on the projection of the heating function on the propagating waves. On Figure 5, we show the modulus of 1 − e (iν−σ)(t) /(σ − iν) for a Rossby , Kelvin and gravity waves obtained by ECLIPS3D (see Figure 3) when τ drag = 2 × 10 5 s and for a Rossby and gravity waves when τ drag = 10 6 s. In both cases, after a few 10 4 s (a day being ∼ 9 × 10 4 s), the amplitude of the different waves is of the same order of magnitude whereas Rossby wave contribution dominates at later times. This analysis suggests that the linear steady state is not responsible for accelerating superrotation in hot Jupiter atmospheres for the low drag limit, and allows us to resolve the problem explained in Section 2.2. As discussed in Section 2.2, superrotating atmospheres were found by Komacek & Showman (2016) despite structures implying negative convergence of momentum at the equator in the linear limit (a 'reverse' MG structure), in contradiction with the mechanism of Showman & Polvani (2011) which invokes the linear MG steady state solution.
This problem persists for physically plausible choices on the equilibrium temperature contrast and the drag and radiative timescale, typically, ∆T eq,top = 500 K, τ drag ∼ 5 = 10 5 s and τ rad = 5 × 10 4 s which yields u max ∼ 200m.s −1 . When solved numerically, we obtain u MG,1 ∼ 5m.s −1 hence ∆T max ∼ 150 K: the linear steady state cannot be reached. Specifically, after one day of simulation, our analysis shows that the linear steady state will not have been reached but numerically we already have u > u max , hence non linear effects can not be neglected anymore. Regarding the dissipation timescales for the waves (Section 4), after one day we have σ K t, ν K t ∼ 1,σ g t, ν g t ∼ 1 and σ R t, ν R t 1, where ν K , ν g , ν R are the Kelvin, gravity and Rossby waves frequencies, respectively, and σ K , σ g σ R the Kelvin, gravity and Rossby dissipation rate, respectively. Our estimates of this section, Eq.(60), show that this leads to an equivalent contribution of Rossby, gravity and Kelvin waves in the circulation.
In summary, after 1 day of simulation: the structure of the atmosphere exhibits similar contribution in Rossby, gravity and Kelvin waves, which is characteristic of the chevron-shaped pattern of Figure 1a. As shown by Showman & Polvani (2011), the eddies from such a circulation favour the meridional convergence of eastward momentum at the equator. -Additionally, simple orders of magnitude show that the non linear terms (hence the contribution from the eddies) can no longer be neglected as they are of the same order of magnitude of the linear terms.
Therefore, when the non linear terms become dominant, they lead to a net acceleration of the equator. Even though the eddy acceleration from the linear steady state (our Figure 1b) would tend to decelerate the equator, the equator is accelerated non linearly after one day of simulation because of this chevron shaped, non steady linear circulation. This initiates superrotation, and the later, slower evolution is well described by Tsai et al. (2014). We study this further in the next section using our own GCM (the Unified Model, UM, presented in Mayne et al. 2017). We must underline that this separation of scales between linear and non linear behaviour is obviously very simplified. As already noted by Showman & Polvani (2011), the non linear accelerations are mostly due to the wave-mean flow interactions (rather than wave-wave or mean flow-mean flow, as we confirm in the next section). Therefore, a quasi linear analysis or statistical studies of momentum transfer might allow more rigorous insight into the jet acceleration (see e.g., Srinivasan & Young (2012)

3D GCM simulations
In order to assess the applicability of the linear shallow water result developed in this work to a full 3D calculation, we have performed simulations using the UM across a range of forcing scenarios. The simulations employ Newtonian heating as discussed in , and adopt the baseline hot Jupiter setup presented in  which follows that of Heng et al. (2011) and for the radiative timescale, Iro et al. (2005). For our simulations we have then varied the day to night temperature contrast, ∆T from 1 to 100K (see Eqs.(B2) and (B3) of Heng et al. 2011). Obviously, this is only a toy model as we do not expect to find a tidally locked planet with an effective temperature of 1300K and a day night contrast of 0.1K, but it allows us to study the physical mechanisms controlling the atmospheric structure. Atmospheric drag has not been explicitly implemented in the UM, but is provided by a diffusion scheme as detailed in . We have verified that all of our simulations conserve mass and angular momentum to an order of 10 −6 . We use these simulations to first explore the resulting, qualitative structure of the atmosphere and then the accelerations within it, in Sections 5.3.1 and 5.3.2, respectively.

Qualitative structure of the atmosphere
As long as the linear effects are dominant, we expect all our simulations to have qualitatively similar features but with quantitative values that scale with the forcing. After one day of simulation this is indeed the case, where all of our simulations have the same qualitative structure matching Figure 1a, although the magnitude of the temperature differences and wind velocities vary between simulations (increasing with larger temperature contrast). The structure matches the 'chevron' shaped pattern of Showman & Polvani (2011), but we again state that this is not the steady Matsuno-Gill solution. It is a specific time in the evolution of the atmosphere when all waves have the same projection in the circulation, as discussed in the previous section.
Comparing the very low temperature contrast case with linear steady states from ECLIPS3D, the dissipation within our simulations is equivalent to τ drag ∼ a few 10 5 s. At P ∼ 80 mbar the radiative timescale of Heng et al. (2011), adapted from Iro et al. (2005), is of the order of 2.5 × 10 4 s. For these parameters the linear MG state can only be considered as reached after ∼ 10 days, the time for the gravity and Rossby waves (which are the most long lived components) to be completely dissipated (see Section 4.1.5). Figure 6 shows the temperature and wind structure for three calculations, the first one using ECLIPS3D with parameters set to those matching the GCM simulations ( Figure  6a), and the next two from GCM simulations after 10 days of simulations, at 80 mbar, for a small and large temperature contrast at the top of the atmosphere ∆T eq,top = 1 K ( Figure 6b) and ∆T eq,top = 100 K, (Figure 6c).
The ECLIPS3D calculation, hence the linear steady state, Figure 6a, recovers the dominant mid-latitude Rossby gyres, associated with equatorial winds but it clearly differs from Figure  1a in the sense that the equatorial circulation is not impacting the Rossby gyres significantly. We therefore have an intermediate between Figures 1a and 1b. For completeness, we also show the evolution of the atmosphere for the same simulations but with τ drag = 10 6 s in Appendix D, which leads to a linear steady circulation equivalent to Figure 1b. For the GCM simulations adopting ∆T eq,top = 1K, Figure 6b, the circulation of Figure 6a is recovered after 10 days of simulation. For the GCM simulations adopting ∆T eq,top = 100K, Figure 6c, respectively, the longitudinal extent of the westward wind is reduced after 10 days compared to the weak forcing regime. Hence, for the simulation with the larger temperature contrast, after 10 days, the atmosphere has already diverged from the linear evolution of the atmosphere. Although both simulations were qualitatively identical after 1 day of simulation, the low forcing simulations then reaches the linear steady state whereas higher forcing simulations never go through this state. Using the steady state wind velocities for the smaller temperature contrast simulation, ∼ 2ms −1 and Eq.(56), alongside the fact that we expect the linear steady wind to scale with the forcing, we can estimate the timescale to depart from the MG state in the stronger forcing simulation: ∼ 10 6 s, which is about 10 days. This timescale matches the timescale estimated above, from the analysis of the atmospheric waves, for convergence to the MG state and hence the MG state is never actually reached.
In our simulations, the case of not reaching the linear steady state occurs for day-night temperature contrasts of ∼ 100K or greater, at drag timescales of τ drag ∼ a few 10 5 s (as presented in Figure 6). If the drag timescale is further increased, the temperature contrast for which the linear steady state could not be reached would decrease further 2 , and HD 209458b is rather expected to experience a ∼ 500 K temperature contrast. Hence it does not seem possible to reconciliate the initial acceleration of superrotation with the consideration of steady linear effect. The linear considerations can only be used in the first day or so of simulation, and the linear steady state is irrelevant.
Therefore if superrotation does exist in hot Jupiter atmospheres, the steady linear considerations are not likely sufficient to explain the initial acceleration as non linear effects quickly dominate. Notably, the study of both Tsai et al. (2014) and Showman & Polvani (2011) (and more recently Hammond & Pierrehumbert 2018) would only apply in the limit of slow evolution, hence once the atmosphere is already close to a non linear steady state. This explains why Figure 16 of Tsai et al. (2014) which represents their linear consideration compares so well with their Figure 15, taken from 3D numerical simulations: when an initial superrotation is already settled, the further evolution is slow and can be understood in the linear limit. However, Tsai et al. (2014) provide no comparison between the linear expectation and the 3D simulations during the original acceleration phase of superrotation.
As we have seen in this section, linear considerations apply as long as we consider the time dependent solution. Although we cannot use our linear prescriptions to predict the evolution of the atmosphere in the non linear phase (as stressed in section 5.2), we can estimate the duration of validity of the linear approximation and show that the atmosphere does not go through a linear steady state during the acceleration phase. Therefore, it is worth noting that the westward shift of the hot spot in the steady linear limit studied by Hindle et al. (2019), with the addition of a magnetic field, is not a robust enough diagnostic to predict whether the atmosphere is superrotating.
Interestingly, our low forcing simulation also converged to a superrotating state after a much longer time (scaling as one would expect as the inverse of the forcing). This was unexpected as Figure 6b is not associated with a strong deposition of eastward momentum at the equator. More surprisingly, we also recover a superrotating jet for low forcing simulations when we further increase the drag timescale, and the atmosphere goes through a linear steady state resembling Figure 1b. To understand this phenomenon, we conclude our study with the considerations of 3D accelerations in the spin up and equilibration of superrotation for GCM results. . Temperature (color) and winds (arrows) as a function of longitude and latitude for (a) linear steady state calculated with ECLIPS3D with ∆T eq,top = 100 K, τ drag = 2 × 10 5 s and τ rad following Iro et al. (2005). Maximum speed is 400 m.s −1 (b): 3D GCM result at the 80 mbar level after 10 days of simulation, with ∆T eq,top = 1 K and the radiative timescale of Iro et al. (2005). The maximum speed is 2.5 m.s −1 ; (c): 3D GCM result at the 80 mbar level after 10 days of simulation, with ∆T eq,top = 100 K and the radiative timescale of Iro et al. (2005). The maximum speed is 350 m.s −1 Article number, page 17 of 28 A&A proofs: manuscript no. main

Accelerations
The final step is to study the acceleration of the mean flow in the initial stages of the acceleration of superrotation within the 3D GCM simulations and assess the relevance of the 2D studies. As discussed, once the jet is settled and evolving slowly, the studies of Tsai et al. (2014) and Hammond & Pierrehumbert (2018) describe the evolution of superrotation, but the initial acceleration is less clear as we have explained through this paper.
For this purpose, we study the acceleration of the jet in a simulation with ∆T eq,top = 100K and τ drag ∼ 10 5 s, where the first phases of the development of superrotation can be captured over about 30 days. Following Mayne et al. (2017) (who adapted the treatment of Hardiman et al. 2010), the acceleration of the zonal mean flow can be written as where G λ denotes the body forces acting in the longitudinal direction (not considered here), the subscripts denote partial derivatives, and every quantity X is defined as X =X + X where a bar denotes an average on longitude. In this section, we do not consider the mean flow-mean flow accelerations as they are negligible during the initial acceleration within our simulations. However, once the superrotating jet has formed, these accelerations should be taken into account as they balance the eddy accelerations and eventually lead to a non linear steady state (see notably the conclusions of Tsai et al. 2014). Following Showman & Polvani (2011), the meridional eddy accelerations, involving v and u , should lead to momentum convergence at the equator from the MG steady state whereas the vertical component acts to decelerate the equatorial region. In Figure 7 we show the value of (ρ u) for ∆T eq,top = 100K after 50 days of simulations as well as the vertical and meridional accelerations. After 50 days, the jet extends from roughly 1mbar to 1bar with the maximum of (ρ u) around 0.2bar. As in Figure  15 of Tsai et al. (2014), we observe that the vertical accelerations are slowing down the upper part of the jet, while extending the jet to deeper pressures. The meridional accelerations on the other hand compensate the vertical component in agreement with both Tsai et al. (2014) and Showman & Polvani (2011). It is interesting to note that the explanation for radius inflation of Tremblin et al. (2017) relies on the vertical wind in the deep atmosphere due to the equatorial jet, and that the spin up of the jet shows that the vertical accelerations are pushing the equatorial jet downwards. This seems to point towards a circulation in depth between the jet and the vertical velocities, that gets deeper with time.
We have also used these simulations to study the jet acceleration in more detail, during the earlier phases. Firstly, we observe that the jet sets up initially between 0.08 and 0.1 bar in about 15 days (not shown), and then extends upwards and downwards. We show in Figure 8 the meridional and vertical accelerations after 1, 5 and 20 days in the ∆T eq,top = 100K case. Figure 8 shows that in the region where superrotation is the strongest (see Figure 7), the vertical eddy acceleration always provides the maximum momentum convergence in the first  motions accelerate the jet gets deeper with time, i.e. moves to higher pressures. We believe that this can be understood in the following way: the superrotation does not affect the mid-latitude eddy circulation, which keeps acting to converge eastward momentum at the equator. However, as radiation penetrates deeper and the jet extends, the vertical circulation is changed and the vertical winds carry momentum away from the jet. This inhibits the deposition of eastward momentum at the equator, which was accompanied by a deceleration of westward winds on the night side of the planet. Globally, it appears that both meridional and vertical accelerations set up the initial superrotation (with the vertical accelerations being dominant), which then tends to decrease and even change the sign of the vertical acceleration. Then, the global meridional motions act to sustain the jet once the vertical eddy acceleration is negative.
A key question is whether one reaches a limiting level of the day-night temperature contrast, as a proxy for the radiative forcing, at which the superrotation would transition. Such a transition would occur once the non-linear terms become important and depend on the state of the atmosphere at that point. If the atmosphere is similar to Figure 1a, then superrotation would be favoured, but for a state such as 1b the superrotation would be impeded. In our simulations, it seems that there is no threshold to superrotation, because the vertical accelerations spin-up the initial jet in all cases. The only limiting aspect is the time to reach a non linear superrotating state as the forcing gets lower.
In the case of long drag and short radiative timescales Komacek & Showman (2016) have already noted that the meridional motion of the linear steady state is opposite to what is necessary to drive superrotation, although they do observe that the non linear steady state is actually superrotating. We have explained this by considering the time dependent linear state in Section 5.3.1. More precisely, Figure 9 shows the meridional and vertical accelerations as a function of time for two simulations adopting ∆T eq,top = 1K and ∆T eq,top = 100K and the same dissipation. Figure 10 shows the evolution of the zonally averaged zonal wind for comparison.
For both simulations, Figure 9a and Figure 9b, in the first two days meridional and vertical accelerations are both positive and of comparable magnitude. This was expected, as during the first days the atmosphere is comparable to the MG steady state explored by Showman & Polvani (2011). The zonally averaged zonal speed, Figure 10, is almost zero as expected from the propagation of waves with zonal wavenumber m = 1. In the low forcing case, the vertical accelerations eventually dominate by almost an order of magnitude and the meridional terms can even lead to opposing the creation of a jet (for a given pressure level, not seen on the vertically averaged Figure 9). This is due to the fact that the atmosphere has reached the linear steady state of Figure 6a, associated with strong vertical accelerations but almost zero (or negative) meridional accelerations. For the high forcing case, although vertical accelerations contribute to the initiation of superrotation, they get smaller with time as the jet is being created, and eventually become negative (after ∼ 60 days, not shown). This confirms that the vertical accelerations initiate the jet but then tend to extend it to deeper pressure, while meridional momentum convergence allows for an equilibrated state.
Globally, we can now resolve the discrepancy of Figure 4 and Figure 5 of Komacek & Showman (2016), discussed throughout this paper (notably Section 2.2 and this section): nonlinearly superrotating atmospheres can have linear steady states that seem to oppose the triggering superrotation, in contradiction with Showman & Polvani (2011). The study of Tsai et al. (2014), in the limit of strong dissipation, does not offer expla-nation for the apparent paradox. First, as we have shown, the treatment of the time dependent linear solution shows that the linear steady state is not relevant in the high forcing case. After 1 day, the non linear terms become dominant whereas the MG steady state would require linear effects to dominate for at least 10 days. The shape of the atmosphere after 1 day is again given by our Figure 1a: it is the usual chevron shape pattern of Showman & Polvani (2011). Therefore, when non linear effects become dominant, they tend to accelerate the equator whereas if the linear steady state had been reached the deceleration by meridional motion could have been dominant: for adequate forcing in hot Jupiter conditions, the state of the atmosphere after 1 day always leads to meridional momentum convergence at the equator. Then, as seen in Figure 8 and 9, the vertical accelerations also need to be taken into account: at the 80mbar pressure range, vertical motion triggers the emergence of superrotation contrary to what is proposed by Showman & Polvani (2011). Only when superrotation is settled (Figure 7b) do the vertical accelerations tend to decelerate the jet, and extend it deeper i.e. to higher pressures.
Later on, once the superrotation is settled, the study of Tsai et al. (2014) applies to the slower evolution of the atmosphere, leading to the possible existence of a unique steady state. This steady state is permitted by both meridional and vertical accelerations, as we have seen throughout this work. This explains why superrotation is reached even when the dissipation is very low, although the initial acceleration is not included in the explanation of Tsai et al. (2014).

Conclusion
In this study we have explored the initial acceleration of superrotation in the context of a hot Jupiter atmosphere. We have also focused on an inherent discrepancy between the works of Showman & Polvani (2011) and Komacek & Showman (2016). Showman & Polvani (2011) propose that the superrotation is triggered by non linear accelerations around the linear steady state, that converge momentum to the equator. On the other hand, Komacek & Showman (2016) show that certain configurations that exhibit superrotation are also associated with momentum divergence at the equator in the linear steady state limit.
In order to resolve this apparent contradiction, we have studied the general form of the time dependent linear response of the atmosphere to a constant, asymmetrical heating. This response depends on the shape of the forcing, the global shape and frequency of the waves it generates and the decay rate of these waves. Our first conclusion, through the use of ECLIPS3D, is that changing the longitudinal form of the forcing is not of prime importance in the qualitative understanding of superrotation, although quantitatively it does affect the results. The use of a Newtonian cooling scheme with a wavenumber 1 in longitude is therefore a reasonable approximation.
We have also obtained an equation for the frequency and decay rates of the propagating waves, as in Heng & Workman (2014). We could not solve this equation analytically for Rossby and gravity waves, and have therefore estimated the asymptotic behaviour of the waves numerically. For Kelvin waves on the other hand, the analytical solution has been obtained. The estimated decay rates were reported in section 4.1.5 and Figure 3.
From there, we have explained qualitatively the structure of the linear solutions with different drag and radiative timescales, as presented in Figure 5 of Komacek & Showman (2016) The orthogonality of the eigenvectors is easily proven, and the completeness is proved by use of the completeness of the Hermite polynomial functions.
(A.6) Therefore, the expression of the eigenvector (n,l) is: ik c n,l ψ n+1 (c n,l y) − n c n,l (iω n,l + τ −1 drag ) + ik c n,l ψ n−1 (c n,l y) Because of the dependency with c n,l in the parabolic cylinder function, the eigenvectors of Eq.A.7 are not orthogonal anymore (the calculation is cumbersome but with no difficulty). On the other hand, it is easily shown that the set of eigenvectors (n,l) are linearly independent (thanks to the Hermite polynomials). A rigorous proof would be needed to assess that they form a complete set, allowing for a projection of the heating function onto these eigenvectors. A graphical representation of these waves is provided in Appendix C.

Appendix B: Solutions to Eq.(40)
Appendix B.1: The argument principle Left hand side of Eq.(39) may be confused with a second order polynomial, but the dependency of c with ω actually leads to a polynomial of order 6. From Eq.(39) and Eq.(34), we obtain an equation for X = c 2 : with ∆τ = (τ −1 drag − τ −1 rad )/2. Determining the number of propagating waves therefore consists in determining the number of roots of Eq.(B.1) with a positive real part, as explained in section 4. We will make use of the argument principle: denoting P(X) the polynomial in X of Eq.(B.1), the number N of roots of P in a domain K is given by : where γ is a positively oriented contour encompassing K. Denoting C 1/2,r the semi-circle of radius r, centered on the origin and cut by the pure imaginary line ( Figure B.1), the number of zeros of P with positive real part is given by Eq.(B.2) with γ = C 1/2,r in the limit r → ∞. iR R r -r C 1/2,r The right hand side of Eq.(B.2) can be estimated by decomposing the integral on the pure imaginary line and on the circle of radius r. Namely: idt + π/2 −π/2 P (r exp(iθ)) P(r exp(iθ)) ri exp(iθ)dθ .

(B.3)
For large r, the second term of this expression can be calculated thanks to the asymptotical expansion of a polynomial of order 6: Hence the second term in the right hand side of Eq.(B.3) tends to 6iπ when r → ∞.
The first term deserves further consideration. If the image of the imaginary numbers by P does not cross a given half line D with origin at z = 0, said otherwise P(iR) ⊂ C\D, we can define a holomorphic function Ln such that Ln(exp(z)) = exp(Ln(z)) = z and in that case: where the limit comes from the fact that P is an even degree polynomial. In that case, our two expressions for the terms of the right hand side of Eq.(B.3) would yield: N(C 1/2,∞ ) = 3 , (B.6) confirming that Eq.(40) has exactly three roots of positive real part. We are therefore going to show that, except for n = 0, the image of the imaginary numbers by P is always included in C \ iR − or C \ iR + , where iR − and iR + are the imaginary numbers with negative and positive imaginary part respectively.

Appendix B.2: Image of the polynomial
With further calculation one can show that the polynomial P from Eq.(B.1) can be written: for Y ∈ R and with α n = (2n + 1)∆τ/k. Hence we can write the imaginary part of P(iY) as : (P(iY)) = 2k∆τ(Y 2 − 2α 0 Y + 1)(Y 2 + 2α 0 Y + 1) . (B.8) There are two cases: if |α 0 | = |∆τ/k| < 1, the imaginary part of P(iY) has no roots for Y ∈ R. Hence for all n, P(iR) ⊂ C\iR − or P(iR) ⊂ C\iR + depending on the sign of ∆τ, and we have the result: for all n, there are exactly three roots of positive real part of Eq.(40), hence only three waves propagate.
We therefore have confirmed that for |∆τ/k| < 1, only three waves propagate for all n and that this results holds for |∆τ/k| > 1 and n > 0. The case |∆τ/k| > 1 and n = 0 is more complicated as when the real part of P(iy) cancels its imaginary part cancels as well and we can't apply the same argument as for n > 0. However, it means that the polynomial P for n=0, P 0 , can be easily factorized: which yields: P 0 (Y) = (−Y 2 + 2iα 0 Y + 1) (1 − Y 2 ) 2 + 2ik∆τ(−Y 2 − 2iα 0 Y + 1) ≡ (−Y 2 + 2iα 0 Y + 1)Q(Y) . (B.13) Looking for the roots of P 0 with positive real part is therefore equivalent to looking for the roots of Q with positive real part.
If we can show that Q(iR) ⊂ C \ iR − or Q(iR) ⊂ C \ iR + , then we can use the argument principle on Q, a polynomial of order 4, and Eq.(B.3) will ensure that P has only 2 roots with positive real part. This is easily proven by looking at the real part of Q: (Q(iY)) = (1 + Y 2 ) 2 > 0 , (B.14) hence for all y ∈ R the real part of Q(iy) never cancels hence Q(iR) ⊂ C \ iR: P has only two roots of positive real part and only two waves can propagate. It might seem surprising to have a different behaviour for n = 0, but this was already obtained by Matsuno (1966) where only two of the three solutions of the n = 0 case were actual solutions of the linearised equations of motion. When considering that τ drag τ rad , this degeneracy is removed when |∆τ/k| > 1 where only two roots of the polynomial have positive real part. These findings have been tested and confirmed numerically.
Appendix C: Waves when τ rad , τ drag 0 This appendix intends to show the change in the structure of the waves when τ rad , τ drag 0 from the numerical solutions of equations 40 and A.7. Figure C.1 shows a Rossby, gravity and Kelvin wave when τ rad = τ drag = 0.35 which corresponds to ∼ 10 5 s in dimensional units. Their shape is similar to the waves studied by Matsuno (1966), as the only difference is in the decay timescale which is not zero but equal to τ drag .
Then, Figure C.2 and C.3 shows the same waves when the drag or radiative timescales has been multiplied by ten, respectively. The shape of the waves is almost unaltered for gravity and Kelvin waves, apart from a tilt of the perturbation with latitude. Rossby waves on the other hand are more affected by the change in the drag and radiative timescale.
Finally, Figure C.4 shows the same waves when τ rad = 1 and τ drag = 28, which is about 3.5 × 10 4 s and 10 6 s respectively. The tilt in the gravity and Kelvin modes is amplified compared to Figure C.2, and the shape of the Rossby wave is extremely altered. Because the majority of the energy of the Rossby wave is in very high latitudes, where the equatorial β-plane frameworks breaks, we have excluded these modes in our semi-analytical treatment. Physical Rossby waves are nonetheless recovered in 3D by ECLIPS3D with such timescales, see section 4.2.

Appendix D: Numerical development of the linear steady state
In this appendix, we show the development of the linear steady state for a simulation with ∆T eq,top = 1K, τ drag = 10 6 s and τ rad following the prescription of Iro et al. (2005) and the same simulation with ∆T eq,top = 100K. Our estimates show that such a drag timescale should lead to a linear steady circulation similar to Figure 1b after ∼ 20 days of evolution in the low forcing case. Whereas the time to depart from the linear evolution in the highly forcing case should be about ∼ 1 day. This is clearly recovered in Figure D.1. Figure D.1 shows the temperature and winds of both simulations after 1, 3 and 50 days of evolution. We see that the lowforcing simulation resembles Figure 1b after 50 days while the highly forced simulation is superrotating.
After 1 day of simulation a Matsuno-Gill like circulation is recovered in both cases. The maximum speed in the high forcing case is a hundred times the maximum speed of the low forcing case. After 3 days already, the maximum speed of the high forcing case is more than 100 times that of the low forced case. After 50 days, there is a factor 200 difference between the two cases, highlighting the influence of non linear terms.
In the low forcing case, the propagation and dissipation of mid latitude Rossby waves shifts the circulation obtained after 1 day towards a reverse Matsuno-Gill state after 50 days, whereas in the highly-forced case the eddies from the circulation after 1 day accelerate the equator towards a superrotating state. Notably, we see on the middle panel of Figure D.1 that the winds in the low-forced simulation are slowly shifted westward by the mid-latitude Rossby waves, whereas the highly forced simulation leads to a shrink of the westward winds that disappear after ∼ 30 days.