Disk origin of the Milky Way bulge: the necessity of the thick disk

In the Milky Way bulge, metal-rich stars form a strong bar and are more peanut-shaped than metal-poor stars. It has recently been claimed that this behavior is driven by the initial (i.e., before bar formation) in-plane radial velocity dispersion of these populations, rather than by their initial vertical random motions. This has led to the suggestion that a thick disk is not necessary to explain the characteristics of the Milky Way bulge. We discuss this issue again by analyzing two dissipationless N-body simulations of boxy or peanut-shaped bulges formed from composite stellar disks that consist of kinematically cold and hot stellar populations. These two models represent two extreme cases: one where all three components of the disk have a ﬁxed vertical velocity dispersion and di ﬀ erent in-plane radial dispersion, and another where they all have a ﬁxed radial dispersion and di ﬀ erent vertical random motions (thickness). This is intended to quantify the drivers of the main features that are observed in composite boxy or peanut-shaped bulges and their origin. We quantify the mapping into a boxy or peanut-shaped bulge of disk populations in these two cases, and we conclude that initial vertical random motions are as important as in-plane random motions in determining the relative contribution of cold-and hot-disk populations with height above the plane, the metallicity and age trends. Previous statements emphasizing the dominant role of in-plane motions in determining these trends are not conﬁrmed. However, signiﬁcant di ﬀ erences exist in the morphology and strength of the resulting boxy or peanut-shaped bulges. In particular, the model where disk populations initially have only di ﬀ erent in-plane random motions, but similar thickness, results in a boxy or peanut-shaped bulge where all populations have a similar peanut shape, independent of their initial kinematics or metallicity. This is at odds with the trends observed in the Milky Way bulge. We discuss the reasons behind these di ﬀ erences, and also predict the signatures that these two extreme initial conditions would leave on the vertical age and metallicity gradients of disk stars outside the bulge region. As a consequence of this analysis, we conclude that given our current knowledge of the Milky Way bulge and of the properties of its main stellar components, a metal-poor, kinematically (radial and vertical) hot component, that is, a thick disk, is necessary in the Milky Way before bar formation. This supports the scenario that has been traced in previous works. Boxy or peanut-shaped bulges and their surrounding regions are fossil records of the conditions present at early times in disk galaxies, and by dissecting their stellar components by chemical compositions and / or age, it may be possible to reconstruct their early state.


Introduction
The Milky Way (hereafter MW) bulge is a boxy/peanut-shaped bulge (Okuda et al. 1977;Maihara et al. 1978;Weiland et al. 1994;Dwek et al. 1995;Binney et al. 1997;Babusiaux & Gilmore 2005;López-Corredoira et al. 2005;Rattenbury et al. 2007;Cao et al. 2013;Wegg & Gerhard 2013;Portail et al. 2015Portail et al. , 2017a;;Ciambur et al. 2017;Portail et al. 2017b), made up, for the vast majority, of stars originated in its disc.Several works indeed now agree in limiting the contribution of a classical bulge to a small percentage of its total mass (Shen et al. 2010;Kunder et al. 2012;Di Matteo et al. 2014;Kunder et al. 2016;Debattista et al. 2017;Gómez et al. 2018).Also the contribution of the stellar halo, whose density is expected to peak in the inner few kiloparsecs of the Galaxy, seems to be marginal, not exceeding few percent of the bulge total mass (Ness et al. 2013a).While there is growing consensus on these general results, the details remain more controversial.
One of the points still debated concerns the origin of the metal-poor (−1 dex ≤ [Fe/H] ≤ 0 dex), α-enhanced population observed in the bulge.Compared to the metal-rich population ([Fe/H] > 0), metal-poor stars appear kinematically hotter at latitudes b −4 deg, and colder closer to the midplane (Ness et al. 2013b;Rojas-Arriagada et al. 2014;Zoccali et al. 2017), not strongly peanut-shaped (Rojas-Arriagada et al. 2014, but with an evident difference between stars with metallicities below and above ∼ −0.5 dex, see Ness et al. (2013a)), and dominate the outer bulge region.Their proportion, relative to the total population, decreases with decreasing height from the Galactic plane until at least latitudes b −5 deg (Ness et al. 2013a), and then seems to show an inversion, becoming again dominant at very low latitudes (Zoccali et al. 2017).In Di Matteo et al. (2014Matteo et al. ( , 2015) ) we have proposed that the metal-poor population observed in the bulge is the Galactic thick disc, mapped into the boxy/peanut-shaped (hereafter B/P) bulge differently from the metal-rich stars, because of its hotter kinematics.In particular, we have suggested (see also Ness et al. 2013a, for this same interpretation) that stars with metallicity between −1 ≤ [Fe/H] ≤ −0.5 dex can be associated to the old Galactic thick disc (ages greater than about 10 Gyr, metallicities below -0.5 dex, see Haywood et al. 2013) -while bulge stars with metallicities in the range −0.5 ≤ [Fe/H] ≤ 0 would correspond to stars of the young Galactic thick disc (ages between 8 and 10 Gyr, metallicities between −0.5 dex and solar, see Haywood et al. 2013).The reasons behind this suggestion have been reviewed in Di Matteo (2016), and we refer the reader to that paper for a discussion on the disc-bulge connection in this picture.In particular, in Di Matteo (2016) and Fragkoudi et al. (2017a) we have shown that a composite disc galaxy made initially of a kinematically cold, intermediate and hot discmimicking respectively the thin disc, the young and the old thick disc of the Galaxy -can reproduce the morphological and populations trends observed in the MW bulge.Namely: the colder the population, the stronger the bar, and -at all heights where a B/P morphology is observed -the hotter the population, the weaker the B/P structure.Also the chemical patterns and the metallicity distribution function of bulge stars can be understood simply as a consequence of the differential mapping of the composite Galactic (thin+thick) disc into the B/P bulge, modulo their scale-lengths and heights (see Fragkoudi et al. 2017b;Haywood et al. 2018;Fragkoudi et al. 2018).In this context, the agreement of the metallicity map that this scenario predicts with that of the MW bulge, as derived from APOGEE data, is remarkable (Fragkoudi et al. 2018).Our findings, based on models tailored to match the MW bulge and inner disc populations, agree with the general result that in a composite disc, populations with different kinematics are mapped differently into a bar and B/P bulge -a process referred to as kinematic fractionation by Debattista et al. (2017).This result, firstly shown by Bekki & Tsujimoto (2011), has been since then indeed confirmed and extended in a number of other works (Di Matteo 2016;Fragkoudi et al. 2017a;Athanassoula et al. 2017;Buck et al. 2018).However, in Debattista et al. (2017) it is suggested that the main driver of this different response to the bar perturbation is the radial, in-plane random motion of stars, rather then their vertical ones.In their view, a composite disc made of stellar populations with the same initial thickness but different in-plane random motions constitutes a sufficient condition to reproduce the trends observed in the MW bulge, and they conclude that, as a consequence, a thick discthat is a disc both radially and vertically hot -is not necessary to explain the MW bulge main characteristics.
In this paper we re-investigate this issue, about the driver of the trends observed in the MW bulge.First, we aim at understanding whether the in-plane, radial random motions of disc stars constitute the main drivers of the trends observed in the MW bulge, as suggested, and the role played by vertical motions.Second, by comparing some of the current properties of the MW bulge with the models predictions, we aim at finding signatures and suggesting criteria to constrain the kinematic conditions present in early MW disc.
For this study, we analyse two dissipationless N-body simulations of composite disc galaxies.These two models represent two extreme cases, namely one where all three components of the disc have fixed vertical velocity dispersion and different planar/radial (in-plane) dispersion, as in Debattista et al. (2017), and another where they all have fixed radial dispersion and different vertical random motions (thickness).Their analysis reveals that disc populations with different initial vertical random motions and same in-plane random motions are subject, at first order, to a similar mapping into a B/P-shaped bulge, as that experienced by stellar populations with equal initial thickness and different in-plane motions.More specifically, both initial conditions lead to composite B/P bulges, where formerly disc stars with initially the highest velocity dispersions redistribute in a thicker B/P structure and weaker bar, than disc stars which have initially a cold kinematics.As a consequence of this similar mapping, in both models, the B/P bulges show vertical metallicity/age gradients, and a pinching of the metallicity maps along the bulge minor axis, as also observed Gonzalez et al. (2017).Despite these same trends, we note, however, some significant differences in the morphology of the resulting B/P structure in the two cases.In particular, when the disc is made of stellar populations with the same initial thickness, which are differentiated only in their inplane motions, as in Debattista et al. (2017), the peanut structure appears at the same height above the plane, for all populations, independently on their kinematics, and its strength and shape is remarkably similar.This is at odds with what is observed in the MW bulge.We discuss the reasons behind this different behaviour and how the coupling of radial and vertical motions acts in the two cases.
These two models constitute a first step to explore to what extent in-plane and vertical random motions reshape the properties of B/P bulges and their surrounding discs, but none of them evolves into properties compatible with those of MW disc.As we discuss, it is indeed necessary that disc populations both radially and vertically warm -such as those currently found in the MW thick disc -were present in the Galaxy at the beginning of its secular evolution, in order to reproduce some of the trends observed in the Galaxy.
The paper is organized as follows : in Sect. 2 we describe the two models analyzed in this paper, their properties, and the adopted numerical methods; in Sect. 3 we present the results, by discussing first some main general trends (morphology, metallicity, ages) of their B/P bulges and their surrounding discs (Sect.3.1), then the strength of the B/P shape and its dependence on the initial disc kinematics (Sect.3.2) and, finally, in Sect.4, we outline our conclusions.

Models
In this paper, we analyze two dissipationless simulations of galaxies made of the superposition of discs of different initial kinematics.In Model 1, the galaxy is made of three stellar disc components, of equal initial thickness (and hence initial vertical velocity dispersion), but different initial in-plane motions (i.e.different radial and tangential velocity dispersions).In Model 2, the galaxy is also made of three stellar discs, but with equal initial in-plane motions and different initial thickness.These two models do not aim at representing a realistic model for the MW at the time of the formation of the bar, but are intended to constitute two significantly different initial conditions, needed to explore the drivers of the main characteristics observed in the MW bulge.In particular Model 1 has a similar set-up to the models analyzed in Debattista et al. (2017) (i.e.equal thickness, different in-plane motions), thus allowing a comparison with this work.In each model, the three discs are represented by Miyamoto-Nagai density profiles (Miyamoto & Nagai 1975), with characteristic heights and scale lengths as given in Table 1.In each of the two models, the cold, intermediate and hot discs have masses which represent, respectively, 50%, 30% and 20% of the total disc mass.In this context, the choice of the mass of each of these discs is arbitrary, and the conclusions of this work do not depend on it.However, we use these percentanges because they are reminiscent of those of the thin, young, and old thick discs respectively (for a derivation of the Galactic mass growth and relation with the thin and thick discs we refer the reader to Snaith et al 2014Snaith et al , 2015)).To allow all comparisons with the models that will be presented in forthcoming papers, we have thus decided to keep the same mass repartition between the three discs as that used in those papers.
These composite discs are then embedded in a dark matter halo, modelled as a Plummer sphere (Plummer 1911), and whose parameters are reported in Table 1.The resulting rotation curves are shown in Fig. 1.Face-on and edge-on maps of stars in the bulge region at the initial time are given in Appendix A, where we also discuss further details about the adopted parameters and their impact on the final bulge morphology.
To generate initial conditions, we employed the iterative method described in Rodionov et al. (2009).This method allows to generate initial conditions at equilibrium with the required density profiles and/or kinematics constraint, avoiding the relaxation processes -and departure from initial conditions -often observed in N-body models of disc galaxies.It is particularly useful for building composite discs, each with a specific kinematic and/or density profile, allowing a full control on their initial state.For the models presented here, the initial velocity dispersion profiles of the cold, intermediate and hot populations are reported in Fig. 1.In Model 1 the initial thickness of the three discs is the same, and thus the corresponding vertical velocity dispersion profile; the in-plane motions (radial and tangential) are however different.On the contrary, in Model 2, the vertical velocity dispersions are different, while the in-plane motions are initially the same.We investigate in the following how these two models evolve secularly, and how these (initially different) composite discs are mapped into the B/P bulge.We refer the reader to Appendix B for a discussion on the velocity dispersion structure that is imposed for the initial discs and on the final convergence of the method.
Both simulations have been run with a recently developed parallel MPI Tree-code which takes into account the adaptive spatial decomposition of particle space between nodes.The multi-node Treecode is based on the 256-bit AVX instructions which significantly speed up the floating point vector operations and sorting algorithms (Khoperskov et al. in prep).Fifteen million particles have been employed for each model, 10 millions in the disc components, and 5 millions in the dark matter halo.A time step ∆t = 2 × 10 5 yr has been adopted, together with a gravitational smoothing length of 50 pc.Both simulations have been run over a time scale of 5 Gyr.In less than 1 Gyr from the beginning of the simulations, a bar and a B/P bulge forms in both models.We then analyze the final configuration, at t = 5 Gyr, to quantify the properties of the B/P bulge and surrounding disc.

Multiple populations with different in-plane or vertical kinematics: relative contribution to the B/P bulge, metallicity and age trends
The face-on and edge-on projection of stars in the bulge region for Model 1 are shown in Fig. 2. We recall that in this model, the simulated galaxy is made of three discs, with different in-plane kinematics (the radial velocity dispersion increases moving from the cold disc to the hot disc, see Fig. 1), but same  Fig. 3. First row: Bar strength A 2 as a function of time, for all stars (dashed black curve), stars initially in the cold (blue curve), intermediate (green curve), and hot (red curve) discs.Second row: Height profiles in the B/P region of all stars, and stars initially in the cold, intermediate, and hot discs.Models 1 and 2 are shown, respectively, on the left and right columns.In the panels on the first row, the grey area indicates the time of disc bending.At the end of this phase of instability, a B/Pshaped bulge is formed in both models, and its morphology does not change significantly after that time.
vertical velocity dispersions.This set-up is thus similar to those analyzed in Debattista et al. (2017), and indeed the morphology of the bulge in Fig. 2 agrees qualitatively with the idealized model presented in their paper1 .
Our analysis for this model confirms the findings of Debattista et al. (2017) (see also Combes et al. 1990;Athanassoula 2013, for earlier works on the strength of bars in single component discs, and their dependence on the initial disc kinematics).Stars with different in-plane kinematics are mapped differently in the B/P bulge.When viewed face-on, the colder the initial disc, the stronger the bar.This can be appreciated by looking at the isodensity contours that are more elongated for the initially cold disc, but is also quantified by means of the m = 2 coefficient, A 2 , of the Fourier analysis of the face-on density maps (Fig. 3, left column) which shows that the colder the disc is initially, the higher the corresponding A 2 value.In the edge-on projection, stars with initially the coldest kinematics show the thinnest distribution, as we have verified by estimating < h z >, the median of the absolute value of the z-component of their positions, as a function of the distance R from the galaxy centre (Fig. 2).
Because of this different response to the bar perturbation, stars belonging initially to discs with different in-plane kinematics contribute differentially to the B/P bulge, both in its face-on and edge-on projection.In agreement with Debattista et al. (2017), the colder the disc, the higher its relative contribution to the B/P structure is.Stars in the initially kinematically hottest disc dominate at larger heights above the disc.As a consequence their relative contribution along the B/P minor axis increases with height above the plane (see Fig. 2, fourth column.This trend -namely the larger thickness of the initially in-plane hot disc with respect to the colder ones -comes from the fact that in the inner regions of the simulated disc the azimuthal and vertical frequencies, Ω and ν, respectively, are similar (see Appendix C), and thus in-plane and vertical motions are not decoupled, as usually assumed in the case of very thin, or infinitely thin discs (see Binney & Tremaine 1987).As a result, a disc which has initially only a hot in-plane kinematics will rapidly relax into a disc also vertically hot, because of this coupling.
The analysis of Model 2, interestingly reveals that all the trends observed for Model 1 are satisfied also in this case (Fig. 4).When viewed face-on, the lower the initial vertical velocity dispersion of the disc, the stronger is the bar -as measured by means of the A 2 coefficient, as a function of time (Fig. 3, right column).When viewed edge-on, the lower the initial vertical velocity dispersion of the disc, the thinner the B/P structure.Note however the difference in the thickness of cold and hot populations in the two cases: in Model 2, outside the B/P region, the hot population has a higher thickness than that measured for the cold population, while in Model 1 the difference is milder.For example, at R = 6 kpc the hot population has a thickness about three times higher than that measured for the cold population, while in Model 1 the hot population is only 1.3 times thicker than the cold one, at the same radius.Finally, because the three discs in Model 2 have, by construction, different initial thickness, their relative contribution to the B/P bulge depends on their initial kinematics: in particular, the weight of stars with initially the hottest kinematics increases with the height above the plane.
The similarity in the trends of cold and hot populations with height above the plane, found for the two models, has also some important implications in the mean metallicity and age of stars across the B/P bulge, as we investigate in the following.To this aim, we assign to stars in each disc a metallicity and age, as follows: the cold, intermediate and hot discs have gaussian metallicity distributions with means respectively at [Fe/H] = 0.3, −0.2 and − 0.6 dex, and dispersions σ [Fe/H] = 0.1, 0.3 and 0.3; for the ages, we assume for the cold, intermediate and hot discs uniform distributions in the intervals [0., 8.], [8., 10.] and [10., 13.] Gyr, respectively.The metallicity mean values and dispersions are similar to those found by Ness et al. (2013a) for components A, B and C in the Galactic bulge.Since, as recalled in the introduction, in our scenario we associate these three bulge components2 to the Galactic disc(s)respectively the thin, young thick and old thick discs, which, as robustly established, show a decreasing metallicity with increasing (in-plane and vertical) velocity dispersions -the choice to assign these metallicity values to our three discs is natural.We emphasize however that other choices would have been possible, and that the trends discussed in the following are not impacted by the specific metallicity values employed, as far as the three modeled discs have different chemistry and ages (i.e. a trend between metallicity/age and velocity dispersion must exist).Since age distributions in the Galactic bulge -as well as in most of the Galactic disc -are still missing, we decide to assign them in the simplest way, simply relating our modeled discs to the thin, young and old thick discs, respectively, and making use of the age estimates of (thin and thick) disc stars at solar vicinity, as given by Haywood et al. (2013).With these choices of metallicities and ages, half of the disc stellar mass has metallicities below solar and ages below 8 Gyr, in agreement with the results presented in Snaith et al. (2014); Haywood et al. (2016) for the inner Galactic disc (inside 6-7 kpc from the Galactic centre).Before the formation of the bar and of the B/P structure, the edge-on metallicity and age maps of Model 1 and Model 2 are shown in Fig. 5.Because in Model 1 all discs have initially the same thickness, no trend is visible when the modeled galaxy is seen edge-on, and the vertical gradient along the bar major axis is null.For Model 2, however, because the discs have different initial vertical velocity dispersions, a vertical gradient is in place ab initio both in the metallicity and age maps, with mean metallicities(/ages) decreasing(/increasing) with height above the plane.As an example, for Model 2 along the bulge minor axis the initial metallicity gradient is -0.20 dex/kpc, and the age gradient is 1.72 Gyr/kpc3 .We note however that in Model 2 the metallicity (/age) gradient is mostly due the presence of very metalrich (/young) stars close to the galaxy midplane (at heights below 500 pc), while for larger heights the gradient is nearly flat also in this case.Once the B/P bulge is formed, the metallicity and age maps of Models 1 and 2 appear rather different from their initial state.In both models, X-shaped metallicity and ages maps are clearly visible (as found also in the models by Athanassoula et al. 2017;Debattista et al. 2017), with metal-rich (young stars) associated to the initially cold disc mostly redistributed in the peanut configuration.Along the bulge minor axis, as one moves vertically from the plane, the metallicity and age maps become rapidly dominated by the metal-poor, old intermediate and hot discs, giving rise to a vertical negative (/positive) metallicity (/age) gradient.It is striking to note that not only are the global metallicity and age trends similar in the two models (Fig. 5), but the absolute values and strengths of the gradients as well.Ex- cept for the outermost bulge region outside the B/P lobes, that we will discuss in the following section, at all x |4| kpc the vertical metallicity and age profiles appear very similar, with comparable slopes.As an example, along the bulge minor axis, between z = 0 and z = 2 kpc from the galaxy midplane, the metallicity gradient is equal to −0.089 dex/kpc in Model 1 and to −0.091 dex/kpc in Model 2. The age gradients are respectively 0.721 Gyr/kpc for Model 1 and 0.782 Gyr/kpc for Model 2. The age and metallicity maps generated in these two models are thus remarkably similar, despite the significantly different initial conditions adopted, and this is valid for all choices of the initial metallicity and age distributions in the disc populations, since here metallicity and age are simply tags of the stellar particles, that trace their spatial distribution.As a final remark, we emphasize that both models produces B/P bulges whose metallicity/age maps appear more pinched/peanutshaped than the stellar density distribution itself (cf.metallicity/age maps with isodensity contours in Fig. 5).This feature has been observed in external bulges (Gonzalez et al. 2017), and reported also in N-body simulations (Debattista et al. 2017;Aumer & Binney 2017), and here we show that it is not necessarily the signature of a pre-existing disc with stellar populations differentiated in their in-plane random motions, as previously suggested.
From this analysis we thus conclude that initial vertical random motions are as important as in-plane random motions in determining the overall bulge population, metallicity and age trends, and that previous statements emphasizing the dominant role of in-plane motions in determining these trends are not confirmed by our analysis.
Finally, we would like to shortly comment on the metallicity/age trends that these two models generate outside the B/P bulge, in the surrounding disc.In Fig. 6, we show the large-scale -i.e. over the whole extent of the galaxy -edge-on metallicity and age maps, for both models.In Model 1, as shown in Appendix D, during the whole evolution, outside the bulge region, the populations are never differentiated in their vertical velocity dispersion, and thus scale heights.This implies that the vertical metallicity and age profiles of the disc do not exhibit any vertical gradient, as result of combining different populations.The consequence of the absence of any vertical differentiation in kinematics present in Model 1 is that the mean metallicity and age of the disc do not vary with z: this feature is already evident at the borders of the B/P bulge, and persists all over the disc.So, while in the B/P bulge, the two models predict similar trends of metallicity/age with height above the plane, significant differences are expected outside the B/P region, and in the surrounding disc.In particular, the absence of any dependence of metallicity or age with height above the plane -as observed in Model 1 -is in sharp contrast to what we know about the MW disc : both at the solar vicinity and on a severalkpc scale metallicity and ages vary with height above the plane, and with the vertical velocity dispersion of the corresponding populations (Strömberg 1946;Spitzer & Schwarzschild 1951;Nordström et al. 2004;Seabroke & Gilmore 2007;Holmberg et al. 2007Holmberg et al. , 2009;;Bovy et al. 2012;Haywood et al. 2013;Sharma et al. 2014;Bovy et al. 2016;Martig et al. 2016;Ness et al. 2016;Mackereth et al. 2017).To generate vertical metallicity/age gradients in the disc and in the region outside the lobes of the B/P bulge, it is necessary that populations of different metallicities/ages are differentiated in their vertical velocity dispersion ab initio, before being trapped in the bar instability, otherwise only by secular evolution processes this relation cannot be put in place afterwards.

Multiple populations with different in-plane or vertical kinematics: on the shape and strength of the B/P bulge
Despite that Figs. 2 and 4 show similar trends with height from the plane, we notice that the initial disc kinematics imprints some significant differences in the intrinsic morphology of the final B/P bulges.
A first way to quantify these differences is shown in Fig. 7, where we estimate the strength of the B/P structure by means of the sixth Fourier component (B 6 ) of edge-on density isophotes in the bulge region (see Ciambur 2015; Ciambur & Graham 2016, for further details4 ).What we observe is that, while in Model 2, the amplitude of the B 6 coefficient depends on the initial kinematics of the population -the hotter the kinematics, the lower B 6 , and this over the whole bulge extent -in Model 1, the B 6 amplitude appears the same for all populations, independently on their initial kinematics -both as a function of the projected radial length of the peanut, R Π , and of its projected height, z Π (for a definition of R Π and z Π , see Fig. 1 in Ciambur & Graham 2016).In the regions where all three populations coexist in the bulge (about z Π ≤1 kpc), the trends of B 6 with R Π and z Π indeed are indistinguishable for Model 1.
A second way to appreciate the different B/P shapes in the cold and hot populations of the two models is illustrated in Fig. 8, where we show isodensity contour maps of the B/P bulge in Models 1 and 2. For both models and for all populations, we have normalized the isodensity contours to their central density value.The normalized density levels shown by the contours have been chosen in order to probe similar regions of the B/P bulge  for hot and cold populations.The difference between the two models appears very clearly: while in Model 1 the isodensity shapes inside the B/P lobes, i.e. x ≤ 2 kpc, are the same at all heights, for both cold and hot populations, in Model 2 the shape of the isodensity contours depends on the initial kinematics of the population, the hotter it is initially, the rounder and less peanuty the contour is, at all z.5 Fig. 8 thus shows that in Model 1 it is the thickness of the B/P structure that moderately changes with the initial kinematics of the population, but not its intrinsic shape.The trends observed for Model 1 in Fig. 2 namely the decreasing contribution of the cold disc population with height above the plane, when compared to the hot population -and in Fig. 5 -the X-shaped metallicity/age maps -are rather due to final different vertical density gradients of the cold and hot populations, and thus to different thicknesses, rather than to an intrinsic difference in their isodensity shape.This result is interesting, because it shows that X-shaped metallicity maps are not only the result of the spatial superposition of populations with different metallicities and different intrinsic peanut/X-shapes, but can also be produced when in a B/P bulge multiple populations with different metallicities, but similar X-shapes, co-exist.In this latter case, the X-shape metallicity and age maps result simply because of a different vertical density decay of these populations.
Another consequence of the similarity in the isodensity contours found for Model 1 is that the peanut appears at the same height above the plane, at about 600 pc, for all populations, independently on their initial in-plane kinematics (see also Fig. 9).This trend seems at odds with what is found in the MW bulge, where the strength and appearance of the B/P shape depends on the metallicity of bulge stars (Ness et al. 2013a;Rojas-Arriagada et al. 2014).According to our experiment, this observational result favors a scenario where metal-rich and metal-poor populations in the MW bulge initially had different initial vertical random motions, i.e. before being trapped by the B/P bar.If the metal-poor and metal-rich populations were different only in their in-plane random motions, but not in the vertical ones, the bimodality in the stellar density distribution would show a weak or even null dependence on the height above the plane, as we find for Model 1.
But why is the B/P shape of hot and cold populations so similar in a disc whose populations are only differentiated by their in-plane motions, as in Model 1, and how general this result is?While it is premature to generalize, and a larger statistics of Model 1-like simulations is required before deriving final conclusions, we can give here some first explanations of the observed similarities in the B/P morphology of cold and hot populations in Model 1.In this model, all populations have initially the same scale lengths, and thus similar guiding radii, and they differ only in their initial radial velocity dispersions.In the epicyclic approximation, orbits in an axisymmetric potential, with different radial velocity dispersions σ R but same guiding radius R g , have same azimuthal frequency Ω, where Ω = V c /R g , V c being the circular velocity at R = R g .Also when the epicyclic approximation is not valid, because the departure from the orbit circularity is significant, still kinematically hot and cold populations, with similar guiding radii, have similar azimuthal frequencies (with differences typically less than 10%), as we show in Fig. C.1 in Appendix C. The distribution of Ω being similar for hot and cold populations, and because in Model 1 the distribution of vertical frequencies ν is the same by construction -all populations have the same thickness -this implies that the fraction of stars that satisfies the inequality ν ≥ 2(Ω − Ω p ) (Merritt & Sellwood 1994), that is that can respond to the bending/buckling instability which initiates the B/P formation, is similar in all populations, for any given value of the bar pattern speed Ω p .Also, because of the initial similarity of Ω and ν at any given radius, the percentage of resonant material with the vertical Inner Lindblad resonance (Combes et al. 1990) is similar for all populations.Finally, in Model 1 we observe that the bar formation rapidly raises the radial velocity dispersion of the cold disc to values similar to those of the hot population (see Fig. D.1 in Appendix D), thus erasing their initial kinematic differences.The similarity of Ω and ν, and the convergence of the in-plane velocity dispersions to similar values, imply a similar B/P shapes in all populations, independently on their initial in-plane kinematics.

Conclusions
In this paper we have made use of dissipationless N-body simulations of boxy/peanut-shaped bulges formed from composite stellar discs, made of kinematically cold and hot stellar populations, to discuss the main drivers of the trends observed in a B/P bulge.To this aim, we have analyzed two extreme models, the first made of disc populations with the same initial vertical random motions, σ z but different in-plane random motions, σ R , the second made of disc populations with different initial thickness but equal in-plane random motions.We have chosen such extreme initial conditions to separately study the   6 4 2 0 2 4 6 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 [1.2,1.3]6 4 2 0 2 4 6 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 [1.3,1.4]6 4 2 0 2 4 6 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 [1.4,1.5]6 4 2 0 2 4 6 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 [1.5,1.6]x [kpc] Probability density Model 2 Fig. 9. Distribution of stars along the bar major axis at different heights above the plane, as indicated on the top of each panel.The distribution of cold, intermediate and hot stars in the B/P structure are indicated, respectively, with blue, green and red histograms.The distribution appears bimodal for all populations at the same height above the plane in Model 1 (left panel) , while in Model 2 (right panel) the bimodality appears at larger heights for the hot population than for the colder ones.role of in-plane versus vertical kinematics in determining the final properties of a B/P bulge.
While a larger sample of N-body simulations is needed to understand how representative our findings are of the two classes of models studied, we can derive some first conclusions.At first order, disc populations with different initial σ z and same σ R are subject to a similar differential mapping into a boxy/peanut-shaped bulge, as that experienced by stellar populations with equal initial σ z and different σ R .In both cases: when viewed face-on, the colder initially the disc population, the stronger the bar; when viewed edge-on, the hotter initially the disc population, the thicker the B/P structure.at a given height above the plane, the relative contribution of stars in the B/P-shaped bulge depends on their initial kinematics, and the relative weight of stars initially in the cold disc population decreases with height (and viceversa, that of the hot disc population increases with height).because of this differential distribution of initially cold and hot disc populations in the B/P-shaped bulge, vertical metallicity(/age) gradients are naturally formed along the bulge minor axis, and X-shape metallicity/age distributions as well, if a velocity dispersion-metallicity(/age) relation exists in the disc, initially.
As a consequence, the vertical, as well as the radial velocity dispersions, are both responsible for generating similar trends in a boxy/peanut-bulge, like that of the Milky Way, and we cannot conclude that in-plane random motions play a more dominant role than vertical random motions in determining these trends, as recently suggested.However, some important differences exist in the characteristics of the B/P bulges, and their surrounding discs, as generated with these two models.
-We find that the model where disc populations have initially only different in-plane random motions, but similar thickness, results into a boxy/peanut bulge where all populations have a similar peanut shape, independently on their initial kinematics, or metallicity.As a consequence, the peanut appears at the same height above the plane, for all populations.This is at odds with the trends observed in the Milky Way bulge.
-Finally, a model where the disc stellar populations are initially differentiated only in their in-plane kinematics, generates discs with no relation between metallicity/age and vertical velocity dispersion and where all populations converge to the same final radial velocity dispersion.This model is thus not applicable to the Milky Way.
On the basis of these two models, we conclude that a metal-poor, kinematically (radial and vertical) hot component, that is a thick disc, is necessary in the Milky Way before bar formation.

Appendix D: Evolution of the in-plane and vertical velocity dispersions with time
In this Section, we show the temporal evolution of the radial and vertical velocity dispersion of the initially kinematically cold and hot discs, for Models 1 and 2, in Figs.D.1 and D.2, respectively.In both plots, these velocity dispersions are shown for 8 different radial annuli, covering the whole extent of the simulated galaxy, from its inner regions to its outskirts.For clarity, in these plots we do not show the intermediate disc, but we checked that its temporal behavior is always bracketed between that of the initially hot and cold populations.We refer the reader to Sect.3.1 for a discussion about the implications of these plots on the vertical metallicity gradients outside the B/P bulges of Model 1 and 2,      Probability density 0.9, 1.9, 3.1 0.9, 2.0, 3.1 0.9, 2.0, 3.0

Fig. 2 .
Fig. 2. Model 1, morphology of the bar-B/P region at t = 5 Gyr.First and third column, from top to bottom: face-on (col 1) and edge-on (col 3) absolute stellar densities of all stars (top panel), and stars initially in the cold (second panel), intermediate (third panel) and hot (bottom panel) discs.Second and fourth column, from top to bottom: Fraction of cold (second panel), intermediate (third panel), and hot disc (bottom panel) stars in the B/P bulge, seen face-on (col 2) and edge-on (col 4).The fraction is defined as the ratio of cold, intermediate or hot disc stars to the total (i.e.cold+intermediate+hot) disc stars in the B/P bulge.In the edge-on maps, only stars in the bar (|x| ≤ 5.5 kpc and |y| ≤ 3 kpc) have been selected.

Fig. 5 .
Fig. 5. Edge-on metallicity and age maps for stars in Model 1 (first and third row) and for Model 2 (second and fourth row), at time t = 0 (first column) and at time t = 5 Gyr (second column).Only stars in the bar (|x| ≤ 5.5 kpc and |y| ≤ 3 kpc) have been selected.In all these maps, when present the bar is oriented side-on.Isodensity contours are shown in black.

Fig. 6 .
Fig. 6.Edge-on metallicity and age maps at the final time of the simulation for Model 1 (respectively, first and second row), and Model 2 (third and fourth row).Both the B/P region and the surrounding disc are shown in the maps.Isodensity contours are shown in black.

Fig. 7 .
Fig.7.First row: B/P strength, quantified by means of the B 6 parameter, as a function of the projected radial length of the peanut, R Π , for Model 1 (left panel) and Model 2 (right panel).Second row: B/P strength as a function of the projected height of the peanut, z Π , for Model 1 (left panel) and Model 2 (right panel).In all plots, the strength of the peanut is shown at t = 5 Gyr for all stars (black curves), and stars initially belonging to the cold (blues curves), intermediate (green curves) and hot (red curves) discs.In all panels, error bars are represented by shaded areas and indicate the intensity rms scatter along each isophote after the subtraction of the best-fitting pure ellipse with harmonic terms up to, and including, the 6th order.

Fig. 8 .
Fig. 8. Isodensity contours of the B/P bulge in Model 1 (top panel) and Model 2 (bottom panel), for the initially kinematically cold (blue contours) and hot (red contours) disc populations, at t = 5 Gyr.The contours of the intermediates disc are not shown, but they are bracketed by those of the cold and hot disc populations.The numbers on the top left of each panel indicate the values of the isodensity contours, for cold and hot populations, normalized to their central density.Only stellar particles in the bar region (|x| ≤ 5.5 kpc and |y| ≤ 3 kpc) have been selected for these plots.

Fig. A. 1 .
Fig. A.1.Morphology of the bulge region at time t = 0, for Model 1 and Model 2. First and third column, from top to bottom: face-on absolute stellar densities of all stars (top panel), and stars initially in the cold (second panel), intermediate (third panel) and hot (bottom panel) discs for Model 1 ((col 1)) and Model 2 (col 3) .Second and fourth column, from top to bottom: edge-on absolute stellar densities of all stars (top panel), and stars initially in the cold (second panel), intermediate (third panel) and hot (bottom panel) discs for Model 1 ((col 2)) and Model 2 (col 4).In the edge-on maps, only stars with |x| ≤ 5.5 kpc and |y| ≤ 3 kpc have been selected.

Fig. A. 2 .
Fig. A.2. From left to right, First panel: Edge-on stellar density map of particles in the hot disc of Model 2 at t = 0.The density contours trace the whole density distribution of the hot disc in the central regions, the colored map corresponds to the bulge-like part of this distribution (see text for details).Second panel: Edge-on stellar density map, at the final time of the simulation, of hot disc particles initially inside the bulge-like region.Third panel: Relative contribution of hot disc particles initially inside the bulge-like region to the final B/P bulge, as defined by hot disc particles only.Fourth panel: Relative contribution of hot disc particles initially inside the bulge-like region to the final B/P bulge, as defined by all disc particles, i.. cold and intermediate discs also included.

Fig. B. 1 .
Fig. B.1.From top to bottom: Evolution of the radial, azimuthal and vertical velocity dispersion profiles, as a function of radius R, during the iterative procedure adopted to generate the initial conditions of Model 1.For each plot, and each inset in the plot, different colors represent different steps of the iteration, as indicated by the the color bar.In all plots, and insets, the black dotted cuve indicates the velocity dispersion after 20 iterations, when we consider the system converged to an equilibrium solution.Cold, intermediate and hot discs are shown, respectively, in the left, middle, and right columns, as indicated.

FigFig
Fig. C.1.From left to right: Distribution of guiding radii, azimuthal frequencies Ω, vertical frequencies ν and of the differences Ω − ν/2, for Model 1 (top panels), and Model 2 (bottom panels), at t=0.In each plot, the blue, green and red curves correspond, respectively, to the cold, intermediate and hot disc components, as indicated.In each plot, we also give the 25th, 50th and 75th percentile of the distributions, with colors corresponding to each of these components.

Table 1 .
Masses, characteristic scale lengths and heights and number of particles, for the different components in Model 1 and Model 2. All masses are in units of 2.3×10 9 M , distances in kpc.
component represents a significant fraction of the hot disc in the innermost B/P region, especially for |z| ≤ 1 kpc (seeFig.A.2,  third panel).However it does represent only a marginal fraction of the whole B/P bulge populations, which includes also particles formerly in the cold and intermediate discs (see Fig. A.2, fourth panel).Thus, while this central bulge-like component certainly dominates the morphology of the kinematically hot part of the B/P bulge made of hot disc stars only, it only marginally affects the final, whole B/P morphology and its spatial extension.