The Gaia DR3 view of dynamical substructure in the stellar halo near the Sun

The debris from past merger events is expected and, to some extent, known to populate the stellar halo near the Sun. We aim to identify and characterise such merger debris using Gaia DR3 data supplemented by metallicity and chemical abundance information from LAMOST LRS and APOGEE for halo stars within 2.5 kpc from the Sun. We utilise a single linkage-based clustering algorithm to identify over-densities in Integrals of Motion space that could be due to merger debris. Combined with metallicity information and chemical abundances, we characterise these statistically significant over-densities. We find that the local stellar halo contains 7 main dynamical groups, some of in-situ and some of accreted origin, most of which are already known. We report the discovery of a new substructure, which we name ED-1. In addition, we find evidence for 11 independent smaller clumps, 5 of which are new: ED-2, 3, 4, 5 and 6 are typically rather tight dynamically, depict a small range of metallicities, and their abundances when available, as well as their location in Integrals of Motion space, suggest an accreted origin. The local halo contains an important amount of substructure, of both in-situ and accreted origin.


Introduction
The Gaia mission has brought our Galaxy into sharper focus with every data release, revolutionising our understanding of our local environment and the field of Galactic Archaeology.Notably, the second data release (Gaia Collaboration 2018) significantly increased the number of stars with full 6D position and velocity information.This preponderance of information has brought on new insights into our Galaxy's past, such as evidence of an ancient major merger (known as Gaia-Enceladus, Helmi et al. 2018; see also Belokurov et al. 2018, the 'Sausage'), and the fine details of the dynamics of the Galactic disk (e.g., Antoja et al. 2018).The most recent third data release (Gaia Collaboration 2023b) promises to offer similar advancements in our understanding of the Galaxy.
Over the Milky Way's history, many galaxies must have been accreted in a series of minor and major mergers, following the hierarchical growth characteristic of the ΛCDM model (Springel et al. 2005).Inferring our assembly history from the accreted material means overcoming the challenge in identifying the accreted stars and attributing them to their progenitor.For all but the most recent events, the material has long since phase mixed, erasing cohesion in physical space.Instead, we may look to the space of integrals of motion (IoM), where some struc- The catalogue used in this paper is only available at the CDS via anonymous ftp to cdsarc.cds.unistra.fr(130.79.128.5) or via https://cdsarc.cds.unistra.fr/viz-bin/cat/J/A+A/670/L2 ture is preserved (see Helmi 2020, and references therein).Combined with the chemical abundances, this can help identify a star's progenitor.
This goal currently stands as a particularly broad endeavour among the Galactic community and many structures have already been identified in the stellar halo.Some of the larger ones have been studied for decades and are well established, such as Sagittarius and the Helmi streams, while others are more recent discoveries, such as Gaia-Enceladus/Sausage.However, the existence and extent of some other structures is still a matter of debate (see e.g., Naidu et al. 2020).
As the available data improves and expands in its breadth, the methods used to identify substructures have become increasingly sophisticated.Nonetheless, the interpretation and statistical soundness of the outcome have generally received less attention.With this in mind, in this Letter, we apply our previously developed clustering algorithm (Lövdal et al. 2022;Ruiz-Lara et al. 2022) to identify merger debris and in situ substructures in the new Gaia DR3 dataset.This work is organised as follows.Section 2 describes our selection of a Gaia DR3 halo sample, complemented with the relevant chemistry.We describe our methodology in Sect.3. We present our results and offer a brief discussion in Sect. 4. In Sect.5, we summarise our findings.

Data
Gaia DR3 has provided a significant increase (roughly a factor 5) in the number of known stars with a radial velocity value (the RVS sample, Katz et al. 2023).Furthermore, Gaia DR3 has provided, for the first time, metallicities for over five million stars derived from the RVS spectra (Recio-Blanco et al. 2023).As we show below, the increase in size and content information of this dataset offers new insights into the local stellar halo.
To construct a sample suitable for our purposes, we applied several quality and selection cuts to the RVS dataset.We first corrected each star's parallax ( ) by their individual zero-point offsets (∆ ), determined following Lindegren et al. (2021).To obtain a distance, we inverted the parallax; hence, we require that the (total) relative parallax uncertainty is less than 20%, namely, ( − ∆ )/ σ 2 parallax + σ 2 sys ≥ 5, where σ parallax is the parallax_error, and σ sys is the systematic uncertainty on the zero-point, which we take to be 0.015 mas (Lindegren et al. 2021).Furthermore, we selected stars with RUWE < 1.4 and σ(V los ) < 20 km s −1 , after applying the correction to radial_velocity_error recommended by Babusiaux et al. (2023).Stars in this sample typically have radial velocity errors lower than this 20 km s −1 cut, with a median σ(V los ) of 3.9 km s −1 , corresponding to a median total velocity uncertainty of 1.4 km s −1 .We also followed these authors and removed a few stars with (G RVS − G) < −3.
To make a kinematic selection of the local halo, we derived the velocities of the stars after correcting for the solar motion using (U, V, W) = (11.1,12.24, 7.25) km s −1 (Schönrich et al. 2010) and for the motion of the local standard of rest (LSR) using a |V LSR | of 232.8 km s −1 (McMillan 2017).We also required |V − V LSR | > 210 km s −1 .We adopted R = 8.2 kpc (McMillan 2017) and imposed a distance cut of 2.5 kpc.For stars at low latitude (|b| < 7.5), we required higher signal-to-noise ratios (S/N) for the spectra (rv_expected_sig_to_noise > 5) to avoid highly contaminated spectra and spurious velocities, following Katz et al. (2023, Sect. 9).The resulting sample contains 69 106 nearby halo stars.
To complement the dynamical information, we considered several sources for the stellar chemistry data.We recalibrated the Gaia DR3 GSP-SPEC M/H and [α/H] abundances according to the recipes given in Recio-Blanco et al. (2023) and we followed Gaia Collaboration (2023a) to define a 'medium quality' sample.This yields 4665 stars in our sample with a (reliable) [M/H] measurement.Additionally, our sample contains 1809 stars in APOGEE DR17 (Accetta et al. 2022) and 9797 stars in LAMOST LRS DR7 (Zhao et al. 2012), with 675 stars in common.

Methods
To identify accreted debris in the local halo, we applied a clustering algorithm to the three-dimensional (3D) integrals of motion (IoM) space of energy, along with the z and perpendicular components of angular momentum (E, L z , L ⊥ ).We computed E using the same potential as in Lövdal et al. (2022).This potential consists of a Miyamoto-Nagai disk with parameters (a d , b d ) = (6.5, 0.26) kpc, M d = 9.3 × 10 10 M , a Hernquist bulge with c b = 0.7 kpc, M b = 3.0 × 10 10 M , and an NFW halo with r s = 21.5 kpc, c h = 12, and M halo = 10 12 M .We defined L z to be positive for prograde stars, while L ⊥ = L 2 x + L 2 y .Whilst L ⊥ is not a true IoM, it is approximately conserved, and is therefore useful to identify halo substructure.We required that all stars are bound in this potential, resulting in a final nearby halo sample of 68 921 stars.
The clustering algorithm we used is described in detail in Lövdal et al. (2022) and Ruiz-Lara et al. (2022), where it was applied to a local halo sample from Gaia EDR3.We refer to those works for more information.It is based on the single linkage algorithm, which, at each step, joins together the two closest components until all components are linked.To determine where to stop the linkage and identify significant components and clusters, we determined, at each step, how over-dense a cluster is relative to a sample of 1000 artificial, smooth datasets obtained by re-shuffling the velocities of the stars.Re-shuffling two components of the velocities breaks the correlations in velocity space and, hence, the structure in the IoM space, while preserving the 1D velocity distributions and properties of the original dataset.
In other words, we compared the number of stars, N C i , in an ellipsoidal region centred on each cluster, to the number of stars in our artificially generated smooth halos within the same region, N art C i .The significance is then: and σ art C i is the standard deviation of the number of stars in the given region across all 1000 artificial datasets.Our final set of clusters were extracted at their maximum significance and we kept clusters with a significance S > 3 and a minimum number of ten members.
This clustering method does not include the measurement uncertainties in IoM space, however, the uncertainties are very small and do not effect the structures found.The median uncertainties are for the energy σ E ∼ 220 km 2 s −2 ; and for the angular momenta σ L z ∼ 11 kpc km s −1 and σ L ⊥ ∼ 12 kpc km s −1 .
As demonstrated by Ruiz-Lara et al. ( 2022), the clusters identified by the algorithm are not necessarily physically independent from each other and can potentially be grouped together to form larger structures.To this end, we define the Mahalanobis distance between two clusters in IoM space as: where µ 1 , µ 2 and Σ 1 , Σ 2 correspond to the means and covariance matrices of the two cluster (ellipsoidal) distributions respectively.The value of D thus gives a relative measure of how close clusters are in IoM space.This distance metric may be visualised in a dendrogram and can thus be used for a second stage of linkage between clusters.By introducing a preliminary distance cut, we can identify larger groups as well as individual clusters, which we then proceed to characterise dynamically and chemically.

Results
Figure 1 shows the distribution of stars in our halo sample in the IoM (top panels) and the velocity space (bottom panels), as well as the 91 significant clusters (in colour) identified by the algorithm.Compared to EDR3, the fraction of stars in clusters is similar (∼13%) as is their distribution in these spaces.The most striking difference are the new clusters at a low binding energy.
The purple and blue clusters in Fig. 1 with v z ∼ 0 and large v φ have unexpected (possibly spurious) kinematics.Because their stars have |b| < 10 deg, exhibit higher than average radial velocity errors and their spectra display S/N < 10, we suspect also here unreliable radial velocities (see Katz et al. 2023).We thus removed these clusters from our analysis, leaving 89 significant clusters.
Figure 2 shows a dendrogram linking the 89 clusters by Mahalanobis distance.Several large groups are formed by clusters linking at small D , as seen previously with EDR3 (Lövdal et al. 2022).We tentatively set a limit at a Mahalanobis L2, page 2 of 7   Fig. 2. Relationship in the IoM space between the significant clusters found, shown as a dendrogram using the Mahalanobis distance between the clusters in this space (see main text for details).Using this Mahalanobis distance, the clusters are further joined up to a cut-off threshold (red dashed line) taken to be 3.3.The labels for the clusters are given on the x-axis, and the clusters that are joined to form larger groups have links of the same colour.Using this distance cut, we find 7 preliminary main groups, 1 cluster pair and 19 individual clusters.
distance (D lim ) ∼ 3.3 (red dashed line in Fig. 2).This Mahalanobis distance is motivated by our analyses of EDR3 and what we know from the literature on the halo so far (Koppelman et al. 2019;Naidu et al. 2020).Certain regions of IoM space are sensitive to our choice of D lim .For example, imposing a D lim larger than 3.3 linked together Sequoia and what we believe to be a separate smaller cluster, which we label as ED-3 in Fig. 3.

Main groups
Our tentative D lim cut suggests we may identify 7 primary groups, 1 small pair of clusters, and 19 independent clusters.The majority of these groups correspond to previously identified substructures.To better characterise each of the groups and remaining individual clusters, we proceeded to add additional members L2, page 3 of 7 with a Mahalanobis distance in the IoM space to each group or cluster of D < 2.13 (this cut corresponds to the value of D containing 80% of the cluster or group members, and was found to minimise noise when adding tentative members Lövdal et al. 2022;Ruiz-Lara et al. 2022).We also added tentative members by identifying all stars within 5 kpc from the Sun and after applying the same quality criteria, we adopted the same Mahalanobis distance cut to each group or cluster.This results in 33 233 stars (more than 3× the original number of members) in a group or individual cluster.We go on to discuss the properties of the different groups and clusters identified.The metallicity distributions (MDF) and their abundance patterns are shown in Figs. 4 and 5.The largest number of stars with a metallicity measurement stems from the LAMOST LRS set.It is reassuring to see in Fig. 4 that the MDFs obtained using GSP-SPEC and APOGEE are very similar, modulo the smaller number of stars (and possibly a small offset).We also note that the MDFs obtained with original or added members are very consistent with each other.We used the added members for characterising the MDFs (Fig. 4) and the chemical abundances (Fig. 5) of the groups, as well as the chemical information discussed for the smaller clusters in Sect.4.2.
The largest group is Gaia-Enceladus, with 2872 stars and 36 linked clusters.These stars can be seen to trace the halo peak of the MDF (see panel 1 of Fig. 4) and they form a clear sequence in [α/Fe] space (see Fig. 5).We note that GSP-SPEC offers a slightly less clear distinction between the sequence defined by Gaia-Enceladus stars and the hot-thick disk (see also Recio-Blanco et al. 2023), which is why we plot this separately in the middle panel of  2022), therefore, we refer to it as L-RL3.It contains 1958 stars and is made up of three clusters.The MDF shows that this group is a mix of two populations: a high-metallicity population (akin to that of the hot-thick disk) and a well-populated low metallicity tail (see Fig. 4).This can also be seen in the middle panel of Fig. 5 where the high-metallicity stars in this group populate both highalpha sequences, while the low-metallicity stars seem to define a sequence parallel to that of Gaia-Enceladus, but with lower [Mg/Fe].The mix of in situ and accreted populations is confirmed from their distributions in [Mg/Mn] vs. [Al/Fe] space.
The third-largest group corresponds to the heated (or 'hot') thick-disk stars, containing a total of 1450 stars.Shown in orange in Fig. 3, it is made up of 13 clusters, indicating a large amount of substructure (e.g., stripes in energy) in this component.Its MDF (third panel of Fig. 4) shows very little contamination from the metal-poor halo peak.The abundances seen in Fig. 5 (small orange triangles) show the characteristic high [Mg/Fe] at high metallicity of this in situ component.
Thamnos 1 and 2 can be seen in brown empty squares in Fig. 3 Sequoia can be seen in green in IoM space in Fig. 3, consisting of 247 stars and made up of two joined clusters.Its MDF shows hints of multiple peaks (see also Naidu et al. 2020) which do not, however, appear to correspond to separate dynamical structures.Figure 5 suggests that these stars (green triangles) follow a distinct chemical sequence from other halo substructures; for a fixed metallicity, they have lower α-abundances, as can be seen from both the GSP-SPEC and APOGEE data.
One new group (we refer to it as ED-1) corresponds to the red structure seen below the 'hot' thick disk in energy in Fig. 3.It contains 246 stars and is made up of four clusters.This substructure is made up of two kinematic components with similar v z magnitude but differing signs.The negative v z group contains 146 stars with means of (v z , v φ , v R ) = (−156.6,88.1, 29.0) km s −1 and dispersions of (σ v z , σ v φ , σ v R ) = (16.4,15.7, 75.5) km s −1 .The positive v z group contains 100 stars with means of (v z , v φ , v R ) = (142.8,88.6, 40.5) km s −1 and dispersions of (σ v z , σ v φ , σ v R ) = (13.9,15.5, 70.8) km s −1 .The MDF spans a wide range of metallicities and appears to exhibit several metallicity peaks, roughly corresponding to the hot-thick disk, Gaia Enceladus, and a more metal-poor relatively prominent peak at [Fe/H] ∼ −1.8.The abundances reveal stars located in both the accreted and in situ regions of chemical space.

Remaining individual clusters
Of the 19 clusters left ungrouped with our D lim cut, most have only ten (or fewer) stars with metallicities (even after adding members within 5 kpc) and therefore we do not show their MDFs.
The tight pair and the three small clusters located between Gaia-Enceladus and the hot-thick disk in L z −L ⊥ display metallicities and abundances that are mostly consistent with being in situ.Five small clusters overlap in IoM space with the region occupied by Gaia-Enceladus (see Fig. 3), however, the small number of stars with metallicity information make a clear association inconclusive.The lowest energy cluster shown in Fig. 3 corresponds to the globular cluster M 4.
The most retrograde cluster (in blue in Fig. 3, indicated as L-RL64) contains 59 original members, with 8 stars having LAMOST LRS metallicities with a mean of −1.67.It has a higher energy than Sequoia and its kinematics are also clearly distinct.This cluster has been identified before by Ruiz-Lara et al. (2022, as cluster 64, see their Fig.11, and re-discovered by Oria et al. 2022), where it was argued to be independent given Sequoia's estimated mass (from its mean metallicity), which would be inconsistent with such a broad extent in the IoM space (Koppelman et al. 2019).The three stars with APOGEE abundances have lower [Mg/Fe] than Gaia-Enceladus.
There is another smaller cluster located close to Sequoia in IoM space (ED-2, in pink in Fig. 3).With 32 original member stars, only 3 have a LAMOST LRS metallicity, but they are similar, namely [Fe/H] = −1.88,−2.07 and −2.19.This cluster is extremely tight in velocity space, as can be seen in the bottom row of Fig. 3, and it appears to form a stream in the x−z space.
The last small, highly retrograde cluster located in this region of IoM space can be seen in Fig. 3 in orange.ED-3 contains 16 original member stars, of which 2 have LAMOST LRS [Fe/H] of −1.48 and −1.47, and 1 has [Fe/H] = −1.37 from APOGEE, also suggesting a rather small spread in metallicity.Intriguingly, the APOGEE star has a low [Mg/Fe], but it is located in the in situ part of the [Al/Fe]-[Mg/Mn] space.
Two clusters are located in Fig. 3, directly above the Helmi streams in energy and L ⊥ at similar L z .The cluster with the lower energy, ED-4, contains 29 original member stars and has 5 stars with a LAMOST LRS metallicity ranging from <−1.6 to −1.05, and 2 stars with a low [Mg/Fe] abundance in APOGEE, suggesting an accreted origin.The cluster just above ED-4 in IoM (cyan in Fig. 3) overlaps with the recently reported Typhon (Tenachi et al. 2022).It contains 12 stars, 4 of which have a very similar LAMOST LRS metallicity; namely −1.23, −1.24, −1.32, and −1.38.Having three stars with such similar metallicities makes this a very interesting cluster.Unfortunately, we do not have any abundance information for these stars.
L2, page 6 of 7 Located at high energy and with retrograde motion, ED-5 in yellow in Fig. 3, contains 12 stars.The LAMOST LRS metallicities for three members are −1.46,−1.21 and −1.22, with the latter two stars having the same metallicity in APOGEE.These two APOGEE stars are on the low α track and both fall within the accreted region of the [Al/Fe]-[Mg/Mn] space.Therefore, this is a potentially interesting cluster to follow up on in the future.
The cluster above in the IoM space we dubbed ED-6 (shown in pink in Fig. 3), which contains 10 original members, 6 of them having a LAMOST LRS metallicity and showing a very small spread around −1.3, except for 1 of the stars at −0.86.One overlapping star has an APOGEE abundance, placing it in the boundary of accreted vs. in situ in [Al/Fe]-[Mg/Mn] space, but this is the outlier star with [Fe/H] of −0.86 dex.
There are three remaining clusters, one at low energy overlapping with L-RL3 in IoM space, one overlapping with Thamnos and the other cluster close to the Helmi streams in L ⊥ .None of these have sufficient metallicity information to allow for further commentary.In summary, in this section, we have identified 7 main groups.Of the preliminary 19 individual clusters, 8 of these can be tentatively associated to the larger groups.

Discussion and conclusions
We constructed a sample of dynamically selected nearby halo stars based on the Gaia DR3 dataset.Using a single-linkage based algorithm (and thanks to the excellent quality of this dataset), we have identified 89 clusters in Integrals of Motion space.By grouping these according to their Mahalanobis distance in this space, and by subsequently comparing the metallicities of the member stars using data from GSP-SPEC, APOGEE, and LAMOST LRS, we have been able to identify 7 groups and 11 individual independent clusters.Out of the 7 large groups, 6 have already been reported in the literature, namely Gaia-Enceladus, the hot-thick disk, Thamnos, Sequoia, Helmi streams, and L-RL3 (see e.g., Ruiz-Lara et al. 2022).In particular, ED-1 is a new dynamical group and it contains a mix of populations.It probably includes contamination from the hot thick disk and from Gaia-Enceladus, but its MDF reveals a peak at low metallicity, while the abundances of some of its stars suggest an accreted origin.
Of the 11 remaining independent clusters, 2 have been reported before: the globular cluster M4 and L-RL64 (Ruiz-Lara et al. 2022).A third cluster was reported as Typhon (Tenachi et al. 2022) at the time of this writing.These three clusters do not have sufficient metallicity or abundance information to provide richer commentary.The remaining five clusters  are all interesting in different ways: most are rather tight dynamically (especially ED-2) and some show a very small spread in metallicities (ED-2, ED-3, and tentatively ED-5 and ED-6), while all of them appear to have been accreted based on their location in the IoM space and the chemical abundances of a few member stars.
Given the complexity of the debris from large accretion events (see e.g., Koppelman et al. 2020;Amarante et al. 2022), some of these smaller clusters may ultimately be related to one another or the larger groups.To make progress towards our goal of inferring the assembly history of the Milky Way, we need to probe beyond the immediate solar vicinity and especially to obtain metallicities and more precise chemical abundances for a greater numbers of stars.

Fig. 1 .
Fig. 1.Members of the 91 significant clusters identified by our algorithm in our stellar halo sample, where different colours indicate a stars association with a cluster.Stars not attributed to a cluster are shown in the background in grey.The top and bottom rows show the IoM and velocity space, respectively.The relation between the clusters is given as a dendrogram in Fig. 2 and the joined groups in the same spaces in Fig. 3.

Fig. 3 .
Fig. 3. Members of our joined groups and individual clusters, where different colours indicate a star's association.Stars that are not part of clusters or groups are shown in the background in grey.The top and bottom rows show the IoM and velocity space, respectively.
Fig. 5, [Mg/Fe] vs. [Fe/H] from APOGEE.Following Horta et al. (2021), we also show [Mg/Mn] vs. [Al/Fe] based on APOGEE, which is a useful chemical space to separate more clearly accreted from in situ stars.The second largest group is shown in cyan in Fig. 3 at low energy.This group is very similar to the Cluster 3 identified in Lövdal et al. (2022), Ruiz-Lara et al. (

Fig. 5 .
Fig. 4. Metallicity distributions for the 7 groups identified in this study that have metallicity information.Different line-styles and shades of colour are used for the different surveys: LAMOST LRS, APOGEE, and GSP-SPEC.The grey histogram in the background shows the entire halo sample LAMOST LRS metallicities normalised for comparison with each group.