Free Access
Issue
A&A
Volume 645, January 2021
Article Number A69
Number of page(s) 17
Section Galactic structure, stellar clusters and populations
DOI https://doi.org/10.1051/0004-6361/202038178
Published online 15 January 2021

© ESO 2021

1. Introduction

Galactic cartography has gained a huge boost with the advent of astrometric, photometric, and spectroscopic data of unprecedented volume and quality from the Gaia mission (Gaia Collaboration 2016, 2018b), in combination with spectroscopic surveys such as APOGEE (Wilson et al. 2010; Abolfathi et al. 2018), RAVE (Kunder et al. 2017), and LAMOST (Cui et al. 2012). These datasets have revealed numerous substructures in different Galactic components. For example, large wave-like patterns and sharp ridges have been found in the stellar disc (Antoja et al. 2018; Kawata et al. 2018; Katz et al. 2018). These arches give rise to intricate structures in action-space (Trick et al. 2019), and their imprint varies with location in the disc (Ramos et al. 2018). They are likely due to a combination of perturbations of a satellite flying by (Minchev et al. 2009; Antoja et al. 2018; Laporte & Minchev 2018) and internal dynamical processes such as resonances with the bar and spiral structures (Monari et al. 2019; Hunt et al. 2019; Khanna et al. 2019; Chiba et al. 2019).

Studies, such as those mentioned above, show how complex and intertwined the phase-space structure of the Milky Way is. The expectation has been that the Galactic halo would be similarly complex as a result of the mergers experienced by the Milky Way (Helmi & White 1999). In fact, the analysis of Gaia DR2 has revealed that besides several small features, the local stellar halo is largely dominated by two structures, including a large, radially-anisotropic, slightly retrograde kinematic structure and a hot thick disc (Belokurov et al. 2018; Koppelman et al. 2018; Haywood et al. 2018). The origin of the former is related to the accretion of a massive (M ∼ 109M) dwarf galaxy known as Gaia-Enceladus (Helmi et al. 2018, or Gaia-Sausage, c.f. Belokurov et al. 2018). The accretion of Gaia-Enceladus took place 9–11 Gyr ago (Helmi et al. 2018; Di Matteo et al. 2019; Mackereth et al. 2019; Chaplin et al. 2020) and lead to heating of the proto-Milky Way disc (Helmi et al. 2018; Gallart et al. 2019; Chaplin et al. 2020).

A large fraction of the results listed above stem from the Gaia sub-sample containing full velocity information (Katz et al. 2019), also known as the 6D sample. Although impressive in size, the 6D sample is small relative to the 5D sample (i.e., without line-of-sight velocities). It comprises ∼7 million sources compared to a staggering ∼1.3 billion sources with parallaxes and proper motions. Most stars in Gaia DR2 have relatively large errors in their parallax measurements, as “only” ∼150 million sources have distances with relative errors <20%. Also the Gaia DR2 parallaxes are known to suffer from a zero-point offset of ∼−0.029 mas with a root mean square (RMS) of 0.03−0.05 mas, which varies with the location on the sky (Arenou et al. 2018; Gaia Collaboration 2018b; Lindegren et al. 2018). For brighter stars, the typical off-set may be closer to ∼0.05 mas (Schönrich et al. 2019; Leung & Bovy 2019; Zinn et al. 2019a; Chan & Bovy 2019). The poorer parallaxes in combination with the missing line-of-sight velocities therefore complicate the utilisation of the entire Gaia DR2 sample for dynamical studies.

A general approach to cope with poorly constrained or missing distances is to use the luminosity of the sources that are known as “standard candles”, such as RR-Lyrae. Their period-luminosity relation can be used to derive distances that are typically accurate up to ∼5%. This makes RR-Lyrae outstanding targets to study the morphology of the disc and stellar halo (e.g., Watkins et al. 2009; Sesar et al. 2010; Drake et al. 2012; Hernitschek et al. 2018; Iorio & Belokurov 2019; Zinn et al. 2019b). The downside of RR-Lyrae stars is that they are not abundant. Another approach is to identify stars of a specific type such as blue horizontal-branch (BHB) stars (Xue et al. 2008, 2011; Deason et al. 2012; Fukushima et al. 2017; Lancaster et al. 2019), or red-giant branch (RGB) stars (Morrison et al. 2009) whose absolute magnitude can be derived using isochrones when knowledge of their surface gravity (log g) or metallicity is available, such as from spectroscopic surveys (e.g., Leung & Bovy 2019; Conroy et al. 2019; Cargile et al. 2019).

In this paper, we use main sequence (MS) stars to study the Milky Way halo. The reason for focusing on MS stars is twofold. Firstly, they follow a relatively simple absolute magnitude relation as a function of colour, which can be used to calculate a photometric distance (e.g., Juric et al. 2008; Ivezic et al. 2008; Bonaca et al. 2012). Secondly, we can select them using only their photometry and proper motions (i.e., without knowing the distance to the stars) through a property known as the reduced proper motion (e.g., Jones 1972; Smith et al. 2009).

We describe the methods that we use in Sect. 2, and the selection of MS halo stars and calibration of their distances in Sect. 3. In Sect. 4 we explore the spatial distribution of the halo stars in the sample. We combine the spatial coordinates with the proper motions of the stars to calculate tangential velocities. In Sect. 5 we inspect the velocities of the full sample and in Sect. 6 we focus on the velocity distribution of a local sample. Finally, in Sect. 7 we summarise and discuss our results, and present our conclusions.

2. Methods

In this section, we describe the tools that we require to select halo MS stars and to calibrate photometric distances. The description of the data, selection, and calibration will be carried out in Sect. 3.

2.1. Photometric distance estimates for MS stars

Distances are notoriously difficult to measure in astronomy. Only about ∼10% of the parallaxes released in Gaia DR2 are precise (i.e., those with parallax_over_error >5). Another way of calculating distances is through the luminosity of a star. For specific types of stars, for which the intrinsic luminosity is known, we can calculate a photometric distance from the apparent luminosity. The relation between the intrinsic and apparent magnitude of a star in the Gaia G-band is given by

M G = m G 5 log 10 ( d kpc ) 10 A G , $$ \begin{aligned} M_G = m_G - 5\log _{10}\left(\frac{d}{\mathrm{kpc}}\right) - 10 - A_G, \end{aligned} $$(1)

where MG is the absolute magnitude of the star, mG is its apparent magnitude, d is its distance, and AG is the extinction in the G-band. For most sources, MG is unknown and Eq. (1) cannot be used to calculate a distance d.

In this work, we use the close to linear relation of the colour and absolute magnitude of MS stars to derive a distance. We note that the MS is only approximately linear in optical pass-bands, and in the near-infrared this approximation breaks down (e.g., Fig. 9 of Gaia Collaboration 2018a). Because of the strong correlation between MG and colour of the MS in the Hertzsprung-Russel diagram (HRD) we can find the distance independent relation

M G = f ( G G RP ) . $$ \begin{aligned} M_G = f(G-G_{\rm RP}). \end{aligned} $$(2)

We use the Gaia G − GRP colour because it is less prone to systematic effects than GBP − GRP, especially in crowded fields (Gaia Collaboration 2018b). Because G − GRP is distance independent, we can use Eq. (1) to calculate a distance that is a function of the apparent magnitude and colour only

d = 10 ( m G M G 10 A G ) / 5 . $$ \begin{aligned} d = 10^{(m_G - M_G - 10 - A_G)/5}. \end{aligned} $$(3)

When propagating the error in d, assuming that the error in mG can be neglected, we find that the relative distance error is

ϵ d / d = 0.2 log ( 10 ) ϵ M G , $$ \begin{aligned} \epsilon _d/d = 0.2\log (10) \epsilon _{M_G}, \end{aligned} $$(4)

where ϵMG is the error in the absolute magnitude.

2.2. Selecting MS stars

The method of determining distances that is described above is valid for MS stars. Giants and stars at the MS turn-off (MSTO) describe a sequence in the HRD that is too vertical or degenerate to find a reliable relation between MG and colour. Therefore, we need a (distance independent) way of selecting MS stars.

We identify MS stars using a combination of the proper motion and photometry known as a reduced proper motion (RPM, Jones 1972, see also Smith et al. 2009). The RPM of a star, in the Gaia G-band, is defined as

H G m G + 5 log 10 ( μ mas yr 1 ) 10 A G , $$ \begin{aligned} H_G \equiv m_G + 5\log _{10}\left(\frac{\mu }{\mathrm{mas\,yr}^{-1}}\right) - 10 - A_G, \end{aligned} $$(5)

where μ = μ 2 + μ b 2 $ \mu = \sqrt{\mu_\ell^2+\mu_b^2} $ is the amplitude of the proper motion. This equation is similar to Eq. (1) and the two can be combined to gain some insight

H G = M G + 5 log 10 ( v tan 4.74057 km s 1 ) , $$ \begin{aligned} H_G = M_G + 5\log _{10}\left(\frac{v_\mathrm{tan} }{4.74057 \,\mathrm{km\,s}^{-1}}\right), \end{aligned} $$(6)

where vtan is the tangential velocity of a star given by

v tan = 4.74057 km s 1 ( μ mas yr 1 ) ( d kpc ) , $$ \begin{aligned} v_\mathrm{tan} = 4.74057\,\mathrm{km\,s}^{-1} \left(\frac{\mu }{\mathrm{mas\,yr}^{-1}}\right) \left(\frac{d}{\mathrm{kpc}}\right), \end{aligned} $$(7)

where d is the heliocentric distance to the star.

When plotted as a function of colour, the HG−colour diagram (which we refer to as the RPM diagram) of a stellar population is equal to the HRD but with an offset that depends on vtan. If all the stars in the population have the same vtan, their sequence in RPM diagram and HRD will look exactly the same. However, if the stellar population has a mean vtan plus a few km s−1 dispersion, its sequence in the RPM diagram will be broadened by the logarithm of the velocity dispersion. Furthermore, populations with characteristic, specific tangential velocities will split into parallel sequences.

We exploit this splitting of the MS to select halo stars by identifying the region where the MS stars with high vtan are located. Since the vtan for the disc is small, even when considering the dispersion, the halo should appear as a separate sequence. Eqs. (5) and (6) imply that, for fixed vtan populations, HG will only be a function of G − GRP and can readily be computed. At the core of our selection method, we aim to locate high-vtan, MS stars in the RPM diagram.

This type of selection is conceptually not too different from a kinematic selection performed in the Toomre diagram, which is often used to identify halo stars in the 6D sample of Gaia (e.g., Nissen & Schuster 2010; Bonaca et al. 2017; Posti et al. 2018; Koppelman et al. 2018). Reduced proper motion diagrams are often used to select white dwarfs and classify them as belonging to the halo or the disc (e.g., Kalirai et al. 2004; Kilic et al. 2006; Fusillo et al. 2015; Torres et al. 2019; Geier 2020). However, it is important to note that the RPM selection has a clear bias: halo stars with a small tangential velocity will not be selected (e.g., halo stars moving only along the line-of-sight).

3. Data selection and calibration

3.1. Data and quality cuts

We start from the full sub-sample of Gaia DR2 with multi-band photometry. For the method outlined in Sect. 2, we have to rely on the photometry of the sources. Therefore, we impose the several cuts on the photometric quality of the stars. Firstly, we select stars with phot_g_mean_flux_over_error > 50 and phot_rp_mean_flux_over_error > 20. Secondly we require stars to have phot_bp_rp_excess_factor < 1.3 + 0.06·(bp_rp)2 and phot_bp_rp_excess_factor > 1.0 + 0.015· (bp_rp)2, where bp_rp = phot_bp_mean_mag - phot_rp_mean_mag. Besides cleaning the photometry, these cuts also remove sources with a bad astrometric solution (see Arenou et al. 2018).

We use the re-normalised unit weight error (RUWE) to further clean the sample. When DR2 came out, the Gaia Data Processing and Analysis Consortium (DPAC) recommended using the unit weight error (UWE) to filter sources with a bad astrometric solution (e.g., Lindegren et al. 2018; Arenou et al. 2018). However, the original UWE varies with colour and magnitude. Therefore, DPAC (Lindegren 2018) recommends the use of the re-normalised UWE (RUWE), which does not depend on stellar properties. Following their suggestion, we remove all the stars that have RUWE > 1.4.

Independently of the quality of the photometry, our sources are prone to extinction caused by absorption from dust in the interstellar medium (ISM) along the line-of-sight. We correct for the extinction using the Schlegel et al. (1998) maps.

These maps provide the extinction factor integrated along the entire line-of-sight. For distant stars we assume that all the absorbing ISM clouds lie in the foreground. However, for nearby stars we have to be careful, as some of the extinction in the Schlegel et al. maps might come from regions in the ISM behind the stars. Therefore, we use the approach outlined by Eqs. (10) and (11) from Binney et al. (2014) to calculate the amount of foreground dust for each star as a function of its parallax and the location on the sky. The amount of extinction is given by

A V ( , b , s ) = A V , ( , b ) 0 s ρ [ x ( s ) ] d s 0 ρ [ x ( s ) ] d s , $$ \begin{aligned} A_V(\ell ,b,s) = A_{V,\infty }(\ell ,b) \frac{\int _0^s \rho [\boldsymbol{x}(s^{\prime })] \mathrm{d}s^{\prime }}{\int _0^\infty \rho [\boldsymbol{x}(s^{\prime })] \mathrm{d}s^{\prime }}, \end{aligned} $$(8)

where AV, ∞(, b) is the extinction given by the Schlegel et al. maps, s is the heliocentric distance to the star, and x is the position vector of the star that lies at distance s in the direction of (,b) on the sky. For the dust density we follow the model presented by Eq. (16) of Sharma et al. (2011)

ρ Dust ( R , z ) = exp ( R R h R | z z warp | k flare h z ) $$ \begin{aligned} \rho _{\rm Dust}(R,z) = \exp {\bigg (\frac{R_\odot -R}{h_R}} {-\frac{|z-z_{\rm warp}|}{k_{\rm flare}h_z}\bigg )} \end{aligned} $$(9)

where zwarp and kflare describe the warping and flaring of the disc

k flare ( R ) = 1 + γ flare Min ( R flare , R R flare ) $$ \begin{aligned} k_{\rm flare}(R) = 1+\gamma _{\rm flare}\mathrm{Min}(R_{\rm flare},R-R_{\rm flare}) \end{aligned} $$(10)

and

z warp ( R , ϕ ) = γ warp Min ( R warp , R R warp ) sin ( ϕ ) $$ \begin{aligned} z_{\rm warp}(R,\phi ) = \gamma _{\rm warp}\mathrm{Min}(R_{\rm warp},R-R_{\rm warp})\sin {(\phi )} \end{aligned} $$(11)

with values hR = 4.2 kpc, hz = 0.088 kpc, γwarp = 0.18 kpc−1, Rwarp = 8.4 kpc, Rflare = 1.12 R, and γflare = 0.0054 kpc−1, which are based on the model of Robin et al. (2003). Effectively, we derive in this way an extinction fraction A V ( b , , s ) A V , ( b , ) $ \frac{A_V(b,\ell,s)}{A_{V,\infty}(b,\ell)} $ which encodes what fraction of the full extinction correction should be applied.

Following Binney et al. (2014), we scale the Schlegel et al. maps because the reddening in the regions E(B − V) > 0.15 is overestimated (e.g., Arce & Goodman 1999). The correction factor that we apply is

f ( E ( B V ) ) = 0.6 + 0.2 [ 1 tanh ( E ( B V ) 0.15 ) 0.3 ) ] . $$ \begin{aligned} f(E(B-V)) = 0.6 + 0.2\bigg [1-\tanh {\bigg (\frac{E(B-V)-0.15)}{0.3}}\bigg )\bigg ]. \end{aligned} $$(12)

This factor scales the highly reddened regions by a factor of 0.6. We note that the results that are presented in this work are not affected by this scaling.

For stars with parallaxes smaller than 0.1 mas we apply the full extinction fraction because they are likely to be distant stars. For all the other stars we invert the parallaxes to obtain an estimate for the distance. We ignore the error on the parallax because we only are looking for an estimate of the distance. On average, the parallax-errors increase with heliocentric distance. Nearby sources, for which the correction fraction is essential, will have relatively good parallaxes. The correction fraction is larger than 0.90 for more than 90% of the sources, only 1.6% of the stars receive a correction of smaller than 0.50. The resulting weighted AV values are transformed to AG, ABP, and ARP using the relations given by Malhan et al. (2018; they originate from the Padova model1 and are originally based on Cardelli et al. 1989).

As a final quality selection criterion we remove sources located in areas on the sky where the extinction is larger than AV >  2.0. These highly extinct sources are mostly found close to the plane of the disc, so the cut acts as a filter for the Galactic disc.

3.2. Fitting the MS

Our method is contingent upon having a reliable fit for the absolute magnitude of halo MS stars as a function of colour. Gaia Collaboration (2018a) have shown that tangential velocities can be used to identify nearby halo stars (i.e., vtan >  200 km s−1). This set of high-vtan stars is characterised by two sequences in the HRD, which are known as the blue and red sequence (e.g., Gaia Collaboration 2018a). The red sequence is kinematically reminiscent of the slower-rotating, hotter tail of the thick disc and the blue sequence of a classic halo that has a close to zero rotation (e.g., Koppelman et al. 2018; Haywood et al. 2018; Di Matteo et al. 2019; Gallart et al. 2019).

For the purpose of mapping the Galactic halo, we are mainly interested in the stars in the blue sequence. We expect the red sequence to be more important closer to the plane of the disc and in the inner Galaxy. It is currently not possible to avoid contamination from the red sequence (metallicity information could help, e.g., Gallart et al. 2019). Its stars are, both kinematically and photometrically, too similar to the stars in the blue sequence. However, for the fitting, we strive to keep the contamination from the red sequence to a minimum. One way of doing this is to increase the cut in vtan to a larger value. However, even at vtan > 300 km s−1 the contribution of the red sequence is still ∼22% (Sahlholdt et al. 2019).

Figure 1 shows an HRD of all the stars in Gaia DR2 that remain after imposing the quality cuts described in Sect. 3.1 and two additional criteria: (parallax_over_error > 50) & (vtan > 200 km s−1). For illustrative purposes we use here vtan > 200 km s−1 rather than 300 km s−1 because this brings out the two sequences better. However, for the fitting procedure we use vtan > 300 km s−1.

thumbnail Fig. 1.

Hertzsprung-Russel diagram (HRD) of a local sample of stars with large tangential motions (vtan > 200 km s−1) and very high-quality parallaxes (parallax_over_error > 50). Overlayed is an 11 Gyr age and [M/H] = −0.5 metallicity isochrone (red). This isochrone is shifted to the left by 0.01 mag in G − GRP to split the two sequences that are shown. The inset shows the two sequences without the red line overplotted to make the gap between them appreciable.

After the vtan cut we impose a MG-colour cut to remove the last bit of the red sequence contamination. For this cut, we use a synthetic isochrone that was first used by Gaia Collaboration (2018a) to describe the red sequence. The isochrone, overlayed in Fig. 1, describes a stellar population of a metallicity and age of [M/H] = −0.5 and 11 Gyr. This isochrone is obtained from Marigo et al. (2017), after enhancing the α elements by 0.23 (Salaris et al. 1993). The specific isochrone is chosen because it matches well with the red sequences as shown by Gaia Collaboration (2018a), and it has the alpha-enhancement characteristic of the halo. We note that this isochrone is not fitted, but simply describes well the valley between the two sequences when shifted by 0.01 mag in G − GRP. All the stars to the right of the isochrone (i.e., those belonging to the red sequence) are removed. As a final quality cut we remove a handful of sources that are offset from (i.e., are below) the MS. For this cut we remove the stars with ((G − GRP <  0.65) & (MG > 8)) OR ((GGRP < 0.8) 8)) & (MG >  10)).

For the final part of this section, we fit the cleaned MS in two ways: a simple three-component, piece-wise linear fit and a more accurate fit using a running mean and standard deviation of the MS. The three-component fit has a straight forward parametrisation, which is ideal for the construction of the MS selection. We select the MS stars from the full dataset of Gaia, so a computationally efficient parametrisation is beneficial. On the other hand, the running mean is the most accurate fit, which is crucial for the calculation of the photometric distances. We have tested using a synthetic isochrone like the one overlaid in Fig. 1 instead of fitting the MS. However, the isochrone does not perfectly trace the MS over the full colour range. The fit on the data is more precise, given that there are enough stars per bin.

The three components of the linear fit describe the MS in the absolute-magnitude ranges: (4 <  MG <  6), (6 <  MG <  8), and (MG >  8). For the running mean we split the MS into 128 bins in the range of 0.35 <  (G − GRP) < 1.1 and remove stars with (MG <  4), which is roughly where the MSTO occurs. In each colour-bin, we calculate the mean absolute magnitude and the standard deviation. The resulting fit closely describes the width and amplitude of the absolute magnitude as a function of G − GRP. Both fits are run on a sample of high-vtan stars with good parallaxes: (parallax_over_error > 50) & (vtan > 300 km s−1).

Figure 2 shows both the three-component linear fit (red, dashed) and the running mean (blue). In the background, we show the sample that is fitted. The two fitting procedures agree well with the data. The top panels show the MS sample that was used in for the fitting before (left) and after (right) the photometric cleaning described in this section.

thumbnail Fig. 2.

Top panels: raw data (left) and the cleaned data (right) that are used for the fit. Middle: two fits of the main sequence (MS). Bottom: error in distance is related to the measured thickness of the MS. The dashed lines indicate the range where the MS is reliable enough to calibrate photometric distances.

3.3. Inferring distances for MS stars

We use the running-mean-fit from Sect. 3.2 to calibrate photometric distances. For each star we find the colour-bin in which it falls. The bin sizes are sufficiently small so we do not perform any interpolation. We assume that the absolute magnitude of the star is the same as the mean value found for the specific bin. Using Eqs. (3) and (4), we calculate the distance and its relative error.

The bottom panel of Fig. 2 shows the typical error, based on the width of the MS in absolute magnitude. Overall, the expected error in the photometric distance is quite small, averaging 7% for a large fraction of the MS. The relative distance error of 10% given for stars at the MS turn-off (0.35 <  G − GRP <  0.45) is somewhat misleading. Since the sequence here is close to vertical, the range in possible magnitudes is larger than what is indicated by the error bars. The fit of the MS aims to trace the faint part of the MSTO, since we do not fit for stars brighter than MG <  4. The fitted absolute magnitude is comparable to or smaller than the true absolute magnitude. Therefore, distances for MSTO stars that are far away from the fitted sequence are typically underestimated and on average not overestimated. For comparison, if the intrinsic brightness of a source is underestimated by one magnitude, which is typical for the vertical extent of the MSTO, the distance will be underestimated by 37%. Another systematic bias that is not included in the relative error is the difference of half a magnitude that is typical for the offset between the red and blue sequence. This offset results in distances for red-sequence stars that are systematically underestimated by 20%.

For faint MS stars, at G − GRP ≈ 0.7, there is a break in the MS. This break is a known feature in the faint MS for low-mass stars (Saumon et al. 1994; Cassisi et al. 2000). It is caused by a low effective temperature in combination with collisionally induced absorption. The feature is observed in globular clusters (Bono et al. 2010, who use it to determine the age of NGC 3201) and in the Galactic bulge (Zoccali et al. 2000). The width of the MS increases in the region redder than this break. As a result, the photometric distances are less reliable for stars with G − GRP >  0.715. The brightest of these (MG ∼ 8) are only visible out to ∼4 kpc (assuming a Gaia limiting magnitude of 21 in the G-band). Our interest in the halo lies mainly with stars more distant than 4 kpc and with stars with reliable distances. Therefore, when using the newly calibrated photometric distances, we often constrain ourselves to stars in the range of 0.45 <  G − GRP <  0.715 only. This range is indicated in the bottom panel of Fig. 2 with vertical dashed lines. Of course, the distance calibration described in this section only works for MS stars. Therefore, the last step in the construction of our sample is to remove contamination from other types of stars (e.g., giants and white dwarfs).

3.4. Selecting MS stars

The method of selecting halo MS stars in the RPM diagram is best understood when visualised. Figure 3 shows the RPM as a function of colour for the full Gaia data set after imposing the quality cuts from Sect. 3.1. To find the location of MS stars with a large tangential velocity we place the three-component linear fit from Sect. 3.2 on top of the density map in Fig. 3. We add the vtan-based offset from Eq. (6) to the line. The upper line of the box in Fig. 3 is given by the fit plus an offset of 200 km s−1, the lower line has an offset of 800 km s−1. Therefore, the stars that fall between the horizontal lines are those that, based on their location in the RPM diagram, are on the MS and have a velocity in the range of 200 km s−1 <  vtan <  800 km s−1. The vertical lines of the box are set to 0.35 <  G − GRP <  1.1. The blue limit (0.35 <  G − GRP) is chosen because there are no MS stars bluer than this in the halo, see for example Fig. 1. At the other end, we truncate the selection at G − GRP = 1.1 because this is approximately where the MS ends. We note that the truncation is beyond the red limit where the distances become less reliable. However, these stars are still likely halo stars and therefore we add them to the sample.

thumbnail Fig. 3.

Reduced proper motion diagram for all sources in Gaia that pass the quality cuts of Sect. 3.1. The sample of sources within the red lines are selected to be tentatively halo stars. The red lines are drawn based on the three-component fit of the MS. The location of the box is chosen to select halo stars with 200 km s−1 < vtan < 800 km s−1.

3.5. Final quality checks

3.5.1. Removing white dwarfs

Figure 4 shows the HRD of a subset of the RPM sample with parallax_over_error >5, where the MG is calculated using the parallaxes. The red line is the three-component fit that is also shown in Fig. 2, and the blue is this same line, but offset by 2 mag in MG. Below the blue line, there is contamination from faint sources. Amongst these sources is a population of white dwarfs, visible at 12.5 <  MG <  15. The purpose of the blue line is to filter this contamination, that is, we remove all the sources below the line (and have parallax_over_error >5).

Based on the number of sources with reliable parallax_over_error > 5, we estimate that the contamination is at most 0.2%. White dwarfs are intrinsically faint objects, the brightest few in our sample have an absolute magnitude of MG ∼ 13. As a result, we expect no contamination of white dwarfs farther out than ∼0.40 kpc, beyond which they fall below the magnitude limit of Gaia. Because white dwarfs can only be observed in the close vicinity of the Sun, we expect that all of them have relatively good parallaxes. Therefore our filter shown in Fig. 4 should remove all contamination from white dwarfs.

thumbnail Fig. 4.

HRD of sources in the MS selection with precise parallaxes (parallax_over_error >5). The three-component fit of the MS of Sect. 3.2 is shown as a red, dashed line and with an offset of 2 mag as blue line. Contamination of white dwarfs and other faint sources is reduced by removing all the stars below the blue line.

Giants are an unlikely source of contamination because of their intrinsic brightness and even the uncertainties in the proper motions or photometry are not large enough. Therefore, we estimate that the fraction of contamination of non-MS stars in our final RPM sample is negligible.

3.5.2. Quality of photometric distances

We test the quality of the photometric distances in Fig. 5, which shows photometric distance quality for the same sample of stars used for fitting the MS. The quality is displayed as a function of the colour G − GRP, the amplitude of the proper motion, and G-magnitude. The vertical dashed lines in the top right panel of Fig. 5 indicate the limit for reliable distances that is described in Sect. 3.3. The stars that are outside of this range (in grey) are not included in the other panels. The typical parallax-distance error in the sample that is shown can be neglected, it is ≲2%. These figures confirm that there is no dependence and that the distance derivation works properly. The photometric distances agree well with the parallaxes. In each panel of Fig. 5, we overlay two blue horizontal dashed-lines that correspond to photometric distances that are 10% off from the parallaxes.

thumbnail Fig. 5.

Inspection of the quality of the photometric distances compared to the parallax (top, left), and as a function colour (top, right), proper motion (bottom, left), and extinction corrected G-magnitude (bottom, right). As a reference, we overlay dashed lines indicating a difference of ±10% in the distance. The quality of the photometric distances does not depend on any of the values showed here, as is to be expected.

3.5.3. Cross-matches with spectroscopic surveys

As a last check before investigating the properties of the RPM halo sample, we briefly explore the chemical properties of stars cross-matched to existing spectroscopic surveys. Figure 6 shows the results of the overlap of the RPM sample with APOGEE DR16 (Ahumada et al. 2020) and with LAMOST DR4 (Cui et al. 2012).

thumbnail Fig. 6.

Chemical abundances (left) and metallicities (right) of stars included in the RPM sample as derived from cross-matching to APOGEE DR16 and to LAMOST DR4. In both panels, we denote the distribution of the full samples provided by each of these surveys, after quality cuts, in black, while in blue we denote the subset that overlaps with the RPM sample. The vast majority of these stars are clearly all metal-poor, as expected for halo stars.

For the comparison with APOGEE we plot in the left panel of Fig. 6 the characteristic [Mg/Fe] vs [Fe/H] diagram, for the full dataset in black and with the halo RPM stars in blue. We show here only stars that pass the following quality criteria: VERR < 0.2, SNR > 70, TEFF > 3700, STARFLAG bitmask values of 0, 3, or 4 and ASPCAPFLAG bitmask values of 10 or 23 where SNR is the catalogue parameter for the signal-to-noise ratio (these criteria are based on Hayes et al. 2020). This figure shows that stars overlapping the RPM sample are all metal-poor and mostly lie on the halo sequence. Only a handful of stars lie on the high-α disc sequence. These stars are most likely part of the “hot” thick disc (i.e., the “red sequence” discussed in Sect. 3.3). These stars have halo kinematics and are unavoidable in a kinematically selected halo sample such as the RPM sample explored here.

The panel on the right in Fig. 6 shows the metallicity distribution for LAMOST in black, and for the overlap with the RPM sample in blue. In this case we have imposed the following quality cuts: snri > 20 and snrg > 20. Again, the RPM subset is clearly more metal-poor than the full sample. The overlapping subset (blue curve) shows a secondary bump at [Fe/H] ∼ −0.8 that most likely corresponds to stars on the “hot” thick disc sequence that are also visible in the cross-match with APOGEE. This interpretation is supported by the fact that when using the LAMOST line-of-sight velocity measurements, stars with [Fe/H] > −0.9 have more disc-like kinematics than the more metal-poor stars (which all have halo-like motions). This result is fully consistent with the analyses of the halo and “hot” thick disc stars of, for example, Haywood et al. (2018) and Di Matteo et al. (2019). We have also performed similar tests with data from GALAH DR2 (Buder et al. 2018), RAVE DR5 (Kunder et al. 2017), and APOGEE-astroNN (Leung & Bovy 2018) and reached similar conclusions as those presented here.

4. Spatial distribution of the RPM sample

The final sample obtained using the above described procedures comprises 11 711 399 tentative MS halo stars. The subset with reliable photometric distances (i.e., stars in the range 0.45 <  G − GRP <  0.715, which excludes MSTO and faint-end stars) comprises 7 117 555 stars.

We now inspect the spatial distribution of the stars in the RPM sample and check for signs of substructure. Because MS stars are intrinsically faint objects, Gaia can only observe them up to ∼16 kpc, assuming the brightest star has MG ∼ 5 and Gaia’s limiting magnitude in G is ∼21 mag. However, at the faint-end, Gaia is far from complete. Only 7384 sources ( ∼ 0.1%) in the sample have a photometric distance larger than 10 kpc – the median photometric distance of the sample is 4.39 kpc.

Figure 7 shows the distribution of the galactocentric distance, G-magnitude, and photometric distance. The top panel shows a centrally concentrated distribution. Close to 83% of the stars are located inside of the solar radius (dgc <  8.2 kpc). This high concentration towards the centre is to be expected, the stellar halo is known to have a steep density profile (e.g., Juric et al. 2008; Deason et al. 2011).

thumbnail Fig. 7.

Top: distribution of galactocentric distances. The dashed line indicates the location of the Sun. Bottom: distribution of G-magnitudes (left) and photometric distances (right) of the stars in the RPM sample.

Figure 8 shows the distribution of the stars in the RPM sample in a Mollweide projection, colour-coded by the logarithm of the number of stars per pixel. The maps are created at a healpix level of 7, resulting in 196 608 equal area pixels. The left panel shows the distribution for the complete sample and the right panel shows all sources with a photometric distance larger than 5 kpc (∼35% of the sample). Several globular clusters are visible in both panels as small yellow dots, they are highlighted with white markers. These globular clusters are all nearby, they are NGC 7099, NGC 362, NGC 5904, NGC 6341, NGC 5466, and NGC 288.

thumbnail Fig. 8.

Sky-maps of RPM selected sample of halo stars. The left panel shows the full RPM sample, and the right panel shows a selection of the 35% most distant stars. Strong signatures of the Gaia scanning pattern are visible in both panels. Sources in the plane of the disc are filtered by our quality cuts. The white circles mark the location of six globular clusters that are picked up by the RPM method and which are clearly visible when zooming in.

Gaia’s scanning-pattern (c.f. Lindegren et al. 2018; Arenou et al. 2018, where these patterns are shown and discussed) is prominent in both maps but is most clearly seen in the right panel. The sinusoidal band with an amplitude of ∼60° in b is a known artefact in the Gaia DR2 data related to insufficient subtraction of zodiacal light (c.f. Fig. 18 of Evans et al. 2018). Besides these systematic effects, the right panel shows significant signs of incompleteness, most clearly apparent in the number of bins with zero sources (white pixels). The pixels with no stars, in the left panel, are mostly coinciding with high-extinction regions. See, for example, the disc region, but also the area near the Magellanic Clouds.

The spatial distribution of the sample, shown in Fig. 9, displays a similar level of structure. The coordinate system is oriented such that the Galactic Centre is in the positive X-direction and the rotation of the disc is in the positive Y-direction. To mitigate the contamination from highly extinct sources and from the thick disc we have removed all the stars with |b| < 20°. The combined effects of the remaining dust extinction, the scanning pattern of Gaia, and the errors in the photometric distance create the radial features seen in the figure.

thumbnail Fig. 9.

Spatial distribution of the RPM sample in heliocentric Cartesian coordinates. The sample is distributed uniformly, and the density of sources decreases with heliocentric distance, as is expected for a magnitude-limited sample. A combined effect of dust extinction and the scanning pattern of Gaia creates stripes. Sources with |b| < 20° have been removed (i.e., highly reddened stars located close to the disc).

Maps such as those shown in Fig. 9 project 3D-structure onto a 2D-plane. As a result, most small-scale structure is smoothed out. A simple method to inspect the internal 3D-structure is to minimise the smoothing effect by dividing the sample in thin slices. In Fig. 10 we project wedges in in heliocentric cylindrical (R, Z) coordinates, each wedge is 36° in width. The maps are binned (128 × 256 bins) and coloured by the logarithm of the number of stars per bin. Low-latitude areas (|b| < 20°) are highly affected by extinction, which introduces systematic errors in the distance estimate. Therefore, these areas are coloured in greyscale to focus the attention to the less affected parts. We overlay overdensities found in SDSS (i.e., Table 4 of De Jong et al. 2010), see the caption of the figure for more information. Most of the large overdensities such as the Virgo Over Density (VOD; Newberg et al. 2002; Juric et al. 2008; Bonaca et al. 2012), the Hercules-Aquila Clouds (HAC; Belokurov et al. 2007; Simion et al. 2014), TriAnd (Majewski et al. 2004; Rocha-Pinto et al. 2004; Deason et al. 2014) and similar structures (e.g., De Jong et al. 2010; Grillmair et al. 2016) are too distant to be detected in our sample.

thumbnail Fig. 10.

Spatial distribution in wedges of Δ=36°, projected in cylindrical heliocentric (R, Z) coordinates. Low latitude (|b|< 20°) areas are desaturated because these are likely contaminated by thick disc stars. The labels correspond to the overdensities found by De Jong et al. (2010, i.e., their Table 4): Virgo (V), Monoceros (M), (6.5, 1.5) (J; detected originally by Juric et al. 2008), and the unlabelled structures (U). These overdensities lie mostly just beyond our sample’s reach, but might become visible with a similar analysis with upcoming Gaia data releases.

The distribution of stars is not isotropic as, for example, is shown in the lower-left panel of Fig. 10 which depicts a strong north-south asymmetry. This asymmetry is likely caused by incompleteness of Gaia because of its scanning pattern. There is an asymmetry in the number of faint stars (G >  19) when comparing the source number counts in that specific wedge for b >  0 versus the b <  0. Some of the asymmetries are related to the photometric quality cuts that we impose, or are linked to the Gaia scanning pattern. They, for example, correlate with the Gaia DR2 catalogue parameter visibility_periods_used. The complexity of the structure in the data makes it non-trivial to compare the structure against a smooth model, which makes the significance of the structure present in the maps here unclear.

We check for spatial trends with extinction in G-mag in Fig. 11 where the mean extinction is shown in bins projected on the XY- and XZ-plane. As expected, the extinction strongly correlates with Galactic latitude b. Stars with low-latitude (the disc region) are strongly affected by extinction. The pattern in the XY-plane is less intuitive to understand. The low-latitude features in b correlate non-uniformly with features in . Towards the anti-centre (X <  0) the extinction is less significant. From the right panel, it is clear that a significant fraction of the high-extinction sources is easily removed by applying a simple cut in latitude b. Although these low-latitude stars are removed in Fig. 9, some of the stripes still seem to correspond to the high-extinction features.

thumbnail Fig. 11.

Spatial distribution of the RPM sample projected onto the XY-plane (left) and XZ-plane (right). The sample is binned in 256 × 256 bins, each bin is colour-coded by the mean extinction in G-mag. As expected, low-latitude regions are the most affected. The extinction creates interesting features as a function of galactic (i.e., the fingers of God features).

5. Velocity content of the RPM sample

The spatial distribution of the RPM sample shows many features embedded in what is likely a smooth background. Much of the structure that is visible could be related to incompleteness and selection effects. In the next part we combine the spatial information with that encoded in the proper motions to filter halo structures from the smooth background. This will be easier when such structures move sufficiently different from the “background” (in the particular region of the sky). The degree of distinction will depend also on the magnitude of the velocity errors.

5.1. Binned velocity moments

A powerful, yet simple, tool is to bin moments of the velocity distribution on the sky. Using the photometric distances, we convert the proper motions to tangential velocities using

v j = 4.74057 km s 1 ( μ j mas yr 1 ) ( d kpc ) , $$ \begin{aligned} v_{j} = 4.74057\,\mathrm{km\,s}^{-1}\, \left(\frac{\mu _{j}}{\mathrm{mas\,yr}^{-1}}\right)\, \left(\frac{d}{\mathrm{kpc}}\right), \end{aligned} $$(13)

where j = (,b). These tangential velocities can be corrected for the solar reflex motion

v j = v j + v j , , $$ \begin{aligned} v_{j}^*= v_{j} + v_{j,\odot }, \end{aligned} $$(14)

using

v , = U sin + ( V + v LSR ) cos , $$ \begin{aligned} v_{\ell ,\odot } = -U_\odot \sin {\ell } + (V_\odot +v_{\rm LSR})\cos {\ell }, \end{aligned} $$(15)

v b , = W cos b sin b · ( U cos + ( V + v LSR ) sin ) , $$ \begin{aligned} v_{b,\odot } = W_\odot \cos {b} - \sin {b}\cdot (U_\odot \cos {\ell } + (V_\odot +v_{\rm LSR})\sin {\ell }), \end{aligned} $$(16)

where we use the Schönrich et al. (2010) solar motion (U, V, W) = (11.1, 12.24, 7.25) km s−1 and the McMillan (2017) motion of the local standard of rest (LSR) vLSR = 232.8 km s−1. The resulting tangential velocities are in the Galactic rest frame, but as observed from the solar position.

For our sample of MS stars, the median velocity errors taking into account the error in the distance as well as that in the proper motions, are ϵ(v)∼22 km s−1 and ϵ(vb)∼15 km s−1. For MSTO stars and those farther away than 6 kpc, the median errors are larger, namely ϵ(v)∼42 km s−1 and ϵ(vb)∼28 km s−1.

Figure 12 shows the mean v $ v_\ell^\ast $ and v b $ v_b^\ast $ velocities binned on the sky. We only consider stars with a photometric distance d >  6 kpc because we are interested in picking up distant streams (nearby streams will not appear as coherent, thin structures on the sky). Besides large-scale velocity patterns, which we discuss below, a few stream-like features stand out because members of a stream move in the same direction and with similar mean velocity, or at least sufficiently distinct from that of the background given the velocity errors. The most conspicuous structure is the GD-1 stream (Grillmair & Dionatos 2006), but also Jhelum is apparent as well as tentatively Leiptr (cf. Fig. 6 of Ibata et al. 2019).

thumbnail Fig. 12.

Distribution of the mean, solar-motion-corrected v (top) and vb (bottom) velocities binned on the sky for distant halo stars (d >  6 kpc). Large-scale streaming motions as well as several small streams stand out. This map includes MSTO stars, which might have large uncertainties in the distances. See Appendix A and Fig. A.1 for a version without MSTO stars.

As mentioned above, we have included all of the stars in our sample also those near the MSTO, even though their distances will on average be underestimated by up to ∼40% (see Sect. 3.5.2). Yet, because they are intrinsically brighter they allow us to probe the farthest into the halo. This means that the mean velocities shown in the figure might not be very accurate, yet including these MSTO stars allows us to probe distant streams whose motions are sufficiently different from those of the background stars in a similar portion of the sky, as appears to be the case for GD-1 and the Jhelum streams.

The large all-sky patterns that are seen in Fig. 12 are reminiscent of a rotation signal (particularly in v $ v_\ell^\ast $), and this is plausibly related to contamination from the hot thick disc. The pattern in v $ v_\ell^\ast $ aligns perfectly with the signal of the solar motion around the Galactic Centre (i.e., the change of sign at  = ±90°), further supporting the interpretation of rotation.

Although the correction for the solar motion, and in particular the contribution from the LSR motion, could potentially affect these velocities (since the correction is dependent on the photometric distance), the effect is unlikely to be significant. This view is supported by the large-scale pattern seen in v b $ v_b^\ast $ in the figure, which cannot be due to the solar motion correction (because of its dependence on galactic longitude).

Let us now inspect the cross-correlation between the velocity components projected on the sky. Previous work, by Iorio & Belokurov (2019) using a sample of RR-Lyrae stars, has shown that the Pearson correlation of μ and μb creates a pattern on the sky indicative of the halo being radially anisotropic, see for example their Fig. 5. Figure 13 shows the mean correlation coefficient in bins on the sky for our RPM sample. The coefficient is calculated for the proper motions after correcting for the solar motion (i.e., using Eq. (16) and the inverse of Eq. (13)), it is defined as

Corr ( μ , μ b ) = cov ( μ , μ b ) std ( μ ) std ( μ b ) . $$ \begin{aligned} \mathrm{Corr}(\mu _\ell ^*,\mu _b^*) = \frac{\mathrm{cov}(\mu _\ell ^*,\mu _b^*)}{\mathrm{std}(\mu _\ell ^*)\mathrm{std}(\mu _b^*)}. \end{aligned} $$(17)

thumbnail Fig. 13.

Mean Pearson correlation coefficient of the proper motions projected on the sky after correction of the solar reflex motion. The signal that is present was first observed in a sample of RR-Lyrae stars (Iorio & Belokurov 2019, see their Fig. 5). A radially anisotropic halo explains the pattern, mainly caused by the debris of Gaia-Enceladus.

The Pearson correlation ranges from [−1,1, given the Cauchy-Schwarz inequality. Figure 13 clearly shows that there exists a strong correlation between the two proper motion components. If the velocity ellipsoid of the stars in our sample were isotropic there would be no signal and if it were biased towards circular orbits it would have a very different signal. The correlation pattern that is observed is similar to that detected by Iorio & Belokurov (2019) and this is direct evidence that stars on radial orbits dominate the halo sample.

5.2. Global kinematic maps

Using the ( v $ v_\ell^\ast $,  v b $ v_b^\ast $) tangential velocities defined above, we can calculate pseudo-3D velocities. These are not the true 3D velocities because we assume that the line-of-sight velocities of all the stars are zero (i.e., v los = 0 $ v^\ast_{\rm los} = 0 $). This assumption is valid if the velocity distribution is centred on zero in the galactocentric rest frame. However, they will not be zero on average for local regions on the sky because of the imprint of the motion of the Local Standard of Rest around the Galactic Centre, following a sincosb pattern.

The equations for the pseudo-Cartesian velocities are

v x = v sin v b cos sin b , $$ \begin{aligned} \tilde{v}_x = -v_\ell ^*\sin {\ell } - v_b^*\cos {\ell }\sin {b}, \end{aligned} $$(18)

v y = v cos v b sin sin b , $$ \begin{aligned} \tilde{v}_y = v_\ell ^*\cos {\ell } - v_b^*\sin {\ell }\sin {b}, \end{aligned} $$(19)

v z = v b cos b . $$ \begin{aligned} \tilde{v}_z = v_b^*\cos {b}. \end{aligned} $$(20)

We adopt the notation (x, y,z) for this set of velocities to make clear they are not the true Cartesian velocities. Subsequently, we calculate galactocentric cylindrical velocities and adopt a similar notation (ϕ, R, z). To obtain these coordinates, we place the Sun at X = −8.2 kpc (which agrees well with the recently determined distance to the massive black hole in the centre of the Galaxy by GRAVITY Collaboration 2018; McMillan 2017).

Besides looking for narrow streams, we can also use the velocity information to investigate the footprint of some interesting selections in velocity space. The two regions that we inspect are the retrograde halo, selected as all the stars with ϕ < −150 km s−1, and the radial halo, selected as |ϕ| < 50 km s−1 and |R | > 200 km s−1. We do not select stars with a small pseudo space-velocity (i.e., |R| < 200 km s−1) because the missing line-of-sight velocity moves stars towards zero velocities, so this is where we expect to find significant contamination.

Figure 14 shows the distribution on the sky of the full sample (top), all retrograde stars (middle), and stars on radial orbits (bottom). These maps reveal a centrally concentrated halo, with most of the stars near the Galactic Centre (yellow colours). Both the retrograde maps (middle row) and radial maps (bottom row) show a clear footprint. At first sight, the footprint of the radial halo is very similar to the full-sky maps made of nearby Gaia-Enceladus stars, shown in Helmi et al. (2018; see also the skymaps presented in Iorio & Belokurov 2019). However, this footprint is affected by the selection effects of the RPM sample. The kinematic selections imposed in the middle and bottom panels can also impact the distribution of the stars on the sky, depending on what the intrinsic velocity distributions are.

thumbnail Fig. 14.

Distribution of halo stars projected on the sky. Top: all of the stars in the sample with reliable distances (see Sect. 3.3). Middle: a subset of retrograde halo stars. Bottom: a subset of halo stars on radial orbits. The footprints seen in these panels are mostly due to selection effects.

In Appendix B we explore how the selection biases that are introduced by the incomplete velocity information affect the maps shown in Fig. 14. We use the Gaia 6D sample, but apply a high-tangential velocity selection criterion and set the line-of-sight velocities to zero. This analysis shows that such selections can produce the footprints that are very similar to those shown in Fig. 14. Therefore, the maps shown in this figure cannot be simply interpreted at face value.

6. The velocity distribution of the local halo

After highlighting the streams and structures in the distant halo, we now focus on exploring the velocity distribution of the local halo. Velocity space is very suitable to look for local streams in the form of overdensities. In small volumes, in which the (orbital) velocity gradients are small, stars with similar velocities have similar orbits. In this section we inspect only stars with a heliocentric distance smaller than 2 kpc, or even smaller volumes when indicated.

6.1. Toomre selection

The RPM sample is selected to contain a high fraction of halo stars, but we have seen in Sect. 4 that there is still some fraction of contamination from thick disc stars. Therefore we now impose a second selection to filter out thick disc stars based on their kinematics, namely, we remove stars that have |VLSR| < 250 km s−1. This selection is similar to a “Toomre” cut, which is often used to differentiate disc from halo stars (e.g., Bonaca et al. 2017; Koppelman et al. 2018; Posti et al. 2018). We adopt a rather strict limit of 250 km s−1 here because the velocities are not the true 3D velocities. In total, we are left with 3 223 725 high-quality halo stars (∼30% of the total sample). This number is in concordance with the fact that the red sequence (hot thick disc) contributes about 50% to the sample of stars with vtan > 200 km s−1 (e.g., Sahlholdt et al. 2019, see also Amarante et al. 2020).

6.2. Consistency check with RVS sample

In Sect. 5.2 and Appendix B we have seen that the RPM selection method introduces selection effects in subsets that are selected kinematically. These features vary with location on the sky (i.e., the “blindspots”). We check if the missing line-of-sight velocities also create features in velocity space locally. For this check, we use a control sample of nearby halo stars selected from the Gaia 6D sample, which includes both giants and main sequence stars. See Appendix C for a summary of this halo sample, with stars within <2 kpc. We aim to compare the RPM selection method to one using the full phase-space information. After applying the (pseudo) Toomre selection, we find about 6700 halo stars based 6D information and ∼7700 when artificially setting the line-of-sight velocity to zero.

Figure 15 shows the velocity distributions of halo stars in the 6D sample. Both the true velocities (left) and the pseudo-velocities (right) are shown. Only those stars with line-of-sight velocities close to zero are found in the same location left and right. The disc in these spaces is centred on vϕ ≈ 230 km s−1, the horseshoe shape (top panel) is an artefact of its removal (the Toomre cut). Subtle structures present in the true velocity distributions are diluted in the pseudo-velocities. Even the footprint of Gaia-Enceladus (Helmi et al. 2018; Belokurov et al. 2018), which creates an elongated structure in vR at constant vϕ, is blurred. For a much larger sample of stars we should be able to find faint imprints of the structures present in the true velocities, simply because by chance there will be stars with close to zero radial velocities. Comforted by the fact that no artificial structures are created by the introduction of the pseudo-velocities we turn back to the RPM sample.

thumbnail Fig. 15.

Velocity distribution of halo stars in the solar neighbourhood (distance <2 kpc) selected from the Gaia 6D sub-sample. The velocities calculated using the full phase-space information of the stars are shown in the left panels, while on the right they have been computed by setting the line-of-sight velocities to zero. Most of the structures present in the full 6D sample are strongly diluted in the 5D case.

6.3. Local streams in velocity space

We start with the analysis of the velocity distribution, see Fig. 16. The sample that is shown results from the RPM sample after applying the Toomre criterion. The top row shows the 2D-histogram of the velocities in 256 × 256 bins. The elongation in R at constant ϕ of the velocity distribution is reminiscent of the footprint of Gaia-Enceladus (cf. Fig. 15). All three panels show relatively smooth distributions, with only subtle hints of substructures. To enhance these structures, we inspect the asymmetry of the distributions in the horizontal axis of each panel, see the middle row. We define the asymmetry as

Asymmetry = H + H H + + H , $$ \begin{aligned} \mathrm{Asymmetry} = \frac{H^{\prime }_{+} - H^{\prime }_{-}}{H^{\prime }_{+} + H^{\prime }_{-}}, \end{aligned} $$(21)

where H + , $ H_{+,-}^{\prime} $ are smoothed histograms given by

H + , = N ( H + , + 1 , 2 ) . $$ \begin{aligned} H^{\prime }_{+,-} = \mathcal{N} (H_{+,-}+1,2). \end{aligned} $$(22)

thumbnail Fig. 16.

Velocity distributions of local (d <  2 kpc) halo stars calculated without the line-of-sight component. For each combination of the cylindrical velocity components a 2D-histogram is shown in the top row, while the asymmetry (defined as in Eq. (21)) with respect to the velocity plotted on the x-axis is shown in the middle row. We compare the maps to the asymmetry originating from a few structures that have been identified in the 6D sample in the bottom row. The asymmetry fleshes out structures that are asymmetric in either R or ϕ. Structures that are clearly present in the RPM sample, see middle row, are the Helmi Streams (green dots) and Sequoia (red and purple dots).

Here H+ is the histogram as it appears in the top row, H is its mirrored counterpart with the sign of R or z flipped and 𝒩(H, 2) is a Gaussian filter of the histogram H with kernel-size 2. The histograms range from −500 km s−1 to 500 km s−1 in both directions with 250 × 250 bins. A Gaussian kernel of 2 corresponds to a standard deviation of ∼8 km s−1, which is roughly the mean velocity error of this sample. Positive values (red) indicate an overdensity and negative (blue) an under density. By construction, each overdensity is matched by a conjugate underdensity mirrored with respect to the panel’s x = 0 axis.

The asymmetry maps reveal both small overdensities and large-scale patterns. These overdensities imply the presence of non-phase-mixed debris. Any population in the halo that is sufficiently phase-mixed will not display an asymmetry in the (local) velocity distribution. To shed light on the origin of the asymmetries, we map several structures detected in the 6D sample in the bottom row. These maps are based on the structures found by Koppelman et al. (2018). We focus on these structures because they are asymmetric in the true velocities and therefore might also be in 5D sample.

To establish if there is a link between the structures in the 6D and 5D samples, we first need to bear in mind the following issues: (i) The substructures from the 6D samples only comprise few stars each and hence they do not sample the sky densely; (ii) The values of the pseudo-velocities depend on location on the sky, so we cannot use these stars directly to predict where stars in the streams in the 5D sample would be located in velocity space; (iii) We expect the streams to be broad enough for the member stars to be isotropically distributed on the sky. With these caveats in mind we aim to enhance the number of sources per group in the 6D sample by resampling each star using the following method:

  • Each of the 103 stars is re-sampled 1000 times,

  • For each realisation, we generate a random location on a sphere of 2 kpc in radius that is centred on the solar position (uniformly distributed on the surface),

  • We assign the random location with the unchanged velocity vector (in galactocentric coordinates),

  • We calculate the i velocities for each star based on its new location.

We assume here that the Galactic potential does not vary over the volume in which we resample the structures. In this approximation, the orbits of the stars are mostly determined by the amplitude and direction of the velocities since to first order the location-dependent potential term is constant. The resampling is thus a simple way of modelling more members of the stream. That is, to add stars on the same orbits as the detected streams.

This resampling shows what the footprint of the structures identified in the 6D sample could look like in the pseudo velocity-space. The stars are coloured by the original labels of Koppelman et al. (2018). The structure indicated with green dots has been associated with the Helmi Streams (e.g., Helmi et al. 1999; Koppelman et al. 2019a). Recent studies suggest that the structures indicated with red and purple dots are part of a structure in velocity space known as Sequoia (Myeong et al. 2019), and that the blue and orange structures are part of a structures labelled Thamnos (Koppelman et al. 2019b).

We overlay the expected asymmetry over the newly sampled members of the structures identified in the 6D set (blue and red lines in the bottom row). These asymmetry contours do a surprisingly good job in explaining the asymmetries seen in the 5D sample. Therefore, we conclude that the features that are shown in the middle row for this sample are mainly due to the structures previously detected in the Gaia sample with full phase-space information.

Apart from these structures, there are several other overdensities visible of a few km s−1 in size (small red and blue dots). To check the statistical significance of the asymmetries found we shuffle the data. We randomly shuffle the velocities of Fig. 16 (top panel) and for each random set we create an asymmetry map. The Helmi Streams and a handful of other groups (those with the darkest colours) have a strong asymmetry. However, similar levels of asymmetry are found in the random realisations of the data. We note that this is a very crude estimate of the statistical significance as only the amplitude and not the extent of the asymmetries are taken into account. For example, Sequoia (purple and red dots in the bottom panel) overlaps with several of the asymmetries seen in the middle row. It spans most of the retrograde part of the diagram (ϕ < −100 km s−1). Therefore, we surmise that this asymmetry is due to the debris of Sequoia (Myeong et al. 2019) and possibly also Thamnos (Koppelman et al. 2019b). Of course, the asymmetries seen in Fig. 16 can also be partly due to unidentified halo structures in velocity space.

We perform an additional test to measure the overall level of asymmetry in the dataset (rather than of a given feature). We compute the “total level of asymmetry” per map as the sum over all the bins of the absolute values of the asymmetries per bin both for the data as well as for 1000 randomly reshuffled samples. For these samples, we compute the average and its dispersion σ. Finally we measure a significance value as (TotalAsymmetrydata − ⟨TotalAsymmetry⟩rand)/σ. We find that the significance levels vary per map (i.e., combination of cylindrical velocities), taking values 10.2σ, 12.5σ, and 5.8σ from left to right in Fig. 16. These results imply that, while individual asymmetries might be caused by random clustering of the stars, the total level of asymmetry found in the data is far from random.

6.4. Excess of pairs

As final part of this work, we quantify the clustering of stars in velocity space with a two-point correlation function. That is, we check if the stars are randomly distributed in velocity space, or whether they tend to cluster on some velocity scale. We use a correlation function of the form

ξ ( Δ v ) = D D ( Δ v ) R R ( Δ v ) , $$ \begin{aligned} \xi (\Delta v) = \frac{DD (\Delta v)}{RR (\Delta v)}, \end{aligned} $$(23)

where DDv) is the number of pairs with a (pseudo) 3D velocity difference of Δv and, similarly, RRv) is the number of pairs in a randomised sample. We obtain the randomised sample by shuffling the velocities and we count the pairs in 64 bins ranging from 0 to 800 km s−1. For computational reasons, we only calculate the correlation function for stars inside a volume of 1 kpc.

Figure 17 shows the velocity correlation function ξ for the RPM sample (black) and for a control sample of stars with full phase-space information (blue). Similar to analysis presented in Sect. 6.2, we show the effect of the missing line-of-sight velocities with the control sample (grey curve). The error bars indicate the error in ξ, calculated as the Poisson error in the number of pairs. A correlation of ξ >  1 indicates grouping of stars in excess of random clustering. All curves show significant clustering on velocity scales Δv ≲ 100 km s−1. Clustering of velocities on small scales hints at the presence of streams.

thumbnail Fig. 17.

Two point correlation function for stars in a local volume (d <  1 kpc). Both the RPM sample (black) and the 6D (blue) control sample are shown, in addition to a 5D version of the latter (grey). There is an excess of stars at velocity scales smaller than 100 km s−1, and at scales larger than 400 km s−1 in the 5D case. The correlation in the RPM sample is fully compatible with the correlation in the 5D version of the control sample.

The correlation excess at large velocity scales is less intuitive to interpret. However, a similar effect is seen in the 5D version of the 6D control sample. The grey and black curve are strikingly similar. Thus, the missing velocity component artificially enhances the correlation of the velocities on large scales and to a lesser extent also on small scales.

It is interesting that the black and grey curves match so well. This must mean that in spite of the roughly 10 times smaller size of the 6D sample (2945 stars), compared to the RPM sample, in both cases, most of the local streams are well resolved and contain at least a few stars per stream. This would be consistent with the early predictions by (e.g., Helmi & White 1999) who estimated of the order of 300–500 streams should be present near the Sun for a halo with a pure accretion origin. The advantage of a larger sample is that the amplitude of the correlation will be measured more accurately, as the (Poisson) uncertainty decreases with sample size.

Recently, Simpson et al. (2019) have investigated the velocity correlation function for Milky Way-like galaxies in the Aurigaia mock catalogues (Grand et al. 2018). Several of the halos analysed by these authors show similar correlation functions, both on small and large scales, to the one observed in the RPM sample. An excess of pairs can be very difficult to interpret if there are too few particles, also if the volume is not localised. Pairs always appear as a consequence of substructure and this can be accreted but also in situ (mergers do produce features in the in situ population as well, e.g., Gómez et al. 2012; Jean-Baptiste et al. 2017).

7. Discussion and conclusions

We have explored the use of a reduced proper motion (RPM) diagram to identify tentative main sequence (MS) halo stars in Gaia DR2. This method makes only use of the Gaia photometry and proper motions. Most conventional methods rely on distance information or spectroscopic observations of the stars, which limits the sample size substantially. With the RPM selection method we find 11 711 399 tentative halo MS stars.

For these stars, we calculate photometric distances using the relatively simple colour-magnitude relation of the main sequence in the Gaia photometric bands. These distances have typical errors of ∼7%, which makes them much more reliable than inverted parallaxes. The distances are especially accurate for stars with colours 0.45 <  G − GRP <  0.715. Stars with bluer colours are near the MS turn-off and their magnitudes are a very steep function of colour. On the other hand, the MS broadens significantly for redder colours, which causes the error in the photometric distance calibration to increase. The above-mentioned colour range reduces our sample to 7 117 555 MS halo stars with good distances. The median velocity errors for these stars are ∼20 km s−1.

A limitation of dealing with a sample of halo MS stars is that they are intrinsically faint. The most distant MS star in our particular sample is found at ∼16 kpc. A star with MG  ≈  5 (the brightest given the colour cut) at this distance would have the limiting magnitude of the Gaia DR2 catalogue. Beyond a heliocentric distance of ∼5 kpc, the sample suffers from incompleteness, and thus halo overdensities such as the Virgo and Hercules-Aquila clouds cannot be studied with our dataset.

The spatial distribution of the RPM sample displays large-scale patterns and structures but it is not trivial to disentangle real structure from artefacts of the selection method. Nonetheless, full-sky maps of the mean velocities v and vb, reveal the presence of some known streams, such as GD-1 (Grillmair & Dionatos 2006). Since pixels on the sky that overlap with a stream may have a mean velocity that is different from that of the background, and a smaller velocity dispersion, this promises to be an interesting approach to identifying substructures. Future Gaia data releases will be less affected by systematics due to, for example, the scanning-pattern, and will provide more precise proper motions leading to easier distinction between stream and background stars.

Another promising approach is to calculate (pseudo) space velocities of the stars by assuming their line-of-sight velocity to be zero. Most of the structure in velocity space is smoothed out because of the missing velocity component. However, by chance, some stars can have a true line-of-sight velocity of zero, particularly if the sample is large enough. In that case, we may still find some structure in velocity space.

In fact, in pseudo-velocity space for nearby MS halo stars we find clear imprints of the Helmi Streams. Also the footprint of Gaia-Enceladus (i.e., the Gaia-Sausage) is found in several of the velocity maps that we explore. Moreover, the retrograde halo shows a strong asymmetry in the velocity distribution, reminiscent of the accreted structures that have been previously reported in the 6D sample (Koppelman et al. 2018, 2019b; Myeong et al. 2019). The total level of non-phased-mixed substructure in pseudo-velocity space, as measured by an asymmetry parameter, is very significant (≳6σ), in comparison to randomised samples.

Through a two-point velocity correlation function we measure a very significant excess of stars with small velocity differences (Δv <  100 km s−1). The amplitude of this clustering signal for the RPM sample is similar to that obtained when using 6D Gaia sample, implying that this sample is already large enough for a true quantification of the amount of clustering. This appears to be consistent with the expectation from theoretical models that hundreds of debris streams are crossing the solar vicinity (e.g., Helmi & White 1999), as only if the sample is large enough will the streams be populated by enough stars to produce a signal. It will be particularly interesting to study this excess of close velocity pairs in more detail and, in particular, to do spectroscopic follow-up of the stars in pairs as these likely originate from very localised regions in the phase-space of accreted galaxies.

The RPM catalogue we have built provides interesting targets for spectroscopic follow-up, for example for chemical tagging. Even low/intermediate resolution spectroscopy would be highly valuable because it would provide the missing line-of-sight velocity but also because even a metallicity and [α/Fe] abundance are extremely useful to disentangle merger events from one another and to construct basic chemical enrichment histories. Halo main sequence stars are easy to identify as shown here. Moreover, they have the advantage of being very numerous and that the elements found in their atmospheres are directly representative of the elements in their birth material (e.g., Tolstoy et al. 2009), which is not necessarily true for giants, which might have undergone mixing.

Finally, a sample such as the one presented here could be used to map the (local) mass-distribution of the Milky Way, the density profile of the stellar halo as well as its dynamical properties. In all applications it is important to bear in mind that our sample is kinematically biased by construction, and that it misses halo stars with small proper motions and large line of sight velocities.


Acknowledgments

We thank Eduardo Balbinot for helpful discussions regarding the methodology presented in this paper and for his comments on earlier versions of the manuscript. We also thank the anonymous referee for their constructive comments that helped improve the text. We gratefully acknowledge financial support from a VICI grant from the Netherlands Organisation for Scientific Research (NWO) and a Spinoza prize. This work has made use of data from the European Space Agency (ESA) mission Gaia (http://www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, http://www.cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement. For the analysis, the following software packages have been used: vaex (Breddels & Veljanoski 2018), numpy (Oliphant 2006; Van Der Walt 2011), matplotlib (Hunter 2007), jupyter notebooks (Kluyver et al. 2016).

References

  1. Abolfathi, B., Aguado, D. S., Aguilar, G., et al. 2018, ApJS, 235, 42 [NASA ADS] [CrossRef] [Google Scholar]
  2. Ahumada, R., Prieto, C. A., Almeida, A., Anders, F., et al. 2020, ApJS, 249, 3 [CrossRef] [Google Scholar]
  3. Amarante, J. A. S., Smith, M. C., & Boeche, C. 2020, MNRAS, 492, 3816 [CrossRef] [Google Scholar]
  4. Antoja, T., Helmi, A., Romero-Gomez, M., et al. 2018, Nature, 561, 360 [NASA ADS] [CrossRef] [Google Scholar]
  5. Arce, H. G., & Goodman, A. A. 1999, ApJ, 512, L135 [NASA ADS] [CrossRef] [Google Scholar]
  6. Arenou, F., Luri, X., Babusiaux, C., et al. 2018, A&A, 616, A29 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  7. Belokurov, V., Evans, N. W., Bell, E. F., et al. 2007, ApJ, 657, 89 [CrossRef] [Google Scholar]
  8. Belokurov, V., Erkal, D., Evans, N. W., et al. 2018, MNRAS, 478, 611 [NASA ADS] [CrossRef] [Google Scholar]
  9. Binney, J., Burnett, B., Kordopatis, G., et al. 2014, MNRAS, 437, 351 [NASA ADS] [CrossRef] [Google Scholar]
  10. Bonaca, A., Jurić, M. J., Zeljko, I., et al. 2012, AJ, 143, 105 [NASA ADS] [CrossRef] [Google Scholar]
  11. Bonaca, A., Conroy, C., Wetzel, A., Hopkins, P. F., & Kereš, D. 2017, ApJ, 845, 101 [NASA ADS] [CrossRef] [Google Scholar]
  12. Bono, G., Stetson, P. B., Vandenberg, D. A., et al. 2010, ApJ, 708, 74 [NASA ADS] [CrossRef] [Google Scholar]
  13. Breddels, M. A., & Veljanoski, J. 2018, A&A, 618, A13 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  14. Buder, S., Asplund, M., Duong, L., et al. 2018, MNRAS, 478, 4513 [NASA ADS] [CrossRef] [Google Scholar]
  15. Cardelli, J. A., Clayton, G. C., & Mathis, J. S. 1989, A&A, 345, 245 [Google Scholar]
  16. Cargile, P. A., Conroy, C., Johnson, B. D., et al. 2019, ApJ, 900, 28 [Google Scholar]
  17. Cassisi, S., Castellani, V., Ciarcelluti, P., et al. 2000, MNRAS, 315, 679 [NASA ADS] [CrossRef] [Google Scholar]
  18. Chan, V. C., & Bovy, J. 2019, MNRAS, 000, 1 [Google Scholar]
  19. Chaplin, W. J., Serenelli, A. M., Miglio, A., et al. 2020, Nat. Astron., 4, 382 [Google Scholar]
  20. Chiba, R., Friske, J. K. S., & Schönrich, R. 2019, MNRAS, 500, 4710 [CrossRef] [Google Scholar]
  21. Conroy, C., Bonaca, A., Cargile, P., et al. 2019, A&A, 883, 107 [Google Scholar]
  22. Cui, X.-Q., Zhao, Y.-H., Chu, Y.-Q., et al. 2012, Res. Astron. Astrophys., 12, 1197 [NASA ADS] [CrossRef] [Google Scholar]
  23. Deason, A. J., Belokurov, V., & Evans, N. W. 2011, MNRAS, 416, 2903 [NASA ADS] [CrossRef] [Google Scholar]
  24. Deason, A. J., Belokurov, V., Evans, N. W., et al. 2012, MNRAS, 425, 2840 [NASA ADS] [CrossRef] [Google Scholar]
  25. Deason, A. J., Belokurov, V., Hamren, K. M., et al. 2014, MNRAS, 444, 3975 [NASA ADS] [CrossRef] [Google Scholar]
  26. De Jong, J. T. A., Yanny, B., Rix, H.-W., et al. 2010, ApJ, 714, 663 [NASA ADS] [CrossRef] [Google Scholar]
  27. Di Matteo, P., Haywood, M., Lehnert, M. D., et al. 2019, A&A, 604, A106 [Google Scholar]
  28. Drake, A. J., Catelan, M., Djorgovski, S. G., et al. 2012, ApJ, 763, 32 [NASA ADS] [CrossRef] [Google Scholar]
  29. Evans, D. W., Riello, M., De Angeli, F., et al. 2018, A&A, 616, A4 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  30. Fukushima, T., Chiba, M., Homma, D., et al. 2017, PASJ, 70, 12 [Google Scholar]
  31. Fusillo, N. P. G., Gänsicke, B. T., & Greiss, S. 2015, MNRAS, 448, 2260 [NASA ADS] [CrossRef] [Google Scholar]
  32. Gaia Collaboration (Prusti, T., et al.) 2016, A&A, 595, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  33. Gaia Collaboration (Babusiaux, C., et al.) 2018a, A&A, 616, A10 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  34. Gaia Collaboration (Brown, A. G. A., et al.) 2018b, A&A, 616, A21 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  35. Gallart, C., Bernard, E. J., Brook, C. B., et al. 2019, Nat. Astron., 3, 932 [NASA ADS] [CrossRef] [Google Scholar]
  36. Geier, S. 2020, A&A, 635, A193 [CrossRef] [EDP Sciences] [Google Scholar]
  37. Gómez, F. A., Minchev, I., Villalobos, A., et al. 2012, MNRAS, 419, 2163 [NASA ADS] [CrossRef] [Google Scholar]
  38. Grand, R. J. J., Helly, J., Fattahi, A., et al. 2018, MNRAS, 481, 1726 [NASA ADS] [CrossRef] [Google Scholar]
  39. GRAVITY Collaboration (Abuter, R., et al.) 2018, A&A, 615, A15 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  40. Grillmair, C. J., & Carlin, J. L. 2016, in Tidal Streams in the Local Group and Beyond, eds. H. J. Newberg, & J. L. Carlin (Springer), 87 [NASA ADS] [CrossRef] [Google Scholar]
  41. Grillmair, C. J., & Dionatos, O. 2006, ApJ, 643, L17 [NASA ADS] [CrossRef] [Google Scholar]
  42. Hayes, C. R., Majewski, S. R., Hasselquist, S., et al. 2020, ApJ, 889, 63 [NASA ADS] [CrossRef] [Google Scholar]
  43. Haywood, M., Di Matteo, P., Lehnert, M. D., et al. 2018, ApJ, 863, 113 [NASA ADS] [CrossRef] [Google Scholar]
  44. Helmi, A., & White, S. D. M. 1999, MNRAS, 307, 495 [NASA ADS] [CrossRef] [Google Scholar]
  45. Helmi, A., White, S. D. M., De Zeeuw, P. T., et al. 1999, Nature, 402, 53 [NASA ADS] [CrossRef] [Google Scholar]
  46. Helmi, A., Babusiaux, C., Koppelman, H. H., et al. 2018, Nature, 563, 85 [NASA ADS] [CrossRef] [Google Scholar]
  47. Hernitschek, N., Cohen, J. G., Rix, H.-W., et al. 2018, ApJ, 859, 31 [NASA ADS] [CrossRef] [Google Scholar]
  48. Hunt, J. A., Bub, M. W., Bovy, J., et al. 2019, MNRAS, 490, 1026 [NASA ADS] [CrossRef] [Google Scholar]
  49. Hunter, J. D. 2007, Comput. Sci. Eng., 9, 90 [NASA ADS] [CrossRef] [Google Scholar]
  50. Ibata, R. A., Malhan, K., & Martin, N. F. 2019, ApJ, 872, 152 [NASA ADS] [CrossRef] [Google Scholar]
  51. Iorio, G., & Belokurov, V. 2019, MNRAS, 482, 3868 [NASA ADS] [CrossRef] [Google Scholar]
  52. Ivezic, Z., Sesar, B., Juric, M., et al. 2008, ApJ, 684, 287 [NASA ADS] [CrossRef] [Google Scholar]
  53. Jean-Baptiste, I., Di Matteo, P., Haywood, M., et al. 2017, A&A, 604, A106 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  54. Jones, E. M. 1972, A&A, 173, 671 [Google Scholar]
  55. Juric, M., Željko, I., Brooks, A., et al. 2008, ApJ, 673, 864 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
  56. Kalirai, J. S., Richer, H. B., Hansen, B. M., et al. 2004, ApJ, 601, 277 [NASA ADS] [CrossRef] [Google Scholar]
  57. Katz, D., Antoja, T., Romero-Gómez, M., et al. 2018, A&A, 616, A41 [CrossRef] [EDP Sciences] [Google Scholar]
  58. Katz, D., Sartoretti, P., Cropper, M., et al. 2019, A&A, 622, 19 [Google Scholar]
  59. Kawata, D., Baba, J., Ciucă, I., et al. 2018, MNRAS, 479, L108 [NASA ADS] [CrossRef] [Google Scholar]
  60. Khanna, S., Sharma, S., Tepper-Garcia, T., et al. 2019, MNRAS, 489, 4962 [CrossRef] [Google Scholar]
  61. Kilic, M., Munn, J. A., Harris, H. C., et al. 2006, AJ, 131, 582 [NASA ADS] [CrossRef] [Google Scholar]
  62. Kluyver, T., Ragan-Kelley, B., Pérez, F., et al. 2016, Jupyter Notebooks-a Publishing Format for Reproducible Computational Workflows (IOS Press) [Google Scholar]
  63. Koppelman, H. H., Helmi, A., & Veljanoski, J. 2018, ApJ, 860, L11 [NASA ADS] [CrossRef] [Google Scholar]
  64. Koppelman, H. H., Helmi, A., Massari, D., et al. 2019a, A&A, 625, A5 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  65. Koppelman, H. H., Helmi, A., Massari, D., et al. 2019b, A&A, 631, L9 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  66. Kunder, A., Kordopatis, G., Steinmetz, M., et al. 2017, AJ, 153, 75 [Google Scholar]
  67. Lancaster, L., Koposov, S. E., Belokurov, V., et al. 2019, MNRAS, 486, 378 [NASA ADS] [CrossRef] [Google Scholar]
  68. Laporte, C. F. P., Minchev, I., et al. 2018, MNRAS, 485, 3134 [NASA ADS] [CrossRef] [Google Scholar]
  69. Leung, H. W., & Bovy, J. 2018, MNRAS, 483, 3255 [Google Scholar]
  70. Leung, H. W., & Bovy, J. 2019, MNRAS, 489, 2079 [NASA ADS] [CrossRef] [Google Scholar]
  71. Lindegren, L. 2018, Re-normalising the Astrometric Chi-square in Gaia DR2, Tech. rep., (Lund Observatory) [Google Scholar]
  72. Lindegren, L., Hernandez, J., Bombrun, A., et al. 2018, A&A, 616, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  73. Mackereth, J. T., Schiavon, R. P., Pfeffer, J., et al. 2019, MNRAS, 482, 3426 [NASA ADS] [CrossRef] [Google Scholar]
  74. Majewski, S. R., Ostheimer, J. C., Rocha-Pinto, H. J., et al. 2004, ApJ, 615, 738 [NASA ADS] [CrossRef] [Google Scholar]
  75. Malhan, K., Ibata, R. A., & Martin, N. F. 2018, MNRAS, 481, 3442 [NASA ADS] [CrossRef] [Google Scholar]
  76. Marigo, P., Girardi, L., Bressan, A., et al. 2017, ApJ, 835, 19 [NASA ADS] [CrossRef] [Google Scholar]
  77. McMillan, P. J. 2017, MNRAS, 94, 76 [NASA ADS] [CrossRef] [Google Scholar]
  78. Minchev, I., Quillen, A. C., Williams, M., et al. 2009, MNRAS, 396, 56 [CrossRef] [Google Scholar]
  79. Monari, G., Famaey, B., Siebert, A., Wegg, C., & Gerhard, O. 2019, A&A, 626, A41 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  80. Morrison, H. L., Helmi, A., Sun, J., et al. 2009, ApJ, 694, 130 [NASA ADS] [CrossRef] [Google Scholar]
  81. Myeong, G. C., Vasiliev, E., Iorio, G., et al. 2019, MNRAS, 000, 1 [Google Scholar]
  82. Newberg, H. J., Yanny, B., Rockosi, C., et al. 2002, ApJ, 569, 245 [NASA ADS] [CrossRef] [Google Scholar]
  83. Nissen, P. E., & Schuster, W. J. 2010, A&A, 511, L10 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  84. Oliphant, T. E. 2006, Trelgol Publishing, 1, 378 [Google Scholar]
  85. Posti, L., Helmi, A., Veljanoski, J., & Breddels, M. A. 2018, A&A, 615, A70 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  86. Ramos, P., Antoja, T., & Figueras, F. 2018, A&A, 619, A72 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  87. Robin, A. C., Reylé, C., Derrière, S., & Picaud, S. 2003, A&A, 409, 523 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  88. Rocha-Pinto, H. J., Majewski, S. R., Skrutskie, M. F., et al. 2004, ApJ, 615, 732 [NASA ADS] [CrossRef] [Google Scholar]
  89. Sahlholdt, C. L., Casagrande, L., & Feltzing, S. 2019, ApJ, 881, L10 [NASA ADS] [CrossRef] [Google Scholar]
  90. Salaris, M., Chieffi, A., & Straniero, O. 1993, A&A, 414, 580 [Google Scholar]
  91. Saumon, D., Bergeron, P., Lunine, J. I., et al. 1994, ApJ, 424, 333 [NASA ADS] [CrossRef] [Google Scholar]
  92. Schlegel, D. J., Finkbeiner, D. P., & Davis, M. 1998, ApJ, 500, 525 [NASA ADS] [CrossRef] [Google Scholar]
  93. Schönrich, R., Binney, J., & Dehnen, W. 2010, MNRAS, 403, 1829 [NASA ADS] [CrossRef] [Google Scholar]
  94. Schönrich, R., Mcmillan, P., & Eyer, L. 2019, MNRAS, 487, 3568 [NASA ADS] [CrossRef] [Google Scholar]
  95. Sesar, B., Ivezić, Z., Grammer, S. H., et al. 2010, ApJ, 708, 717 [NASA ADS] [CrossRef] [Google Scholar]
  96. Sharma, S., Bland-Hawthorn, J., Johnston, K. V., & Binney, J. 2011, ApJ, 730, 20 [NASA ADS] [CrossRef] [Google Scholar]
  97. Simion, I. T., Belokurov, V., Irwin, M., & Koposov, S. E. 2014, MNRAS, 440, 161 [CrossRef] [MathSciNet] [Google Scholar]
  98. Simpson, C. M., Gargiulo, I., Gómez, F. A., et al. 2019, MNRAS, 490, L32 [CrossRef] [Google Scholar]
  99. Smith, M. C., Evans, N. W., Belokurov, V., et al. 2009, MNRAS, 399, 1223 [NASA ADS] [CrossRef] [Google Scholar]
  100. Tolstoy, E., Hill, V., & Tosi, M. 2009, ARA&A, 371 [NASA ADS] [CrossRef] [Google Scholar]
  101. Torres, S., Cantero, C., Rebassa-Mansergas, A., et al. 2019, MNRAS, 485, 5573 [NASA ADS] [CrossRef] [Google Scholar]
  102. Trick, W. H., Coronado, J., & Rix, H.-W. 2019, MNRAS, 484, 3291 [NASA ADS] [CrossRef] [Google Scholar]
  103. Van Der Walt, S., et al. 2011, Comput. Sci. Eng., 13, 22 [CrossRef] [Google Scholar]
  104. Watkins, L. L., Evans, N. W., Belokurov, V., et al. 2009, MNRAS, 398, 1757 [NASA ADS] [CrossRef] [Google Scholar]
  105. Wilson, J. C., Hearty, F., Skrutskie, M. F., et al. 2010, in Ground-based and Airborne Instrumentation for Astronomy III, eds. I. S. McLean, S. K. Ramsay, H. Takami, et al., Int. Soc. Opt. Photonics, 7735, 77351C [CrossRef] [Google Scholar]
  106. Xue, X. X., Rix, H. W., Zhao, G., et al. 2008, ApJ, 684, 1143 [NASA ADS] [CrossRef] [Google Scholar]
  107. Xue, X. X., Rix, H. W., Yanny, B., et al. 2011, ApJ, 738, 79 [NASA ADS] [CrossRef] [Google Scholar]
  108. Zinn, J. C., Pinsonneault, M. H., Huber, D., & Stello, D. 2019a, ApJ, 878, 136 [NASA ADS] [CrossRef] [Google Scholar]
  109. Zinn, R., Chen, X., Layden, A. C., & Casetti-Dinescu, D. I. 2019b, MNRAS, 492, 2161 [CrossRef] [Google Scholar]
  110. Zoccali, M., Cassisi, S., Frogel, J. A., et al. 2000, ApJ, 530, 418 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A: Velocity maps without MSTO stars

In Sect. 5.1 we show the velocity moments of the RPM sample binned on the sky. To flesh out distant streams, such as GD-1, we have included sources that are near the MSTO. However, the MSTO stars might have distances that are underestimated by up to 40%. For this reason, we show here the same figures, but without these MSTO stars, see Fig A.1 (and Fig. 12 for comparison). Clearly, the overall contours are the same with and without the MSTO stars. However, the figures shown in this appendix do not show any (clear) narrow streams.

thumbnail Fig. A.1.

Same as Fig. 12, but without MSTO stars.

Appendix B: Selection effects

If, for simplicity, we assume that all of the halo stars are on radial orbits, then there would be blind spots in the RPM sample towards the Galactic Centre and anti-centre. In these directions, stars on purely radial orbits move along the line-of-sight. Therefore, their proper motions are close to zero, and they will not be present in the RPM selection. The blind spot towards the Galactic Centre is smaller than its conjugate spot towards the anti-centre.

In Fig. B.1 we investigate this selection effect by showing the sky-distribution of halo stars in the 6D sample. We show a sample of halo stars selected using the full phase-space information that is available (left column) and a sample of halo stars selected using only tangential velocities (right column). The sample that is used is identical to the one described in Appendix C. The layout (per column) is the same as the layout of Fig. 14. The similarity between the footprints of the red dots and the distribution displayed in Fig. 14 are striking. We have to conclude that the large structures that are seen in Fig. 14 are dominated by selection effects.

thumbnail Fig. B.1.

Selection effects of the RPM halo-selection, illustrated with a sample of halo stars for which the full phase-space information is available (see Appendix C). The layout of the panels is based on Fig. 14. The left columns shows the distribution on the sky of halo stars selected using full phase-space information. On the right, we show the distribution of a halo sample selected without the line-of-sight velocity.

We do note that the distance distribution of the stars in the full phase-space sample is more local. Typically these stars are within 3 kpc, while the RPM sample has a median distance of 4.39 kpc. Therefore, the maps of Fig. B.1 might look different when a larger (volume-wise) sample is used. Although, up to first order they should be the same. These maps mostly show where the motion aligns the most with the line-of-sight, which is a sky-dependent phenomenon and not distance dependent.

Appendix C: RVS sample

In Sect. 6.2 we have used the RVS (Katz et al. 2019) subset of Gaia DR2 as a control sample. This sample of stars comprises full phase-space information. For the construction of this sample we start with the full RVS sample and impose the following quality criteria: firstly, phot_g_mean_flux_over_error > 50 and phot_rp_mean_flux_over_error > 20, secondly visibility_periods_used > = 5, and finally parallax_over_ error > 5.

We calculate Cartesian coordinates and velocities similarly as described in Sects. 5 and 6, as well as tangential velocities according to Eq. (7). We then select a halo sample as stars with vtan > 300 km s−1. Because for this set there are line-of-sight velocities available, we also calculate the true and pseudo-velocities, and then impose the following cut in velocity space:

| V V LSR | > 250 km s 1 . $$ \begin{aligned} |\mathbf V - \mathbf V _{\rm LSR}| > 250\,\mathrm{km\,s}^{-1}. \end{aligned} $$(C.1)

We calculate this cut separately for the true and pseudo-velocities. As a result we end up with two halo samples, one based on full phase-space information and one without line-of-sight velocities. For a local sample (<2 kpc), we find 6718 (using true velocities) and 7749 (using pseudo-velocities) tentative halo stars.

All Figures

thumbnail Fig. 1.

Hertzsprung-Russel diagram (HRD) of a local sample of stars with large tangential motions (vtan > 200 km s−1) and very high-quality parallaxes (parallax_over_error > 50). Overlayed is an 11 Gyr age and [M/H] = −0.5 metallicity isochrone (red). This isochrone is shifted to the left by 0.01 mag in G − GRP to split the two sequences that are shown. The inset shows the two sequences without the red line overplotted to make the gap between them appreciable.

In the text
thumbnail Fig. 2.

Top panels: raw data (left) and the cleaned data (right) that are used for the fit. Middle: two fits of the main sequence (MS). Bottom: error in distance is related to the measured thickness of the MS. The dashed lines indicate the range where the MS is reliable enough to calibrate photometric distances.

In the text
thumbnail Fig. 3.

Reduced proper motion diagram for all sources in Gaia that pass the quality cuts of Sect. 3.1. The sample of sources within the red lines are selected to be tentatively halo stars. The red lines are drawn based on the three-component fit of the MS. The location of the box is chosen to select halo stars with 200 km s−1 < vtan < 800 km s−1.

In the text
thumbnail Fig. 4.

HRD of sources in the MS selection with precise parallaxes (parallax_over_error >5). The three-component fit of the MS of Sect. 3.2 is shown as a red, dashed line and with an offset of 2 mag as blue line. Contamination of white dwarfs and other faint sources is reduced by removing all the stars below the blue line.

In the text
thumbnail Fig. 5.

Inspection of the quality of the photometric distances compared to the parallax (top, left), and as a function colour (top, right), proper motion (bottom, left), and extinction corrected G-magnitude (bottom, right). As a reference, we overlay dashed lines indicating a difference of ±10% in the distance. The quality of the photometric distances does not depend on any of the values showed here, as is to be expected.

In the text
thumbnail Fig. 6.

Chemical abundances (left) and metallicities (right) of stars included in the RPM sample as derived from cross-matching to APOGEE DR16 and to LAMOST DR4. In both panels, we denote the distribution of the full samples provided by each of these surveys, after quality cuts, in black, while in blue we denote the subset that overlaps with the RPM sample. The vast majority of these stars are clearly all metal-poor, as expected for halo stars.

In the text
thumbnail Fig. 7.

Top: distribution of galactocentric distances. The dashed line indicates the location of the Sun. Bottom: distribution of G-magnitudes (left) and photometric distances (right) of the stars in the RPM sample.

In the text
thumbnail Fig. 8.

Sky-maps of RPM selected sample of halo stars. The left panel shows the full RPM sample, and the right panel shows a selection of the 35% most distant stars. Strong signatures of the Gaia scanning pattern are visible in both panels. Sources in the plane of the disc are filtered by our quality cuts. The white circles mark the location of six globular clusters that are picked up by the RPM method and which are clearly visible when zooming in.

In the text
thumbnail Fig. 9.

Spatial distribution of the RPM sample in heliocentric Cartesian coordinates. The sample is distributed uniformly, and the density of sources decreases with heliocentric distance, as is expected for a magnitude-limited sample. A combined effect of dust extinction and the scanning pattern of Gaia creates stripes. Sources with |b| < 20° have been removed (i.e., highly reddened stars located close to the disc).

In the text
thumbnail Fig. 10.

Spatial distribution in wedges of Δ=36°, projected in cylindrical heliocentric (R, Z) coordinates. Low latitude (|b|< 20°) areas are desaturated because these are likely contaminated by thick disc stars. The labels correspond to the overdensities found by De Jong et al. (2010, i.e., their Table 4): Virgo (V), Monoceros (M), (6.5, 1.5) (J; detected originally by Juric et al. 2008), and the unlabelled structures (U). These overdensities lie mostly just beyond our sample’s reach, but might become visible with a similar analysis with upcoming Gaia data releases.

In the text
thumbnail Fig. 11.

Spatial distribution of the RPM sample projected onto the XY-plane (left) and XZ-plane (right). The sample is binned in 256 × 256 bins, each bin is colour-coded by the mean extinction in G-mag. As expected, low-latitude regions are the most affected. The extinction creates interesting features as a function of galactic (i.e., the fingers of God features).

In the text
thumbnail Fig. 12.

Distribution of the mean, solar-motion-corrected v (top) and vb (bottom) velocities binned on the sky for distant halo stars (d >  6 kpc). Large-scale streaming motions as well as several small streams stand out. This map includes MSTO stars, which might have large uncertainties in the distances. See Appendix A and Fig. A.1 for a version without MSTO stars.

In the text
thumbnail Fig. 13.

Mean Pearson correlation coefficient of the proper motions projected on the sky after correction of the solar reflex motion. The signal that is present was first observed in a sample of RR-Lyrae stars (Iorio & Belokurov 2019, see their Fig. 5). A radially anisotropic halo explains the pattern, mainly caused by the debris of Gaia-Enceladus.

In the text
thumbnail Fig. 14.

Distribution of halo stars projected on the sky. Top: all of the stars in the sample with reliable distances (see Sect. 3.3). Middle: a subset of retrograde halo stars. Bottom: a subset of halo stars on radial orbits. The footprints seen in these panels are mostly due to selection effects.

In the text
thumbnail Fig. 15.

Velocity distribution of halo stars in the solar neighbourhood (distance <2 kpc) selected from the Gaia 6D sub-sample. The velocities calculated using the full phase-space information of the stars are shown in the left panels, while on the right they have been computed by setting the line-of-sight velocities to zero. Most of the structures present in the full 6D sample are strongly diluted in the 5D case.

In the text
thumbnail Fig. 16.

Velocity distributions of local (d <  2 kpc) halo stars calculated without the line-of-sight component. For each combination of the cylindrical velocity components a 2D-histogram is shown in the top row, while the asymmetry (defined as in Eq. (21)) with respect to the velocity plotted on the x-axis is shown in the middle row. We compare the maps to the asymmetry originating from a few structures that have been identified in the 6D sample in the bottom row. The asymmetry fleshes out structures that are asymmetric in either R or ϕ. Structures that are clearly present in the RPM sample, see middle row, are the Helmi Streams (green dots) and Sequoia (red and purple dots).

In the text
thumbnail Fig. 17.

Two point correlation function for stars in a local volume (d <  1 kpc). Both the RPM sample (black) and the 6D (blue) control sample are shown, in addition to a 5D version of the latter (grey). There is an excess of stars at velocity scales smaller than 100 km s−1, and at scales larger than 400 km s−1 in the 5D case. The correlation in the RPM sample is fully compatible with the correlation in the 5D version of the control sample.

In the text
thumbnail Fig. A.1.

Same as Fig. 12, but without MSTO stars.

In the text
thumbnail Fig. B.1.

Selection effects of the RPM halo-selection, illustrated with a sample of halo stars for which the full phase-space information is available (see Appendix C). The layout of the panels is based on Fig. 14. The left columns shows the distribution on the sky of halo stars selected using full phase-space information. On the right, we show the distribution of a halo sample selected without the line-of-sight velocity.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.