Free Access
Issue
A&A
Volume 564, April 2014
Article Number A49
Number of page(s) 14
Section Celestial mechanics and astrometry
DOI https://doi.org/10.1051/0004-6361/201323037
Published online 03 April 2014

© ESO, 2014

1. Introduction

Open clusters have long been used as a testing ground for a large number of astronomical theories. Determining the distances to nearby open clusters is critical, since they have historically formed the first step in the calibration of the distance scale. Because all stars within an open cluster are expected to have the same age and metallicity, accurate distance estimates are highly useful in calibrating the main sequence and checking stellar evolutionary theory through comparison with theoretical isochrones.

Until recently, accurate distances to even the most nearby clusters have not been possible. The Hipparcos astrometric mission of 1989 (Perryman & ESA 1997) for the first time gave accurate parallax measurements for over one hundred thousand stars and has been used extensively to give direct distance measurements to more than 30 open clusters.

Still, many questions remain. While revolutionary in its time, the milliarcsecond astrometry and limiting magnitude (Hp < 12.5) of Hipparcos allow calculating distances to only the nearest open clusters, and even then do so with a precision no better than a few percent. Owing to the relatively small size of most open clusters compared with their distances and the precision of measurements, the Hipparcos data has been unable to give definitive answers about the internal structure and physical size of such clusters.

Additionally, the release of the Hipparcos catalogue in 1997 has led to some controversy. The most famous is the case of the Pleiades, where methods based on Hipparcos data (van Leeuwen & Hansen Ruiz 1997; Robichon et al. 1999a; Mermilliod et al. 1997) gave a distance estimate that was some 10% shorter than works based on photometric methods (Pinsonneault et al. 1998; Robichon et al. 1999b; Stello & Nissen 2001) (see Sect. 6).

With the launch of Gaia, the European Space Agency’s second major astrometric satellite, the situation is set to change. Building on the success of Hipparcos, Gaia will provide micro-arcsecond astrometric precision and a limiting magnitude of 20, which will revolutionise many aspects of astronomy.

With the above in mind, it is apparent that a new method is required that is capable of squeezing the maximum precision from the currently available data and is capable of utilising information from the full range of observed quantities (astrometric, photometric, and kinematic information). This will be particularly true after the launch of Gaia, which will produce a rich dataset that will include not only accurate parallax measurements, but also photometry at millimag precision and a full set of kinematics obtained from proper motions combined with radial velocity measurements from the on-board radial velocity spectrometer (for stars with GRVS < 17).

The use of trigonometric parallaxes can be problematic, and care must be taken to account for effects such as the Lutz & Kelker (1973) or Malmquist (1936) biases, sample selection effects and non-linear transformations, such as those highlighted by Arenou & Luri (1999) and Arenou & Luri (2002).

In Sect. 2 the rationale behind the method is described, followed by the exact mathematical formulation in Sect. 3. A description of the observational data used is given in Sect. 4, and the results of application of the method to the Pleiades and Hyades given in Sects. 5 and 7. The possible effects of correlated errors in the Hipparcos catalogue are discussed in Sect. 6. The use of the method with Gaia data is tested using simulations in Sect. 8.

2. Methods

Maximum likelihood estimation (MLE) has been used in relation to open clusters since 1958, where Vasilevskis et al. (1958) used MLE to perform cluster membership selection from proper motions.

For the Hyades and Pleiades, Chen and Zhao used a combination of the convergent point method with MLE in order to simultaneously determine mean parallax and kinematics for the Hyades (Zhao & Chen 1994) and later the Pleiades (Li & Junliang 1999). These early approaches applied the principles of MLE in its basic form, but improvements by Luri et al. (1996) resolved several shortcomings in its use. The improvements, described in full in Luri et al. (1996), have been used extensively in this work and allow the following improvements:

  • The use of numerical methods avoids the necessity ofapproximating or simplifying complex equations during theformulation of the MLE, thus avoiding any loss of precision.

  • Explicitly taking into account sample selection effects caused by observational constraints is required to correctly model the joint probability density function (PDF) of the sample. These selection effects introduce various biases into the sample, e.g. Malmquist, and their correct treatment within the formulation of the density laws avoids the need for a posteriori corrections, which are often poorly understood.

  • Through the construction of the MLE joint PDF, where all of a star’s available information is used simultaneously, posterior estimates of a star’s properties can be obtained with increased precision compared with the original data. This can be extended to calculate cluster membership probabilities without the need for external membership selection.

  • The three-dimensional spatial distribution of the stars in a cluster is modelled, along with their kinematic distribution, assuming all the members follow a single velocity ellipsoid. Additionally, their absolute magnitude is modelled assuming the stars absolute magnitudes can be described by an isochrone. This links each of a star’s observables to the properties of the cluster to which it belongs, giving tight constraints on the quantification of a clusters parameters. With a sufficient quantity of data, it is possible to extract higher order information, as can be achieved through a Bayesian hierarchical model.

3. Mathematical formulation

3.1. Definition

The MLE is a method for estimating the parameters of a statistical model. By specifying a joint PDF for all observables and taking an initial guess at the value of each parameter, the likelihood of having taken a set of observations is calculated. Maximising this function by varying the parameters results in the MLE of the parameters in the statistical model.

The likelihood function can be defined as (1)where the joint PDF is made up of the un-normalised PDF and a normalisation constant , such that (2)and θ is the vector giving the parameters of the model.

In MLE, the user is free to define any model. For reliable results, the models chosen must resemble the system being modelled, and can incorporate the a priori information one may have about the system. In our case, we have chosen that θ = (R,σR,M(V − I),σM,U,V,W,σUVW), where R is the distance to the centre of the cluster assuming a spherical Gaussian distribution, σR the intrinsic dispersion around the mean distance, M(V − I) the mean absolute magnitude as a function of colour, σM the intrinsic dispersion around the mean absolute magnitude, U, V, and W are the three components of the clusters velocity ellipsoid in galactic Cartesian coordinates, and σUVW is the intrinsic dispersion in the clusters velocity.

The vector (y = m,l,b,ϖ,μl,μδ,vr) describes the observed properties of each star, and (y0 = m0,l0,b0,r0,μα,0,μδ0,vr0) is the vector describing the “true” underlying stellar properties unaffected by observational errors.

We can then define the un-normalised1 PDF such that (3)where ϕM0, ϕrlb, ϕv are the PDFs describing the nature of the open cluster, and is the selection function, which takes the probability of observing a star into account, given the properties of the star and the instruments’ observational capabilities. To take the fact that Hipparcos is a magnitude limited sample into account, a step function is used with (4)The case is more complicatedin reality, with Hipparcos only being complete up to magnitude Hp < 7. At fainter magnitudes, stars were selected using a number of criteria and used as an input catalogue (Turon et al. 1992). Hipparcos had a physical limit in the number of stars observable in a single field of view, and it suffered from glare effects on the telescope when observing stars very close together on the sky. Therefore, decisions were made on a case-by-case basis as to which stars to observe, depending on the number and position of stars in each field of view. The Hipparcos mission pre launch status report states that: “once the list of likely cluster members had been established and the worst veiling glare cases excluded, a somewhat arbitrary compromise had to be found”. The hand selection of stars for observation makes it impossible to accurately model a selection function based on apparent magnitude that can describe the selection probability for the underlying cluster population, which does not strictly follow the definition given in Eq. (4). However, as the Hipparcos star selection mostly depended on the proximity of target stars to each other on the sky and not on magnitude, the effects on the results given in Sects. 5 and 7 are minimal.

3.2. Models

The distribution of absolute magnitudes in a given photometric band is assumed to be a Gaussian distribution around a mean colour-absolute magnitude relation, Mmean, with some dispersion, σM: (5)Here, the absolute magnitude is given by (6)where m is the apparent magnitude of the star, ϖ its parallax, and A is the interstellar extinction of the star in the same photometric band.

In the following work, the interstellar extinction is assumed to be known, although it can be left as a free fit parameter. For the Pleiades, a single extinction value of AHp = 0.0975 mag in the Hipparcos band H is used for all members. This is derived from a reddening of E(B − V) = 0.025 ± 0.003 found by Groenewegen et al. (2007), which is converted to an extinction estimate in the visual band through AV = 3.1E(B − V) and then into the Hipparcos band Hp. For the Hyades, the level of extinction is assumed to be negligible, and an extinction of AHp = 0 is assumed. If available, individual extinction estimates for each star could be used, in order to correctly take effects of differential reddening into account.

The term Mmean(V − I) gives the mean absolute magnitude of stars as a function of colour, while σM is the intrinsic dispersion. For simplicity, known binaries are removed. Additionally, the given distribution does not support the giant branch, since this would require the magnitude function to turn back on itself, giving a non-unique solution. Therefore, giants are also removed, enabling the application of this method to all clusters, irrespective of age.

The fitting procedure has two options:

  • 1.

    It fits the position of points on the colour-absolute magnitudediagram, to which a spline function is fitted in order to determine acolour-dependent absolute magnitude distributionapproximating the isochrone of the cluster; or

  • 2.

    If a theoretical isochrone is supplied as input, the shape of the magnitude distribution is taken from the isochrone. While the shape of the isochrone is conserved, the isochrone is free to be shifted in absolute magnitude.

If using the second option, the age and metallicity of the cluster must be assumed. This makes the first option preferable for clusters with little information on age and metallicity, because parameters can be determined without having to be concerned with models that depend on this information.

The spatial distribution we have assumed follows a spherical Gaussian distribution in Cartesian coordinates that, expressed in a rotated galactic coordinate system (see Appendix A), follows: (7)where the term r2cos(b′) is the Jacobian of the coordinate transformation, R the mean distance to the cluster, r the distance to the individual star, and l and b are the rotated coordinates of the star.

The Gaussian spatial distribution was chosen for ease of implementation, and as a first approximation of the cluster’s spatial structure. It is possible to use a distribution specifically suited to open clusters, such as King’s profile (King 1962), which could make the distribution more realistic. This is a possible improvement for later additions to this work.

Finally, the velocity distribution in Cartesian coordinates is defined as the velocity ellipsoid: (8)where Umean, Vmean, and Wmean are the components of the cluster’s mean velocity along each Cartesian axis.

The distribution in observational errors is given by (9)All observational errors are assumed to follow a Gaussian distribution with a variance given by the formal error ϵy. Here, ϖ is the parallax, μα and μδ the proper motions, and m the apparent magnitude. The delta function δ(m,l′,b′) describes the case for which a parameter’s observational error is small enough to be deemed negligible.

An additional benefit of using this formulation with “true” object parameters in the models and then linking these true parameters with the observed quantities is that observational data including negative parallaxes can be used directly without problems attempting to calculate the logarithm of a negative number (e.g. in Eq. (6)). The inclusion of stars with negative parallaxes is essential for avoiding biasing the sample by preferentially removing more distant stars and biasing the average distance by selecting the stars with only positive errors.

The normalisation constant required for is found by integrating the un-normalised joint probability distribution over all y: (10)The exact analytical solution has been found where possible, and the remaining integrals with no analytical solution are solved numerically (see Appendix B).

3.3. Formal errors

The Hessian matrix is constructed through numerical differentiation of the likelihood function at its global maximum. The inverse of the Hessian matrix is the covariance matrix, and the formal errors are calculated from the square root of the diagonal of the covariance matrix. After calculating the covariance matrix and formal errors, correlations between each of the parameters can easily be obtained.

3.4. Data binning

While some parameters, such as mean distance, are “global” parameters describing some general property of the open cluster, a number of the parameters show a dependence on colour. For example, the physical spatial distribution of some clusters is believed to change as a function of mass, hence colour, through the process of mass segregation.

With a sufficiently precise data set containing enough stars, it is possible to fit a smooth function describing a parameter’s colour dependence, if applicable, by estimating the parameters of, for example, a polynomial approximation of the dependence. Owing to the limited available data in the Hipparcos catalogue for the Pleiades and Hyades, there is insufficient information to constrain such distributions in all cases, so where necessary an approximation has been made by binning the data.

A star’s normalised PDF can be thought of as the sum of several other PDFs, such that (11)where is the normalisation constant and a selection function dependent on colour: (12)where w is a normalised coefficient that depends on the relative number of stars per colour bin.

The normalisation constant C is found by integrating the PDF over all y:

The strength of this approach is that parameters with a single value of interest (with no colour dependence) are described by only one parameter in θ. Where more information can be gained by producing an estimate of a parameter in each colour bin, it is possible to define a separate parameter for each colour bin.

The parameters to be estimated therefore become (15)where n is the number of bins, R the mean distance to the cluster (assuming a spherical Gaussian distribution), the intrinsic dispersion around the mean distance in each colour bin, Mn + 1 are the absolute magnitude values used for fitting the spline function of the isochrone, U, V, and W are the three components of the velocity ellipsoid, and σUVW is the velocity dispersion.

3.5. Testing with simulations

The method described in Sect. 3 was implemented and tested extensively using simulations. During development and testing of the MLE implementation, simulated catalogues were constructed using Monte Carlo techniques. Each star in the simulated population is given a position in Cartesian coordinates, which is then converted into a sky position and distance, and given a velocity in Cartesian coordinates, which is converted into proper motions and a radial velocity. The values are randomly chosen from a spherical Gaussian spatial distribution and a velocity ellipsoid, with the mean and variance of each distribution chosen by hand. The relationship between colour and absolute magnitude is assumed to be linear for simplicity.

Basic simulation of observational errors was achieved by adding an error value derived from a Gaussian random number generator to each parameter in the simulated catalogue, with the variance chosen to be a value similar to the errors found in the Hipparcos catalogue. Additionally, the more realistic AMUSE open cluster simulator (Pelupessy et al. 2013) was used to simulate open clusters with more realistic distributions, including a realistic isochone.

The final stage of testing with simulations uses a set of 500 simulated open clusters found in the Gaia Simulator (Masana et al. 2010; Robin et al. 2012) in order to test the suitability of the method in use with Gaia data (see Sect. 8). The Gaia Object Generator (Isasi et al. 2010) was used to simulate realistic errors as expected for Gaia catalogue data. Using the ML method on this simulated data showed the successful extraction of an isochrone-like sequence from error-effected data (Fig. 1) and could reproduce the input distance of the cluster, within formal error bounds and, after repeated testing, with no significant bias.

thumbnail Fig. 1

CMD for a simulated Pleiades like cluster found in the GaiaSimu library, with a distance of 2.6 kpc. Open circles show the “true” simulated absolute magnitudes without errors. Blue crosses are the points fitted with the ML method applied to the data after the simulation of observational errors. The solid line is the resulting isochrone-like sequence, showing accurate reconstruction of the isochrone-like sequence from error effected data.

Open with DEXTER

In the following sections (4, 5, 6, and 7), we apply the method to the best currently available data on the Pleiades and the Hyades. In Sect. 8 we use simulations to extrapolate the use of the method out to greater distances, in expectation of the Gaia data.

4. Data

The method described here has been applied to the new Hipparcos reduction (van Leeuwen 2007) data for 54 well known Pleiades members. The 54 cluster members are believed to be a clean sample and have been identified and used in numerous previous studies (e.g. van Leeuwen & Hansen Ruiz 1997; Robichon et al. 1999a and Makarov 2002).

The new Hipparcos reduction has several advantages over the original Hipparcos catalogue, including a reduction in the formal errors by up to a factor of 4 for the brightest stars and a claimed reduction, by up to a factor of 10, in the correlations between stars observed over small angles. This was achieved through a completely new method for formulating the Hipparcos satellite’s attitude model, replacing the previous “great-circles” reduction process with a fully iterative global solution. This new method of catalogue reconstruction supersedes an earlier attempt by Makarov (2002) to reduce correlation in the Hipparcos catalogue specifically for the Pleiades open cluster.

This reduction in correlations is particularly important in the study of open clusters. This is because correlated errors in the original Hipparcos catalogue were blamed for the cases of discrepancy between distances calculated with Hipparcos parallaxes, compared with photometric data or other methods.

Where radial velocity information has been used, it was obtained from (Mermilliod et al. 2009; Raboud & Mermilliod 1998; Morse et al. 1991; Liu et al. 1991) through the WEBDA database.

Table 1

Colour-independent results obtained for the Pleiades from the method applied to the new Hipparcos reduction.

5. Results – Pleiades

The results of the application of the method described in Sect. 3 can be seen in Tables 1 and 2. Parameters describing the general properties of the cluster, such as mean distance and mean proper motion are given in Table 1, whereas colour-dependent parameters are given in Table 2. For the latter, stars were divided into four color bins. The choice of bins is arbitrary and have been selected to give roughly the same number of stars per bin.

5.1. Distance

The mean distance to the Pleiades has been estimated to be 120.3 ± 1.5 pc. This agrees with van Leeuwen (2009), who finds 120.2 ± 2.0 pc using the same dataset. The formal errors assigned to each parameter are calculated from the square root of the covariance matrix of the likelihood function. The formal error is approximately 25% smaller than from van Leeuwen (2009), with the increased precision from the use of this method attributed to the inclusion of parallax, position, proper motion, colour, and apparent magnitude information.

The distance is given as a general property of the cluster alone (Table 1), because there is no physical reason to expect differing distances for different colour bins, unlike, for example, the size parameter, σR.

The mean distance to the cluster has been included as a colour-dependent parameter during testing, in order to check the ML method functions as expected and to check that there are no biases present in the Hipparcos catalogue. The mean distance to the cluster was found to be consistent across all colour bins, within the error bounds. This is a good test that there is no colour- dependent bias in the Hipparcos Pleiades data.

5.2. Kinematics

Testing was carried out for two cases: first where only Hipparcos data has been used, and second including the use of radial velocity data where available. Fifteen Hipparcos Pleiades members have radial velocity data (less than one third of the sample). Working in a mixed case, where some stars have radial velocity information and some do not, is possible through marginalisation of terms with missing data from the joint PDFs.

With the inclusion of radial velocity data, the variance in the space velocity has been found to be nearly 6 kms-1. We believe that a lack of homogeneity in the data is responsible for the large variance, because the data is compiled from several sources. While there is no change in the estimation of the mean distance to the cluster between the two cases, the formal errors are in fact larger with the inclusion of radial velocity data. Therefore, the lack of an accurate and homogeneous catalogue of radial velocities for a significant number of Hipparcos Pleiades stars means that radial velocity information has been ignored, and the results shown are for a fitting to a Gaussian distribution in proper motion rather than a three-dimensional space velocity.

The variance in the distribution in proper motions is found to be 1.7 ± 0.3 and 1.5 ± 0.2 mas year-1 for μα and μδ, respectively. Taking the distance to be 120.3 pc, this variance is equivalent to a variance in the velocity distribution of the cluster of 1.3 ± 0.4 km s-1.

5.3. Size

The spatial distribution of the open cluster is modelled using a spherical Gaussian distribution, with its center at some distance R and a variance around the mean distance σR. The variance has been included as a colour-dependent term and is estimated for each colour bin. The results show an increase in σR with an increase in colour (V − I), which corresponds directly to a decrease in stellar mass.

As the variance in the spatial distribution increases from 3.3 pc to 13.2 pc over the full colour range for the observed stars, we find strong evidence for mass segregation. This relationship between stellar mass and the spatial distribution of stars in a cluster has been reported for the Pleiades with a similar degree of segregation, using a star counting technique for the two-dimensional case of the cluster projected onto the plane of the sky (Converse & Stahler 2008).

Table 2

Colour-dependent results obtained for the Pleiades from the method applied to the new Hipparcos reduction.

5.4. Absolute magnitude distribution

An approximation to the isochrone of the cluster is found by fitting points in the colour-absolute magnitude diagram to find the absolute magnitude distribution of the star as a function of colour. A spline function converts these points into a smooth line, which can be thought of as the observed isochrone of the cluster. The points used for the spline function are found by the fitting method and labelled A and B for each colour bin in Table 2. The resolution of the resulting fit only depends on the number of stars, which limits the possible number of bins, and the precision of the data.

The intrinsic dispersion, σM, is the dispersion in absolute magnitude around the mean magnitude given by the magnitude distribution, over any narrow colour range (V − I) to (V − I) + δ(V − I). Since the dispersion is not constant over the whole colour range, the value of σM is given for each colour bin.

The results of the fitting can be seen in Table 2, where they have been plotted onto the colour magnitude diagram of the Pleiades in Fig. 2. The theoretical isochrone taken from the PARSEC library (Bressan et al. 2012) can be seen in green. The isochrone was generated using PARSEC v1.1, with an age of 100 Myr, and Z = 0.03.

Excluding the turn off (V − I < 0.1), the shape of the theoretical isochrone is accurately obtained by the fitting procedure (see Fig. 2). The ~0.3 mag difference in absolute magnitude between the theoretical isochrone and the sequence found by the fitting procedure is the discrepancy historically reported between Hipparcos and photometry-based methods.

Above the main sequence turn off, neither the theoretical isochrone nor the results from this work accurately fit the data. This is due to the presence of stars migrating to the giant branch.

The intrinsic dispersion of absolute magnitude around the isochrone is found for each colour bin. The large dispersion in the first bin is due to the presence of the turn off, and subsequently the dispersion around the main sequence decreases with increased V − I.

thumbnail Fig. 2

Colour-absolute magnitude diagram for the 54 Hipparcos Pleiades members. MH is the absolute magnitude in the Hipparcos photometric band calculated using the posterior parallax. The blue crosses are the results of the fitting, with the spline function giving the magnitude dependence as the thick line. The green dashed line is the theoretical isochrone generated using PARSEC v1.1, with an age of 100 Myr, and Z = 0.03 (Bressan et al. 2012).

Open with DEXTER

6. Correlations

The main argument against Hipparcos-based distance estimates of open clusters has been that Hipparcos trigonometric parallaxes have correlated errors on small angular scales. Indeed, this has been cited as the cause of the large (0.3 mag) discrepancy in the Pleiades distance between Hipparcos-based and photometry-based methods. Narayanan & Gould (1999) argue that correlated parallax errors cause a stark difference between Hipparcos trigonometric parallaxes and the Hipparcos kinematic parallaxes derived from proper motions.

thumbnail Fig. 3

Smoothed contours of the difference between the Hipparcos parallaxes and the proper-motion-based parallax (ϖHip − ϖpm) of the 54 Pleiades cluster members, with data taken from the original Hipparcos catalogue (1997). Thin contours are at intervals of 0.1 milli-arcseconds, thick contours at intervals of 1 milli-arcsecond.

Open with DEXTER

Table 3

Results obtained for the Pleiades after cutting all stars at or above the apparent location of the main sequence turn off: (V − I) > 0.1.

Table 4

Results obtained for the Pleiades after cutting all stars at or above the apparent location of the main sequence turn off.

According to Narayanan & Gould (1999), the proper-motion-based parallax can be determined from Hipparcos data using (16)where (Vt)i is the transverse velocity of the cluster in the plane of the sky at the position of the star i, Ci is the sum of the velocity dispersion tensor divided by the square of the mean distance to the cluster and the covariance matrix of the Hipparcos proper motion, and μHIP,i is the vector describing the proper motion of the star.

Narayanan and Gould used a plot showing the contours of the difference between Hipparcos parallaxes and the parallaxes derived from proper motions (Eq. (16)) to argue that the Hipparcos parallaxes are systematically larger by up to two milli-arcseconds throughout the inner 6 ° of the Pleiades. This figure has been recreated in Fig. 3 with the 54 member stars used in this work.

Figure 4 has been produced using the method described by Narayanan & Gould (1999), but using parallax data from the new Hipparcos reduction. While in the original plot from Narayanan and Gould there is clearly a region where the trigonometric parallaxes are larger than those derived from proper motions, this feature has been reduced in severity by a factor of two by using the new Hipparcos reduction.

This new figure confirms, for the case of the Pleiades, claims made by van Leeuwen (2007) that correlations in the new reduction have been reduced significantly. Additionally, this disagrees with claims that the shorter distance for the Pleiades derived from Hipparcos is due to the correlated errors in parallaxes for Pleiades stars. If this was the case, one would expect the distance estimate from the new Hipparcos reduction to be greater now that the correlations have been reduced.

In fact, the distance derived from the new reduction in this work and by van Leeuwen (2009) put the Hipparcos distance to the Pleiades at 2.5 pc longer than those derived from the original Hipparcos catalogue, a much smaller difference than the roughly 10% historic discrepancy. The method presented here has also been applied to the original Hipparcos catalogue. The difference between the estimated distance from the old and new Hipparcos reductions is only 2%. The small change in distance estimate despite the reduction in correlations by a factor of 10 implies that correlations cannot be responsible for the long-standing discrepancy.

Additionally, in the calculation of the proper-motion-based parallax, a mean distance to the cluster must be assumed. Narayanan & Gould (1999) derived a distance to the Pleiades of 131 pc using Hipparcos proper motions, which was then used in calculating the proper-motion-based parallaxes that form the basis of Fig. 3 and their argument against Hipparcos. Using the distance of 120 pc as found in this work and implied by studies using Hipparcos parallaxes, the baseline for the correlation plots is shifted, as can be seen in Fig. 5. In this final figure no significant residual biasing the results is apparent, contrary to the original claims.

thumbnail Fig. 4

Same as Fig. 3, but with parallax data taken from the new Hipparcos reduction (2007).

Open with DEXTER

thumbnail Fig. 5

Same as Fig. 4, but with an assumed mean cluster distance of 120 pc as implied by the Hipparcos data for the computation of the proper motion based parallax.

Open with DEXTER

7. Results – Hyades

This method has also been applied to the new Hipparcos reduction for the Hyades open cluster. Of the 282 potential Hyades members used by Perryman et al. (1998), a detailed study by de Bruijne et al. (2001) finds 218 probable members, which are used as the basis of this study. Radial velocities from ground-based observations have been collected in Perryman et al. (1998) and used extensively here.

The star HIP20205 was rejected as a giant, because the isochrone fitting does not currently support fitting of the giant branch.

Galli et al. (2012) reject the star HIP28774 in their analysis of the Hyades cluster due to conflicting data for this star in the old and new Hipparcos reductions. In the following sections, this star is not present in our membership list after selecting only the stars in the inner 10 pc. When using the full 218 probable members, the star is included, however its removal has no real effect on the results. This is due to the very low precision of the data on this star in the Hipparcos catalogue, and therefore its very low statistical weight within the method.

7.1. Distance

As in Perryman et al. (1998), we select only the stars within the inner 10 pc for the distance estimation, since the large spatial dispersion and presence of numerous halo stars is not modelled well by a spherical Gaussian distribution. The distance to the Hyades has been estimated as 46.35 ± 0.35 pc. This is slightly more than Perryman et al. (1998), who finds 46.34 ± 0.27 pc, and slightly smaller than van Leeuwen (2009), who finds 46.45 ± 0.50 pc. Perryman et al. (1998) used the original Hipparcos catalogue, whereas van Leeuwen (2009) used the new reduction from 2007. Both authors used differing membership selection and determination techniques.

The three-dimensional dispersion in the spatial distribution of the cluster’s core is estimated to be 3.4 ± 0.2 pc, which is consistent with the observed distribution in sky position. Assuming this dispersion in the spatial distribution, the selection of stars with cluster radius of less than 10 pc corresponds to a 3σ limit, and so the spatial distribution of the core of the cluster should not be significantly affected.

Using the ML method on the full 218 probable member stars, including halo stars, the mean distance is found to be 42.6 ± 0.5 pc. The decrease in precision with the increase in the number of stars is attributed to the clearly non-Gaussian distribution of a dense core and a large number of disperse halo stars. In this case the spatial distribution of the cluster’s core is estimated to be 15.8 ± 3.9 pc. The modelling of the spatial distribution could be improved through the use of a King’s profile (King 1962) or an exponential distribution. This is being considered for future improvements.

That member stars are found with distances from the center greater than its tidal radius of ~ 10 pc (Madsen et al. 2001) is reasonable, and is expected due to the large number of so-called halo stars. These stars have been found to exist in the Hyades at a radius of 10 to 20 pc (Brown & Perryman 1997), and although they are effected by the galactic gravitational field, they remain bound to the cluster for some significant time.

7.2. Kinematics

The space velocity of the Hyades has been found to be 46.5 ± 0.2 km s-1, with an internal dispersion on the velocity of 1.11 ± 0.05 km s-1. The internal dispersion is slightly larger than those in the literature reviewed by de Bruijne et al. (2001), who find a dispersion of ~0.3 km s-1.

As for the case of the Pleiades, the high velocity dispersion is attributed to an inhomogeneous radial velocity data. As highlighted by Brown & Perryman (1997), the radial velocity data for the Hyades comes from a combination of sources with greatly varying precision and zero points.

7.3. Absolute magnitude distribution

As described in Sect. 5.4, an estimate of the isochrone of the cluster is produced through fitting a smoothed line to the mean absolute magnitude Mmean in Eq. (5). The results of the fitting can be seen in Fig. 6, with the theoretical isochrone overlaid in green.

In contrast to the results for the Pleiades, the isochrone from the ML method and the theoretical isochrone from the PARSEC library are in strong agreement in both shape and position over most of the main sequence, with some divergence at the extreme ends of the colour range, where there are few stars to constrain the model fit.

The offset in the case of the Pleiades was caused by the long-standing discrepancy between Hipparcos and photometric methods. This is not present in the case of the Hyades, where Hipparcos-based distances generally agree with other methods.

Dispersion around the main sequence has been greatly reduced compared with computing the absolute magnitude from the data directly, by computing posterior distances for each star from the results of the fitting and the individual Hipparcos observations.

thumbnail Fig. 6

Colour-absolute magnitude diagram for the 128 Hipparcos Hyades members found in the inner 10 pc of the clusters core. MHP is the absolute magnitude in the Hipparcos band, and is calculated using the posterior parallax. The blue crosses are the results of the fitting, with the spline function giving the magnitude dependence as the thick line. The dashed line is the theoretical isochrone from the PARSEC library, with an age of 630 Myr and Z = 0.024.

Open with DEXTER

Table 5

Colour-independent results obtained from the method applied to Hyades stars in the new Hipparcos reduction.

Table 6

Colour-dependent results obtained from the method applied to the new Hipparcos reduction for the Hyades open cluster.

thumbnail Fig. 7

Results of the distance estimation to a simulated Pleiades-like cluster placed at a factor of 10 to 30 times the original distance. Filled circles show the percentage difference between the MLE-estimated distances and the true distance, open circles are the percentage difference between the inverse of the mean of the parallax and the true distance, and green shading highlights 1, 2, and 3σ errors extrapolated over the entire range.

Open with DEXTER

8. Outlook for Gaia

The method described in Sect. 3 will be particularly useful after the release of the Gaia astrometric catalogue. That the Gaia catalogue will include all of the information required for applying this method, including radial velocities for GRVS < 17, in one self-consistent catalogue makes Gaia ideal for studying open clusters to greater precision and at greater distances than was possible previously. Indeed, Gaia is expected to observe some one billion stars, including stars in numerous open clusters. This will allow application of this method to many more clusters, including those at much greater distances than was previously possible.

To test the performance of the ML method with Gaia data, a simulated open cluster was created, and simulated Gaia observational errors applied. To continue with the case study of the Pleiades, a simulated star catalogue for a Pleiades-like cluster was obtained from GaiaSimu. This (Masana et al. 2010; Robin et al. 2012) is a set of libraries containing the Gaia Universe Model and instrument models used by the Gaia Data Processing and Analysis Consortium (DPAC). It contains a database of 500 simulated open clusters, including the Pleiades, constructed from Padova isochrones and a Chabrier/Salpeter IMF.

To apply simulated Gaia observational errors, the data was processed using the Gaia Object Generator (GOG). It (Isasi et al. 2010) is a simulator of the Gaia end of mission catalogue, and is one of the major products of DPAC’s simulation efforts. It contains all of the currently available predicted error models for the Gaia satellite, and is capable of transforming an input catalogue of “true” stellar properties into simulated Gaia observations including predicted observational effects and the instrumental capabilities.

In the Gaia case, the selection function in Eq. (4) is modelled as a step function at G = 20 mag. Because Gaia is expected to be complete up to this magnitude, the step function should be a good approximation to the real case.

8.1. Pleiades with Gaia

The GaiaSimu and GOG simulated Pleiades contains some 1000 stars, placed at a distance of 130 pc and occupying the same region of the sky as the real Pleiades. With simulated parallax errors of between 10 and 100 μas, the vast majority of the star’s distances are very accurately measured.

With such a precise data set, both the estimated distance from MLE and the distance obtained directly from the mean of the parallax are both within 0.01 pc of the true value. This highlights that, for the nearest open clusters, it will be possible with Gaia data to go further beyond the current goal of determining the distance and kinematic and structural parameters, to having highly detailed information on many aspects of open clusters.

In such cases, the mean distance of a cluster determined through the ML method will not in itself be useful, although individual stars’ posterior distance estimates from the method will be unbiased and therefore preferable to distances found by inverting individual parallaxes. In terms of determining a cluster’s spatial distribution, the direct use of parallax information results in a bias in the results not present during the application of the ML method.

8.2. Distant clusters with Gaia

To test the performance of the ML method with open clusters at greater distances, the GaiaSimu simulated Pleiades was modified, increasing the distance while conserving all other properties. The open cluster was moved to a range of distances between 10 and 30 times the originally assumed distance of 130 pc (i.e. up to a distance of 4 kpc). Then the Pleiades-like clusters were processed with GOG to simulate Gaia observational errors.

Using the same simulated open cluster moved to different distances allows a direct comparison of the ML method performance at different distances. Figure 7 shows the results of the distance estimation using the ML method, showing the percentage difference between the “true” distance and the distance estimated from MLE, Δ(dreal − dMLE)/dreal, and comparing this with the percentage difference between the “true” distance with the inverse of the mean parallax, . As the distance to the cluster increases, the stars become fainter and the observational errors larger. As can be clearly seen in Fig. 7, the mean of the parallax is susceptible to large random error, in addition to Lutz-Kelker effects and other statistical biases, and is unsuitable for accurate distance determination in magnitude-limited data sets and those with significant observational errors.

As with the Hipparcos Pleiades and Hyades data, the CMD is plotted with the isochrone-like sequence obtained from the observational data and shown in Fig. 8. These plots have been created using the simulated clusters at 1300, 2600, and 3900 pc, showing that it is possible to obtain a reliable observational isochrone from Gaia observations even when individual parallaxes are strongly affected by observational errors.

thumbnail Fig. 8

CMD for the simulated Pleiades like cluster at distances of 1300 pc (top), 2600 pc (middle) and 3900 pc (bottom). The three clusters have 216, 143 and 114 observed members respectively. Open circles show the “true” simulated absolute magnitudes without errors, and filled circles show the absolute magnitudes calculated directly from the simulated observations including simulated gaia errors. Stars with negative parallaxes are omitted from the figure but included in the ML estimation. Reddening in V − I is assumed known. The blue crosses are the points fitted with the ML method and the solid line is the resulting isochrone-like sequence.

Open with DEXTER

With this simulated dataset, the ML method is confirmed as not suffering from significant statistical biases, and it is expected to perform well with real Gaia data.

8.3. Membership selection

When studying open clusters, especially those at greater distances, membership selection has always been important and problematic. When applying the ML method to distant open clusters in the Gaia catalogue, the density of stars on the sky could cause problems with misclassification and source confusion.

However, with an expected source density in the galactic plane of around 3 × 105 stars per square degree, Gaia’s windowing system for object detection is small enough to give a low probability of source confusion even when observing distant open clusters.

In terms of membership selection, the ML method’s estimation of an open cluster’s parameters can be used directly to perform membership probability tests. If the ML method is primed using a sample of probable members, a χ2 test can be applied to calculate membership probability for each star in the sample using the parallax, proper motion, and radial velocity information simultaneously: (17)where A is the vector (ϖi − ϖML,   μα,i − μα,ML,   μδ,i − μδ,ML,   vri − vr,ML), and C-1 is the sum of the catalogue’s covariance matrix and the variance on each parameter due to the clusters intrinsic dispersion in distance, proper motion, and radial velocity. Here, ϖML, μα,ML, μδ,ML, and vr,ML are the mean parallax, mean cluster proper motion and mean radial velocity of the cluster, determined from the ML method.

Stars with K2 > 16.25 are rejected, in correspondence with a χ2 test at three-sigma level with four degrees of freedom. This process should be performed using an iterative process, rejecting the worst outliers a few at a time and recalculating all fitting parameters, until no further outliers remain.

Here, the estimated cluster distance and space velocity, including intrinsic dispersion, are combined with the individual observations and their associated errors in order to distinguish between members and non-members in a single step.

Testing using a 1000-star sample of GOG simulated field stars added to the five simulated Pleiades samples used in Fig. 7, the χ2 test excluded field stars with a misclassification rate between 0.8 and 0.3%, assuming a worst case of zero radial velocity information.

9. Conclusions

An improved method for estimating the properties of open clusters has been presented, and tested using real data on two nearby and well studied open clusters. In addition to distance estimation, internal kinematics and spatial structure were probed, with mass segregation detected in the case of the Pleiades. These results confirm that the method performs as expected and highlight the potential future uses of such a method when high quality parallax information is available from the Gaia mission.

After revisiting the “Pleiades problem”, we find that an explanation cannot be found in error correlation problems in Hipparcos. Through the use of simulations we find that Gaia will measure the distance to Pleiades stars with precision of a fraction of a percent, enabling a conclusion to this long running discrepancy.

The ML method can be extended further to give more detailed information, such as including a model for cluster ellipticity and orientation. It is possible to include and compare different spatial and kinematic distributions, allowing one to test predictions on spatial structure, mass segregation, and peculiar motions, and to test for other properties such as cluster rotation. In the case of the absolute magnitude distribution, it would be possible to give age and metallicity estimates by fitting and comparing sequences of different theoretical isochrones.

Unresolved binaries, which complicate studies of open clusters, can be detected using the posterior distances calculated using the ML method and the resulting colourmagnitude diagram. It is possible to extend the method to use a distribution in absolute magnitude that is asymmetrical around the main sequence, in order to consider undetected unresolved binaries within the method.

As mentioned in Sect. 5, a lack of quality radial velocity data for Hipparcos Pleiades stars limits the application of the method in fitting the full three-dimensional kinematics of the open cluster. This is expected to change when Gaia comes to fruition, because all stars with GRVS < 17 will have radial velocity information from the on-board radial velocity spectrometer. Additionally, very high quality radial velocities for stars in more than 100 open clusters will become available through the Gaia ESO Survey (Gilmore et al. 2012), expanding the scope of the method’s application to clusters at much greater distances.


1

Although the PDF must be normalised, it is convenient to define the un-normalised PDF first, and then come back to the normalisation constant when all of the components of the PDF have been defined.

Acknowledgments

The authors would like to thank Carme Jordi for her useful comments, and Anthony Brown and Jos de Bruijne for their input on recreating the results of Narayanan & Gould (1999). We likewise thank CU2 for the use of GaiaSimu and the GOG simulator, and Erika Antiche for running GOG simulations. This work was supported by the Marie Curie Initial Training Networks grant number PITN-GA-2010-264895 ITN “Gaia Research for European Astronomy Training”, and MINECO (Spanish Ministry of Economy) – FEDER through grant AYA2009-14648-C02-01, AYA2010-12176-E, AYA2012-39551-C02-01, and CONSOLIDER CSD2007-00050.

References

Online material

Appendix A: Cartesian to galactic coordinate transformation

The spatial distribution of the cluster is given in Cartesian coordinates by a Gaussian distribution in each axis: where X0, Y0, and Z0 define the centre of the cluster in each axis, and σS gives the variance of the distribution.

To transform this Cartesian PDF into polar coordinates as required for our observables r, l, and b, we start with the relationship between our two sets of variables and find the inverse:

Then we have two equations and two unknowns: Then we require the Jacobian: r2cos(b).

By substituting the x, y, and z found above into the original PDF and multiplying by the Jacobian of the transformation, we find the PDF in the new coordinate system: (A.12)By rotating our coordinate system l,b → l′,b to align the cluster centre with the X axis, we have new coordinates l and b for all the stars. In this rotated coordinate system, the cluster has a position , and X is equivalent to the distance to the clusters centre. The above spatial probability distribution function can then be simplified as (A.13)where R is the distance to the cluster.

It should be noted that the two coordinate systems are used simultaneously. The rotated coordinates (l′,b′) are used in the integration over position to simplify the integrals as explained above. However, in the analytic solution to the integrals over μαμδvr, the unrotated coordinates l and b are used.

Appendix B: Integration of the likelihood function

To evaluate in Eq. (1) we must integrate over all y0, giving a multiple integral that can be split into three parts. First is the integral over variables with assumed zero error, second kinematics, and finally distance.

Appendix B.1: Integration over m0, l and b

As these variables have errors given by the delta function, so we can use (m,l′,b′). This avoids integrating over these three parameters.

Appendix B.2: Integration over μα,0, μδ,0 and vr0

The triple integral over μα,0, μδ,0 and vr0 is

(B.1)where (B.2)In order to perform the integral, the function ϕv(U,V,W) must be expressed in terms of μα,0, μδ,0, and vr0. This is achieved through the following expressions:

where

Therefore ϕv(U,V,W) can be written in terms of μα,0, μδ,0 and vr0 as (B.6)This can be integrated using the definite integral, (B.7)giving the solution (B.8)where K, X, Y, and Z are defined in Eq. (B.10).

Appendix B.3: Integration over R

The remaining integral has no analytical solution and will therefore be performed numerically:

Appendix C: Normalisation coefficient

Until now we have been using the un-normalised joint probability distribution. Normalisation is achieved by dividing by a normalisation constant, . The normalisation constant is found by integrating the un-normalised joint probability distribution over all y: (C.1)This integral can be performed in two parts, where I is defined such that(C.2)Substituting in the selection function and the PDF of the observational errors ℰ(y | y0) gives the following seven-dimensional integral: (C.3)This integral can be split into two parts. The integral over the delta function in ℰ(y | y0) that, by definition, gives one; and the integral over each Gaussian error, Here, θ(m − mlim) acts to provide an upper limit to the integral over all MG. Substituting I back into C we have (C.6)As with in the previous section, the integral can be split up into a number of parts.

Appendix C.1: Integration over MG

Evaluating first the integral over apparent magnitude gives (C.7)

where erfc is the complementary error function, and (C.8)

Appendix C.2: Integration over (U,V,W)

Integrating over (U,V,W) gives (C.9)

Appendix C.3: Integration over l, b, and r0

The remaining triple integral has no analytical solution and will be performed numerically: (C.10)with: .

All Tables

Table 1

Colour-independent results obtained for the Pleiades from the method applied to the new Hipparcos reduction.

Table 2

Colour-dependent results obtained for the Pleiades from the method applied to the new Hipparcos reduction.

Table 3

Results obtained for the Pleiades after cutting all stars at or above the apparent location of the main sequence turn off: (V − I) > 0.1.

Table 4

Results obtained for the Pleiades after cutting all stars at or above the apparent location of the main sequence turn off.

Table 5

Colour-independent results obtained from the method applied to Hyades stars in the new Hipparcos reduction.

Table 6

Colour-dependent results obtained from the method applied to the new Hipparcos reduction for the Hyades open cluster.

All Figures

thumbnail Fig. 1

CMD for a simulated Pleiades like cluster found in the GaiaSimu library, with a distance of 2.6 kpc. Open circles show the “true” simulated absolute magnitudes without errors. Blue crosses are the points fitted with the ML method applied to the data after the simulation of observational errors. The solid line is the resulting isochrone-like sequence, showing accurate reconstruction of the isochrone-like sequence from error effected data.

Open with DEXTER
In the text
thumbnail Fig. 2

Colour-absolute magnitude diagram for the 54 Hipparcos Pleiades members. MH is the absolute magnitude in the Hipparcos photometric band calculated using the posterior parallax. The blue crosses are the results of the fitting, with the spline function giving the magnitude dependence as the thick line. The green dashed line is the theoretical isochrone generated using PARSEC v1.1, with an age of 100 Myr, and Z = 0.03 (Bressan et al. 2012).

Open with DEXTER
In the text
thumbnail Fig. 3

Smoothed contours of the difference between the Hipparcos parallaxes and the proper-motion-based parallax (ϖHip − ϖpm) of the 54 Pleiades cluster members, with data taken from the original Hipparcos catalogue (1997). Thin contours are at intervals of 0.1 milli-arcseconds, thick contours at intervals of 1 milli-arcsecond.

Open with DEXTER
In the text
thumbnail Fig. 4

Same as Fig. 3, but with parallax data taken from the new Hipparcos reduction (2007).

Open with DEXTER
In the text
thumbnail Fig. 5

Same as Fig. 4, but with an assumed mean cluster distance of 120 pc as implied by the Hipparcos data for the computation of the proper motion based parallax.

Open with DEXTER
In the text
thumbnail Fig. 6

Colour-absolute magnitude diagram for the 128 Hipparcos Hyades members found in the inner 10 pc of the clusters core. MHP is the absolute magnitude in the Hipparcos band, and is calculated using the posterior parallax. The blue crosses are the results of the fitting, with the spline function giving the magnitude dependence as the thick line. The dashed line is the theoretical isochrone from the PARSEC library, with an age of 630 Myr and Z = 0.024.

Open with DEXTER
In the text
thumbnail Fig. 7

Results of the distance estimation to a simulated Pleiades-like cluster placed at a factor of 10 to 30 times the original distance. Filled circles show the percentage difference between the MLE-estimated distances and the true distance, open circles are the percentage difference between the inverse of the mean of the parallax and the true distance, and green shading highlights 1, 2, and 3σ errors extrapolated over the entire range.

Open with DEXTER
In the text
thumbnail Fig. 8

CMD for the simulated Pleiades like cluster at distances of 1300 pc (top), 2600 pc (middle) and 3900 pc (bottom). The three clusters have 216, 143 and 114 observed members respectively. Open circles show the “true” simulated absolute magnitudes without errors, and filled circles show the absolute magnitudes calculated directly from the simulated observations including simulated gaia errors. Stars with negative parallaxes are omitted from the figure but included in the ML estimation. Reddening in V − I is assumed known. The blue crosses are the points fitted with the ML method and the solid line is the resulting isochrone-like sequence.

Open with DEXTER
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.