Machine learning techniques to select Be star candidates
An application in the OGLE-IV Gaia south ecliptic pole field
1 Universidad de los AndesDepartamento de Física, Cra. 1 No. 18A-10, Bloque Ip, A.A., 4976 Bogotá, Colombia
e-mail: firstname.lastname@example.org; email@example.com; firstname.lastname@example.org
2 Universidad de los Andes, Departamento de Matemáticas, Cra. 1 No. 18A-10, Edificio H, Bogotá, Colombia
3 Korteweg-de Vries Institute for Mathematics, University of Amsterdam, Science Park 105-107, 1098 XG Amsterdam, The Netherlands
4 Instituto de Astronomía, Universidad Nacional Autónoma de México, Unidad Académica en Ensenada, Ensenada BC 22860, México
Received: 15 May 2016
Accepted: 23 June 2017
Context. Optical and infrared variability surveys produce a large number of high quality light curves. Statistical pattern recognition methods have provided competitive solutions for variable star classification at a relatively low computational cost. In order to perform supervised classification, a set of features is proposed and used to train an automatic classification system. Quantities related to the magnitude density of the light curves and their Fourier coefficients have been chosen as features in previous studies. However, some of these features are not robust to the presence of outliers and the calculation of Fourier coefficients is computationally expensive for large data sets.
Aims. We propose and evaluate the performance of a new robust set of features using supervised classifiers in order to look for new Be star candidates in the OGLE-IV Gaia south ecliptic pole field.
Methods. We calculated the proposed set of features on six types of variable stars and also on a set of Be star candidates reported in the literature. We evaluated the performance of these features using classification trees and random forests along with the K-nearest neighbours, support vector machines, and gradient boosted trees methods. We tuned the classifiers with a 10-fold cross-validation and grid search. We then validated the performance of the best classifier on a set of OGLE-IV light curves and applied this to find new Be star candidates.
Results. The random forest classifier outperformed the others. By using the random forest classifier and colours criteria we found 50 Be star candidates in the direction of the Gaia south ecliptic pole field, four of which have infrared colours that are consistent with Herbig Ae/Be stars.
Conclusions. Supervised methods are very useful in order to obtain preliminary samples of variable stars extracted from large databases. As usual, the stars classified as Be stars candidates must be checked for the colours and spectroscopic characteristics expected for them.
Key words: methods: statistical / stars: variables: general / stars: emission-line, Be / catalogs
© ESO, 2017