Volume 650, June 2021
|Number of page(s)||13|
|Section||Numerical methods and codes|
|Published online||15 June 2021|
pyUPMASK: an improved unsupervised clustering algorithm
Instituto de Astrofísica de La Plata (IALP-CONICET), 1900 La Plata, Argentina
2 CENTRA, Faculdade de Ciências, Universidade de Lisboa, Ed. C8, Campo Grande, 1749-016 Lisboa, Portugal
3 Facultad de Ciencias Exactas, Ingeniería y Agrimensura (UNR), 2000 Rosario, Argentina
4 Instituto de Física de Rosario (CONICET-UNR), 2000 Rosario, Argentina
5 Facultad de Ciencias Astronámicas y Geofísicas (UNLP-IALP-CONICET), 1900 La Plata, Argentina
Accepted: 24 March 2021
Aims. We present pyUPMASK, an unsupervised clustering method for stellar clusters that builds upon the original UPMASK package. The general approach of this method makes it plausible to be applied to analyses that deal with binary classes of any kind as long as the fundamental hypotheses are met. The code is written entirely in Python and is made available through a public repository.
Methods. The core of the algorithm follows the method developed in UPMASK but introduces several key enhancements. These enhancements not only make pyUPMASK more general, they also improve its performance considerably.
Results. We thoroughly tested the performance of pyUPMASK on 600 synthetic clusters affected by varying degrees of contamination by field stars. To assess the performance, we employed six different statistical metrics that measure the accuracy of probabilistic classification.
Conclusions. Our results show that pyUPMASK is better performant than UPMASK for every statistical performance metric, while still managing to be many times faster.
Key words: open clusters and associations: general / methods: data analysis / open clusters and associations: individual: NGC 2516 / methods: statistical
© ESO 2021
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.