Volume 565, May 2014
|Number of page(s)||4|
|Section||Numerical methods and codes|
|Published online||06 May 2014|
A fast version of the k-means classification algorithm for astronomical applications
Instituto de Física de Cantabria,
Avenida de los Castros, s/n,
2 Departamento de Astrofísica, Universidad de La Laguna, 38205, La Laguna, Tenerife, Spain
3 Instituto de Astrofísica de Canarias, 38205, La Laguna, Tenerife, Spain
Accepted: 28 March 2014
Context. K-means is a clustering algorithm that has been used to classify large datasets in astronomical databases. It is an unsupervised method, able to cope very different types of problems.
Aims. We check whether a variant of the algorithm called single pass k-means can be used as a fast alternative to the traditional k-means.
Methods. The execution time of the two algorithms are compared when classifying subsets drawn from the SDSS-DR7 catalog of galaxy spectra.
Results. Single-pass k-means turn out to be between 20% and 40% faster than k-means and provide statistically equivalent classifications. This conclusion can be scaled up to other larger databases because the execution time of both algorithms increases linearly with the number of objects.
Conclusions. Single-pass k-means can be safely used as a fast alternative to k-means.
Key words: astronomical databases: miscellaneous / methods: data analysis / methods: statistical
© ESO, 2014
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.