| Issue |
A&A
Volume 709, May 2026
|
|
|---|---|---|
| Article Number | A122 | |
| Number of page(s) | 10 | |
| Section | Astronomical instrumentation | |
| DOI | https://doi.org/10.1051/0004-6361/202555748 | |
| Published online | 07 May 2026 | |
Applying vision transformers to the spectral analysis of astronomical objects
1
Harvard Extension School, Harvard University,
Cambridge,
MA
02138,
USA
2
John A. Paulson School of Engineering and Applied Science, Harvard University,
Cambridge,
MA
02138,
USA
3
Department of Computer Science, Universidad de Concepción,
Edmundo Larenas 219,
Concepción,
Chile
4
Center for Data and Artificial Intelligence, Universidad de Concepción,
Edmundo Larenas 310,
Concepción,
Chile
5
Millennium Institute of Astrophysics (MAS),
Nuncio Monseñor Sotero Sanz 100, Of. 104, Providencia,
Santiago,
Chile
6
Millennium Nucleus for Galaxies (MINGAL),
Chile
7
Heidelberg Institute for Theoretical Studies, Heidelberg,
Baden-Württemberg,
Germany
★ Corresponding author: This email address is being protected from spambots. You need JavaScript enabled to view it.
Received:
30
May
2025
Accepted:
1
March
2026
Abstract
We applied pretrained vision transformers (ViTs), originally developed for image recognition, to the analysis of astronomical spectral data. By converting traditional 1D spectra into 2D image representations, we enable ViTs to capture both local and global spectral features through spatial self-attention. We fine-tuned a ViT pretrained on ImageNet using millions of spectra from the Sloan Digital Sky Survey (SDSS; Kollmeier et al. 2019, in BAAS, 51, 274; Stoughton et al. 2002, AJ, 123, 485) and Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST; Luo et al. 2015, Res. Astron. Astrophys., 15, 1095) surveys, represented as spectral plots. Our model is evaluated on key tasks including stellar object classification and redshift (z) estimation, where it demonstrates strong performance and scalability. We achieved classification accuracy higher than that of support vector machines and random forests, and we attain R2 values comparable to AstroCLIP’s spectrum encoder, even when generalizing across diverse object types. These results demonstrate the effectiveness of using pretrained vision models for spectroscopic data analysis. To our knowledge, this is the first application of ViTs to large-scale astronomical datasets, which also leverages real spectroscopic data and does not rely on synthetic inputs.
Key words: methods: data analysis / methods: statistical / techniques: miscellaneous / techniques: spectroscopic
© The Authors 2026
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. This email address is being protected from spambots. You need JavaScript enabled to view it. to support open access publication.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.