| Issue |
A&A
Volume 709, May 2026
|
|
|---|---|---|
| Article Number | A239 | |
| Number of page(s) | 15 | |
| Section | Numerical methods and codes | |
| DOI | https://doi.org/10.1051/0004-6361/202659116 | |
| Published online | 19 May 2026 | |
Comparative analysis of missing data imputation methods for CSST survey: Impact on photometric redshift estimation performance
1
Shanghai Key Lab for Astrophysics, Shanghai Normal University,
Shanghai
200234,
China
2
Center for Astronomy and Space Sciences, China Three Gorges University,
Yichang
443000,
PR China
3
South-Western Institute for Astronomy Research, Yunnan University,
Kunming
650500,
China
4
Department of Astronomy, School of Physics,Peking University,
Beijing
100871,
China
5
Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences,
Changchun
130033,
China
6
University Observatory, Faculty of Physics, Ludwig-Maximilians-Universität,
Scheinerstr. 1,
81679
Munich, Germany
7
National Astronomical Observatories, Chinese Academy of Sciences,
20A Datun Road, Chaoyang District,
Beijing
100101,
PR China
8
Purple Mountain Observatory, Chinese Academy of Sciences,
Nanjing
210023,
China
9
School of Physics and Astronomy, Sun Yat-sen University,
Zhuhai
519082,
PR China
10
Shanghai Astronomical Observatory, Chinese Academy of Sciences,
80 Nandan Road,
Shanghai
200030,
PR China
11
CSST Science Center for the Guangdong-Hong Kong-Macau Greater Bay Area,
Zhuhai
519082,
PR China
12
School of Physics and Electronics, Hunan Normal University,
36 Lushan Road,
Changsha
410081,
China
13
Computer Network Information Center, Chinese Academy of Sciences,
2 East Kexueyuan South Road, Haidian District,
Beijing
100083,
PR China
14
Center for Astrophysics and Great Bay Center of National Astronomical Data Center, Guangzhou University,
Guangzhou,
Guangdong
510006,
PR China
15
Faculty of Information Engineering and Automation, Kunming University of Science and Technology,
No.727 Jingming South Road,
Kunming
650500,
PR China
16
School of Astronomy and Space Science, University of Chinese Academy of Sciences,
Beijing
101408,
PR China
17
Key Laboratory of Space Astronomy and Technology, National Astronomical Observatories, Chinese Academy of Sciences,
Beijing
100101,
China
★ Corresponding author: This email address is being protected from spambots. You need JavaScript enabled to view it.
Received:
25
January
2026
Accepted:
10
April
2026
Abstract
Improving the accuracy of photometric redshifts (photo-z) is essential for reliable statistical studies of cosmology and galaxy evolution. However, missing photometric bands are a common observational challenge that can significantly degrade photo-z estimation accuracy. In this work, we present a systematic evaluation of data imputation methods aimed at improving photo-z performance. We benchmark a range of representative machine learning and deep learning architectures, identifying k-nearest neighbors (KNN) and the attentionbased SAITS model as the leading performers. These models are then applied to China Space Station Survey Telescope mock data to assess their performance under realistic observational conditions. Our results show that KNN yields the highest accuracy under idealized missing completely at random (MCAR) conditions with complete training sets, whereas robustness tests reveal that SAITS significantly outperforms KNN when training data are incomplete or when applied to realistic mixed-mechanism scenarios. We find that domain consistency between training and testing missingness patterns is a prerequisite for optimal performance, highlighting the risks of domain shift in supervised regression tasks. Furthermore, our analysis demonstrates that while general imputation models are highly effective for MCAR and missing at random data, they are detrimental when applied to missing not at random data arising from flux limits, as statistical models fail to capture the physical information inherent in these nondetection. Consequently, we advocate for more sophisticated architectures capable of disentangling stochastic missingness from physical nondetection to address these distinct mechanisms individually.
Key words: methods: data analysis / methods: statistical / catalogs / galaxies: distances and redshifts / galaxies: photometry
© The Authors 2026
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. This email address is being protected from spambots. You need JavaScript enabled to view it. to support open access publication.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.