Issue |
A&A
Volume 692, December 2024
|
|
---|---|---|
Article Number | A211 | |
Number of page(s) | 11 | |
Section | Cosmology (including clusters of galaxies) | |
DOI | https://doi.org/10.1051/0004-6361/202451367 | |
Published online | 13 December 2024 |
The impact of lossy data compression on the power spectrum of the high-redshift 21 cm signal with LOFAR
1
Kapteyn Astronomical Institute, University of Groningen, PO Box 800 9700 AV Groningen, The Netherlands
2
Netherlands Institute for Radio Astronomy (ASTRON), Oude Hoogeveensedijk 4, 7991 PD Dwingeloo, The Netherlands
3
LERMA, Observatoire de Paris, PSL Research University, CNRS, Sorbonne Université, F-75014 Paris, France
⋆ Corresponding author; chege@astro.rug.nl
Received:
3
July
2024
Accepted:
7
November
2024
Context. Current radio interferometers output multi-petabyte-scale volumes of data per year, making the storage, transfer, and processing of these data a sizeable challenge. This challenge is expected to grow with next-generation telescopes such as the Square Kilometre Array (SKA), which will produce a considerably larger data volume than current instruments. Lossy compression of interferometric data post-correlation can abate this challenge, but any drawbacks from the compression should be well understood in advance.
Aims. Lossy data compression reduces the precision of data, introducing additional noise. Since high-redshift (e.g., cosmic dawn or epoch of reionization) 21 cm studies impose strict precision requirements, the impact of this effect on the 21 cm signal power spectrum statistic is investigated in a bid to rule out unwanted systematics.
Methods. We applied DYSCO visibility compression, a technique for normalizing and quantizing specifically designed for radio interferometric data, to observed visibilities datasets from the LOFAR telescope as well as simulated ones. The power spectrum of these data was analyzed, and we establish the level of the compression noise in the power spectrum in comparison to the thermal noise. We also examined its coherency behavior by employing the cross-coherence metric. Finally, for optimal compression results, we compared the compression noise obtained from different compression settings to a nominal 21 cm signal power.
Results. From a single night of observation, we find that the noise introduced due to the compression is more than five orders of magnitude lower than the thermal noise level in the power spectrum. The noise does not affect calibration. Furthermore, the noise remains subdominant to the noise introduced by the nonlinear calibration algorithm used following random parameter initialization across different runs. The compression noise shows no correlation with the sky signal and has no measurable coherent component, therefore averaging down optimally with the integration of more data. The level of compression error in the power spectrum ultimately depends on the compression settings.
Conclusions. DYSCO visibility compression is found to be an insignificant concern for 21 cm power spectrum studies. Hence, data volumes can be safely reduced by factors of ∼4 with insignificant bias to the final power spectrum. Data from SKA-Low will likely be compressible by the same factor as data from LOFAR owing to the similarities of the two instruments. The same technique can be used to compress data from other telescopes, but a small adjustment of the compression parameters might be required.
Key words: instrumentation: interferometers / methods: data analysis / methods: observational / methods: statistical / techniques: interferometric / cosmology: observations
© The Authors 2024
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.