Issue |
A&A
Volume 688, August 2024
|
|
---|---|---|
Article Number | A34 | |
Number of page(s) | 28 | |
Section | Catalogs and data | |
DOI | https://doi.org/10.1051/0004-6361/202449929 | |
Published online | 02 August 2024 |
TEGLIE: Transformer encoders as strong gravitational lens finders in KiDS
From simulations to surveys★
1
National Centre for Nuclear Research,
ul. Pasteura 7,
02-093
Warsaw,
Poland
e-mail: margherita.grespan@ncbj.gov.pl
2
Astronomical Observatory of the Jagiellonian University,
Orla 171,
30-001
Cracow,
Poland
3
Department of Physics and Astronomy, University of the Western Cape,
Bellville, Cape Town,
7535,
South Africa
4
South African Radio Astronomy Observatory,
2 Fir Street, Black River Park,
Observatory
7925,
South Africa
Received:
10
March
2024
Accepted:
16
May
2024
Context. With the current and upcoming generation of surveys, such as the Legacy Survey of Space and Time (LSST) on the Vera C. Rubin Observatory and the Euclid mission, tens of billions of galaxies will be observed, with a significant portion (~105) exhibiting lensing features. To effectively detect these rare objects amidst the vast number of galaxies, automated techniques such as machine learning are indispensable.
Aims. We applied a state-of-the-art transformer algorithm to the 221 deg2 of the Kilo Degree Survey (KiDS) to search for new strong gravitational lenses (SGLs).
Methods. We tested four transformer encoders trained on simulated data from the Strong Lens Finding Challenge on KiDS data. The best performing model was fine-tuned on real images of SGL candidates identified in previous searches. To expand the dataset for fine-tuning, data augmentation techniques were employed, including rotation, flipping, transposition, and white noise injection. The network fine-tuned with rotated, flipped, and transposed images exhibited the best performance and was used to hunt for SGLs in the overlapping region of the Galaxy And Mass Assembly (GAMA) and KiDS surveys on galaxies up to z = 0.8. Candidate SGLs were matched with those from other surveys and examined using GAMA data to identify blended spectra resulting from the signal from multiple objects in a GAMA fiber.
Results. Fine-tuning the transformer encoder to the KiDS data reduced the number of false positives by 70%. Additionally, applying the fine-tuned model to a sample of ~5 000 000 galaxies resulted in a list of ~51 000 SGL candidates. Upon visual inspection, this list was narrowed down to 231 candidates. Combined with the SGL candidates identified in the model testing, our final sample comprises 264 candidates, including 71 high-confidence SGLs; of these 71, 44 are new discoveries.
Conclusions. We propose fine-tuning via real augmented images as a viable approach to mitigating false positives when transitioning from simulated lenses to real surveys. While our model shows improvement, it still does not achieve the same accuracy as previously proposed models trained directly on galaxy images from KiDS with added simulated lensing arcs. This suggests that a larger fine-tuning set is necessary for a competitive performance. Additionally, we provide a list of 121 false positives that exhibit features similar to lensed objects, which can be used in the training of future machine learning models in this field.
Key words: gravitational lensing: strong / methods: data analysis / catalogs
Full Table 6 is available at the CDS via anonymous ftp to cdsarc.cds.unistra.fr (130.79.128.5) or via https://cdsarc.cds.unistra.fr/viz-bin/cat/J/A+A/688/A34
© The Authors 2024
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.