Utilization of convolutional neural networks for H I source finding

Henrik Håkansson; Anders Sjöberg; Maria Carmen Toribio; Magnus Önnheim; Michael Olberg; Emil Gustavsson; Michael Lindqvist; Mats Jirstrand; John Conway

doi:10.1051/0004-6361/202245139

Home

All issues

Volume 671 (March 2023)

A&A, 671 (2023) A39

Abstract

Open Access

Issue		A&A Volume 671, March 2023


Article Number		A39
Number of page(s)		13
Section		Numerical methods and codes
DOI		https://doi.org/10.1051/0004-6361/202245139
Published online		03 March 2023

A&A 671, A39 (2023)

Team FORSKA-Sweden approach to SKA Data Challenge 2

Henrik Håkansson¹, Anders Sjöberg¹, Maria Carmen Toribio², Magnus Önnheim¹, Michael Olberg², Emil Gustavsson¹, Michael Lindqvist², Mats Jirstrand¹ and John Conway²

¹ Fraunhofer-Chalmers Centre & Fraunhofer Center for Machine Learning, 412 88, Gothenburg, Sweden
e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
² Department of Space, Earth and Environment, Chalmers University of Technology, Onsala Space Observatory, 439 92 Onsala, Sweden
e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Received: 5 October 2022
Accepted: 17 January 2023

Abstract

Context. The future deployment of the Square Kilometer Array (SKA) will lead to a massive influx of astronomical data and the automatic detection and characterization of sources will therefore prove crucial in utilizing its full potential.

Aims. We examine how existing astronomical knowledge and tools can be utilized in a machine learning-based pipeline to find 3D spectral line sources.

Methods. We present a source-finding pipeline designed to detect 21-cm emission from galaxies that provides the second-best submission of SKA Science Data Challenge 2. The first pipeline step was galaxy segmentation, which consisted of a convolutional neural network (CNN) that took an H I cube as input and output a binary mask to separate galaxy and background voxels. The CNN was trained to output a target mask algorithmically constructed from the underlying source catalog of the simulation. For each source in the catalog, its listed properties were used to mask the voxels in its neighborhood that capture plausible signal distributions of the galaxy. To make the training more efficient, regions containing galaxies were oversampled compared to the background regions. In the subsequent source characterization step, the final source catalog was generated by the merging and dilation modules of the existing source-finding software SOFIA, and some complementary calculations, with the CNN-generated mask as input. To cope with the large size of H I cubes while also allowing for deployment on various computational resources, the pipeline was implemented with flexible and configurable memory usage.

Results. We show that once the segmentation CNN has been trained, the performance can be fine-tuned by adjusting the parameters involved in producing the catalog from the mask. Using different sets of parameter values offers a trade-off between completeness and reliability.

Key words: methods: data analysis / methods: statistical / radio lines: galaxies

Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. This email address is being protected from spambots. You need JavaScript enabled to view it. to support open access publication.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

Utilization of convolutional neural networks for H I source finding

Team FORSKA-Sweden approach to SKA Data Challenge 2

Utilization of convolutional neural networks for H I source finding