Issue |
A&A
Volume 695, March 2025
|
|
---|---|---|
Article Number | A65 | |
Number of page(s) | 21 | |
Section | The Sun and the Heliosphere | |
DOI | https://doi.org/10.1051/0004-6361/202449671 | |
Published online | 07 March 2025 |
Bypassing the static input size of neural networks in flare forecasting by using spatial pyramid pooling
1
Solar-Terrestrial Centre of Excellence – SIDC, Royal Observatory of Belgium, Avenue Circulaire 3, 1180 Brussels, Belgium
2
Center for Mathematical Plasma Astrophysics, Department of Mathematics, University of Leuven, KULeuven, Belgium
3
Columbia Astrophysics Laboratory, Columbia University, MC 5247, 550 West 120th Street, New York, NY 10027, USA
⋆ Corresponding author; philippe.vong@oma.be
Received:
20
February
2024
Accepted:
24
January
2025
Context. The spatial extension of active regions of the Sun (hence their associated images) can strongly vary from one case to the next. This inhomogeneity is a problem when using convolutional neural networks (CNNs) to study solar flares, as they generally use input images of a fixed size. Different processes can be performed to retrieve a database with homogeneous-sized images, such as coarse resizing, cropping, or padding of raw images. Unfortunately, key features can be lost or distorted beyond recognition during these processes. This can lead to a deterioration of the ability of CNNs to classify flares of different soft X-ray classes, especially those from active regions with structures of great complexity.
Aims. This study aims to implement and test a CNN architecture that retains features of characteristic scales as fine as the original resolution of the input images.
Methods. We compared the performance of two CNN architectures for solar flare prediction. The first one is a traditional CNN with convolution layers, batch normalization layers, max pooling layers, and resized input, whereas the other implements a spatial pyramid pooling (SPP) layer instead of a max pooling layer before the flattening layer and without any input resizing. Both were trained on the Spaceweather HMI Active Region Patch (SHARP) line of sight magnetogram database, which was generated from data collected by the Helioseismic and Magnetic Imager on board the Solar Dynamics Observatory from May 2010 to August 2021 only using images within 45°of the central meridian of the Sun. We also studied two cases of binary classification. In the first case, our model had to distinguish active regions producing flares in less than 24 h of class ≥C1.0 from active regions producing flares in more than 24 h or never. In the second case, it had to distinguish active regions producing flares in less than 24 h of class ≥M1.0 from active regions producing flares in more than 24 h or never, or flares in less than 24 h but of class < M1.0. The impact of the use of a score-oriented loss (SOL) function optimizing the true skill statistics (TSS) metric instead of a binary cross-entropy (BCE) loss function is also studied and discussed in this work.
Results. Our models implementing an SPP layer and trained using a BCE loss function outperform the traditional CNN models, with an average increase of 0.1 in TSS and 0.17 in precision metrics when predicting flares ≥C1.0 within 24 h. However, their performances degrade sharply along the other models studied in this paper when trained to classify images of ≥M1.0 flares.
Conclusions. We prove the higher efficiency of a CNN model that includes an SPP layer in predicting solar flares. The degradation of prediction performance of this model when the images of active regions producing a C class flare are classified as negative may be attributed to its success in identifying features that appear in active regions only a few hours before the flare, independent of their soft X-ray class. The development of explainable artificial intelligence tools adapted to this architecture in future projects will be interesting for the study of solar flare-triggering mechanisms.
Key words: methods: data analysis / Sun: activity / Sun: flares / sunspots
© The Authors 2025
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.