Issue |
A&A
Volume 694, February 2025
|
|
---|---|---|
Article Number | A161 | |
Number of page(s) | 17 | |
Section | Astrophysical processes | |
DOI | https://doi.org/10.1051/0004-6361/202452489 | |
Published online | 11 February 2025 |
Stellar flare morphology with TESS across the main sequence
1
Konkoly Observatory, HUN-REN Research Centre for Astronomy and Earth Sciences, Konkoly Thege Miklós út 15-17., H-1121 Budapest, Hungary
2
HUN-REN CSFK, MTA Centre of Excellence, Budapest, Konkoly Thege Miklós út 15-17., H-1121 Budapest, Hungary
3
Eötvös University, Department of Astronomy, Pf. 32, H-1518 Budapest, Hungary
4
Gyula Bay Zoltán Solar Observatory (GSO), Hungarian Solar Physics Foundation (HSPF), Petőfi tér 3, H-5700 Gyula, Hungary
5
Eötvös Loránd University, Institute of Physics and Astronomy, H-1117 Budapest, Hungary
⋆ Corresponding author; seli.balint@csfk.org
Received:
4
October
2024
Accepted:
16
December
2024
Context. Stellar flares are abundant in space photometric light curves. As they are now available in large enough numbers, the statistical study of their overall temporal morphology is timely.
Aims. We use light curves from the Transiting Exoplanet Survey Satellite (TESS) to study the shapes of stellar flares beyond a simple parameterization by duration and amplitude, and we reveal possible connections to astrophysical parameters.
Methods. We retrained and used the flatwrm2 long-short term memory neural network to find stellar flares in 2-min cadence TESS light curves from the first five years of the mission (sectors 1–69). We scaled these flares to a comparable standard shape and used principal component analysis to describe their temporal morphology in a concise way. We investigated how the flare shapes change along the main sequence and tested whether individual flares hold any information about their host stars. We also applied similar techniques to solar flares, using extreme ultraviolet irradiation time series.
Results. Our final catalog contains ∼120 000 flares on ∼14 000 stars. Due to the strict filtering and the final manual vetting, this sample contains virtually no false positives, although at the expense of reduced completeness. Using this flare catalog, we detected a dependence of the average flare shape on the spectral type. These changes are not apparent for individual flares; they only appear when averaging thousands of events. We find no strong clustering in the flare shape space. We have created new analytical flare templates for different types of stars, and we present a technique to sample realistic flares and a method to locate flares with similar shapes. The flare catalog along with the extracted flare shapes and the data used to train flatwrm2 are publicly available.
Key words: Sun: flares / stars: activity / stars: flare / stars: statistics
© The Authors 2025
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
1. Introduction
The advent of space photometry has enabled detailed statistical studies of stellar flares in volumes never seen before (e.g., Hawley et al. 2014; Davenport 2016; Yang et al. 2017; Roettenbacher & Vida 2018; Yang & Liu 2019; Günther et al. 2020; Oláh et al. 2021; Feinstein et al. 2022). The most important instruments in this regard are the Kepler space telescope (Borucki et al. 2010) and the Transiting Exoplanet Survey Satellite (TESS; Ricker et al. 2014). Kepler observed the same field over four years and has provided accurate flare statistics for thousands of stars, while TESS observes the whole sky in 27-day long sectors and provides shorter light curves of even more objects. The Kepler observatory was designed to survey a portion of the sky in order to discover Earth-like exoplanets, and its targets were mainly solar-type stars, whereas TESS observes almost the whole sky and has more late-type stars as targets, including more flaring M dwarfs.
Stellar flares are the most easily observable manifestations of magnetic activity (Pettersen 1989; Kowalski 2024). They appear on light curves as sudden bursts, lasting for minutes or hours. By using photometric data in a single filter, it is possible to determine the time of the flare peak, its amplitude, duration, and the energy released in the given filter.
Most flare studies focus on the flaring rate and energy distribution of different kinds of active stars (see, e.g., Candelaresi et al. 2014; Yang & Liu 2019; Günther et al. 2020; Yang et al. 2023b; Feinstein et al. 2022, 2024; Petrucci et al. 2024), looking for a dependence on spectral type, age, rotational period, and other stellar parameters. Other applications of basic flare properties include the search for changes in flare rate (Crowley et al. 2022), the search for periodicity in flaring times (Howard & Law 2021), or the study of waiting time and rotational phase distributions (Hawley et al. 2014; Doyle et al. 2020).
When the emphasis is on studying the temporal morphology of flares, higher-cadence observations are necessary where the flare events are resolved in time. Using space photometry, the following options are available: 1- and 30-min cadence for Kepler; 20-s and 2-min for TESS short cadence mode (for pre-selected targets); and 200-s, 10-min, and 30-min for TESS full frame images. While Kepler and TESS are the most popular space-based options, there are flare-related studies using Microvariability and Oscillations of Stars (MOST; Hunt-Walker et al. 2012; Davenport et al. 2016) and Characterising Exoplanets Satellite (CHEOPS; Bruno et al. 2024) data.
One of the most influential studies about stellar flare profiles was presented by Davenport et al. (2014). Using a 1-min cadence Kepler light curve of the M4 dwarf GJ 1243, Davenport et al. (2014) created a flare profile template combining a polynomial rise phase with a double exponential decay phase. This template was extensively used to model stellar flares observed with different instruments (e.g., Maas et al. 2022; Hübner et al. 2022; Medina et al. 2022; Murray et al. 2022; Jackman et al. 2023). An updated flare model of the same star was introduced by Mendoza et al. (2022) using the convolution of a Gaussian and a double exponential to create a profile that is differentiable at the peak. A detailed study of flare shapes was carried out by Pietras et al. (2022) on a sample of 140 000 TESS flares. They used multiple flare profile models, employing the convolution of a Gaussian and an exponential decay and combining two of these profiles. Oláh et al. (2022) contrasted flares of dwarfs and giants, and they found that while the distribution of their durations are different, their profiles are similar, at least with 30-min cadence Kepler data.
Bruno et al. (2024) comprehensively analyzed 20-s cadence TESS and 3-s cadence CHEOPS light curves and revealed that a significant fraction of flares are complex with the adequate time resolution. They separated multi-peak flares to study the individual components and also identified possible quasi-periodic pulsations and a pre-flare dip. Based on TESS 20-s data, Howard & MacGregor (2022) showed that a large fraction of flares have a substructure during the rising phase, and short-period quasi-periodic pulsations are quite common. They also found that a significant fraction of flares have a gradual Gaussian peak following the primary impulsive peak.
The previous studies have presented results on large ensembles of flares, but in some cases, individual events were also analyzed, for example, in the case of quasi-periodic pulsations (Pascoe et al. 2020; Doyle et al. 2022). To study the variability of stellar flares on a time scale of seconds, there are ground-based measurements available with much smaller sample sizes (see, e.g., Kowalski et al. 2016; Aizawa et al. 2022). Notably, focusing on a limited number of targets also makes it possible to obtain multi-band or spectral data simultaneously (see, e.g., Boyd et al. 2023; Kowalski et al. 2013).
Flare profiles can also be analyzed on the Sun, where more detailed observations are possible. Kashapova et al. (2021) studied the temporal morphology of solar flares with Sun-as-a-star flux measurements from the Solar Dynamics Observatory’s Atmospheric Imaging Assembly (SDO/AIA). Gryciuk et al. (2017) studied solar X-ray flare light curves and defined the flare profile as the convolution of a Gaussian heating and an exponential decay, thus having a smoothly varying profile that fits high cadence solar observations well.
To explain the diversity of flare shapes, a few theoretical models have been proposed. Tovmassian et al. (2003) put forward a geometric model. They treated a flare as a short impulsive event that heats the base of the associated magnetic structure in the photosphere, which then radiates more gradually, giving rise to typical “peak-bump” shapes (Howard & MacGregor 2022). Then, depending on the position of the footpoint (or echo) on the stellar disk, different temporal morphologies may be observed. Yang et al. (2023a) also modeled “peak-bump” flares. They used one-dimensional hydrodynamic loop simulations and found that radiating plasma from the loop can contribute to the secondary peak in the optical.
In this work, we intend to gain further empirical insights about the temporal morphology of stellar flares. We use TESS light curves to compile a large, homogeneous, and pure sample of stellar flares. After scaling these flares to a standard shape, we apply dimensionality reduction techniques to summarize the information carried by the morphology of the flares beyond the simple parameterization by duration and amplitude. This representation of the flare shape is parameter-free, unlike models involving multiple polynomial, Gaussian, and other components. We then try to find regularities in the flare shapes, including clusters of different shapes or correlations with astrophysical parameters. Finally, we apply similar techniques for solar flares observed by the Extreme Ultraviolet Variability Experiment (EVE) instrument of SDO in order to look for obvious differences between flares produced under different conditions.
2. Data and methods
2.1. TESS
To search for stellar flares we used 2-min cadence Pre-search Data Conditioning Simple Aperture Photometry (PDCSAP) light curves provided by the Science Processing Operations Center pipeline (SPOC; Jenkins et al. 2016) from the first five years of the TESS mission, up to sector 69.
Since the goal of this work is not to compile a complete catalog of TESS flares, but to study their average shapes, we excluded the noisier light curves from the start. To this end we smoothed each available TESS light curve with a 31-point (one hour) wide running median filter, and kept it only if the ratio of the standard deviation of the smoothed and original datasets exceeded an empirically derived threshold of 0.4, indicating that astrophysical variation dominates short time scale random noise:
This way, based on the manually vetted training set (introduced in the following section) we can exclude ∼60% of the available TESS light curves to speed up the computation, while only losing ∼10% of the flaring stars.
2.2. Flare detection method
Several automated tools exist to identify stellar flares in light curves, including the use of convolutional neural networks (Feinstein et al. 2020a; Tu et al. 2022; Jia et al. 2024), Bayesian odds ratio (Pitkin et al. 2014), differencing (Bicz et al. 2022), RANdom SAmple Consensus (RANSAC; Vida & Roettenbacher 2018), multi-algorithm voting (Lin et al. 2024), and hidden Markov models (Esquivel et al. 2024; Zimmerman et al. 2024).
In this work we used flatwrm2 (Vida et al. 2021), a long short-term memory (LSTM) neural network originally developed to find flares in Kepler light curves with the emphasis on low astrophysical false positive rate from known variable stars, such as RR Lyrae or eclipsing binaries.
We retrained flatwrm2 specifically to TESS 2-min cadence data with the original architecture and an augmented training set. Apart from the original training set, we added 4631 TESS light curves from sectors 1–69 with flares identified manually. These include random stars, stars that are expected to flare, and also typical false positives. We collected flaring candidates from Günther et al. (2020), previous runs of flatwrm2, and also from the following TESS Guest Observer proposals: G011266, G04039, G04139, G05105, G03227, G04051, G04234. For the false positives, we added a few hundred stars from these sources: rapidly oscillating Ap stars from Sikora et al. (2019), δ Scuti hybrids from Skarka et al. (2022), solar oscillators from Schofield et al. (2019), RR Lyrae stars from the TESS G03169, G04106 and G04184 proposals, and stars with solar system asteroids moving through the aperture (as identified by the ephemd tool, see Pál et al. 2020). Figure 1 shows a few examples of the “astrophysical noise” set. We inspected each light curve and flagged flaring points using a box selection tool, resulting in an array of ones and zeros for flaring and nonflaring points. Most of the manually vetted light curves included no flares, so to balance the training set we excluded ∼2/3 of the nonflaring stars. The final set of 4631 light curves includes the following: 50% flaring, 34% nonflaring with σratio > 0.4, 2% nonflaring with σratio < 0.4 and 14% false positives.
![]() |
Fig. 1. Example light curves from the training set. The upper-left panel shows a real flare, and the others are false positives. All panels show one-day long segments. |
We trained flatwrm2 on this new training set using k-fold cross-validation following the same procedures as Vida et al. (2021). We used this retrained version of flatwrm2 to find flares in the first 69 sectors of TESS 2-min data. In the following sections we describe the post-processing steps and the content of the final flare catalog.
To facilitate future data-driven efforts for stellar flare detection, we make the manually flagged light curves publicly available on Zenodo1, as a series of time, flux, and 0/1 flags for each light curve. The fully trained model and the weight file is also available on Github2.
2.3. Post-processing of the flatwrm2 results
The raw output of flatwrm2 is a flare probability time series (see Fig. 2 for an example). To extract individual flare events from this output we run the flatwrm2 validation step (see Vida et al. 2021 for details). Running this validation step on the 444 963 TESS light curves with σratio > 0.4 resulted in 3 103 728 flare candidates.
![]() |
Fig. 2. Example light curve color coded with the flatwrm2 prediction. Gray lines show the positions of the validated flares from the final catalog. |
As this initial candidate list contains many false events, we filtered it based on the A flare amplitude, ED (equivalent duration, measured in days), and S/N parameters available from flatwrm2. After manually inspecting a few hundred candidates we used the following criteria to remove smaller events and brightenings from other astrophysical sources:
-
S/N > 5
-
A > 0.001
-
0.001 ⋅ A < ED < 0.1 ⋅ A
To clean the flare catalog further we employed three more criteria for each candidate: i) the peak must rise above 3 standard deviations from the median of the quiescent light curve, after removing a parabolic trend; ii) there must be no NaN points in the 15 min vicinity of the peak; iii) there can be no more than one point flagged with the bitmask 6591. This bitmask includes the following quality flags (Twicken et al. 2020): Attitude tweak, Safe mode, Coarse point, Earth point, Argabrightening, Desaturation event, Manual exclude, Discontinuity corrected, Straylight and Straylight2. We did not use the following quality flags, as they would sometimes remove real flare peaks, as also noted by Feinstein et al. (2020b): Impulsive outlier, Cosmic ray in collateral data, Cosmic ray in optimal aperture.
As a next step we extract and scale the flare events in time and flux. Table 1 shows the number of flare candidates after each processing step.
Sample size after each processing step.
2.4. Flare extraction from the light curves
Once we know where the flares are, we need to extract them from the quiescent baseline variation along with some basic parameters, including amplitude, some measure of length, and ED. To model the baseline variation, local polynomial fits, linear fits (Davenport et al. 2014) and Gaussian processes are commonly used (e.g., Mendoza et al. 2022; Gilbert et al. 2022).
To start, we centered the flare peak time to zero. As there might be a slight offset from the peak time from flatwrm2, we repositioned it to the maximum in the 30 min vicinity. Then, we cut a ±0.1 days segment around it and masked out the duration of the flare (tpeak ± 30 min). We clipped a remaining segment with 2σ to remove any residual variation by the flare. Then, we fitted a polynomial to this quiescent light curve with a degree between zero and four as favored by the minimal Bayesian information criterion (BIC; Liddle 2007), defined as
where n is the number of data points, k is the degree of freedom (polynomial degree plus one), yi and are the measured and modeled points. After removing this polynomial trend, the flare was fitted with the single-peaked flare template of Davenport et al. (2014) to estimate the t1/2 time scale of the flare, which is the full width at half maximum of the template. This template is given in the following form, after transforming the measured time t to
:
After centering the flare peak time to zero and scaling time with the fitted t1/2, we linearly interpolated the segment to a grid of 200 points between −3 and 10 t1/2. As a final detrending step, we removed a linear fit from the interpolated segment fitted before −t1/2 and after 8t1/2. The amplitude of the flare was scaled to unity (see Fig. 3). Similar steps were followed by Oláh et al. (2022). We calculated the ED from the baseline-removed light curve using the trapezoidal rule for integration. Also, we calculated the signal-to-noise ratio (S/N) of the flare by dividing the amplitude by a local measure of scatter. To calculate the scatter, we differenced the 0.2 days vicinity of the flare, and took half the difference between the 16th and 84th percentiles (1σ in the Gaussian case).
![]() |
Fig. 3. Illustrative example of the extraction of a scaled flare shape. Left: Light curve segment around the flare. Gray shows the points used for the baseline fit; the red line shows the fitted polynomial. Right: Scaled flare shape. The red line shows the flare template used for the time scaling. The large black dots are from the original light curve; the small black dots are the interpolated points. |
The given flare was discarded if t1/2 < 2 min or if S/N < 3, removing ∼30% of the candidates. Finally, we saved the scaled and interpolated flare shape, amplitude, t1/2, ED and S/N. The amplitude and t1/2 were also measured by flatwrm2, but we recalculated them here to match the extraction method.
We note that using a flare template for scaling could cause a bias in the resulting average shape, as it enforces a given shape onto the flare events. For example, using a triangle-shaped template results in “boxy” flares. However, since we are interested in the relative differences, it is not a problem as long as the same template is used for all the events.
2.5. Manual vetting
After the filtering and extraction steps described in the previous sections we are left with 148 887 scaled flare profiles. This list is already relatively pure; however, defects can occur during the extraction process, resulting in deformed flare shapes. As we are interested in subtle differences in the flare profiles, we added a final manual vetting step to the analysis. We visually inspected every single extracted flare and classified them into three groups: i) correctly extracted real flare, ii) incorrectly extracted real flare (e.g., with defects in the baseline removal, too many missing points), iii) nonflare (e.g., nova-like flickering). For each candidate, we made this judgment by plotting the 2 days, 7 hours and 4 hours vicinity of the event, and also the final scaled and interpolated profile. We use the correctly extracted real flares in the remainder of the manuscript without any modifications. We made no further attempt to correct the erroneously extracted flares, but we list their tpeak in the final catalog. The nonflares were removed immediately from the sample.
A noteworthy case is that of the complex flares. It is still debated whether these events are just by-chance alignments of individual flares, or whether there is a physical connection between them (see, e.g., Török et al. 2011), so we do not remove them from the sample by default. However, as many of them are hard to extract, a large fraction of the complex flares will be missing from the final sample anyway.
As stars on the Hertzsprung–Russell diagram are not equally likely to produce observable flares, certain types of stars – where one would expect flaring activity less – warrant extra scrutiny (see Oláh et al. 2021 for a discussion on flaring giants). We selected the following objects based on their Gaia properties and repeated the classification of their correctly extracted flares:
-
hot stars: GBP − GRP < 0.5
-
giant stars: 3 ⋅ (GBP − GRP)−1.5 > MG
-
subdwarfs and white dwarfs: 3 ⋅ (GBP − GRP)+3.5 < MG.
This selection included 1433 flares, out of which we discarded 612. After this check, a few flares still remained on these objects. It is still possible that these flares do not originate from the given stars, but are from unresolved companions or other contaminating objects. However, we note that the flare amplitudes are noticeably higher on white dwarfs, which is consistent with their low luminosities producing higher flare contrast. We make no further attempt to validate these flares, as it is not the main focus of this paper, but we encourage interested readers to further examine these objects.
The final occurrence rate of the three categories is 83% for correctly extracted real flares, 10% for incorrectly extracted real flares, and 7% for nonflares. So we are left with more than 120,000 flares.
2.6. Duplicate flares
The manual vetting process revealed that the catalog contains duplicate entries. This can happen for two different reasons. The first one is that there are duplicate light curves in TESS 2-min cadence data, due to the large (21″) pixel size of TESS. This way, a bright flaring star can contaminate the neighboring pixels, causing (almost) the same flares to appear on different stars at the exact same time. Getting rid of these flares is beyond the scope of this paper. Apart from identifying the duplicates, the main challenge is to which star should the flare be attributed. The solution would require the use of pixel-level data, as in the case of Tu et al. (2022) and Higgins & Bell (2023).
The second reason is that during the extraction procedure (see Sect. 2.4), the flare peak was shifted from its initial position from flatwrm2 to the light curve maximum in a 30 min window, to center the peak. This way, some neighboring flare candidates have been merged. We removed these duplicates from the catalog, totaling 1065 events, out of which 688 were correctly extracted real flares (see Sect. 2.5).
2.7. Blacklisted objects
One needs to be careful when interested in the flares with the highest amplitude or energy. The flare amplitude (and also the ED) is measured as a flux increase compared to the quiescent level. If the quiescent level is erroneous for some reason, for example, due to defects in the background removal in the TESS photometry pipeline, one can measure extremely high flare amplitudes. This was the case for TIC 231799463 and TIC 1801578770, where the flare amplitudes ranged from a few to even a hundred times the quiescent level. We identified these objects on the TESS magnitude vs. light curve noise plot as outliers. We extracted their light curves from the target pixel files with a different pipeline (eleanor; Feinstein et al. 2019; Brasseur et al. 2019). We got a higher quiescent flux baseline and thus smaller, more realistic flare amplitudes. We removed the measured flare parameters of these two stars from the catalog. We note that there are probably more stars affected by erroneous background removal, but these two were the most extreme.
2.8. Astrophysical parameters and flare energies
We collected the following astrophysical parameters from version 8.2 of the TESS Input Catalog (TIC; Stassun et al. 2019): effective temperature (Teff), surface gravity (log g), bolometric luminosity (Lbol), Gaia DR2 (Gaia Collaboration 2018) GBP − GRP color index, Gaia DR2 MG absolute G magnitude calculated from the observed G magnitude and the parallax.
To calculate flare energies from EDs (area below the flare on the normalized light curve), we used the same approach as in Oláh et al. (2022). We collected BT-NextGen model spectra (Hauschildt et al. 1999) in the Teff and log g range of the sample with solar metallicity. For each star we selected the closest spectra in the Teff–log g grid, integrated over the whole wavelength range with and without convolving it with the TESS response function so the ratio of the TESS to bolometric luminosity can be calculated. Using Lbol from TIC, the LTESS quiescent luminosity in the TESS band can be calculated. Then, the flare energy in the TESS band is given by:
For stars with no log g in TIC we assumed log g = 4.7, the sample median. For stars with missing Lbol, we estimated it from the absolute G magnitude. These guesses are only used for the estimation of LTESS and not used anywhere else. They affected 13% of the sample and are estimated to increase the uncertainty of LTESS by ∼10%.
We also tried two different methods for the calculation of LTESS. One is simply using a black body spectrum with the Teff from TIC, instead of using a BT-NextGen model spectrum. The other method uses the apparent T magnitude and distance of each star to calculate LTESS directly. For this we need the apparent T magnitude and the TESS band luminosity of the Sun. Using a standard solar spectrum and the TESS response function we get LTESS, ⊙ = 1.03 ⋅ 1033 erg s−1, and by transforming the solar Gaia magnitudes (Stassun et al. 2019) we get T⊙ = −27 .m3. All three methods agree within a few percent down to Teff ≈ 4000 K, below which 20–40% differences can occur.
2.9. Dimensionality reduction
2.9.1. Weighted principal component analysis
To summarize the information contained in the scaled flare shapes and to visualize any trends with astrophysical parameters, we used principal component analysis (PCA; Pearson 1901, or for recent applications, see, e.g., Hajdu et al. 2018; Csörnyei et al. 2021; Seli et al. 2022). The PCA is a linear method that is able to reduce the dimensionality of the interpolated flare shapes from 200 to a few dimensions. It defines a new basis with vectors, principal components (PCs), pointing in the direction of the highest variance. Using this new basis, a large fraction of the sample variance can be recovered by using only the first few PCs. This way, it is possible to describe the shape of the flares in a model-free way, without assuming any analytical functional form.
Since the scaled shapes of longer flares are better sampled than shorter ones, they are more valuable in the analysis. Similarly, flares with a higher S/N are also more important. To account for this, we employed the weighted PCA (WPCA; Delchambre 2015)3. We used the following weighting factor for each flare, with t1/2 measured in days:
This is only slightly different than using uniform weights, the 1st, 50th and 99th percentiles are 0.38, 0.96 and 2.24, respectively. This means that the weights span roughly one order of magnitude. We note that WPCA could weigh each point of each flare differently, but we use uniform weights across the points of individual flares.
Figure 4 shows the first five PCs and the importance of each PC to explain the sample variance (upper right panel). The fact that all PCs are flat before −t1/2 and after 8t1/2 is an artifact of the extraction, as the final detrending line was fitted to those regions (see Sect. 2.4). It can be seen that even the first three PCs can recover 47% of the variance and that there is an “elbow point” at three PCs, after which new PCs contain less information. This suggests that the flare profiles can broadly be described by only a few parameters. There is possibly an additional elbow point at six PCs, and as we show later, it might be linked to the stellar Teff. The lower panels of Fig. 4 show the PCA reconstruction of a few flare events. For the low S/N cases, PCA also acts as a simple denoising, keeping only the more important features. Throughout the paper we use 5 PCs for visualization purposes (56% explained variance), and 20 PCs for calculations (77% explained variance, e.g., for the astrophysical parameter estimation).
![]() |
Fig. 4. Weighted PCA basis. Upper left: First five PCs, with a dashed line denoting the average flare profile. Upper right: Ratio of the sample variance that a given PC can recover. A single feature from the original 200-dimensional dataset would amount to 0.5%. Lower panels: Example light curves with the PCA reconstruction using 20 PCs. |
2.9.2. Uniform manifold approximation and projection algorithm
As PCA is a linear algorithm, it might struggle to find inherently nonlinear relationships (e.g., a Swiss roll shape in 3D). A powerful nonlinear dimensionality reduction algorithm is the uniform manifold approximation and projection algorithm (UMAP; McInnes et al. 2018)4, which is fast and scalable to larger datasets. Other popular options include Isomap, t-SNE, and autoencoders (see Baron 2019 for an overview). Nonlinear dimensionality deduction can provide a more compact, lower dimensional representation of the dataset than PCA, although they are less robust, more sensitive to noise and depend strongly on the random seed. We used UMAP only for visualization purposes.
3. TESS results
3.1. The final flare catalog
Following the filtering steps on the flatwrm2 results, our final sample includes 121,895 correctly extracted flares on 14,408 stars (see Table 1 for the sample size after each step). With the final manual vetting step, the catalog has virtually no false positives, at the expense of relatively low completeness. This makes it suitable for our purposes, but it might not be the optimal choice for many other studies (e.g., calculating flare rates, creating flare frequency distributions), as weak flares with low S/N ratio, or even larger flares with incorrect extraction are removed.
The stellar astrophysical parameters are presented in Table 2, and the flare catalog is presented in Table 3. These tables also include the incorrectly extracted flares identified during the manual vetting, but in those cases, we omit the t1/2, amplitude, ED and ETESS parameters, as those are likely erroneous. The scaled flare shapes are also available at Zenodo.
Astrophysical parameters of the flaring stars.
Final flare catalog.
Figure 5 shows the Teff distribution of the sample compared to the Teff distribution of all the stars observed by TESS (9% of which have no Teff in TICv8.2). Since the number of detectable flares depends on the total observing time, the number of one sector long light curves is also shown. It can be seen that the observed flare rate declines with Teff. However, the target selection for TESS 2-min cadence observations is not random, as stressed by Günther et al. (2020).
![]() |
Fig. 5. Histogram of stars observed with TESS 2-min cadence up to sector 69. The Teff values are from TICv8.2. The distribution is truncated at 18 000 K. |
Figure 6 puts the sample size into context by showing other published flare catalogs. In the lower-left corner we find more focused lists (e.g., TESS object of interests, superflare stars), while in the upper-right corner there are more general catalogs.
![]() |
Fig. 6. Sample size comparison between different stellar flare catalogs created from Kepler and TESS data. The color indicates the observing cadence. Filled circles are catalogs that are publicly available. The following catalogs are shown: Balona (2015), Davenport (2016), Van Doorsselaere et al. (2017), Roettenbacher & Vida (2018), Yang et al. (2018), Yang & Liu (2019), Feinstein et al. (2020b, 2022, 2024), Günther et al. (2020), Howard (2022), Howard & MacGregor (2022), Pietras et al. (2022), Tu et al. (2022), Yang et al. (2023b), Bruno et al. (2024), Lin et al. (2024), Zhang et al. (2024). |
Figure 7 shows the Gaia color-magnitude diagram of the flaring sample. While there are a few flaring stars on the red giant branch and also a few among white dwarfs, most of them are on the main sequence (MS). The majority of the sample consists of M-dwarfs, and the rate of activity declines for earlier type stars on the MS. The unresolved binary MS is prominent, 0 .m75 above the MS. Most stars only have a few flares in the catalog, the median is 3 flares per star, and only 10% of the stars have more than 20 flares. The three stars with the largest number of detected flares are TIC 150359500, 272232401 and 220433364, with over 500 flares each. Most stars in the sample are nearby: 50% are closer than 92 pc, and 90% are closer than 202 pc.
![]() |
Fig. 7. Flaring stars on the Gaia color-magnitude diagram colored with the flare rate. We note that the stars are plotted in order of their flare rates to show the most active stars on top. Gray points show all the stars prior to manual vetting (Sect. 2.5) in order to make the position of the red giant branch more discernible. |
Figure 8 shows six interesting flares found during the manual vetting. They all have large amplitudes and show complex behavior, including quasi-periodic modulation. These events are also relatively long, the flare on TIC 323292484 lasted for more than a day.
![]() |
Fig. 8. Some interesting complex flares identified during manual vetting. The left panels show flares with possible quasi-periodic modulation. |
3.2. Binning stars on the color-magnitude diagram
In the following sections, we explore the possibility of systematic variation of flare properties with stellar astrophysical parameters. For this we need to place the flares into bins of stellar parameters.
We grouped stars into bins along the MS, based on the Gaia color-magnitude diagram. Since almost all stars in the sample are on the MS (see Fig. 7), most of the parameters that are available in bulk are tightly correlated (e.g., Teff, log g, luminosity, radius). Thus, we only show the effect of a single parameter, as many others would lead to similar results, and it is hard to distinguish which is the parameter that directly causes the change.
To trace the MS, we used the “Modern Mean Dwarf Stellar Color and Effective Temperature Sequence”5 from Pecaut & Mamajek (2013). We parameterized the sequence with Teff, but the binning was performed in the color-magnitude space, similar to the sample selection in Seli et al. (2021). We chose this approach over binning by the Teff values from TIC, as those are compiled from different sources and are thus less homogeneous. First, we interpolated the sequence to 100 uniform Teff values between 3000 and 6500 K. Then, for each Teff value, we linearly interpolated the corresponding Gaia (GBP − GRP)i color and MG, i absolute magnitude from the sequence and drew an ellipse on the color-magnitude diagram as follows:
The stars inside this ellipse make up the sample in the ith bin. Figure 9 illustrates this binning procedure, which we use for the basic flare parameters and the average flare shapes. In the following – where applicable – we use the color coding from Fig. 9.
![]() |
Fig. 9. Binning on the Gaia color-magnitude diagram for the calculation of the average flare shapes. Around each point, the stars inside an ellipse are counted. |
3.3. Basic flare properties
Before analyzing the nonparametric flare shapes, we examine the basic parameters estimated for each flare. These include A, t1/2, ED, and energy.
The t1/2 distribution of the TESS flares follows a log-normal distribution with location and scale parameters μ = 0.83 and σ = 0.21 in minutes, giving a median t1/2 of 7 minutes. The distribution is truncated at 2 minutes due to our selection criteria described in Sect. 2.4.
Figure 10 shows how the basic flare parameters change across the MS, using the binning described in Sect. 3.2. We argue that the correlation between Teff and ED or amplitude (as seen in e.g., Yang et al. 2023b or Liu et al. 2023) could arise due to the increasing luminosity with Teff on the MS, as the contrast of the flare depends on the quiescent luminosity of the star. Also, flares with smaller amplitude and ED are more numerous on any star, as they follow a power-law distribution (see, e.g., Kowalski 2024). However, due to the larger photometric errors on the fainter late type stars and the finite time resolution, the distribution of detectable flares will peak at a given ED and amplitude for stars with different Teff. The relationship between t1/2 and Teff is also probably attributed to a sampling bias. As the Teff increases, only the more energetic flares can be detected, which will have longer durations (see, e.g., Maehara et al. 2015; Namekata et al. 2017). However, it is also consistent with the results of Balona (2015), Kővári et al. (2020) and Oláh et al. (2022), who found that flares on giant stars last longer than on the MS, as the flare duration increases with the stellar radius. Also, Reep & Airapetian (2023) showed that the flare decay time scales with the loop length in the corona, which scales with the radius of the star. On the MS, the stellar radius increases with Teff, so the right panel of Fig. 10 can also be interpreted as t1/2 increasing with the stellar radius. However, as there are only a few bona fide giant stars in our sample, it is not possible to break the degeneracy and pinpoint the underlying variable behind the correlation.
![]() |
Fig. 10. Change of basic flare parameters across the MS with the same binning as in Fig. 9. The black line shows the median value, and gray shading is shown between the 16th and 84th percentiles. |
Using our homogeneously estimated flare parameters we can study the relationship between A, t1/2 and ED. The template provided by Davenport et al. (2014) can be analytically integrated to calculate the ED as follows:
As a more general form, we fitted the following power law to the data in bins of stellar parameters:
For the case of the analytic template of Davenport et al. (2014), α = 1.827, β = 1, and γ = 1.
Figure 11 shows how the fitted parameters change along the MS, using the binning described in Sect. 3.2. It hints at systematic differences in the flare shape and a more complex ED(A, t1/2) relationship than Eq. (7).
![]() |
Fig. 11. Fitted parameters for the power law in the form ED |
3.4. Flare shape space
Each flare in the sample was extracted between −3 and 10t1/2 and contained tens to hundreds of data points in this region, with a median of 44 points. They were then linearly interpolated to 200 points, making the scaled flare shape dataset 200 dimensional, which is too high for visualization. We used WPCA to transform the data into a lower-dimensional space. Figure 12 shows a five-dimensional representation of the dataset with two-dimensional projections. The distribution in this PC space appears to be smooth, and the distributions of higher PCs appear to be more symmetrical. This symmetry suggests that the higher PCs only describe random noise and are not physically interesting.
![]() |
Fig. 12. Distribution of flares in the PC space. Each panel shows a two-dimensional histogram. Below the diagonal, the shading indicates the number density of flares. Above the diagonal, the color code indicates the average Teff from TICv8.2 in each bin. The Teff dependence is most apparent in PC5. |
Figure 13 shows the two-dimensional representation of the dataset with UMAP. The distribution is again really smooth, with no clear sign of clustering.
![]() |
Fig. 13. Two-dimensional UMAP projection of the scaled flare shapes. Each panel shows a two-dimensional histogram color-coded by the density of points, Teff, and log g from TICv8.2. |
We briefly mention that these two-dimensional histograms are not the only way to visualize the flare shape space. García et al. (2022) visualized the parameter space of 2500 pulsars as a graph, using the minimum spanning tree. This way, similar objects can be grouped close to each other, and the structure of the graph can be analyzed.
One way to look for distinct flare shapes is to find clusters in a lower dimensional representation, for example the PC space. Each of the marginal distributions in Fig. 12 appears to be unimodal, and to test it, we can use different clustering algorithms.
First, we used Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN; Campello et al. 2013; McInnes et al. 2017) on the first 20 PCs. HDBSCAN is a sophisticated clustering algorithm that aims to identify clumps in the data above a less dense background. It is able to separate the optimal number of clusters automatically, without the need to specify the number. It does not assign every data point to a cluster, as in the case of partitioning algorithms such as k-means. It is frequently used for finding open clusters and associations surrounded by field stars in Gaia data (see, e.g., Hunt & Reffert 2023). By requiring at least ten points in each cluster, HDBSCAN only found a single cluster above the background, at the core of the distribution, indicating no signs of multimodality. The result was the same when using only five PCs.
Then, we applied Gaussian mixture models (Ivezić et al. 2014) to the first five PCs. This method is a density estimation technique that fits the data with a given number of N-dimensional Gaussians. It can also be used for soft clustering, assigning each data point to clusters probabilistically. The number of Gaussian components can be selected using BIC. By varying the number of Gaussians between 1 and 30, the preferred values ranged from 15 to 20, with slight differences in BIC. However, these are not distinct clusters, but rather overlapping ones. We calculated the silhouette score (Rousseeuw 1987) for different number of Gaussians and found values below 0.1, which hints to the absence of distinct clusters. Silhouette scores close to one indicate strong, nonoverlapping clumps, while values close to zero indicate totally overlapping clusters.
While these methods do not prove that flare shape clusters do not exist, they suggest a more gradual, continuous change in the scaled flare profiles. Figure 14 illustrates that the shapes indeed change in the dimensionality reduced space, by showing the median profiles from six different regions in the UMAP space.
![]() |
Fig. 14. Average flare shapes from different positions in the UMAP space. Different colors show the median profiles and the range between the 16th and 84th percentiles inside the given circles in the UMAP space. Each circle includes approximately 1000 flares. Dashed lines in the subplots show the whole sample median for comparison. |
3.5. Changing flare shapes with astrophysical parameters
Figure 12 already hinted to a systematic variation of flare shapes with Teff, with a clearly visible color gradient mostly in PC5. Before attempting to extract astrophysical information from individual flares, we considered the changes in the average shape of a large number of flares from similar stars. For this, we binned the stars by their position on the Gaia MS, as described in Sect 3.2.
Figure 15 shows the position of the median flare shape in the PC space for each bin, color-coded with Teff. There is a systematic “wandering” of the points in the PC space, following the color gradients in Fig. 12. The simplest trajectories with the least turns lie along PC5, indicating that it is directly proportional to Teff. The Pearson correlation coefficient between Teff and PC5 is 0.15 with p < 10−200, which is the strongest correlation among the PCs.
![]() |
Fig. 15. Changes in the flare shapes in the PC space along the MS. The points denote the median value of the PC coefficients for the stars in the given bins from Fig. 9, colored with Teff. The error bars show the standard error of the median, |
Figure 16 shows how the median flare shape changes along the MS. The upper panel shows the scaled flare shapes, and the lower panel shows the residual after removing the median flare shape of the whole sample. There are striking variations in the shape, albeit with relatively small (i.e., a few percent) amplitude. The main feature is that the flares of hotter stars are “fatter” and wider for a few t1/2, but they decay more quickly after ∼2t1/2. The zero residuals around −3t1/2 and 10t1/2 are artifacts of the flare extraction, caused by the final linear detrending, which is performed on those regions.
![]() |
Fig. 16. Flare shapes along the MS using the binning from Fig. 9. Top: Median shapes colored with Teff. Middle: Residual flare shapes made by removing the whole sample median from the median shape in each bin. Left: Number of flares in each bin along the MS parameterized by Teff. |
To explore whether we could have revealed any other type of flare shape variability, we present mock retrieval tests in Appendix A. We injected three different kinds of trends into real light curves and found that the only kind of variability that we can safely recover is similar to what we have found in Fig. 16.
3.6. Astrophysical parameter prediction from individual flares
To find nontrivial relationships between astrophysical parameters and flare shape, we can also use flexible machine-learning models. If we can find an algorithm that predicts a certain astrophysical parameter of the star only from the shape of its flares, we can prove that such a dependence exists.
To parameterize the flare shapes we used 20 components from WPCA and also added the t1/2, amplitude, and ED. We used this 20+3 dimensional dataset as input and Teff from TICv8.2 as output labels. We removed stars with Teff > 8000 K. We used the base-10 logarithm of t1/2, amplitude and ED, as they change multiple orders of magnitude. Due to missing Teff values and some negative EDs, we were left with a sample size of 108,687. Before training, we subtracted the median from each feature and scaled by the interquartile range to normalize the data. We trained different regression models and compared the results to a dummy regressor that outputs the mean label for each input. To assess the performance of the models, we used k-fold cross-validation with root mean squared error as a metric. We split the dataset into k = 5 partitions (folds), and used one of them as the test set and the other four as the training set. The final metric was calculated as the average of the folds.
We tried the following regression models, using their scikit-learn (Pedregosa et al. 2011) implementations: multivariate linear regression, random forest (Breiman 2001) and gradient boosting (Friedman 2001). Random forest is an ensemble method that averages the results of multiple decision trees, where each tree has the same number of branches (depth), splitting the input data by a single feature at each branch. Gradient boosting also works with a given number of trees, but the results of the trees are not averaged, instead, they are built atop of each other, boosting the performance of previous trees. We used the following fine-tuned parameters for the models: 20 trees with depth = 10 for random forest and 20 trees with depth = 10 for gradient boosting.
Table 4 summarizes the results. For the prediction of Teff, the best model outperforms the dummy regressor by only 5–30%. The prediction accuracy improves when the scaling parameters t1/2, A and ED are included. The best-performing models are the random forest and gradient boosting with similar scores.
Results of the Teff prediction from flare shapes.
Based on these results we concluded that the shapes of individual flares do not carry enough information to determine the astrophysical parameters of their stars, although there is a marginal improvement over the dummy regressor even in the PCA-only case. Adding the flare amplitude as input improved the accuracy, due to the different flare contrast on stars with different luminosities. By shuffling the data points of single input features, the permutation feature importance score can be calculated (i.e., how much the regressor relies on a given input feature). Based on this, the most important feature is A, followed by t1/2, ED, PC2, and PC5.
We also tried to average the flares for each star, to see whether it improved the regression performance. We could achieve only a few percent improvement in root mean squared error compared to the best model using individual flares, and only when the number of flares per star was included as an input parameter. For reliable prediction of Teff, we estimate that one would need to average at least a few hundred flares for each star (as in the case of Fig. 16), which is only available for a handful of objects.
4. Application of the results
4.1. New flare templates
Using a large number of high-quality, manually selected 1-min cadence Kepler flares from the M4 dwarf GJ 1243, Davenport et al. (2014) created a flare shape template that has been used in many cases since its release. Here, we aim to study how this flare template would change for different types of stars.
We adopted the same parameterization as Davenport et al. (2014), using a 4th-order polynomial for the rise phase, and the sum of two exponentials for the decay phase, as follows: Frise(t) = 1 + a1 t + a2 t2 + a3 t3 + a4 t4
Fdecay(t) = b1 e-c1 t + b2 e-c2 t. We also forced both fits to go through the peak (t = 0, F = 1), and the rise phase to end at −t1/2 (t = −t1/2, F = 0). For t < −t1/2, the template is zero by definition. We used the scaled and interpolated flares for the fit, grouped together along the MS as described in Sect. 3.2. We used all the flaring points aggregated for the fit, not just an average curve. We made no distinction between simple and complex flares.
Figure 17 shows these templates plotted for different Teff, along with the original template from Davenport et al. (2014). Since the dataset used to create that template is different in both cadence and passband to our 2-min cadence TESS dataset, any comparison should be made with caution. The templates vary only a few percent for different types of stars, they can broadly recover the trends visible on the residual image in Fig. 16. Figure 18 shows how the parameters of the flare template change along the MS. The c parameters have the most physical relevance, as they are the exponents of the exponentials. c1 seems to change erratically, it can be considered constant, while c2, the exponent of the late decay phase changes almost monotonically with Teff. A similar effect was seen in the residual map of Fig. 16. The fit was also repeated for broad Teff ranges, and the parameters are reported in Table 5.
![]() |
Fig. 17. Flare templates fitted along the MS using Eq. (9)–(10) and color coded with Teff. The shaded region shows the 16th and 84th percentiles of the dataset. The black dashed line shows the template of Davenport et al. (2014). |
![]() |
Fig. 18. Parameters of the flare template fitted along the MS in the following form: Frise(t) = 1 + a1 ⋅ t + a2 ⋅ t2 + a3 ⋅ t3 + a4 ⋅ t4 and Fdecay(t) = b1 ⋅ e−c1t + b2 ⋅ e−c2t. The shaded regions show the formal uncertainty of the fit. |
4.2. Sampling flare shapes
When simulating flaring stars’ light curves, it is necessary to use realistic flare shapes. Such a situation would arise during the training of data-driven flare detection algorithms, injection-recovery tests, simulating realistic light curves when stellar flares are a source of astrophysical noise, and so on. Apart from using analytical templates, a more sophisticated approach would be to sample from some low-dimensional representation. Such a representation is the PC space of our flare catalog. One could directly sample real flares from the catalog itself, but another solution would be to use a density estimation on the PC space, sample from that, and transform them to flare shapes using the PCA basis. Normalizing flows and variational autoencoders are specifically designed for this task (see, e.g., Lim et al. 2024; Srinivasan et al. 2024); however, as the topology of the PC space in this case is simple enough, it is reasonable to use a simpler model. This can either be a Gaussian mixture model or a kernel density estimator (KDE; Chen 2017). Once such an estimator is trained, it is fast to sample from it. One important consideration is that we want to generate smooth noiseless flares, and thus we cannot use too many PCs, as the higher PCs mostly describe noise. Figure 19 shows random flare shapes drawn from a Gaussian KDE fitted to the two-, five-, and ten-dimensional PC space. Using 10 PCs results in more “wiggly” flares, as the later PCs mostly describe the noise in the data.
![]() |
Fig. 19. Flare shapes randomly sampled from a kernel density estimator trained on the given number of PCs. |
It is also possible to sample from a joint PC–physical parameter distribution, including t1/2 and amplitude. Restricting the input data to flares from stars with given characteristics (e.g., spectral type, brightness), it is possible to simulate even more realistic flares for special use cases. To facilitate the use of our flare catalog for sampling, we provide a short Jupyter notebook on Zenodo that demonstrates how synthetic light curves can be generated with flares drawn from a KDE.
4.3. Locating similar flares
Another application of the PCA representation is that we can use it to locate real flares that are similar to a given (real or artificial) input shape. The task can be reduced to a nearest-neighbor search in the PC space, after transforming the input shape into this space. A similar technique was presented by Seo et al. (2023), using autoencoders to find galaxies with similar morphologies.
Figure 20 shows an example. Double-peaked flares were generated by adding a secondary peak to a template at different positions. These are the input shapes, and we would like to find real flares that look similar to them. For fast nearest neighbor search, a KDTree (Maneewongvatana & Mount 1999) was built with Euclidean metric from the first 20 PCs, after normalizing the dataset to have zero mean and unit variance in each dimension. Using the PCA basis, the input shapes were transformed to this space (applying the same scaling). Then, the KDTree was queried with these inputs to locate the 30 closest real flares from the catalog. The red lines in Fig. 20 show the input shapes, and the black lines show the retrieved flares, which are indeed similar to the input.
![]() |
Fig. 20. Nearest neighbor search in the flare shape space. The red curve on each panel is the injected flare shape, and the 30 closest flares are shown in black. |
One important consideration with this method is the number of PCs to keep. On one hand, the search algorithm is affected by the curse of dimensionality. This makes the query less efficient in high dimensional space, where the query reduces to a simple linear search over the whole flare sample (Goodman & O’Rourke 2004). The Euclidean metric is also less relevant in higher dimensions (Aggarwal et al. 2001). On the other hand, using too few PCs gives poor results, as they cannot describe the input shape adequately. The optimal number of PCs is different for each input shape, depending on which PCs describe the variation the best. After experimenting with different inputs, a number between 10 and 20 seemed adequate. We provide a short Jupyter notebook on the Zenodo page of the paper that allows the reader to query the catalog for arbitrary input shapes.
5. Solar flare shapes
To gain insights into the stellar case, we made a simple attempt to study solar flare shapes. To obtain a dataset comparable to the stellar case, we chose an instrument that provides disk-integrated light curves. As solar white light flares are rare, we used an instrument operating in the ultraviolet regime. To have a hypothesis to test, we compared light curve shapes of flares with and without accompanying coronal mass ejections (CMEs). If we could find any differences between them based solely on the light curves, we could hope to find similar differences on other stars. If not, that would hint at a universal flare profile.
To compile the sample, we used data from the Extreme ultraviolet Variability Experiment (EVE; Woods et al. 2012) instrument of the SDO. EVE is an ultraviolet photometer with four different channels, centered on 182, 256, 304, and 366 Å, measuring full disk irradiances with 0.25 s cadence since 2010. We used the 304 Å channel, which is centered on a He II line from the chromosphere.
5.1. Flare extraction
To locate the flares on the SDO/EVE light curves, we collected all the M and X class flares between 2010 and 2023 from the catalog of the Geostationary Operational Environment Satellite (GOES)6, resulting in 1136 events. Using this list, we downloaded Level 1 SDO/EVE time series from the LASP archive for the given days when flares occurred. Then, using the peak times from the GOES catalog, we extracted 9-hour cutouts, starting from 3 hours before the GOES peak. A 20-point (5 second) running mean filter was applied to smooth the time series, and the negative flux values were removed.
The flare extraction from the SDO/EVE time series was carried out similarly to the TESS light curves. Using the nominal start and end times from the GOES catalog, we defined a quiescent region around the flare, and fitted it with a low-order polynomial with the order determined by the BIC between zero and four. After subtracting this polynomial baseline, we fitted the flare template of Davenport et al. (2014) to determine the t1/2 time scale of the event. After shifting the flare peak time to zero and scaling the amplitude to unity, we linearly interpolated each flare to a uniform time grid of 1000 points between −2 to 6t1/2 and removed a final linear trend fitted before -1 and after 5t1/2. 203 flares were discarded during the extraction (due to e.g., too many missing points, no dominant peak, failed baseline fit), resulting in a sample size of 933. We then visually inspected all of these flares, similar to the stellar case described in Sect. 2.5, and removed all the incorrectly extracted ones. This resulted in a final sample size of 539. The main properties of these flares are summarized in Table 6, and the scaled flare shapes are available at Zenodo.
Parameters of the solar flares.
The t1/2 distribution of the SDO/EVE solar flares follows a log-normal distribution. The fitted parameters are μ = 0.88 and σ = 0.35 in minutes, giving a median t1/2 of 7.5 minutes.
5.2. The effect of coronal mass ejections
After collecting and scaling the solar flares from the 304 Å channel of SDO/EVE, we can looked for differences in flare shape caused by different physical processes. One way to separate flares is whether they were accompanied by a CME.
We tested whether an accompanying CME influences the flare shape by contrasting the average shapes of flares with and without CMEs (so-called eruptive and confined flares in Li et al. 2021). We used the CME catalog of Gopalswamy et al. (2009), created with data from the Solar and Heliospheric Observatory satellite (SOHO), and flagged the flares where a CME was reported in a ±2 hours interval near the flare peak time, and it was not labeled as a “(very) poor event” in the catalog. This resulted in 291 flares with CMEs and 248 flares without CMEs. The left panel of Fig. 21 shows the median flare shapes, with only a few percent difference between flares with and without CMEs. The average stellar flare shape from TESS is also shown, but due to the lower observing cadence and the different passband, it is hard to make a direct comparison. Following Oláh et al. (2022), we quantified the difference between the average shapes of flares with and without CMEs using the sum of squared differences as a similarity metric. We calculated it for the median flare shapes and compared it to a distribution of values from random shuffles of the dataset, mixing flares with and without CMEs together (see Sect. 3.2 of Oláh et al. 2022, and also their Appendix B). The resulting sum of squared differences is around the 70th percentile of the distribution from the random shuffles, indicating that the difference (if any) is weak; we get similar results from just randomly partitioning the flares into two groups. The right panel of Fig. 21 shows the UMAP projection of the 1000-dimensional dataset, and there is again no distinction between the two classes, in accordance with the previous result. If the two classes were noticeably different, they would separate more in the dimensionality-reduced space. The PCA representation of the dataset shows no distinction either.
![]() |
Fig. 21. Morphology of the solar flares observed in the 304 Å channel of SDO/EVE. Left: Median shapes of solar flares with and without CMEs and their difference. The median flare shape from TESS is also shown. Right: UMAP projection of the scaled solar flare shapes. Blue and red points show events with and without CMEs, respectively. |
Thus we failed to find any (simple) difference in the light curves of solar flares with and without accompanying CMEs. This hints that the diversity of stellar flares should probably not be attributed to CMEs, and that using high-resolution spectral time series remains the easiest way to reliably detect stellar CMEs (see, e.g., Vida et al. 2019; Leitzinger et al. 2020; Namekata et al. 2021). A similar conclusion was reached by Harra et al. (2016) for a sample of 42 X-class solar flares. They found that the only difference between flares with and without CMEs is coronal dimming in EUV. Coronal dimmings appear after flares with accompanying CMEs, they are mainly observable in hot coronal lines (e.g., Fe XII line at 193 Å; Harra et al. 2016). In the stellar context, Veronig et al. (2021) and Loyd et al. (2022) presented observations of coronal dimmings on G–K–M stars, using X-ray and far-UV data. However, as coronal dimmings last for several hours – an order of magnitude longer than the flaring time scale – we cannot see them in the scaled flare shapes, as any variation after ∼10t1/2 is removed with the baseline. However, we note that narrow-band EUV observations of the Sun are quite different from the white light time series we can study with TESS, so any implications should be interpreted with caution. Nevertheless, solar irradiation time series should be studied in greater detail, using data from different channels and instruments.
6. Discussion
6.1. Physical background
The evolution of a flare is usually divided into a heating and a cooling phase, separated by the peak time. In the heating phase, the heating rate exceeds the absorption rate, and thus the temperature of the plasma rises. During the decay phase the heating rate drops below the sum of the conduction and radiative loss, and the plasma temperature drops. The shape of the flare decay light curve is determined by the two cooling processes during the decay phase: radiative cooling and thermal conduction. In the initial part of the flare decay (roughly the first 20% after peak time), the conductive cooling, while in the later part, the radiative losses play the main role typically resulting in a steep initial decay followed by a slower decline in flux (Aschwanden 2004). The behavior of the flux during the decay phase depends on the ratio between cooling by radiation and cooling by thermal conduction, governed by the temperature and density of the emission region. Kashapova et al. (2021) studied the decay phase of solar flares in several spectral bands – in the 1600 and 304 Å channels that they used as Sun-as-a-star data, and in the 1700 Å channel, where the emission is associated with a similar temperature to that usually ascribed to M4-dwarf flares. They found that the emission characteristics of various spectral bands are dependent on their formation temperatures and heights in the solar atmosphere. Namely, the decay rate during the first phase of cooling was slower for solar-like flares compared to M-dwarf flares, suggesting denser plasma in M-dwarf flares. Furthermore, the study found differences in cooling behavior between solar flares and M-dwarf flares, with solar flares exhibiting more complex cooling patterns in the second phase. This is exactly the behavior we observed in Fig. 16 for the first time on other stars: hotter, solar-like stars typically show a slower decay in the conductive cooling phase and more complex cooling patterns compared to M-dwarfs suggesting that the plasma responsible for M-dwarf flare emission is denser than flares of hotter stars, and they potentially originate from a deeper layer of the stellar atmosphere.
Solar flares are seen as arcade-like structures, including multiple loops along the flare ribbon. Warren (2006) showed that while single loop numerical models do not describe the X-ray light curve morphology and decay time scale of solar flares adequately, a multithread flare model works well. In these models, the decay time can be much longer than the single loop cooling time scale, as we see the superposition of multiple flaring loops. The quasi-periodic modulation of flares can also be described in the context of multithread models (see, e.g., Reep et al. 2020). The change in the average shapes of stellar flares might also be related to multithread flares. In this context, the observed morphology can be linked to the physical structure of the flaring arcades, how many threads are formed and how frequently.
A possible source of flare shape variability may come from the different temperature evolution of flares on different stellar types, as suggested by Howard et al. (2020) on the basis of 44 flares observed simultaneously by TESS and Evryscope in g′ (29 of which are common with our sample). Their data suggest that the global color temperature of a flare does not depend on the stellar mass, while the color temperature at the flare peak increases with stellar mass and thus with stellar Teff. If we approximate the flare spectrum with blackbody radiation, the largest fraction of flux in the TESS band is measured at around 5000 K, and it declines for higher flare temperatures (see Fig. 3 in Howard et al. 2020). Thus if the flare temperature changes, the fractional flux in the TESS band will change during the flare, resulting in systematic morphological differences in TESS observations. This shows the importance of time resolved, multiband observing campaigns of stellar flares (see, e.g., Kowalski et al. 2013; Howard et al. 2020; Berger et al. 2024; Jackman et al. 2024).
6.2. Possible future work
In this work, we scaled and interpolated the extracted flares to make them comparable. This way the whole analysis was simplified, and we could use dimensionality reduction algorithms such as PCA. However, this is not the only possible approach.
The analysis of one-dimensional shapes has a rich literature, albeit astronomical applications are rare (see, e.g., Loredo et al. 2024). It belongs to functional data analysis (Ramsay 2005), a field of statistics that deals with curves and surfaces, where measurements are not isolated points, but continuous functions. One specialty of shape data is the invariance to certain transformations (e.g., translation, scaling). When comparing shapes, we should take these into account by selecting a suitable distance metric. In this work, we only used the simple Euclidean distance metric (L2 norm), which is the assumption behind many algorithms. This was only possible after prior scaling of the data. There are alternative metrics for functional data, such as dynamic time warping distance, cross-correlation distance, Procrustes distance. These are invariant to certain transformations; however, they are more computationally expensive to calculate. In functional data analysis, many classical methods have their counterparts, such as functional PCA, functional regression, and k-shape clustering (Paparrizos & Gravano 2016). The last counterpart is an alternative to the popular k-means clustering algorithm, but it is tailored to one-dimensional shapes. It was recently used by Moe et al. (2023) to cluster Stokes profiles from solar atmospheric simulations. One possible future avenue would be the application of functional data analysis techniques to the flare shape data using different methods to represent or cluster the data.
One other way would be to keep the scaled profiles and experiment with different dimensionality reduction algorithms. In this work, we applied PCA and UMAP. Another powerful method is the use of autoencoders (Kramer 1991). These are special neural networks, where the output to predict is the input itself. Autoencoders are comprised of an encoder part that transforms the input into a compact, low-dimensional latent space, and a decoder part, which transforms the data back. Once trained, the encoder can be used as a powerful dimensionality reduction algorithm. As the neural network can be arbitrarily complex, it can – in theory – learn complex representations, beyond the capability of PCA. Thus, more sophisticated dimensionality reductions algorithms might reveal more insights about the flare shape space. With a flexible algorithm, the scaling in duration and amplitude could also be omitted, making it possible to find more general correlations. Lousto et al. (2022) studied the temporal morphology of radio pulses of the Vela Pulsar with similar goals. They worked with time series similar to flaring light curves, using variational autoencoders for dimensionality reduction, and self-organizing maps for the clustering of different pulse shapes. They succeeded in identifying different clusters and interpreted them as pulses originating from different heights in the pulsar magnetosphere.
Also, we did not devote much time to the analysis of interesting individual flaring objects. Apart from highly active stars, these also include hot or compact stars, where the tentative detection of flaring is intriguing in itself; however, its confirmation is more involved. It would require careful and detailed analysis, similar to the work of Xing et al. (2024) about flaring hot subdwarfs and white dwarfs with TESS.
7. Summary
In this study, we have explored the information contained in the shapes of stellar flares. Our findings can be summarized as follows:
-
We searched for flares in the first five years (sectors 1–69) of the TESS mission using 2-min cadence PDCSAP light curves.
-
We used flatwrm2, a neural network-based algorithm to find flares. We re-trained flatwrm2 specifically to TESS 2-min cadence data by extending the previous training set with more than 4000 real TESS light curves, and we identified flares manually. We have made this training set available at Zenodo, which includes not only flaring stars but also known astrophysical false positives.
-
After filtering the flatwrm2 flare candidates and manual vetting, we ended up with a high-purity catalog of ∼120, 000 flare events on ∼14, 000 stars (available as an online supplement). Besides the basic parameters, we also extracted the scaled profile of each flare.
-
We found that the flare parameters – equivalent duration, amplitude, and t1/2 – correlate with Teff along the MS, with increasing temperature, equivalent duration, and amplitude decrease, while t1/2 increases.
-
Flare shapes change with Teff as well – flares of hotter stars are “fatter” and wider for a few t1/2, but they decay more quickly, that is, hotter solar-like stars typically show a slower decay in the conductive cooling phase and more complex cooling patterns compared to M-dwarfs. This suggests that the plasma responsible for M-dwarf flare emission is denser than flares of hotter stars, and flares probably originate from a deeper layer of the stellar atmosphere.
-
There was no indication of clustering in the flare shapes, and the shapes seem to change gradually.
-
The shapes of individual flares do not carry enough information to determine the physical parameters of their host.
-
We created new flare templates for different Teff ranges.
-
The PCA representation can be used to simulate realistic flare shapes and to find flares similar to given input shapes.
-
Using SDO/EVE data on solar flares, we analyzed the effect of CMEs on flare shapes and found no obvious difference between flares with or without CMEs, suggesting that the diversity of the flares is not connected to CMEs.
As a future avenue, new observations from TESS and from the upcoming PLATO mission (Rauer et al. 2014) will provide an ever growing catalog of stellar flares. Using a larger and more diverse sample, knowledge of magnetically active stars can be deepened.
8. Data availability
Supplementary material is available on the Zenodo service: https://zenodo.org/records/14179313. It includes the manually vetted flare catalog, the extracted flare shapes, the training set, and example Jupyter notebooks.
Implemented in the wpca python package: https://github.com/jakevdp/wpca
Implemented in the umap python package: https://github.com/lmcinnes/umap
Version 2022.04.16 retrieved from https://www.pas.rochester.edu/~emamajek/EEM_dwarf_UBVIJHK_colors_Teff.txt
Acknowledgments
We would like to thank the anonymous reviewer for the helpful comments, especially regarding the physical background of flare shapes. We thank G. Csörnyei for the helpful discussions and suggestions regarding data analysis methods. This research was funded by the Hungarian National Research, Development, and Innovation Office grants KKP-143986 and K-138962. Authors acknowledge the financial support of the Austrian–Hungarian Action Foundation grant 117öu4. K.V. is supported by the Bolyai János Research Scholarship of the Hungarian Academy of Sciences. B.S. was supported by the ÚNKP-22-3 New National Excellence Program of the Ministry for Culture and Innovation. On behalf of the “Looking for stellar CMEs on different wavelengths” project, we are grateful for the possibility of using the HUN-REN Cloud. Sz.S. acknowledges the support (grant No. C1791784) provided by the Ministry of Culture and Innovation of Hungary of the National Research, Development and Innovation Fund, financed under the KDP-2021 funding scheme. This work made extensive use of numpy (van der Walt et al. 2011), scipy (Virtanen et al. 2020), pandas (Wes McKinney et al. 2010), scikit-learn (Pedregosa et al. 2011), matplotlib (Hunter 2007) and cmasher (Twicken et al. 2020).
References
- Aggarwal, C. C., Hinneburg, A., & Keim, D. A. 2001, in International Conference on Database Theory (Springer Berlin Heidelberg), 420 [Google Scholar]
- Aizawa, M., Kawana, K., Kashiyama, K., et al. 2022, PASJ, 74, 1069 [NASA ADS] [CrossRef] [Google Scholar]
- Aschwanden, M. J. 2004, Physics of the Solar Corona. An Introduction (Praxis Publishing Ltd.) [Google Scholar]
- Balona, L. A. 2015, MNRAS, 447, 2714 [Google Scholar]
- Baron, D. 2019, arXiv e-prints [arXiv:1904.07248] [Google Scholar]
- Berger, V. L., Hinkle, J. T., Tucker, M. A., et al. 2024, MNRAS, 532, 4436 [NASA ADS] [CrossRef] [Google Scholar]
- Bicz, K., Falewicz, R., Pietras, M., Siarkowski, M., & Preś, P. 2022, ApJ, 935, 102 [NASA ADS] [CrossRef] [Google Scholar]
- Borucki, W. J., Koch, D., Basri, G., et al. 2010, Science, 327, 977 [Google Scholar]
- Boyd, D., Buchheim, R., Curry, S., et al. 2023, J. Am. Assoc. Var. Star Obs., 51, 14 [NASA ADS] [Google Scholar]
- Brasseur, C. E., Phillip, C., Fleming, S. W., Mullally, S. E., & White, R. L. 2019, Astrophysics Source Code Library [record ascl:1905.007] [Google Scholar]
- Breiman, L. 2001, Mach. Learn., 45, 5 [Google Scholar]
- Bruno, G., Pagano, I., Scandariato, G., et al. 2024, A&A, 686, A239 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Campello, R. J. G. B., Moulavi, D., & Sander, J. 2013, Pacific-Asia Conference on Knowledge Discovery and Data Mining (Berlin, Heidelberg: Springer) [Google Scholar]
- Candelaresi, S., Hillier, A., Maehara, H., Brandenburg, A., & Shibata, K. 2014, ApJ, 792, 67 [NASA ADS] [CrossRef] [Google Scholar]
- Chen, Y. C. 2017, arXiv e-prints [arXiv:1704.03924] [Google Scholar]
- Crowley, J., Wheatland, M. S., & Yang, K. 2022, ApJ, 941, 193 [NASA ADS] [CrossRef] [Google Scholar]
- Csörnyei, G., Dobos, L., & Csabai, I. 2021, MNRAS, 502, 5762 [CrossRef] [Google Scholar]
- Davenport, J. R. A. 2016, ApJ, 829, 23 [Google Scholar]
- Davenport, J. R. A., Hawley, S. L., Hebb, L., et al. 2014, ApJ, 797, 122 [Google Scholar]
- Davenport, J. R. A., Kipping, D. M., Sasselov, D., Matthews, J. M., & Cameron, C. 2016, ApJ, 829, L31 [NASA ADS] [CrossRef] [Google Scholar]
- Delchambre, L. 2015, MNRAS, 446, 3545 [NASA ADS] [CrossRef] [Google Scholar]
- Doyle, J. G., Shetye, J., Antonova, A. E., et al. 2018, MNRAS, 475, 2842 [CrossRef] [Google Scholar]
- Doyle, L., Ramsay, G., & Doyle, J. G. 2020, MNRAS, 494, 3596 [NASA ADS] [CrossRef] [Google Scholar]
- Doyle, J. G., Irawati, P., Kolotkov, D. Y., et al. 2022, MNRAS, 514, 5178 [NASA ADS] [CrossRef] [Google Scholar]
- Esquivel, J. A., Shen,, Y., Leos-Barajas, V., et al. 2024, arXiv e-prints [arXiv:2404.13145] [Google Scholar]
- Feinstein, A. D., Montet, B. T., Foreman-Mackey, D., et al. 2019, PASP, 131, 094502 [Google Scholar]
- Feinstein, A., Montet, B., & Ansdell, M. 2020a, J. Open Source Software, 5, 2347 [NASA ADS] [CrossRef] [Google Scholar]
- Feinstein, A. D., Montet, B. T., Ansdell, M., et al. 2020b, AJ, 160, 219 [Google Scholar]
- Feinstein, A. D., Seligman, D. Z., Günther, M. N., & Adams, F. C. 2022, ApJ, 925, L9 [NASA ADS] [CrossRef] [Google Scholar]
- Feinstein, A. D., Seligman, D. Z., France, K., Gagné, J., & Kowalski, A. 2024, AJ, 168, 60 [Google Scholar]
- Friedman, J. H. 2001, Ann. Stat., 29, 1189 [Google Scholar]
- Gaia Collaboration (Brown, A. G. A., et al.) 2018, A&A, 616, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- García, C. R., Torres, D. F., & Patruno, A. 2022, MNRAS, 515, 3883 [CrossRef] [Google Scholar]
- Gilbert, E. A., Barclay, T., Quintana, E. V., et al. 2022, AJ, 163, 147 [NASA ADS] [CrossRef] [Google Scholar]
- Goodman, J. E., & O’Rourke, J. 2004, Handbook of Discrete and Computational Geometry, Second Edition (Chapman and Hall/CRC) [Google Scholar]
- Gopalswamy, N., Yashiro, S., Michalek, G., et al. 2009, Earth Moon Planets, 104, 295 [NASA ADS] [CrossRef] [Google Scholar]
- Gryciuk, M., Siarkowski, M., Sylwester, J., et al. 2017, Sol. Phys., 292, 77 [NASA ADS] [CrossRef] [Google Scholar]
- Günther, M. N., Zhan, Z., Seager, S., et al. 2020, AJ, 159, 60 [Google Scholar]
- Hajdu, G., Dékány, I., Catelan, M., Grebel, E. K., & Jurcsik, J. 2018, ApJ, 857, 55 [Google Scholar]
- Harra, L. K., Schrijver, C. J., Janvier, M., et al. 2016, Sol. Phys., 291, 1761 [NASA ADS] [CrossRef] [Google Scholar]
- Hauschildt, P. H., Allard, F., & Baron, E. 1999, ApJ, 512, 377 [Google Scholar]
- Hawley, S. L., Davenport, J. R. A., Kowalski, A. F., et al. 2014, ApJ, 797, 121 [Google Scholar]
- Higgins, M. E., & Bell, K. J. 2023, AJ, 165, 141 [NASA ADS] [CrossRef] [Google Scholar]
- Howard, W. S. 2022, MNRAS, 512, L60 [NASA ADS] [CrossRef] [Google Scholar]
- Howard, W. S., & Law, N. M. 2021, ApJ, 920, 42 [NASA ADS] [CrossRef] [Google Scholar]
- Howard, W. S., & MacGregor, M. A. 2022, ApJ, 926, 204 [NASA ADS] [CrossRef] [Google Scholar]
- Howard, W. S., Corbett, H., Law, N. M., et al. 2020, ApJ, 902, 115 [Google Scholar]
- Hübner, M., Huppenkothen, D., Lasky, P. D., et al. 2022, ApJ, 936, 17 [Google Scholar]
- Hunt, E. L., & Reffert, S. 2023, A&A, 673, A114 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Hunter, J. D. 2007, Comput. Sci. Eng., 9, 90 [NASA ADS] [CrossRef] [Google Scholar]
- Hunt-Walker, N. M., Hilton, E. J., Kowalski, A. F., Hawley, S. L., & Matthews, J. M. 2012, PASP, 124, 545 [NASA ADS] [CrossRef] [Google Scholar]
- Ivezić, Ž., Connolly, A. J., VanderPlas, J. T., & Gray, A. 2014, Statistics, Data Mining, and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data (Princeton University Press) [Google Scholar]
- Jackman, J. A. G., Shkolnik, E. L., Million, C., et al. 2023, MNRAS, 519, 3564 [NASA ADS] [CrossRef] [Google Scholar]
- Jackman, J. A. G., Shkolnik, E. L., Loyd, R. O. P., et al. 2024, MNRAS, 529, 4354 [CrossRef] [Google Scholar]
- Jenkins, J. M., Twicken, J. D., McCauliff, S., et al. 2016, in Software and Cyberinfrastructure for Astronomy IV, eds. G. Chiozzi, & J. C. Guzman (SPIE), Int. Soc. Opt. Photon., 9913, 99133E [Google Scholar]
- Jia, M., Luo, A.-L., & Qiu, B. 2024, MNRAS, 536, 3123 [CrossRef] [Google Scholar]
- Kashapova, L. K., Broomhall, A.-M., Larionova, A. I., Kupriyanova, E. G., & Motyk, I. D. 2021, MNRAS, 502, 3922 [NASA ADS] [CrossRef] [Google Scholar]
- Kővári, Z., Oláh, K., Günther, M. N., et al. 2020, A&A, 641, A83 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Kowalski, A. F. 2024, Liv. Rev. Sol. Phys., 21, 1 [NASA ADS] [CrossRef] [Google Scholar]
- Kowalski, A. F., Hawley, S. L., Wisniewski, J. P., et al. 2013, ApJS, 207, 15 [NASA ADS] [CrossRef] [Google Scholar]
- Kowalski, A. F., Mathioudakis, M., Hawley, S. L., et al. 2016, ApJ, 820, 95 [NASA ADS] [CrossRef] [Google Scholar]
- Kramer, M. A. 1991, AIChE J., 37, 233 [CrossRef] [Google Scholar]
- Leitzinger, M., Odert, P., Greimel, R., et al. 2014, MNRAS, 443, 898 [NASA ADS] [CrossRef] [Google Scholar]
- Leitzinger, M., Odert, P., Greimel, R., et al. 2020, MNRAS, 493, 4570 [Google Scholar]
- Li, T., Chen, A., Hou, Y., et al. 2021, ApJ, 917, L29 [NASA ADS] [CrossRef] [Google Scholar]
- Liddle, A. R. 2007, MNRAS, 377, L74 [NASA ADS] [Google Scholar]
- Lim, S. H., Raman, K. A., Buckley, M. R., & Shih, D. 2024, MNRAS, 533, 143 [NASA ADS] [CrossRef] [Google Scholar]
- Lin, C.-L., Apai, D., Giampapa, M. S., & Ip, W.-H. 2024, AJ, 168, 234 [Google Scholar]
- Liu, Q., Lin, J., Wang, X., et al. 2023, MNRAS, 523, 2193 [NASA ADS] [CrossRef] [Google Scholar]
- Loredo, T., Budavari, T., Kent, D., & Ruppert, D. 2024, arXiv e-prints [arXiv:2408.14466] [Google Scholar]
- Lousto, C. O., Missel, R., Prajapati, H., et al. 2022, MNRAS, 509, 5790 [Google Scholar]
- Loyd, R. O. P., Mason, J. P., Jin, M., et al. 2022, ApJ, 936, 170 [NASA ADS] [CrossRef] [Google Scholar]
- Maas, A. J., Ilin, E., Oshagh, M., et al. 2022, A&A, 668, A111 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Maehara, H., Shibayama, T., Notsu, Y., et al. 2015, Earth Planets Space, 67, 59 [NASA ADS] [CrossRef] [Google Scholar]
- Maneewongvatana, S., & Mount, D. M. 1999, arXiv e-prints [arXiv:cs/9901013] [Google Scholar]
- McInnes, L., Healy, J., & Astels, S. 2017, J. Open Source Software, 2 [Google Scholar]
- McInnes, L., Healy, J., & Melville, J. 2018, arXiv e-prints [arXiv:1802.03426] [Google Scholar]
- Medina, A. A., Winters, J. G., Irwin, J. M., & Charbonneau, D. 2022, ApJ, 935, 104 [NASA ADS] [CrossRef] [Google Scholar]
- Mendoza, G. T., Davenport, J. R. A., Agol, E., Jackman, J. A. G., & Hawley, S. L. 2022, AJ, 164, 17 [CrossRef] [Google Scholar]
- Moe, T. E., Pereira, T. M. D., Calvo, F., & Leenaarts, J. 2023, A&A, 675, A130 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Murray, C. A., Queloz, D., Gillon, M., et al. 2022, MNRAS, 513, 2615 [NASA ADS] [CrossRef] [Google Scholar]
- Namekata, K., Sakaue, T., Watanabe, K., et al. 2017, ApJ, 851, 91 [Google Scholar]
- Namekata, K., Maehara, H., Honda, S., et al. 2021, Nat. Astron., 6, 241 [Google Scholar]
- Oláh, K., Kővári, Zs., Günther, M. N., et al. 2021, A&A, 647, A62 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Oláh, K., Seli, B., Kővári, Zs., Kriskovics, L., & Vida, K. 2022, A&A, 668, A101 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Pál, A., Szakáts, R., Kiss, C., et al. 2020, ApJS, 247, 26 [CrossRef] [Google Scholar]
- Paparrizos, J., & Gravano, L. 2016, SIGMOD Rec., 45, 69 [CrossRef] [Google Scholar]
- Pascoe, D. J., Smyrli, A., Van Doorsselaere, T., & Broomhall, A. M. 2020, ApJ, 905, 70 [NASA ADS] [CrossRef] [Google Scholar]
- Pearson, K. 1901, London Edinburgh Philos. Mag. J. Sci., 2, 559 [CrossRef] [Google Scholar]
- Pecaut, M. J., & Mamajek, E. E. 2013, ApJS, 208, 9 [Google Scholar]
- Pedregosa, F., Varoquaux, G., Gramfort, A., et al. 2011, J. Mach. Learn. Res., 12, 2825 [Google Scholar]
- Petrucci, R. P., Gómez Maqueo Chew, Y., Jofré, E., Segura, A., & Ferrero, L. V. 2024, MNRAS, 527, 8290 [Google Scholar]
- Pettersen, B. R. 1989, Sol. Phys., 121, 299 [NASA ADS] [CrossRef] [Google Scholar]
- Pietras, M., Falewicz, R., Siarkowski, M., Bicz, K., & Preś, P. 2022, ApJ, 935, 143 [NASA ADS] [CrossRef] [Google Scholar]
- Pitkin, M., Williams, D., Fletcher, L., & Grant, S. D. T. 2014, MNRAS, 445, 2268 [NASA ADS] [CrossRef] [Google Scholar]
- Ramsay, J. 2005, Functional Data Analysis (John Wiley& Sons, Ltd), 2368 [Google Scholar]
- Ramsay, G., Kolotkov, D., Doyle, J. G., & Doyle, L. 2021, Sol. Phys., 296, 162 [NASA ADS] [CrossRef] [Google Scholar]
- Rauer, H., Catala, C., Aerts, C., et al. 2014, Exp. Astron., 38, 249 [Google Scholar]
- Reep, J. W., & Airapetian, V. S. 2023, ApJ, 958, 9 [NASA ADS] [CrossRef] [Google Scholar]
- Reep, J. W., Warren, H. P., Moore, C. S., Suarez, C., & Hayes, L. A. 2020, ApJ, 895, 30 [NASA ADS] [CrossRef] [Google Scholar]
- Ricker, G. R., Winn, J. N., Vanderspek, R., et al. 2014, SPIE Conf. Ser., 9143, 914320 [Google Scholar]
- Roettenbacher, R. M., & Vida, K. 2018, ApJ, 868, 3 [NASA ADS] [CrossRef] [Google Scholar]
- Rousseeuw, P. J. 1987, J. Comput. Appl. Math., 20, 53 [Google Scholar]
- Schofield, M., Chaplin, W. J., Huber, D., et al. 2019, ApJS, 241, 12 [Google Scholar]
- Seli, B., Vida, K., Moór, A., Pál, A., & Oláh, K. 2021, A&A, 650, A138 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Seli, B., Oláh, K., Kriskovics, L., et al. 2022, A&A, 659, A3 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Seo, E., Kim, S., Lee, Y., et al. 2023, PASP, 135, 084101 [CrossRef] [Google Scholar]
- Sikora, J., David-Uraz, A., Chowdhury, S., et al. 2019, MNRAS, 487, 4695 [Google Scholar]
- Skarka, M., Žák, J., Fedurco, M., et al. 2022, A&A, 666, A142 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Srinivasan, R., Crisostomi, M., Trotta, R., Barausse, E., & Breschi, M. 2024, Phys. Rev. D, 110, 123007 [NASA ADS] [CrossRef] [Google Scholar]
- Stassun, K. G., Oelkers, R. J., Paegert, M., et al. 2019, AJ, 158, 138 [Google Scholar]
- Török, T., Panasenco, O., Titov, V. S., et al. 2011, ApJ, 739, L63 [Google Scholar]
- Tovmassian, H. M., Zalinian, V. P., Silant’ev, N. A., Cardona, O., & Chavez, M. 2003, A&A, 399, 647 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Tu, Z.-L., Wu, Q., Wang, W., et al. 2022, ApJ, 935, 90 [NASA ADS] [CrossRef] [Google Scholar]
- Twicken, J. D., Caldwell, D. A., Jenkins, J. M., et al. 2020, J. Open Source Software, 5, 2004 [NASA ADS] [CrossRef] [Google Scholar]
- van der Walt, S., Colbert, S. C., & Varoquaux, G. 2011, Comput. Sci. Eng., 13, 22 [Google Scholar]
- Van Doorsselaere, T., Shariati, H., & Debosscher, J. 2017, ApJS, 232, 26 [NASA ADS] [CrossRef] [Google Scholar]
- Veronig, A. M., Odert, P., Leitzinger, M., et al. 2021, Nat. Astron., 5, 697 [NASA ADS] [CrossRef] [Google Scholar]
- Vida, K., & Roettenbacher, R. M. 2018, A&A, 616, A163 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Vida, K., Leitzinger, M., Kriskovics, L., et al. 2019, A&A, 623, A49 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Vida, K., Bódi, A., Szklenár, T., & Seli, B. 2021, A&A, 652, A107 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Virtanen, P., Gommers, R., Oliphant, T. E., et al. 2020, Nat. Methods, 17, 261 [Google Scholar]
- Warren, H. P. 2006, ApJ, 637, 522 [NASA ADS] [CrossRef] [Google Scholar]
- Wes McKinney 2010, in Proceedings of the 9th Python in Science Conference, eds. S. der Walt, & J. Millman, 56 [CrossRef] [Google Scholar]
- Woods, T. N., Eparvier, F. G., Hock, R., et al. 2012, Sol. Phys., 275, 115 [Google Scholar]
- Xing, K., Zong, W., Silvotti, R., et al. 2024, ApJS, 271, 57 [CrossRef] [Google Scholar]
- Yang, H., & Liu, J. 2019, ApJS, 241, 29 [NASA ADS] [CrossRef] [Google Scholar]
- Yang, H., Liu, J., Gao, Q., et al. 2017, ApJ, 849, 36 [NASA ADS] [CrossRef] [Google Scholar]
- Yang, H., Liu, J., Qiao, E., et al. 2018, ApJ, 859, 87 [NASA ADS] [CrossRef] [Google Scholar]
- Yang, K. E., Sun, X., Kerr, G. S., & Hudson, H. S. 2023a, ApJ, 959, 54 [NASA ADS] [CrossRef] [Google Scholar]
- Yang, Z., Zhang, L., Meng, G., et al. 2023b, A&A, 669, A15 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Zhang, L., Yang, Z., Su, T., Han, X. L., & Misra, P. 2024, A&A, 689, A103 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Zimmerman, R., van Dyk, D. A., Kashyap, V. L., & Siemiginowska, A. 2024, MNRAS, 534, 2142 [NASA ADS] [CrossRef] [Google Scholar]
Appendix A: Mock flare shape test
To test what kind of variations we can recover from the flare shapes, we added artificial flares to 1000 real light curves with low, but nonzero flaring rates, then extracted them with the same procedure as the real events. The base flare shape model was the template of Davenport et al. (2014), with an additional temperature-dependent feature that contains a trend we try to recover. To create realistic datasets, we drew the t1/2 and ED values of the base flare from a joint distribution (a Gaussian kernel density estimate of the real flare dataset), then calculate the A amplitude analytically from the Davenport et al. (2014) template as
We injected ten flares into each light curve at random times, taking care that they do not overlap with each other and real flare events. We considered three different artificial flare profiles, using the Teff of the injected star as a parameter. The purpose of this exercise was to recover the Teff using only the information artificially encoded in the flare profiles, illustrated in Fig. A.1.
![]() |
Fig. A.1. Change of the injected flare shapes with effective temperature. |
A.1. Gaussian bump
The injected flare consists of a single peaked template F(t, tpeak, t1/2, A) from Davenport et al. (2014), and an added Gaussian bump in the following form:
where
The bump has a width of t1/2, amplitude of 0.1 times the amplitude of the flare, and position determined by Teff. This dependence of the bump position on Teff is the trend hidden in the dataset that we try to recover.
A.2. Quasi-periodic pulsation
The injected flare is the sum of a single-peaked template F(tpeak, t1/2, A) from Davenport et al. (2014) and a localized sinusoidal in the following form:
where
and PQPP is a period value drawn from a uniform distribution between 0.8 and 1.2 t1/2, and ϕQPP is a phase value drawn from a uniform distribution between 0 and 2π. The Teff dependence lies in the amplitude of the pulsation (a sine curve with a Gaussian envelope), and peaks at 5000 K. We note the random period and phase of the pulsation, which makes the problem more generic. Quasi-periodic pulsations are a matter of current research (see, e.g., Doyle et al. (2018) or Ramsay et al. (2021)).
A.3. Pre-flare dip
The most subtle alteration of the template of Davenport et al. (2014), with an added Gaussian dip before the rise phase:
where
is the amplitude of the dip, which is the largest at Teff = 5000 K. Such dips have been observed by for example Leitzinger et al. (2014).
A.4. Recovery results
To visualize the recovered temperature dependence, Fig. A.2 shows the two-dimensional UMAP projections of the extracted flare shapes. A clear Teff gradient would show that the hidden trends can be revealed by the dimensionality reduction technique. Adding the Gaussian bump created the only clearly noticeable effect. The appearance of these bumps resemble the trends seen in the case of real TESS flares in Fig. 16. The PCA projection shows similar trends to UMAP. The fact that we cannot recover the other two types of variability (quasi-periodic pulsation and pre-flare dip) indicates the limitation of the methods used in this study.
![]() |
Fig. A.2. Result of the uniform manifold approximation and projection algorithm on the recovered mock flares. Only the Gaussian bump shows a clear trend with Teff. |
All Tables
All Figures
![]() |
Fig. 1. Example light curves from the training set. The upper-left panel shows a real flare, and the others are false positives. All panels show one-day long segments. |
In the text |
![]() |
Fig. 2. Example light curve color coded with the flatwrm2 prediction. Gray lines show the positions of the validated flares from the final catalog. |
In the text |
![]() |
Fig. 3. Illustrative example of the extraction of a scaled flare shape. Left: Light curve segment around the flare. Gray shows the points used for the baseline fit; the red line shows the fitted polynomial. Right: Scaled flare shape. The red line shows the flare template used for the time scaling. The large black dots are from the original light curve; the small black dots are the interpolated points. |
In the text |
![]() |
Fig. 4. Weighted PCA basis. Upper left: First five PCs, with a dashed line denoting the average flare profile. Upper right: Ratio of the sample variance that a given PC can recover. A single feature from the original 200-dimensional dataset would amount to 0.5%. Lower panels: Example light curves with the PCA reconstruction using 20 PCs. |
In the text |
![]() |
Fig. 5. Histogram of stars observed with TESS 2-min cadence up to sector 69. The Teff values are from TICv8.2. The distribution is truncated at 18 000 K. |
In the text |
![]() |
Fig. 6. Sample size comparison between different stellar flare catalogs created from Kepler and TESS data. The color indicates the observing cadence. Filled circles are catalogs that are publicly available. The following catalogs are shown: Balona (2015), Davenport (2016), Van Doorsselaere et al. (2017), Roettenbacher & Vida (2018), Yang et al. (2018), Yang & Liu (2019), Feinstein et al. (2020b, 2022, 2024), Günther et al. (2020), Howard (2022), Howard & MacGregor (2022), Pietras et al. (2022), Tu et al. (2022), Yang et al. (2023b), Bruno et al. (2024), Lin et al. (2024), Zhang et al. (2024). |
In the text |
![]() |
Fig. 7. Flaring stars on the Gaia color-magnitude diagram colored with the flare rate. We note that the stars are plotted in order of their flare rates to show the most active stars on top. Gray points show all the stars prior to manual vetting (Sect. 2.5) in order to make the position of the red giant branch more discernible. |
In the text |
![]() |
Fig. 8. Some interesting complex flares identified during manual vetting. The left panels show flares with possible quasi-periodic modulation. |
In the text |
![]() |
Fig. 9. Binning on the Gaia color-magnitude diagram for the calculation of the average flare shapes. Around each point, the stars inside an ellipse are counted. |
In the text |
![]() |
Fig. 10. Change of basic flare parameters across the MS with the same binning as in Fig. 9. The black line shows the median value, and gray shading is shown between the 16th and 84th percentiles. |
In the text |
![]() |
Fig. 11. Fitted parameters for the power law in the form ED |
In the text |
![]() |
Fig. 12. Distribution of flares in the PC space. Each panel shows a two-dimensional histogram. Below the diagonal, the shading indicates the number density of flares. Above the diagonal, the color code indicates the average Teff from TICv8.2 in each bin. The Teff dependence is most apparent in PC5. |
In the text |
![]() |
Fig. 13. Two-dimensional UMAP projection of the scaled flare shapes. Each panel shows a two-dimensional histogram color-coded by the density of points, Teff, and log g from TICv8.2. |
In the text |
![]() |
Fig. 14. Average flare shapes from different positions in the UMAP space. Different colors show the median profiles and the range between the 16th and 84th percentiles inside the given circles in the UMAP space. Each circle includes approximately 1000 flares. Dashed lines in the subplots show the whole sample median for comparison. |
In the text |
![]() |
Fig. 15. Changes in the flare shapes in the PC space along the MS. The points denote the median value of the PC coefficients for the stars in the given bins from Fig. 9, colored with Teff. The error bars show the standard error of the median, |
In the text |
![]() |
Fig. 16. Flare shapes along the MS using the binning from Fig. 9. Top: Median shapes colored with Teff. Middle: Residual flare shapes made by removing the whole sample median from the median shape in each bin. Left: Number of flares in each bin along the MS parameterized by Teff. |
In the text |
![]() |
Fig. 17. Flare templates fitted along the MS using Eq. (9)–(10) and color coded with Teff. The shaded region shows the 16th and 84th percentiles of the dataset. The black dashed line shows the template of Davenport et al. (2014). |
In the text |
![]() |
Fig. 18. Parameters of the flare template fitted along the MS in the following form: Frise(t) = 1 + a1 ⋅ t + a2 ⋅ t2 + a3 ⋅ t3 + a4 ⋅ t4 and Fdecay(t) = b1 ⋅ e−c1t + b2 ⋅ e−c2t. The shaded regions show the formal uncertainty of the fit. |
In the text |
![]() |
Fig. 19. Flare shapes randomly sampled from a kernel density estimator trained on the given number of PCs. |
In the text |
![]() |
Fig. 20. Nearest neighbor search in the flare shape space. The red curve on each panel is the injected flare shape, and the 30 closest flares are shown in black. |
In the text |
![]() |
Fig. 21. Morphology of the solar flares observed in the 304 Å channel of SDO/EVE. Left: Median shapes of solar flares with and without CMEs and their difference. The median flare shape from TESS is also shown. Right: UMAP projection of the scaled solar flare shapes. Blue and red points show events with and without CMEs, respectively. |
In the text |
![]() |
Fig. A.1. Change of the injected flare shapes with effective temperature. |
In the text |
![]() |
Fig. A.2. Result of the uniform manifold approximation and projection algorithm on the recovered mock flares. Only the Gaussian bump shows a clear trend with Teff. |
In the text |
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.