Open Access
Issue
A&A
Volume 688, August 2024
Article Number A6
Number of page(s) 10
Section Numerical methods and codes
DOI https://doi.org/10.1051/0004-6361/202449495
Published online 29 July 2024

© The Authors 2024

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1 Introduction

High spatial resolution and high signal-to-noise observations are prerequisites to most observational astrophysical problems. However, expecting the two conditions to happen simultaneously is challenging, as space telescopes have a limited collecting power, while large telescopes are ground-based and are therefore affected by atmospheric turbulence. This is clearly illustrated by the two main missions of the next decade: the ESA-NASA Euclid space telescope (Euclid Collaboration 2022; Laureijs et al. 2011) and the Vera C. Rubin Observatory (Ivezić et al. 2019). Exploiting the best of both worlds is possible, provided reliable post-processing techniques are developed to remove blurring by the atmosphere and instrument point spread function (PSF). To further complicate matters, sensor variations introduce noise into images. Hence, image deconvolution in astrophysics is an ill-posed and ill-conditioned inverse problem that requires regularisation to achieve a unique solution. This was realised very early in the field, and solutions were proposed, such as minimising the Tikhonov function (e.g. Tikhonov & Arsenin 1977) or maximizing the entropy of the solution (e.g. Skilling & Bryan 1984). Other algorithms, based on Bayesian statistics, include the Lucy-Richardson algorithm, used on the early data from the Hubble Space Telescope (HST; Richardson 1972; Lucy 1974). Magain et al. (MCS; 1998) proposed a two-channel method that separates the point sources from the spatially extended ones and deconvolves using a narrow PSF, hence achieving finite but improved resolution compatible with the sampling (pixel) chosen to represent the solution. Improvements of the ‘MCS’ method implement wavelet regularisation of the extended channel of the solution Cantale et al. (2016a), and this improved method was further refined by Michalewicz et al. (STARRED; 2023), who used an isotropic wavelet basis called Starlets (Starck et al. 2015) to regularise the solution.

However, deep learning offers a completely different approach, by learning the properties of the desired solution. Once trained, deep learning-based methods are also orders of magnitude faster than classical methods. Of note, UNets (Ronneberger et al. 2015) have become popular due to their highly non-linear processing and multi-scale approach. Building on Unet, Sureau et al. (2020) developed the Tikhonet method for deconvolving galaxy images in the optical domain. They demonstrated that Tikhonet outperformed sparse regularisation-based methods in terms of mean squared error and a shape criterion, where a measure of the galaxy ellipticity was used to encode its shape. Nammour et al. (2022) added a shape constraint to the Tikhonet loss function and achieved better performance. In our recent work (Akhaury et al. 2022), we have proposed a new deconvolution approach that employs the Learnlet decomposition (Ramzi et al. 2023). It uses the same two-step approach as in Sureau et al. (2020) but substitutes the UNet denoiser by Learnlet.

With the recent advent of Vision Transformers (Dosovitskiy et al. 2021), significant progress has been made in the field of image restoration (Liang et al. 2021; Zamir et al. 2022; Wang et al. 2022). This is the motivation for the present work to investigate the performance of SUNet (Fan et al. 2022), a variant of Unet with Swin Transformer blocks (Liu et al. 2021) replacing the convolutional layers. To our knowledge, SUNet has not yet been used as a denoiser in a deconvolution framework. We show that the neural network outputs introduce bias, thereby limiting the scientific analysis. The bias appears as a small flux loss in the outputs. To counter this, we propose a third debias-ing procedure based on the active coefficients in the sparsity wavelet framework called multi-resolution support (Starck et al. 1995), explained in Sect. 2.3. Our experiments involved real HST images, and the network was trained on images extracted from the CANDELS survey (Grogin et al. 2011; Koekemoer et al. 2011).

Finally, to assess its generalisability, we tested our new code on a completely different sample of ground-based images obtained with the FORS2 camera at the Very Large Telescope (VLT). Spatially resolved images of galaxies can help address a multitude of topics involving their morphologies. In the present case, we tackle the question of how the galaxy cluster environment can impact the properties of the discs of spiral galaxies. From the same dataset, Cantale et al. (2016b) found that 50% of cluster spiral galaxies at redshift 0.5-0.9 have disc V – I colours that are redder by more than 1σ of the mean colours of their field counterparts. This result was obtained thanks to the deconvolu-tion code Firedec (Cantale et al. 2016a). The VLT V- and I-band images, with initial spatial resolutions between 0.4" and 0.8", were deconvolved with a target final resolution of 0.1" on 0.05" pixels. Even though the gain in resolution was indeed substantial, conditions were not sufficient to go beyond the global photometric properties of the discs. In this study, we go one step further and investigate their internal structure, specifically by identifying their star-forming clumps.

Compact star-forming clumps have been identified in distant galaxies, particularly with the aid of HST deep images (e.g., Wuyts et al. 2012; Guo et al. 2015; Sattari et al. 2023). They are understood to play a crucial role in galaxy assembly and star-formation activity. Recent research by Sok et al. (2022) explored clump fractions in star-forming galaxies from multi-band analysis in the COSMOS field. The match between HST and ground-based resolution was performed with Firedec. Their findings indicated a decline in the fraction of clumpy galaxies with increasing stellar masses and redshifts. Moreover, they observed that clumps contributed a higher fractional mass towards galaxies at higher redshifts. In our study, by employing a more powerful deconvolution algorithm capable of accurately recovering small-scale structures at high spatial resolution from ground-based multi-band observations (as demonstrated in Sect. 4.2), our goal is to quantify the number of clumps in EDisCS cluster galaxies and examine their relationship with disc colour.

In Sect. 2, we present the deconvolution problem and introduce our proposed deep learning method to address it. The process of generating our dataset and conducting experiments is outlined in Sect. 3. In Sect. 4, we present the results of our deconvolution algorithm. Finally, we draw our conclusions in Sect. 5. To support reproducible research, the codes and trained models utilised in this article are publicly accessible in Acknowledgements. Additional studies and supplementary information can be found in appendices AC.

2 Deep learning-based deconvolution

The deconvolution problem can be summarised with a very simple (but hard to solve) equation. Let y ∈ ℝn×n be the observed image and h ∈ ℝn×n be the PSF. The observed image can be modelled as y=h*xt+η,${\bf{y}} = {\bf{h}}*{{\bf{x}}_{\bf{t}}} + \eta ,$(1)

where xt ∈ ℝn×n denotes the target image, * denotes the convolution operation, and η ∈ ℝn×n denotes additive Gaussian noise. The goal is to recover the ground truth image xt, given the PSF convolution and the unknown noise. Such ill-posed inverse problems require regularisation of the solution in order to select the one that is most appropriate compared to the many others that are compatible with the data. Sparse wavelet regularisation using the 0 or 1 norm remained the most accepted regularisation in the past, but the recent advent of machine learning methods has changed the paradigm.

2.1 Tikhonov deconvolution

Tikhonov deconvolution is a two-step deep learning-based deconvolution technique. In the first step, the input images undergo deconvolution using a Tikhonov filter with quadratic regularisation. If Hn2×n2${\bf{H}} \in {^{{n^2} \times {n^2}}}$ denotes the circulant matrix associated with the convolution operator h, the Tikhonov solution of Eq. (1) is expressed as x^=(HH+λΓΓ)1Hy.$\widehat {\bf{x}} = {\left( {{{\bf{H}}^ \top }{\bf{H}} + \lambda {{\bf{\Gamma }}^ \top }{\bf{\Gamma }}} \right)^{ - 1}}{{\bf{H}}^ \top }{\bf{y}}.$(2)

Here, Γn2×n2${\bf{\Gamma }} \in {^{{n^2} \times {n^2}}}$ represents the linear Tikhonov filter, configured as a Laplacian high-pass filter to penalise high frequencies. The regularisation weight, denoted as λ ∈ ℝ+, is determined through a grid search. The Tikhonov step produces deconvolved images containing correlated additive noise, which is subsequently eliminated in the following step by an appropriate denoiser. The denoisers are trained to learn the mapping from the Tikhonov output x^$\widehat {\bf{x}}$ to the ground truth image xt by minimising a suitable loss function, such as 1 or 2.

The denoising performance is significantly influenced by the choice of the model architecture. To effectively capture distant correlations, it is crucial to incorporate multi-scale processing in the model design. This consideration leads to the adoption of a layout similar to that of a UNet (Ronneberger et al. 2015). Additionally, Mohan et al. (2020) demonstrated that biases in convolutional layers can result in a low generalisation capability. Consequently, for our experiments, we opted for bias-free networks.

2.2 SUNet denoising

In recent times, the UNet architecture has become popular in various image-processing applications due to its incorporation of hierarchical feature maps, which facilitate the acquisition of multi-scale contextual features. It is widely employed in diverse computer vision tasks, including segmentation and restoration (Yu et al. 2019; Gurrola-Ramos et al. 2021). Evolved versions such as Dense-UNet (Guan et al. 2020), Res-UNet (nan Xiao et al. 2018), Non-local UNet (Yan et al. 2020), and Attention UNet (Jin et al. 2020) have also emerged. Thanks to its flexible structure, UNet can adapt to different building blocks, enhancing its overall performance. Moreover, the evolution of image-processing methodologies has seen the introduction of transformer models (Vaswani et al. 2017), which were initially successful in natural language processing but have also demonstrated impressive performance in image classification (Dosovitskiy et al. 2021; Yuan et al. 2021). However, when directly applied to vision tasks, transformers face challenges such as the large-scale difference between images and sequences, making them less effective in modelling long sequences due to a need for a square number of parameters for one-dimensional sequences. Additionally, transformers are not well suited for dense prediction tasks such as instance segmentation at the pixel level. Swin Transformer addresses these issues by introducing a shifted-window mechanism to reduce the number of parameters, establishing itself as a state-of-the-art solution for high-level vision tasks (Liu et al. 2021). Taking inspiration from these advancements, Fan et al. (2022) incorporated Swin Transformers as building blocks within the UNet architecture and showed that they could achieve competitive results compared to existing benchmarks for image denoising. The network architecture is visually depicted in Fig. 1, and the PyTorch code is publicly available on GitHub (details in acknowledgements).

While SUNet was originally developed for white Gaussian noise removal, the Tikhonov deconvolution step (Eq. (2)) introduces alterations to the image’s noise characteristics. Consequently, it becomes crucial to assess the extent to which SUNet can effectively handle the presence of coloured Gaussian noise. Thus, our tests examine SUNet’s generalisability in the presence of CGN.

thumbnail Fig. 1

SUNet architecture with Swin Transformer blocks replacing the convolutional layers while preserving the multi-scale Unet backbone. Credits: Fan et al. (2022).

2.3 Debiasing with multi-resolution support

Our study reveals that the deep learning-based solution introduces a bias, evident in the form of positive structures in the residuals, as illustrated in Fig. 2a. This bias can have implications on the accuracy of the scientific analyses, potentially influencing the flux estimation within the recovered features in the reconstructed images. In recovering the lost flux and enhancing image sharpness, multi-resolution support has been proven effective (Starck et al. 1995). Denoting the Starlet transform as Φ and the SUNet output solution as x0, an iterative process allowed for the recovery of flux from the residual r0. In each iteration, denoted as j, a debiasing correction term was applied by multiplying the multi-resolution support matrices M of the SUNet output solution x0 at each scale with the Starlet decomposition of the gradient of the residual, incrementally modifying the deconvolved image: xj+1=xj+prox(x),${{\bf{x}}_{{\bf{j}} + {\bf{1}}}} = {{\bf{x}}_{\bf{j}}} + {\mathop{\rm prox}\nolimits} \left( {{\nabla _{\bf{x}}}} \right),$(3)

where rj = yHxj , x=Hrj,prox(x)=(ΦMΦ)x, and M=MRS(Φ(x0)).$\eqalign{ & {\nabla _{\bf{x}}} = {{\bf{H}}^ \top }{{\bf{r}}_{\bf{j}}}, \cr & {\mathop{\rm prox}\nolimits} \left( {{\nabla _{\bf{x}}}} \right) = \left( {{\Phi ^ \top }{\bf{M}}\Phi } \right){\nabla _{\bf{x}}}{\rm{, and }} \cr & {\bf{M}} = {\mathop{\rm MRS}\nolimits} \left( {\Phi \left( {{{\bf{x}}_0}} \right)} \right). \cr} $

The acronym ‘MRS’ stands for multi-resolution support and indicates a Boolean measure of whether an image I0 contains information at a specific pixel and scale. If c represents the wavelet coefficient at a given scale and λ is the threshold value, the multi-resolution support operation can be defined as: MRS(c)={ 1, if |c|>λ0, otherwise  ${\mathop{\rm MRS}\nolimits} (c) = \left\{ {\matrix{ {1,} \hfill & {{\rm{ if }}|c| > \lambda } \hfill \cr {0,} \hfill & {{\rm{ otherwise }}} \hfill \cr } } \right.$(4)

The flux-recovery process is illustrated in Fig. 2. The iterative process was stopped once convergence was achieved in the standard deviation of the residual as a function of the number of iterations, as seen in Fig. 2d. A more detailed study on the impact of debiasing with multi-resolution support on neural networks is presented in Appendix B.

thumbnail Fig. 2

Iterative recovery of lost flux through debiasing using multi-resolution support. (a) Original SUNet output. The red square highlights the residual flux lost. (b) Multi-resolution support matrices at each decomposed scale. (c) Debiased solution after iterative correction with multi-resolution support highlighting the reduction in structured residuals. (d) Standard deviation of the residual within the highlighted region as a function of the number of iterations. The process was stopped upon achieving convergence.

3 Dataset and experiments

3.1 Training dataset generation

We extracted HST cutouts measuring 128 × 128 pixels from CANDELS (Grogin et al. 2011; Koekemoer et al. 2011) in the F606W filter (V-band). These cutouts were then convolved with a Gaussian PSF having a full width at half maximum (FWHM) of 15 pixels. Following the convolution, we injected white Gaussian noise with a standard deviation denoted as σnoise. This choice ensured that the faintest object in our dataset had a peak signal-to-noise (S/N) close to one and was barely visible. With this particular value of σnoise, our dataset exhibited a range of S/N values depending on the magnitude of each galaxy. To standardise the images, each image was normalised within the [−1,1] range. This normalization involved subtracting the image’s mean and scaling the peak value by the image’s maximum value. Finally, the batch of images was randomly divided into training-validation-test subsets in the ratio 0.8 : 0.1 : 0.1.

3.2 Training the SUNet

The SUNet architecture was trained using a Titan RTX Turing GPU with 24 GB RAM for each job. The training aimed at learning the mapping from the Tikhonov output x^$\widehat {\bf{x}}$ (containing CGN) to the corresponding HST image xt. The training utilised the Adam optimiser (Kingma & Ba 2014) with an initial learning rate of 10−3, gradually halving every 25 epochs until reaching a minimum of 10−5. As in the original SUNet paper (Fan et al. 2022), 1-loss was used for training. A more detailed discussion on the significance of the training loss function is given in Appendix C. The input images were processed in mini-batches of size 16. The dataset was augmented with random rotations in multiples of 90°; translations and flips were along horizontal and vertical axes. Starting with 22 317 images in our initial training dataset, we increased its diversity by a factor of ten by performing augmentation.

3.3 Test dataset

We utilised the ESO Distant Cluster Survey (EDisCS; White et al. 2005) as our benchmark dataset to assess the performance of our deconvolution method. EDisCS is an extensive ESO Large Programme focused on the analysis of 20 galaxy clusters within the redshift range 0.4 < ɀ < 1 and covering a diverse range of masses, with velocity dispersions ranging from approximately 200–1000 km s−1. All of the clusters benefited from the deep B, V, R, and I photometry obtained with FORS2 at the VLT. Additionally, a subset of ten clusters was imaged with the ACS at the HST in the F814W filter (Simard et al. 2002). Previous work by Cantale et al. (2016b) employed the deconvolution technique Firedec to analyse spiral disc colours for EDisCS cluster galaxies, investigating trends with cluster masses and lookback time. For our study, we focused on analysing a subset of EDisCS clusters at three distinct redshifts − ɀ ≈ 0.58, ɀ ≈ 0.7, and ɀ ≈ 0.79 – using our proposed deconvolution method, and we went a step further by investigating the star-forming clumps in these galaxies. Table 1 provides a summary of the properties of these clusters and the number of galaxies in each of them.

Our analysis specifically considered the V (555 nm), R (655 nm), and I (768 nm) photometric bands, allowing us to evaluate the effectiveness of our deconvolution method in capturing variations across these photometric bands. To enable a more detailed analysis, we grouped these EDisCS galaxies based on their disc colour in order to study the different trends. Since the EDisCS clusters were solely observed in the F814W filter for HST, our deconvolution method presents a unique opportunity to extract high spatial resolution galaxy properties from the ground-based FORS2 multi-band observations. Furthermore, we demonstrate the superiority of SUNet over Firedec in generating cleaner deconvolved images, as detailed in Sects. 4.2 and 4.3.

Table 1

Summary of the EDisCS clusters considered for analysis.

thumbnail Fig. 3

Distribution of the galaxy magnitudes in the HST F814W filter for the EDisCS samples, which were solely observed in the F814W filter for HST.

4 Results

The EDisCS images in the V-, R-, and I-bands served as inputs for the SUNet deconvolution framework, yielding corresponding deconvolved outputs. The algorithm operates with exceptional speed, requiring only ≈15.2 ms to deconvolve a single image on a Titan RTX Turing GPU with 24 GB RAM – approximately 104 times faster than Firedec, on average. A histogram of the magnitudes of the same objects in the HST F814W filter is shown in Fig. 3. In Sect. 4.1, we detail our approach to detecting sizes and the number of clumps in the deconvolved outputs. Section 4.2 then presents a comparative analysis of the quality of our SUNet outputs against Firedec (Cantale et al. 2016a). With the validity of our method established, Sect. 4.3 presents the deconvolved results and is followed by a thorough analysis in Sect. 4.4.

4.1 Objects size and clump detection

To detect clumps, or small-scale structures, and measure the sizes of galaxies, we employed the SCARLET Python package (Melchior et al. 2018). Utilising the Starlet transform from SCARLET, our approach involved decomposing images into different scales, where each scale captures a specific frequency component. To ensure consistency for an unbiased comparison, we maintained fixed algorithm parameters across different bands and objects. All images underwent decomposition into five scales, with the fourth scale chosen for size detection and the second scale for clump detection. A 5σ detection threshold was applied to each scale during the process. In the context of clump detection, the Starlet transform was computed using the standard deviation solely within the region enclosed by the size detection outline, rather than considering the entire image. This refined approach ensured more precise thresholding in the Starlet space. Finally, a clump was only considered valid if it lay within the size detection outline, ensuring that background artefacts were excluded. In Fig. 4, we present examples of size detection and their corresponding clump detection cases.

thumbnail Fig. 4

Size detection (outer contour) and clump detection (inner contours) using SCARLET. The first row shows the FORS2 images in the V-, R-, and I-bands, with the corresponding SUNet outputs displayed directly below. For comparison, the HST image in the F814W filter is shown adjacent to the SUNet I-band output. All images are decomposed into five scales, with the fourth scale chosen for size detection and the second scale for clump detection.

4.2 Comparison with classical methods

We conducted a thorough performance comparison between SUNet and Firedec, a classical deconvolution method based on wavelet regularisation (Cantale et al. 2016a). For direct comparison with HST quality, we concentrated on the I-band outputs for each method, as the EDisCS clusters were exclusively observed in the F814W filter for HST. Both methods exhibited a better performance on low-magnitude (or high-S/N) images, with a gradual decline in performance in the high-magnitude (or low-S/N) regime. In this case, the mean squared error metric between the deconvolved outputs and the ground truth HST images is not a robust metric for indicating similarity since it is biased by the background noise in the HST images (Wang & Bovik 2009). Instead, we used the structural similarity index measure (SSIM), a full reference metric which quantifies the similarity between two images by comparing their structural information or spatial interdependencies (Wang et al. 2004). An SSIM of one implies identical images. The observed trends are illustrated in Fig. 5. We further assessed the ability of the deconvolution algorithms to accurately resolve small-scale structures. For this, we leveraged the SCARLET Python package, as detailed in Sect. 4.1. The fraction of area overlap between small-scale structure detections in the deconvolved outputs and HST is depicted in Fig. 6. Based on these metrics, SUNet clearly outperforms Firedec.

Building on the foundation of Firedec, an enhanced method named STARRED was recently introduced by Michalewicz et al. (2023). STARRED brings innovation by incorporating an isotropic wavelet basis known as Starlets (Starck et al. 2015) that can refine the regularisation process when solving the deconvolution problem. The outputs of Firedec, STARRED, and SUNet are shown in Fig. 7. Upon visual inspection of the outputs and the residuals (residual = noisy image – PSF * deconvolved image), it was evident that SUNet consistently generalises better than Firedec and STARRED.

thumbnail Fig. 5

SSIM between the I-band deconvolved outputs and the HST images in the F814W filter as a function of object magnitude. An SSIM of one implies identical images.

thumbnail Fig. 6

Fraction of area overlap between the small-scale structure detections of the I-band deconvolved outputs and HST images in the F814W filter.

thumbnail Fig. 7

Visual comparison between the deconvolved outputs. The FORS2 image in the I-band is displayed in the top-left corner, with the HST image in the F814W filter directly below it. The Firedec, STARRED, and SUNet images in the I-band are shown in the second, third, and fourth columns of the first row, respectively. Beneath each output, the corresponding residual is depicted, which is defined as follows: residual = noisy VLT image – PSF * deconvolved image.

4.3 SUNet deconvolution results

The methods were put to the test using real ground-based images captured by the FORS2 camera at VLT in Chile. Notably, SUNet showed a remarkable ability to effectively generalise to images with entirely different noise properties than those present in the training dataset, as depicted in Fig. 8. As illustrated, we were able to successfully recover the morphology and lost small-scale structures. Figure 9 presents the trend in the relative flux error between the SUNet deconvolved outputs and the corresponding HST targets as a function of the total clump size. The uncertainty in the flux level is higher for smaller clumps.

thumbnail Fig. 8

Images of a few SUNet outputs without clump and size detection outlines, emphasising the accuracy in recovering the shapes of galaxies. The first row shows the FORS2 images in the V-, R-, and I-bands, with the corresponding SUNet outputs displayed directly below. For comparison, the HST image in the F814W filter is shown adjacent to the SUNet I-band output.

4.3.1 Resolution recovery

To gauge the achieved resolution in the deconvolved outputs, we calculated the average ratio between the areas of the smallest detected clump in the SUNet output and its counterpart in the HST image. This ratio, approximately 2.58, implies an average SUNet output resolution of around 0.129″, considering the known HST resolution is 0.05″.

4.3.2 False positives and false negatives

To assess the reliability of our deconvolution method for real-world applications, we conducted an analysis to estimate the number of false positives and false negatives in our study. We ran SCARLET to detect clumps in both SUNet outputs and HST ground truths. For each HST clump, we checked whether the SUNet-identifled clump centroid fell within a 5-pixel radius of the HST clump centroid. Clumps failing this criterion were considered false positives. The false positive rate was computed across our entire EDisCS dataset by tallying the total count of false positives and dividing it by the overall count of detected clumps in the HST images. However, it is important to note that this result may be biased due to SCARLET’S performance. Instances exist where SCARLET identifies a clump in the SUNet image but misses it in the corresponding HST image, leading to an elevated false positive count. To address this, we employed visual inspection to filter out falsely detected cases. The resultant false positive rate was determined to be approximately 4.16%. Using a similar approach, we also computed the false negative rate, indicating the probability of missing clumps in the deconvolved images that are present in the HST image. This rate was found to be 3.57%, signifying a very low probability of missing features. To obtain a more statistically robust evaluation, we tested the method on another dataset of 2232 galaxies extracted from CANDELS (Grogin et al. 2011) and compared it with other neural networks. This work is shown in Appendix C.

thumbnail Fig. 9

Relative flux error between the SUNet I-band deconvolved outputs and HST images in the F814W filter. Each data point in the plot represents the mean value for a specific bin, while the error bars depict the upper and lower bounds within which 95% of the data points fall.

4.4 Analysis of deconvolved EDisCS galaxies

As a sanity check, we verified that any conclusions drawn regarding the internal properties of our sample galaxies are not influenced by biases in their sizes, either in relation to redshift or disc colours. To this end, a histogram of galaxy sizes grouped by their parent cluster redshift and disc colour is presented in Appendix A. Both plots show that all galaxies have the same global spatial extent.

Following Cantale et al. (2016b), our analysis focuses on the population of galaxies with discs redder than their field counterparts. Some of our sample galaxies have normal colours and hence fall into the colour distribution of the field galaxies, and some are bluer (by more than 1σ of the colour distribution). A few known physical processes can induce enhanced star-forming activity, but we were instead interested in the possible evidence for quenching mechanisms. Therefore, in the latter case, normal and blue disc galaxies form a common broad class of systems to which we compared the redder ones. Employing the clump detection method outlined in Sect. 4.1, we computed the histogram of the number of clumps in galaxies, categorised by disc colour, as shown in Fig. 10. We note that the only-one-clump case reflects the identification of the central and luminous part (bulge) of the galaxies. As illustrated in Figs. 4 and 8, clumps in the V-band are brighter than those in the R- and I-bands, in agreement with the spectral energy distribution of young stellar populations. In principle, it is therefore easier to detect clumps in the V-band at equivalent photometric depths of the images. This may explain the more continuous distribution in the number of clumps, from one to six, in the V-band. Even so, as we witnessed in Fig. 10, the general trend is the same from one band to the other. This trend is clear and reveals that the red discs that were initially identified by Cantale et al. (2016b) have fewer clumps than their bluer counterparts, most likely due to an earlier cessation of star formation. This result opens promising prospects for future studies on larger samples and over larger look intervals.

5 Conclusion

We have proposed a deconvolution framework involving a two-step process – namely, Tikhonov deconvolution and postprocessing with an SUNet denoiser – and an additional debias-ing step using multi-resolution support. SUNet was trained on galaxy images from the CANDELS survey and demonstrated superior performance compared to Firedec in the astrophysi-cal context. After establishing the validity of our method, we applied it to deconvolve a set of galaxies from the EDisCS cluster at three different redshifts. Using SCARLET, we provided further analysis of the galaxies in terms of their size and disc colour. We quantified the number of clumps in these galaxies, examining their relationship with disc colour. Our results, based on both quantitative metrics and visual assessments, highlight the effectiveness of SUNet and showcase its ability to generalise unseen real images with diverse noise properties, which can be attributed to its transformer-based backbone involving the self-attention mechanism.

In summary, this work introduces and evaluates an advanced deconvolution framework applied to ground-based astronomical images. The key findings and contributions include the following:

  • Resolution recovery: based on our SCARLET detection procedure, SUNet demonstrates the capability to recover small-scale structures, with an average resolution of approximately 0.129″, and it outperforms classical algorithms such as STARRED and Firedec (Sect. 4.3.1).

  • Generalisation to diverse noise properties: the method showcases robust generalisation to noise properties different from its training dataset, indicating its adaptability to various observational conditions.

  • Clump analysis: red discs exhibit fewer clumps than their bluer counterparts, affirming the lower presence of star-forming regions.

  • False positive and false negative rates: based on our SCARLET detection analysis on EDisCS, SUNet maintains a false positive rate of 4.16% and a false negative rate of 3.57%, ensuring reliable feature recovery (Sect. 4.3.2).

  • Computational efficiency: the Al-based framework proves to be highly efficient, with an execution time of approximately 15.2 ms per image, making it around 104 times faster than traditional deconvolution methods such as STARRED and Firedec.

Our proposed technique can therefore be used with ground-based images to efficiently identify structures in the distant universe at high spatial resolution. The technique’s applicability to multi-band observations further enhances its utility in studying various astrophysical phenomena. The efficiency of SUNet in processing large datasets and accelerating the deconvolution process opens up opportunities for swift analyses. Access to such a fast and robust deconvolution framework holds the potential to facilitate numerous astrophysical investigations.

thumbnail Fig. 10

Histogram of the number of clumps in galaxies in the V-,R-, and I-bands grouped by their parent disc colour. Each coloured bar in the plots corresponds to a specific disc colour, and the bars for different disc colours are stacked on top of each other. Galaxies are classified as ‘Red’ if they are redder, ‘Normal’ if they are comparable, and ‘Blue’ if they are bluer than the field members.

Data availability

For the sake of reproducible research, the codes and the trained models used for this article are publicly available online: 1. The ready-to-use version of our deconvolution method (https://github.com/utsav-akhaury/SUNet/tree/main/Deconvolution). 2. The repository fork of the SUNet code used for training the network (https://github.com/utsav-akhaury/SUNet). 3. Link to the trained network weights (https://doi.org/10.5281/zenodo.10287213). 4. The repository forks of the Learnlet and Unet-64 codes used for comparison in the Appendices B and C (https://github.com/utsav-akhaury/understanding-unets/tree/candels).

Acknowledgements

This work was funded by the Swiss National Science Foundation (SNSF) under the Sinergia grant number CRSII5_198674. This work was supported by the TITAN ERA Chair project (contract no. 101086741) within the Horizon Europe Framework Program of the European Commission, and the Agence Nationale de la Recherche (ANR-22-CE31-0014-01 TOSCA). The authors thank David Donoho for useful discussions.

Appendix A Supplementary figure

As a validity check, we ensured that any insights into the internal properties of our sample galaxies remain unaffected by size biases, whether concerning redshift or disc colours. This is illustrated in Figure A.1, where (a) represents the histogram of galaxy sizes grouped by their parent cluster redshift and (b) represents the same but grouped by disc colour. Both plots indicate that all galaxies share the same global spatial extent. Each individual legend in the histograms has been normalised such that its probability adds up to one.

thumbnail Fig. A.1

Validity check to ensure that the properties of our sample galaxies remain unaffected by size biases, (a): Histogram of galaxy sizes in the V-, R-, and I-bands grouped by their parent cluster redshift: ɀ ≈ 0.58, ɀ ≈ 0.70, ɀ ≈ 0.79. Each coloured bar in the plot represents a specific redshift value, with bars of different redshiſts stacked on top of each other, (b): Histogram of galaxy sizes in the V-, R-, and I-bands grouped by their disc colour. Galaxies are classified as ‘Red’ if they are redder, ‘Normal’ if they are comparable, and ‘Blue’ if they are bluer than the field members. Each coloured bar in the plot represents a disc colour category, with bars of different disc colours stacked on top of each other.

Appendix B Impact of debiasing with multi-resolution support on neural networks

For a more rigorous study of the impact of debiasing with multi-resolution support on neural networks, we considered three different neural networks with and without the multi-resolution debiasing: Learnlet (Ramzi et al. 2023), Unet-64 (Ronneberger et al. 2015), and SUNet (Fan et al. 2022). To provide a quantitative comparison, we computed the SSIM and the flux error in small-scale structures between 2232 galaxies extracted from CANDELS and their simulated degraded versions, as done in Akhaury et al. (2022). The selection of galaxies based on their FWHM and magnitude is depicted in Figure B.1. To prevent the background noise in the HST images from biasing our metrics, we fit a Gaussian window around each object with an FWHM equal to its catalogue-derived value. The trends in the metrics, depicted in Figure B.2 as a function of the object magnitude, indicate an enhancement in SSIM and flux error for all three networks after debiasing.

thumbnail Fig. B.1

FWHM vs. magnitude plot for the CANDELS dataset. The red rectangle encloses the 2232 galaxies selected for the analysis. The limiting magnitude threshold was set at 25. To eliminate point-sized sources, we applied a minimum FWHM threshold of 10. A maximum FWHM threshold of 60 was set to confine the objects within the 128 × 128 cutout window.

Appendix C Hallucinations and the impact of training loss function

To estimate the occurrence of unexpected artefacts or hallucinations introduced by neural networks, we applied the SCARLET detection procedure (as outlined in Section 4.1) with a tight 5σ detection threshold to each scale. To determine the number of false positives, we examined whether the centroid of a detection in the neural network’s output fell within a five-pixel radius of the HST detection centroid. Detections failing to meet this criterion were considered false positives. Figure C.1 illustrates the impact of different loss functions on the hallucination rate as a function of the galaxy FWHM and magnitude for three neural networks: Learnlet, Unet-64, and SUNet. We conducted the experiments on the same test dataset of 2232 galaxies detailed in Appendix B. Notably, Unet-64 consistently exhibited improved performance when trained with the 1-loss.

thumbnail Fig. B.2

Trends in (B.2a) SSIM and (B.2b) flux error as a function of the object magnitude for the three networks: Learnlet, Unet-64, and SUNet. The trends for the original outputs are shown with solid lines, and the trends for the debiased outputs using multi-resolution support are shown with dotted lines. After debiasing, a noticeable enhancement in flux error can be observed for all three networks across a range of magnitudes, with a slight improvement in SSIM.

thumbnail Fig. C.1

Hallucination rate for the three networks—Learnlet, Unet-64, and SUNet—with 1 and 2 loss as a function of (C.1a) FWHM and (C.1b) magnitude.

References

  1. Akhaury, U., Starck, J.-L., Jablonka, P., Courbin, F., & Michalewicz, K. 2022, Front. Astron. Space Sci., 9 [Google Scholar]
  2. Cantale, N., Courbin, F., Tewes, M., Jablonka, P., & Meylan, G. 2016a, A&A, 589, A81 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  3. Cantale, N., Jablonka, P., Courbin, F., et al. 2016b, A&A, 589, A82 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  4. Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. 2021, arXiv e-prints [arXiv:2010.11929] [Google Scholar]
  5. Euclid Collaboration (Scaramella, R., et al.) 2022, A&A, 662, A112 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  6. Fan, C.-M., Liu, T.-J., & Liu, K.-H. 2022, in 2022 IEEE International Symposium on Circuits and Systems (ISCAS) (IEEE) [Google Scholar]
  7. Grogin, N. A., Kocevski, D. D., & Faber, S. M. 2011, ApJS, 197, 35 [NASA ADS] [CrossRef] [Google Scholar]
  8. Guan, S., Khan, A. A., Sikdar, S., & Chitnis, P. V. 2020, IEEE J. Biomedical Health Informatics, 24, 568 [CrossRef] [Google Scholar]
  9. Guo, Y., Ferguson, H. C., Bell, E. F., et al. 2015, ApJ, 800, 39 [NASA ADS] [CrossRef] [Google Scholar]
  10. Gurrola-Ramos, J., Dalmau, O., & Alarcón, T. E. 2021, IEEE Access, 9, 31742 [NASA ADS] [CrossRef] [Google Scholar]
  11. Ivezić, Ž., Kahn, S. M., Tyson, J. A., et al. 2019, ApJ, 873, 111 [Google Scholar]
  12. Jin, Q., Meng, Z., Sun, C., Cui, H., & Su, R. 2020, Front. Bioeng. Biotechnol., 8 [Google Scholar]
  13. Kingma, D. P., & Ba, J. 2014, arXiv e-prints [arXiv:1412.6980] [Google Scholar]
  14. Koekemoer, A. M., Faber, S. M., & Ferguson, H. C. 2011, ApJS, 197, 36 [NASA ADS] [CrossRef] [Google Scholar]
  15. Laureijs, R., Amiaux, J., Arduini, S., et al. 2011, arXiv e-prints [arXiv:1110.3193] [Google Scholar]
  16. Liang, J., Cao, J., Sun, G., et al. 2021, in Proceedings of the IEEE/CVF international conference on computer vision, 1833 [Google Scholar]
  17. Liu, Z., Lin, Y., Cao, Y., et al. 2021, IEEE/CVF International Conference on Computer Vision (ICCV), 9992 [Google Scholar]
  18. Lucy, L. B. 1974, AJ, 79, 745 [Google Scholar]
  19. Magain, P., Courbin, F., & Sohy, S. 1998, ApJ, 494, 472 [NASA ADS] [CrossRef] [Google Scholar]
  20. Melchior, P., Moolekamp, F., Jerdee, M., et al. 2018, Astron. Comput., 24, 129 [Google Scholar]
  21. Michalewicz, K., Millon, M., Dux, F., & Courbin, F. 2023, J. Open Source Softw., 8, 5340 [NASA ADS] [CrossRef] [Google Scholar]
  22. Mohan, S., Kadkhodaie, Z., Simoncelli, E. P., & Fernandez-Granda, C. 2020, arXiv e-prints [arXiv: 1906.05478] [Google Scholar]
  23. Nammour, F., Akhaury, U., Girard, J. N., et al. 2022, A&A, 663, A69 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  24. nan Xiao, X., Lian, S., Luo, Z., & Li, S. 2018, 2018 9th International Conference on Information Technology in Medicine and Education (ITME), 327 [CrossRef] [Google Scholar]
  25. Ramzi, Z., Michalewicz, K., Starck, J.-L., Moreau, T., & Ciuciu, P. 2023, J. Math. Imaging Vision, 65, 240 [CrossRef] [Google Scholar]
  26. Richardson, W. H. 1972, J. Opt. Soc. Am., 62, 55 [NASA ADS] [CrossRef] [Google Scholar]
  27. Ronneberger, O., Fischer, P., & Brox, T. 2015, arXiv e-prints [arXiv:1505.04597] [Google Scholar]
  28. Sattari, Z., Mobasher, B., Chartab, N., et al. 2023, ApJ, 951, 147 [NASA ADS] [CrossRef] [Google Scholar]
  29. Simard, L., Willmer, C. N. A., Vogt, N. P., et al. 2002, ApJS, 142, 1 [NASA ADS] [CrossRef] [Google Scholar]
  30. Skilling, J., & Bryan, R. K. 1984, MNRAS, 211, 111 [NASA ADS] [CrossRef] [Google Scholar]
  31. Sok, V., Muzzin, A., Jablonka, P., et al. 2022, ApJ, 924, 7 [NASA ADS] [CrossRef] [Google Scholar]
  32. Starck, J.-L., Murtagii, F., & Bijaoui, A. 1995, Graph. Models Image Process., 57, 420 [CrossRef] [Google Scholar]
  33. Starck, J.-L., Murtagh, F., &Bertero, M. 2015, Starlet Transform in Astronomical Data Processing, ed. O. Scherzer (New York, NY: Springer New York), 2053 [Google Scholar]
  34. Sureau, F., Lechat, A., & Starck, J.-L. 2020, A&A, 641, A67 [EDP Sciences] [Google Scholar]
  35. Tikhonov, A. N., & Arsenin, V. Y. 1977, Solutions of Ill-posed Problems (Washington, D.C.: John Wiley & Sons, New York: V. H. Winston & Sons), xiii+258, translated from the Russian, Preface by translation editor Fritz John, Scripta Series in Mathematics [Google Scholar]
  36. Vaswani, A., Shazeer, N., Parmar, N., et al. 2017, in Advances in Neural Information Processing Systems, eds. I. Guyon, U. V. Luxburg, S. Bengio, et al. (New York: Curran Associates, Inc.), 30 [Google Scholar]
  37. Wang, Z., & Bovik, A. C. 2009, IEEE Signal Process. Magazine, 26, 98 [CrossRef] [Google Scholar]
  38. Wang, Z., Bovik, A., Sheikh, H., & Simoncelli, E. 2004, IEEE Trans. Image Process., 13, 600 [CrossRef] [Google Scholar]
  39. Wang, Z., Cun, X., Bao, J., et al. 2022, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 17683 [Google Scholar]
  40. White, S. D. M., Clowe, D. I., Simard, L., et al. 2005, A&A, 444, 365 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  41. Wuyts, S., Förster Schreiber, N. M., Genzel, R., et al. 2012, ApJ, 753, 114 [NASA ADS] [CrossRef] [Google Scholar]
  42. Yan, Q., Zhang, L., Liu, Y., et al. 2020, IEEE Trans. Image Process., 29, 4308 [CrossRef] [Google Scholar]
  43. Yu, S., Park, B., & Jeong, J. 2019, in IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2095 [Google Scholar]
  44. Yuan, L., Chen, Y., Wang, T., et al. 2021 arXiv e-prints [arXiv:2101.11986] [Google Scholar]
  45. Zamir, S. W., Arora, A., Khan, S., et al. 2022, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5728 [Google Scholar]

All Tables

Table 1

Summary of the EDisCS clusters considered for analysis.

All Figures

thumbnail Fig. 1

SUNet architecture with Swin Transformer blocks replacing the convolutional layers while preserving the multi-scale Unet backbone. Credits: Fan et al. (2022).

In the text
thumbnail Fig. 2

Iterative recovery of lost flux through debiasing using multi-resolution support. (a) Original SUNet output. The red square highlights the residual flux lost. (b) Multi-resolution support matrices at each decomposed scale. (c) Debiased solution after iterative correction with multi-resolution support highlighting the reduction in structured residuals. (d) Standard deviation of the residual within the highlighted region as a function of the number of iterations. The process was stopped upon achieving convergence.

In the text
thumbnail Fig. 3

Distribution of the galaxy magnitudes in the HST F814W filter for the EDisCS samples, which were solely observed in the F814W filter for HST.

In the text
thumbnail Fig. 4

Size detection (outer contour) and clump detection (inner contours) using SCARLET. The first row shows the FORS2 images in the V-, R-, and I-bands, with the corresponding SUNet outputs displayed directly below. For comparison, the HST image in the F814W filter is shown adjacent to the SUNet I-band output. All images are decomposed into five scales, with the fourth scale chosen for size detection and the second scale for clump detection.

In the text
thumbnail Fig. 5

SSIM between the I-band deconvolved outputs and the HST images in the F814W filter as a function of object magnitude. An SSIM of one implies identical images.

In the text
thumbnail Fig. 6

Fraction of area overlap between the small-scale structure detections of the I-band deconvolved outputs and HST images in the F814W filter.

In the text
thumbnail Fig. 7

Visual comparison between the deconvolved outputs. The FORS2 image in the I-band is displayed in the top-left corner, with the HST image in the F814W filter directly below it. The Firedec, STARRED, and SUNet images in the I-band are shown in the second, third, and fourth columns of the first row, respectively. Beneath each output, the corresponding residual is depicted, which is defined as follows: residual = noisy VLT image – PSF * deconvolved image.

In the text
thumbnail Fig. 8

Images of a few SUNet outputs without clump and size detection outlines, emphasising the accuracy in recovering the shapes of galaxies. The first row shows the FORS2 images in the V-, R-, and I-bands, with the corresponding SUNet outputs displayed directly below. For comparison, the HST image in the F814W filter is shown adjacent to the SUNet I-band output.

In the text
thumbnail Fig. 9

Relative flux error between the SUNet I-band deconvolved outputs and HST images in the F814W filter. Each data point in the plot represents the mean value for a specific bin, while the error bars depict the upper and lower bounds within which 95% of the data points fall.

In the text
thumbnail Fig. 10

Histogram of the number of clumps in galaxies in the V-,R-, and I-bands grouped by their parent disc colour. Each coloured bar in the plots corresponds to a specific disc colour, and the bars for different disc colours are stacked on top of each other. Galaxies are classified as ‘Red’ if they are redder, ‘Normal’ if they are comparable, and ‘Blue’ if they are bluer than the field members.

In the text
thumbnail Fig. A.1

Validity check to ensure that the properties of our sample galaxies remain unaffected by size biases, (a): Histogram of galaxy sizes in the V-, R-, and I-bands grouped by their parent cluster redshift: ɀ ≈ 0.58, ɀ ≈ 0.70, ɀ ≈ 0.79. Each coloured bar in the plot represents a specific redshift value, with bars of different redshiſts stacked on top of each other, (b): Histogram of galaxy sizes in the V-, R-, and I-bands grouped by their disc colour. Galaxies are classified as ‘Red’ if they are redder, ‘Normal’ if they are comparable, and ‘Blue’ if they are bluer than the field members. Each coloured bar in the plot represents a disc colour category, with bars of different disc colours stacked on top of each other.

In the text
thumbnail Fig. B.1

FWHM vs. magnitude plot for the CANDELS dataset. The red rectangle encloses the 2232 galaxies selected for the analysis. The limiting magnitude threshold was set at 25. To eliminate point-sized sources, we applied a minimum FWHM threshold of 10. A maximum FWHM threshold of 60 was set to confine the objects within the 128 × 128 cutout window.

In the text
thumbnail Fig. B.2

Trends in (B.2a) SSIM and (B.2b) flux error as a function of the object magnitude for the three networks: Learnlet, Unet-64, and SUNet. The trends for the original outputs are shown with solid lines, and the trends for the debiased outputs using multi-resolution support are shown with dotted lines. After debiasing, a noticeable enhancement in flux error can be observed for all three networks across a range of magnitudes, with a slight improvement in SSIM.

In the text
thumbnail Fig. C.1

Hallucination rate for the three networks—Learnlet, Unet-64, and SUNet—with 1 and 2 loss as a function of (C.1a) FWHM and (C.1b) magnitude.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.