A nearby galaxy perspective on dust evolution-Scaling relations and constraints on the dust build-up in galaxies with the DustPedia and DGS samples

Context. The efficiency of the different processes responsible for the evolution of interstellar dust on the scale of a galaxy are, to date, very uncertain, spanning several orders of magnitude in the literature. Yet, precise knowledge of the grain properties is key to addressing numerous open questions about the physics of the interstellar medium and galaxy evolution. Aims. This article presents an empirical statistical study, aimed at quantifying the timescales of the main cosmic dust evolution processes as a function of the global properties of a galaxy. Methods. We modeled a sample of '800 nearby galaxies, spanning a wide range of metallicities, gas fractions, specific star formation rates, and Hubble stages. We derived the dust properties of each object from its spectral energy distribution. Through an additional level of analysis, we inferred the timescales of dust condensation in core-collapse supernova ejecta, grain growth in cold clouds, and dust destruction by shock waves. Throughout this paper, we have adopted a hierarchical Bayesian approach, resulting in a single large probability distribution of all the parameters of all the galaxies, to ensure the most rigorous interpretation of our data. Results. We confirm the drastic evolution with metallicity of the dust-to-metal mass ratio (by two orders of magnitude), found by previous studies. We show that dust production by core-collapse supernovae is efficient only at very low metallicity, a single supernova producing on average less than '0.03 M /SN of dust. Our data indicate that grain growth is the dominant formation mechanism at metallicity above '1/5 solar, with a grain growth timescale shorter than '50 Myr at solar metallicity. Shock destruction is relatively efficient, a single supernova clearing dust on average in at least '1200 M /SN of gas. These results are robust when assuming different stellar initial mass functions. In addition, we show that early-type galaxies are outliers in several scaling relations. This feature could result from grain thermal sputtering in hot X-ray emitting gas, which is a hypothesis supported by a negative correlation between the dust-to-stellar mass ratio and the X-ray photon rate per grain. Finally, we confirm the well-known evolution of the aromatic-featureemitting grain mass fraction as a function of metallicity and interstellar radiation field intensity. Our data indicate that the relation with metallicity is significantly stronger. Conclusions. Our results provide valuable constraints for simulations of galaxies. They imply that grain growth is the likely dust production mechanism in dusty high-redshift objects. We also emphasize the determinant role of local, low metallicity systems in order to address these questions.


Introduction
Characterizing the dust properties across different galactic environments is an important milestone in understanding the physics of the interstellar medium (ISM) and galaxy evolution , for a review). Interstellar grains have a nefarious role of obscuring our direct view of star formation (e.g., Bianchi et al. 2018). The subsequent unreddening of ultraviolet-(UV)-visible observations relies on assumptions about the constitution and size distribution of the grains, as well as on the relative star-dust geometry (e.g., Witt et al. 1992;Witt & Gordon 2000;Baes & Dejonghe 2001). Dust is also involved in several important physical processes, such as the photoelectric heating of the gas in photodissociation regions (PDR; e.g., Draine 1978;Kimura 2016) or H 2 formation (e.g., Gould & Salpeter 1963;Bron et al. 2014). The uncertainties about the grain properties have dramatic consequences on the rate of these processes and are the main cause of discrepancy between PDR models (Röllig et al. 2007). Finally, state-of-the-art numerical simulations of galaxy evolution are now post-processed to incorporate the full treatment of dust radiative transfer in order to reproduce realistic observables (e.g., Camps et al. 2016Camps et al. , 2018Trayford et al. 2017;Rodriguez-Gomez et al. 2019;Trčka et al. 2020). These simulations rely on assumptions about the dust emissivity, absorption and scattering properties, which change from galaxy to galaxy (e.g., Clark et al. 2019;Bianchi et al. 2019).
We are far from being able to predict the dust composition, structure, and size distribution in a given ISM condition. Ab initio methods are currently impractical because of the great complexity of the problem. Empirical approaches face challenges, too. The last four decades have shown that each time we were investigating a particular observable with the hope of constraining the grain properties, these properties were proving themselves more elusive because of their evolution. Firstly, the diversity of extinction curve shapes within the Milky Way (Fitzpatrick 1986;Cardelli et al. 1989) and the Magellanic Clouds  can be explained by variations of the size distribution and carbon-to-silicate grain ratio (Pei 1992;Kim et al. 1994;Clayton et al. 2003;Cartledge et al. 2005). Secondly, elemental depletion patterns strongly depend on the density of the medium, suggesting that dust is partly destroyed and reformed in the ISM (Savage & Mathis 1979;Crinklaw et al. 1994;Jenkins 2009;Parvathi et al. 2012). Thirdly, variations of the infrared (IR) to submillimeter (submm) emissivity as a function of the density of the medium, whether in the Milky Way (Stepnik et al. 2003;Ysard et al. 2015) or in the Magellanic Clouds (Roman-Duval et al. 2017), are interpreted as variations of the grain mantle thickness and composition, and graingrain coagulation (e.g., Köhler et al. 2015;Ysard et al. 2016). Fourthly, the wide variability of the relative intensity and bandto-band ratio of the mid-IR aromatic feature spectrum, witnessed within the Milky Way and nearby galaxies, is the evidence of the variation of the abundance, charge, and size distribution of the band carriers (e.g., Boulanger et al. 1998;Madden et al. 2006;Galliano et al. 2008a;Schirmer et al. 2020). Finally, the wavelength-dependent polarization fraction in extinction and emission provides additional constraints on the composition of the grains, as well as on their shape and alignment (e.g., Andersson et al. 2015;Fanciullo et al. 2017).
To understand the evolution of the global dust content of a galaxy, one must be able to quantify the timescales of the processes responsible for dust formation and destruction. These processes can be categorized as follows.
Stardust production takes place primarily in the ejecta of: (i) Asymptotic giant branch (AGB) stars; and (ii) type II supernovae (SN II; core-collapse SN). SNe II potentially dominate the net stardust injection rate (e.g., Draine 2009, for a review), but their actual yield is the subject of an intense debate. Most of the controversy lies in the fact that, while large amounts could form in SN II ejecta (e.g., Matsuura et al. 2015;Temim et al. 2017), a large fraction of freshly formed grains could not survive the reverse shock (e.g., Nozawa et al. 2006;Micelotta et al. 2016;Kirchschlager et al. 2019). Estimates of the net dust yield of a single SN ranges in the literature from 10 −3 to 1 M /SN (e.g., Todini & Ferrara 2001;Ercolano et al. 2007;Bianchi & Schneider 2007;Bocchio et al. 2016;Marassi et al. 2019).
Grain growth in the ISM refers to the addition of gas atoms onto pre-existing dust seeds (e.g., Hirashita 2012). It could be the main grain formation process, happening on timescales 1 Myr (e.g., Draine 2009). It is however challenged due to the lack of direct constraints and because we currently lack a proven dust formation mechanism, at cold temperatures.
Grain destruction can be attributed to: (i) astration, that is their incorporation into stellar interiors during star formation; (ii) sputtering and shattering by SN blast waves. The second process is the most debated, although it is less controversial than stardust production and grain growth efficiencies. Timescales for grain destruction in the Milky Way range from 200 Myr to 2 Gyr (Jones et al. 1994;Slavin et al. 2015).
In the Milky Way, there is a convergent set of evidence pointing toward a scenario where: (i) stardust accounts for at most 10% of ISM dust; (ii) most grains are therefore grown in the ISM (e.g., Draine & Salpeter 1979;Dwek & Scalo 1980;Jones et al. 1994;Tielens 1998;Draine 2009). The goal of this paper is to explore how particular environments could affect these conclusions, by modeling nearby extragalactic systems. By studying nearby galaxies, we also aim to provide a different perspective on these questions.
Several attempts have been made in the past to quantify the evolution of the dust content of galaxies via fitting their spectral energy distribution (SED; e.g., Lisenfeld & Ferrara 1998;Morgan & Edmunds 2003;Galliano et al. 2008b;Rémy-Ruyer et al. 2014De Vis et al. 2017a;Nersesian et al. 2019;De Looze et al. 2020;Nanni et al. 2020). Although these studies provided important benchmarks, most of them were limited by the following issues: (i) their coverage of the parameter space was often incomplete, especially in the low-metallicity regime, which is crucial to quantify stardust production (cf. Sect. 5); (ii) potential systematic effects, originating either in the SED fitting or in the ancillary data, were questioning some of the conclusions; (iii) dust evolution models were most of the time not fit to each galaxy, but simply visually compared 1 , which can lead to some inconsistencies (cf. Sect. 5.2.3).
The present paper is an attempt at addressing these limitations. We rely on the homogeneous multiwavelength observations and ancillary data of the DustPedia project (Davies et al. 2017) and of the dwarf galaxy sample (DGS; Madden et al. 2013). We adopt a hierarchical Bayesian 2 (HB; cf. Sect. 3.1) approach when comparing models to observations, in order to limit the impact of systematic effects on our results (Galliano 2018, hereafter G18). Finally, we perform a rigorous dust evolution modeling of individual objects in our sample, in order to unambiguously constrain the dust evolution timescales. Section 2 presents the data we have used. Section 3 presents our model and discusses the robustness of the derived dust parameters. Section 4 provides a qualitative discussion of the derived dust evolution trends. Section 5 describes the quantitative modeling of the main dust evolution processes. Section 6 summarizes our results. Several technical arguments are detailed in the appendices, so that they do not alter the flow of the discussion.

The Galaxy sample
This study focuses on global properties of galaxies. Integrating the whole emission of a galaxy complicates the interpretation of the trends, as we discuss in Sect. 4. However, it also presents some advantages: (i) we can include galaxies unresolved at infrared wavelengths; and (ii) we can provide benchmarks for comparisons to unresolved studies of the distant Universe or to one-zone dust evolution models.
Several upcoming studies on a subsample of resolved Dust-Pedia galaxies will discuss the improvements that the spatial distribution of the dust properties provides (Roychowdhury et al., in prep.; Casasola et al., in prep.).

The infrared photometry
We present here the integrated, multiwavelength photometry of our sample, used to constrain the global dust properties of each galaxy. We focus on the mid-IR-to-submm regime, as it is where dust emits.

The DustPedia aperture photometry
We use the photometry of the 875 galaxies of the DustPedia sample presented by Clark et al. (2018, hereafter C18) 3 . Since we focus on the mid-IR-to-submm regime, we restrain the wavelength range to photometric bands centered between 3 µm and 1 mm. The list of photometric bands we have used is given in Table 1. C18 has provided a dedicated reduction of the Herschel broadband data and an homogenization of the Spitzer, WISE and Planck observations. Foreground stars have been masked. Aperture-matched photometry has been performed on each image and a local background has been subtracted from each flux. A consistent noise uncertainty was estimated for each measurement. IRAS fluxes from Wheelock et al. (1994) were added to the catalog. We refer the reader to C18 for more details about the data reduction and photometric measurement.
The SED model (Sect. 3.1), that we have applied to our data, performs a complex statistical treatment, allowing us to analyze even poorly sampled SEDs. However, the efficiency of such a model can be affected by the presence of systematic effects not properly accounted for by the uncertainties. In order to be conservative, we have therefore excluded a series of fluxes, based on the following criteria.
1. The C18 catalog flags about 22% of its fluxes, for different reasons: artifacts, contamination by nearby sources, incomplete extended emission, etc. We have excluded all the fluxes that are flagged. For 89 galaxies, all IR fluxes end up flagged. These galaxies have therefore been excluded. Notes. During the inference process (Sect. 3.2), the observed fluxes are compared to the SED model integrated within the transmission of these filters, with the appropriate flux convention.
2. Several galaxies contain a significant emission from their active galactic nucleus (AGN). This emission is characterized by a prominent synchrotron continuum and copious amounts of hot dust (T dust 300 K), resulting in a rather flat mid-IR continuum. Our dust model (Sect. 3.1.1) is optimized for regular interstellar dust. Our distribution of starlight intensities (Sect. 3.1.2) is usually not flexible enough to account for the hot emission from the torus. We have therefore excluded 19 sources presenting such an emission at a significant level, following Bianchi et al. (2018), who used the criterion of Assef et al. (2018) based on the WISE1 and WISE2 fluxes to identify AGNs 4 . 3. We have performed a preliminary least-squares fit with our reference SED model (Sect. 3.3), in order to identify where the largest residuals are.
-A few short wavelength bands present a large deviation from their SED model and their adjacent fluxes 5 . After inspection of the images, the discrepancies are likely due to residual starlight contamination. -Several long-wavelength IRAS fluxes are significantly deviant from their nearby MIPS and PACS fluxes 6 . The reason of these discrepancies is obscure. However, Sect. 3.3 presents a test comparing SED results with and without IRAS data, showing they are not crucial to our study. -Three additional galaxies are not properly fitted 7 . These objects present the characteristics of an AGN and are indeed classified as such (Liu & Zhang 2002). They were not accounted for by the Assef et al. (2018) criterion. It is however consistent with the fact that this criterion has a 90% confidence level.
We have excluded all these problematic fluxes. Criterion (3) is more qualitative than (1) and (2), but it allows us to identify potential systematics that were missed.
In total, we are left with 764 DustPedia galaxies. The monochromatic fluxes (in Jy) were converted to monochromatic luminosities (in L Hz −1 ), using the distances from the HyperLeda database (Makarov et al. 2014).

The dwarf galaxy sample
Metallicity is a crucial parameter to study dust evolution (cf. Sect. 5 andRémy-Ruyer et al. 2014, 2015). In particular, the low-metallicity regime, represented by dwarf galaxies, provides unique constraints (e.g., Galliano et al. 2018), yet the DustPedia sample selected sources larger than 1 . In order to improve the sampling of the low-metallicity regime, we have thus included additional galaxies from the dwarf galaxy survey (DGS; Madden et al. 2013).
Among the 48 sources of the DGS, 35 are not in the DustPedia sample. We have added these sources to our sample. These galaxies have been observed with Spitzer, WISE and Herschel. We use the aperture photometry presented by Rémy-Ruyer et al. (2013. We do not expect systematic differences between the DustPedia and DGS aperture fluxes. Appendix B.1 compares the photometry of the DGS sources that are in Dust-Pedia. Both are indeed in very good agreement. Similarly to the DustPedia sample, we apply the three exclusion criteria of Sect. 2.1.1. 1. We exclude the fluxes that have been flagged. 2. No significant AGN contribution is present in this sample. 3. Rémy-Ruyer et al. (2015) advise to not trust the PACS photometry for HS 0822+3542. Since this source is neither detected with SPIRE, we simply exclude it.
In total, we are left with 34 DGS galaxies, in addition to those already included in DustPedia.

Photometric uncertainties
Our combined sample contains 798 galaxies. For each of them, we consider the following two sources of photometric uncertainty.
The noise: it includes statistical fluctuations of the detectors and residual background subtraction. It has been thoroughly estimated by C18 and Rémy-Ruyer et al. (2015). We assume that the noise of each waveband of each galaxy follows an independent normal distribution 8 .
Calibration uncertainties: they are systematics, that is they are fully correlated between different galaxies, and partially correlated between wavebands. We assume they follow a joint, multivariate normal distribution, whose covariance matrix is given in Appendix A. Table 1 gives the number of galaxies observed though each waveband, and the number of detections (flux greater than 3σ, where σ refers solely to the noise uncertainty). The number of available filters per galaxy ranges between 1 and 19, and its median is 11. There is a median number of two upper limits per galaxy.

The ancillary data
We present here the ancillary data gathered in order to characterize the ISM conditions in each galaxy.

The stellar mass
For the DustPedia sample, we adopt the stellar masses of Nersesian et al. (2019). Theses masses were derived from UVto-mm SED fitting, using the code CIGALE (Boquien et al. 2019), with two stellar populations. For the DGS, the stellar masses are given by Madden et al. (2014), using the Eskew et al. (2012) relation, based on the IRAC1 and IRAC2 fluxes. Eskew et al. (2012) emphasize that the largest source of systematic uncertainties in the stellar mass is the initial mass function (IMF). Both Nersesian et al. (2019) and Eskew et al. (2012) adopt a Salpeter (1955) IMF, therefore limiting potential biases between the two samples. Other potential biases, such as the form of the assumed star formation history, should not be an issue with our sample. For instance, both Mitchell et al. (2013) and Laigle et al. (2019) tested the reliability of stellar mass estimates using numerical simulations of galaxies, and showed that they gave consistent results at low redshift. We use M to denote the stellar mass. Nanni et al. (2020, hereafter N20) have reestimated the stellar masses of the DGS, with CIGALE. They report systematically lower values, compared to Madden et al. (2014), sometimes by an order of magnitude. We discuss the possible reasons of this discrepancy in Appendix B.2, concluding that our estimates are likely more reliable.

The metallicity
For DustPedia galaxies, we use the metallicities derived by De Vis et al. (2019), using the S calibration of Pilyugin & Grebel (2016, hereafter PG16_S). For the DGS, we use the metallicities derived by De Vis et al. (2017a), using the same PG16_S calibration. De Vis et al. (2017a) show that this particular calibration is the most reliable at low-metallicity.
We adopt the solar elemental abundances of Asplund et al. (2009): the hydrogen mass fraction is X = 0.7381, the helium mass fraction, Y = 0.2485, the heavy element mass fraction, Z = 0.0134, and the oxygen-to-hydrogen number ratio, 12+log(O/H) = 8.69±0.05. Throughout this study, we assume a fixed elemental abundance pattern. It implies that the total metallicity, Z, scales with the oxygen-to-hydrogen number ratio as: Z 2.04 × 10 −9 × 10 12+log(O/H) Z . (1) For that reason, in the rest of the paper, we refer to both Z and 12 + log(O/H), as metallicity. Molecular gas. Casasola et al. (2020) compiled observations of 12 CO lines for 245 late-type DustPedia galaxies. They converted these observations in molecular masses, assuming a constant X CO conversion factor (Bolatto et al. 2013), and corrected them for aperture effects. The uncertainties are the quadratic sum of the 12 CO line noise measurement and 30%, corresponding to the uncertainty on the X CO . We adopt these values when available. For the other DustPedia and DGS galaxies, we infer the H 2 mass, with the approximation used by De Vis et al. (2019, Eq. (7)). It uses the scaling relation between M H i /M and M H 2 /M H i , derived by Casasola et al. (2020). The scatter of this relation is propagated and accounted for in the uncertainties. These molecular gas masses are also corrected for helium and heavy elements. We use M CO H 2 to denote the total molecular gas mass derived from actual 12 CO line observations, and M H 2 , to denote the method-independent total molecular mass, either derived from 12 CO lines or from the De Vis et al. (2019) approximation.

The star formation rate
For DustPedia, the star formation rate (SFR), has been derived by Nersesian et al. (2019). The SFR was a free parameter of the SED fit they performed with CIGALE. For the DGS, we used the SFR estimated by Rémy-Ruyer et al. (2015), using a combination of the Hα 656.3 nm line and of the total infrared luminosity (TIR). Although the two estimators are different, we do not expect significant biases between DustPedia and the DGS. Indeed, both account for: (i) the escaping power from young ionizing stars, with the UV SED or the Hα 656.3 nm line; and (ii) the power reradiated by dust, with the IR SED or the TIR. In addition, both assume a Salpeter (1955)   The SED analysis we present in Sect. 3.1 includes the ancillary data, as a prior. For that purpose, it is preferable to consider the logarithm of quantities that can span several orders of magnitude. The independent ancillary data parameters we use as a constraint are listed in the first five lines of Table 2. The total gas mass is M gas = M H i + M H 2 and the 12 CO-line-estimated molecular fraction, f CO H 2 = M CO H 2 /M gas . We also list, in the second part of Table 2, parameters derived from these quantities, including the method-independent molecular fraction, f H 2 = M H 2 /M gas . All the extensive quantities (masses and SFR) have been homogenized to the distances adopted in Sect. 2.1.
We assume that the uncertainty of the five independent ancillary parameters (first part of Table 2) follows a split-normal distribution (G18, Sect. 3.2.3). We propagate the uncertainties and their correlations in the derived parameters (second part of Table 2).

The SED modeling approach
We now describe our modeling approach, as well as the consistency tests we have performed to assess the robustness of our results.

Our in-house hierarchical Bayesian SED model
HerBIE (HiERarchical Bayesian Inference for dust Emission; G18) is a hierarchical Bayesian model aimed at inferring the probability density functions (PDF) of dust parameters (dust mass, etc.), from their SED, rigorously accounting for the different sources of uncertainties. As any Bayesian model, HerBIE computes a posterior PDF as the product of a classical likelihood and a prior PDF. What makes this model hierarchical is that the prior depends on a set of hyperparameters. These hyperparameters are: (i) the average of each physical parameter; and (ii) their covariance matrix. The hyperparameters are inferred, together with the parameters of each individual galaxy. We are therefore sampling a single, large-dimension 10 , joint PDF. Since the shape of the prior is inferred from the data, the information of the whole sample is used, in this process, to refine our knowledge of each individual galaxy. In particular, it is relevant to keep in our sample even poorly-constrained SEDs, with upper limits or missing fluxes. Such a model is also efficient at suppressing the numerous noise-induced, false correlations between parameters, encountered when fitting SEDs with least-squares or nonhierarchical Bayesian methods (Shetty et al. 2009;Kelly et al. 2012;Galliano 2018;Lamperti et al. 2019). It has recently been used to study the anomalous microwave emission in the Milky Way (Bell et al. 2019).
From a technical point of view, HerBIE uses a Markov chain Monte-Carlo (MCMC), with Gibbs sampling (Geman & Geman 1984). As mentioned in Sect. 2.2, we include the ancillary data in our prior. This process is fully demonstrated in Sect. 5.3 of G18. These ancillary data do not enter the dust model, but they provide information that helps to better constrain the hyperparameters, in a holistic way. In other words, the information provided by the gas mass or the metallicity helps to better constrain the dust SED fit. It is a Bayesian implementation of Stein's paradox (Stein 1956;Efron & Morris 1977).

Parametrization of the dust model
To infer dust parameters with HerBIE, we adopt the framework provided by the grain properties of the THEMIS 11 model . THEMIS is built, as much as possible, on laboratory data, and reproduces dust observables of the Galactic ISM. One of its originalities is that it accounts for the aromatic and aliphatic mid-IR features with a single population of small, partially hydrogenated, amorphous carbons, noted a-C(:H). Consequently, it does not include polycyclic aromatic hydrocarbons (PAH) per se. Although largely dehydrogenated, small a-C(:H) are very similar to PAHs. The other main component of THEMIS is a population of large, a-C(:H)-coated, amorphous silicates, with Fe and FeS nano-inclusions.
THEMIS is designed to model the evolution of: (i) the size distribution, (ii) the a-C(:H) hydrogenation, and (iii) the mantle thickness, with the interstellar radiation field (ISRF) and gas density. With the observational constraints of Sect. 2.1, we can not reliably constrain the mantle thickness, as its effect on the shape of the far-IR SED is too subtle. Nor can we constrain the a-C(:H) hydrogenation, as broadband fluxes do not provide unambiguous constraints on the 3.4 µm feature. We can however study the variations of the size distribution of small a-C(:H), from galaxy to galaxy, as it will affect the strength of the bright mid-IR aromatic features.
We have therefore parametrized the size distribution in the following way. We have divided the a-C(:H) component into (cf. Fig. 1): -Very small a-C(:H) (denoted VSAC) of radius smaller than 7 Å, responsible for the mid-IR feature emission, with more weight in the short wavelength bands; -Small a-C(:H) (denoted SAC) of radius between 7 Å and 15 Å, responsible for the mid-IR feature emission, with more weight in the long wavelength bands; -Medium and large a-C(:H) (noted MLAC) of radius larger than 15 Å, carrying the featureless mid-IR continuum and a fraction of the far-IR peak.
The silicate-to-MLAC ratio is kept constant. Using q X to denote the mass fraction of component X, the size distribution is controlled by two parameters: (i) the mass fraction of aromatic-feature-carrying grains, q AF ≡ q VSAC + q SAC ; and (ii) the very-small-a-C(:H)-to-aromatic-feature-emitting-grain ratio, f VSAC ≡ q VSAC /q AF . Comparing these parameters to dust models with PAHs (e.g., Zubko et al. 2004;Draine & Li 2007;Compiègne et al. 2011), q AF is the analog of the PAH mass 11 www.ias.u-psud.fr/themis/ fraction, q PAH introduced by Draine & Li (2007, hereafter DL07). The difference here is that, q AF = 17%, while q PAH = 4.6%, in the diffuse Galactic ISM 12 . This is because a-C(:H) have a smaller fraction of aromatic bonds per C atom. More mass is therefore needed to produce the feature strength of a PAH. 12 We note here that the estimated mass fraction of PAHs in the diffuse Galatic ISM depends on the set of mid-IR observations used to constrain it. For instance, DL07 use COBE/DIRBE broadbands, while Compiègne et al. (2011) use a scaled Spitzer/IRS spectrum. It results in different levels of mid-IR emission and PAH fraction: Compiègne et al. (2011) find q PAH = 7.7%. Since THEMIS is calibrated with the same observations as Compiègne et al. (2011), it is possible to estimate the value of q PAH that would result in the same level of mid-IR emission as q AF : q PAH 0.45 × q AF (ratio of 7.7% to 17%). This parametrization has the great advantage of being linear. A simpler version, fixing f VSAC , has been implemented in CIGALE by Nersesian et al. (2019). A more physical way to vary the size distribution would be to vary the minimum cut-off size and the index of the power-law size distribution (cf. Jones et al. 2013, for the description of the size distribution of THEMIS). However, this alternate parametrization would be CPU time consuming, and would not produce noticeable differences in the resulting broadband SED (cf. Appendix C).

The mixing of physical conditions
A model, such as THEMIS, is not directly applicable to observations of galaxies. Indeed, the monochromatic flux of a whole galaxy comes from the superposition of grains exposed to a range of physical conditions. This well-known problem can be addressed assuming a distribution of starlight intensity inside each galaxy. We adopt the prescription proposed by Dale et al. (2001), where the dust mass, M dust , follows a power-law distribution, as a function of the starlight intensity, U: The parameter U is the frequency-integrated monochromatic mean intensity of the Mathis et al. (1983) diffuse ISRF. It is normalized so that U = 1 in the solar neighborhood (e.g., Eq. (2) of Galliano et al. 2011). This phenomenological composite SED is the powerU model component of G18 (Sect. 2.2.5). We add to the emission the starBB component of G18 (Sect. 2.2.6), in order to account for the residual diffuse stellar emission at short wavelengths.
The list of free SED model parameters, that HerBIE infers, is thus the following. Similarly to the ancillary parameters (Table 2), we use the natural logarithm of quantities varying over more than an order of magnitude. 1. ln M dust , the dust mass, scales with the whole dust SED. 2. ln U min ∈ [ln 0.01, ln 10 3 ] is the minimum cut-off in Eq. (2). 3. ln ∆U ∈ [ln 1, ln 10 7 ] controls the range in Eq. (2). 4. α ∈ [1, 2.5] is the power-law index in Eq. (2). 5. ln q AF ∈ [ln 10 −5 , ln 0.9] is defined in Sect. 3.1.1. 6. f VSAC ∈ [0, 1] is defined in Sect. 3.1.1. 7. ln L is the luminosity of the residual stellar emission (cf. G18, Sect. 2.2.6). These parameters are however not the most physically relevant. In the rest of the present article, we focus our discussion on the following three parameters, marginalizing over the other ones: (i) M dust ; (ii) U ; and (iii) q AF , where U (Eq. (9) of G18) is the mean of the distribution in Eq. (2). It quantifies the massaveraged starlight intensity illuminating the grains. It is a function of the three parameters U min , ∆U, and α (Eq. (10) of G18). It can be related to an equivalent large grain equilibrium temperature, T eq , through: U (T eq /18 K) 5.8 (e.g., Nersesian et al. 2019).

Questionable assumptions and residual contaminations
In our experience, the model of Sect. 3.1.2 is the most appropriate for galaxies observed with the typical spectral coverage of Table 1. It presents however the following limitations.
Shape of the ISRF. We assume that dust is heated by the Mathis et al. (1983) ISRF, scaled by a factor U. We assume in the model that the shape of this ISRF does not vary between galaxies nor within regions inside galaxies. It is obvious that this assumption is not correct. However, its consequences are minimal on the parameters we are interested in, for the following reasons. Firstly, M dust and U depend mainly on the far-IR peak emission, which is dominated by large grains. These grains are at thermal equilibrium. Their emission therefore does not depend on the shape of the ISRF, only on the absorbed power. Secondly, q AF controls the fraction of small, stochastically-heated grains. The emission from these grains depends on the shape of the ISRF (e.g., Camps et al. 2015). However, since small a-C(:H) are destroyed in H ii regions (cf. Sect. 4.2.3 of Galliano et al. 2018, and Sect. 4.2), they are effectively heated by a rather narrow spectral range (4 eV hν < 13.6 eV), where they exist. This effect is demonstrated on Fig. 7 of Draine (2011), with PAHs. We thus do not expect that actual variations of the ISRF shape will significantly bias our estimate of q AF .
Evolution of small a-C(:H). The abundance of small a-C(:H) and their properties evolve with the ISRF and the gas density. These grains are dehydrogenated and destroyed by intense ISRFs; they are also accreted onto large grains, in dense regions (e.g., Jones et al. 2013;Köhler et al. 2015). Our parametrization (Sect. 3.1.1) allows us to explore variations of q AF between galaxies, but we assume that q AF is constant, for all U, within each galaxy. However, this assumption will not bias our global estimate of q AF , as this parameter is merely a way to give a physical meaning to the observed L AF /TIR ratio (L AF denoting the power emitted by the aromatic features). This assumption would only be problematic if we were trying to estimate the local value of q AF in PDRs, for instance. Indeed, we would, in this case, underestimate q AF , by assuming that a fraction of the aromatic feature emission comes from H ii regions, which are generally hotter than PDRs, and thus more emissive.
Grain opacity. We assume that the grain opacity does not vary between galaxies, nor within regions inside galaxies. There are several indications that this hypothesis is not correct (e.g., Sect. 4.2.1 of Galliano et al. 2018). In particular, the far-IR opacity can typically vary by a factor of 3 with the accretion or removal of mantles and grain-grain coagulation . Such variations will bias our estimate of the dust mass. In addition, the submm and mm silicate emissivity implemented in THEMIS has an unphysical power-law dependence at long wavelengths. Accounting for a more realistic opacity, based on laboratory measurements such as those of Demyk et al. (2017a,b), will likely affect our dust mass estimates. This is an important effect that we can not avoid. We discuss its consequences in Sect. 4.1.3, when interpreting our results.
Residual contaminations. The following emission processes, not accounted for in our model, could be present in our observations.
Several gas lines contribute to the emission in the photometric bands of The submm excess is an excess emission of debated origin, appearing longward 500 µm (cf. Sect. 3.5.5 of Galliano et al. 2018, for a review). Since it is not accounted for in our model, it could bias our submm SED. This effect is however probably negligible in our sample, for the following reasons. Firstly, this A&A 649, A18 (2021) Fig. 2. Select SED fits. Each panel shows the inferred SED PDF, in yellow intensity scale, normalized by the MCMC-averaged TIR. The color violin plots represent the synthetic photometry PDF, for each waveband. The black error bars are the observations. Panels a-c: arbitrary chosen galaxies, representative of their classes (early-type, late-type, and irregulars, respectively). Panel d: case, where most constraints are only upper limits. submm excess is model-dependent. THEMIS has a flatter submm opacity than models based on the DL07 optical properties (e.g., Fig. 4 of Galliano et al. 2018). It is therefore more emissive in the submm regime and minimizes the contribution of the excess. Secondly, this excess is observed in dwarf galaxies, but is rarely detected in higher metallicity systems (e.g., Galliano et al. 2003Galliano et al. , 2005Galametz et al. 2009;Rémy-Ruyer et al. 2015;Dale et al. 2017). Yet, data longward 500 µm, in our sample, are available only for large objects, due to the large Planck/HFI beam. Finally, residuals of our model run (Sect. 3.2) do not show significant excesses in the submm bands.
Residual stellar emission could contaminate the shortest wavelength bands. Improper stellar subtraction could lead to positive or negative offsets, independently in the different bands.

The reference run
The results we present in Sect. 4 were obtained with the model described in Sect. 3.1. We call it our reference run. Figure 2 shows the SED fit of four representative objects, obtained with this run. In panels a-c, SEDs of three arbitrarily chosen galaxies, representative of early-type galaxies (ETG), late-type A18, page 8 of 42  Table 1. Each instrument is color-coded. The vertical dashed error bars represent the calibration prior, that is the ±1σ range given in Appendix A. The violin plots show the actual posterior PDFs of δ, for each broadband. The width of a single violin plot is proportional to the PDF as a function of δ (vertical axis). galaxies (LTG), and irregulars, respectively, are displayed. The PDF of the SED is shown as a yellow density plot. We also show the PDF of the synthetic photometry (violin plots 13 of different colors). The comparison of this synthetic photometry to the observed flux (circles with an error bar) reflects the quality of the fit. We discuss a more thorough and technical fit quality test in Appendix D. Panel d shows the example of a galaxy where most of the fluxes are 3σ-upper limits (displayed when F ν < σ noise ν ). When the evidence provided by the data is weak, which is the case when few detections are available, the posterior distribution becomes dominated by the prior. This is what we see here. The range spanned by this SED PDF is the extent of the prior. The global scaling of this SED is very uncertain, but its shape is realistic. This would not be the case with a nonhierarchical model. Figure 3 shows the inference of the calibration biases, δ(λ), of each waveband (Sect. 2.1.3). Basically, the model fluxes are corrected by factors 1+δ(λ). For each waveband, the dashed-line error bar represents the ±1σ interval of the prior on the calibration bias. This prior is a multivariate normal distribution, centered on 0, with the covariance matrix of Eq. (A.2). As discussed in Sect. 3.1.3, besides accounting for the uncertainty in the calibration of each instrument, these coefficients can account for external contaminations and fitting residuals. We can interpret the most deviant values of Fig. 3 in this light. The posterior values of δ lying outside their ±2σ prior are the following.
160 µm : both overlapping bands PACS3 and MIPS3 indicate an excess emission of about 15%. A significant fraction of this excess is likely due here to contamination by the brightest line of the ISM, [C ii] 158 µm . The [C ii] 158 µm intensity is about 1% the FIR (far-IR; 60−200 µm) power (Malhotra et al. 1997(Malhotra et al. , 200113 Violin plots are 90 • -rotated histograms. Brauher et al. 2008;Cormier et al. 2015). Its contribution to the PACS3 and MIPS3 is thus expected to be a few percent, as these bands are narrower than the FIR range.
12 µm and 22 µm : the deficit could be due to a systematic difference between the size distribution of medium a-C(:H) grains implemented in THEMIS and in the diffuse ISM of these galaxies. Although the model was different, Galliano et al. (2011, Appendix A.2) had to decrease the abundance of the grains carrying the 24 µm continuum in the Large Magellanic Cloud (LMC), for a similar reason.
Apart from these deviations, the two overlapping bands WISE1 and IRAC1 indicate that the model, on average, underestimates the observations by 10% around 3.5 µm. This could be due to either the continuum or the 3.3 and 3.4 µm features. The constraint of the size distribution of THEMIS by the SED of the Milky Way ( Fig. 3 of Jones et al. 2013), has been performed with a medium resolution IRS spectrum for all a-C(:H) features, except the two in question here. The global emissivity of these features is the most uncertain of the model. We also notice that the SPIRE calibration biases go in opposite directions, while they should be partially correlated (Appendix A.5). The HFI1 and HFI2 calibration biases agree very well with SPIRE. This is the sign that there is a systematic residual proportional to the flux in this wavelength range. It could mean that the slope of THEMIS is not steep enough in the 200−500 µm range 14 . The fact that the HFI3 is almost 0 would mean that the slope flattens again longward 800 µm. This is very speculative, but we note that this behavior is qualitatively consistent with the optical properties measured in the laboratory by Demyk et al. (2017b).
The parameters inferred with this run, for each galaxy, are given in Appendix H. We report only the parameter values and their uncertainties (mean and standard-deviation of the posterior PDF). The skewness and correlation coefficients can be retrieved from the DustPedia archive 15 . The length of the MCMC of this run is 1 000 000, where we have excluded the first 100 000 steps, to account for burn-in. The maximum integrated autocorrelation time (Eq. (43) of G18) is 65 000.

Additional runs: robustness assessment
In order to assess the robustness of our results, we have performed several additional runs, with different assumptions. We now discuss the comparison of these tests with our reference run. We focus on the derived dust mass and do not display the redundant comparisons with U . Indeed, TIR is usually well constrained and, by model construction, TIR ∝ M dust × U . We discuss the difference in q AF , when relevant. Figures 4-6 compares M dust or q AF derived by these tests to the same quantity derived by the reference run, M ref dust and q ref AF , respectively. Table 3 reports some statistics on this comparison. We color code the galaxies according to their Hubble stage, T : Early-type: T ≤ 0 (red); Late-type: 0 < T < 9 (green); Irregulars: T ≥ 9 (blue).

Influence of the fitting method
Least-squares. We have fit the full sample of Sect. 2 with the physical model of Sect. 3.1, using a standard least-squares method (cf. Appendix C of G18). The goal of this run is to  The scattered sources are underestimates, for similar reasons as the least-squares example. The amplitude of the scatter is however smaller than for the least-squares run. In addition, even the most extremely scattered sources are 2σ-consistent with the 1:1 relation. This is because a Bayesian method samples the likelihood as a function of the parameters, but stays conditional on the data, while a frequentist approach (e.g., least-squares) sam-ples the likelihood as a function of the data, thus considering data that have not actually been observed, leading to biases (e.g., Jaynes 1976;Wagenmakers et al. 2008).
No Planck and IRAS data. To understand the role of the far-IR coverage, we have fit the sample of Sect. 2 without the Planck and IRAS data, using the physical model of Sect. 3.1. In this case, the far-IR coverage is provided by the sole Herschel data. There is thus no data longward 500 µm. The result is shown in panel c of Fig. 4. The absence of this data does not bias the fit, it simply introduces some scatter, consistent with the 1:1 relation (cf. Sect. 5.2 of G18 for more discussion on this effect). The irregulars are marginally overestimated, due to the lack of sufficient SPIRE detections. Some of these sources do not have any submm detections, higher dust masses are therefore allowed.

Influence of the dust model assumptions
With the Galliano et al. (2011) dust mixture. We have fit the full sample of Sect. 2 with the HB model of Sect. 3.1, replacing the THEMIS grain properties by those of the amorphous carbon (AC) model of Galliano et al. (2011). The far-IR opacity of the two grain mixtures are comparable (cf. Fig. 4 of Galliano et al. 2018). The only fundamental difference is that the aromatic features are accounted for by PAHs, not by a-C(:H). Panel a of Fig. 5 compares M dust to our reference model. As expected, the two values are in good agreement. The moderate scatter is due to the mild difference between the two far-IR opacities. However, most of the ratios are 1σ-consistent with the 1:1 relation. Panel b of Fig. 5 compares the q PAH of this run to the q AF of the reference run. In Sect. 3.1.1, we noted that we should have q PAH /q AF 0.45. In the present case, we have a ratio of 0.6. A possible explanation of this difference is the following. The parametrization of the THEMIS's aromatic spectrum shape is controlled by f VSAC (Sect. 3.1.1), which alters the a-C(:H) mass. On the contrary, for the Galliano et al. (2011) AC model, the shape of the aromatic spectrum is controlled by the fraction of ionized PAHs, which does not alter the PAH mass. A systematically lower 8-to-12 µm ratio compared to the Galaxy's diffuse ISM would explain a q PAH /q AF ratio higher than expected.
Modified black body with β = 1.79. We have fit the photometry of Sect. 2, longward 100 µm, with a Modified Black Body (MBB; e.g., Sect. 2.2.2 of G18). In this first test, we fix the emissivity index β = 1.79 and the level of the opacity to mimic the far-IR opacity of THEMIS: κ(λ) = 0.64 m 2 kg −1 × (250 µm/λ) 1.79 (e.g., Sect. 3.1.1 of Galliano et al. 2018). Panel c of Fig. 5 compares M dust to our reference run. We can see that M dust is about 0.8 times lower. This value is consistent with what Galliano et al. (2011, Appendix C.2) found in the LMC. This difference is due to the fact that a MBB is an isothermal approximation. Since a SED fit is roughly luminosity weighted, the MBB does not account for the coldest, less emissive, but massive regions in the galaxy. It thus systematically underestimates the mass.
Modified black body with free β. Similarly to the previous test, we have fit the photometry of Sect. 2, longward 100 µm, with a MBB, but letting β free, this time. Such a model can potentially infer the grain optical properties through the value of β and the grain physical conditions through the temperature, T d . This potentiality is however limited by the mixing of physical conditions (e.g., Sect. 2.3.1 of Galliano et al. 2018). Panel a of Fig. 6 compares M dust to our reference run. We can see the dust mass is not extremelly biased, although there is some scatter. The β−T d relation of this run is presented in Appendix E.
With the DL07 ISRF distribution. We have fit the full sample of Sect. 2 with the physical model of Sect. 3.1, replacing the ISRF distribution of Eq. (2) by the DL07 prescription: This prescription is our Eq.
(2) plus a uniformly illuminated component at U = U min , with the parameter γ controlling the weight of the two components. We follow DL07 by fixing α = 2 and U min + ∆U = 10 6 . Consequently, the far-IR and submm slope, beyond the large grain peak emission ( 100 µm), is the slope of the large grain emission at U = U min , making this model very similar to a fixed-β MBB in the far-IR and submm range. In comparison, the ISRF distribution of our reference run allows us to account for a flattening of the far-IR and submm slope, by mixing different ISRF intensities, down to lower temperatures (Sect. 2.3.1 of Galliano et al. 2018). Apart from the ISRF distribution, the other features are similar to our reference run: THEMIS grain mixture and HB method. Panel b of Fig. 6 compares M dust to our reference run. We notice, as expected, the same systematic shift, here by a factor 0.7, as for the fixed-β MBB (Table 3). In summary, the comparisons of Sects. 3.3.1-3.3.3 have demonstrated that the discrepancies induced by different fitting methods or model assumptions can be well understood. We are therefore confident that our reference run is the most robust among the diversity of approaches we have tested.  . Secondly, they did not profit from the possibility to include ancillary data in the prior (Sect. 3.1). Finally, they did not include the remaining sources from the DGS sample. The comparisons of M dust and q AF are shown in Fig. 7.

Comparison to CIGALE results
Panel a of Fig. 7 shows that their dust mass is about a factor of two lower than ours. This can be partly understood in the light of the difference in grain mixtures. Indeed, Jones et al. (2017) revised the THEMIS model by including the more realistic Köhler et al. (2014) optical properties. To fit the same observational constraints, they compensated this update by changing the mantle thickness, as well as the dust-to-gas mass ratio: Table 2 of Jones et al. (2013) and M dust /M H = 7.4 × 10 −3 according to Table 1 of Jones et al. (2017). This sole modification explains why a mass derived with the 2013 THEMIS version would be a factor of 0.86 lower than with the 2017 version. However, this does not explain the whole extent of the discrepancy. The comparison of our dust masses with those derived with CIGALE (Sect. 3.3.3) certainly excludes such a large error, on our side. It can be noted that there is also some scatter around the median ratio of M dust /M ref dust . A part of this discrepancy can naturally be explained by the fact that several galaxies have a very poor spectral coverage. As demonstrated in Sect. 3.2, the prior becomes dominant in this case. In our case, the prior contains the information provided by the ancillary data, thus helping to reduce the dust parameter range.
Panel b of Fig. 7 shows the comparison of q AF . The two quantities are in good agreement. There is some scatter around the median, for the same reason as mentioned for M dust . However, the problem with this quantity is the way L19 discuss it. They improperly report the meaning of q AF that they call "QPAH" (L19, Sect. 3, 5th item). They write it represents "the mass fraction of hydrogenated amorphous carbon dust grains with sizes between 0.7 nm and 1.5 nm", while it actually is the mass fraction of a-C(:H) with sizes between 0.4 nm and 1.5 nm. Furthermore, they claim the Galactic value of this parameter is 7.1%, while it is q Gal AF = 18.6% for the 2013 version and q Gal AF = 17.0% for the 2017 version. The value of 7.1% is the mass fraction of a-C(:H) between 4 Å and 7 Å. Consequently, they mistakenly show that most of the DustPedia sample has a higher fraction of small a-C(:H) than the Milky Way, while it is not the case (cf. Sect. 4.2). In summary, we are confident that our derived parameters are both more precise and more accurate than L19's.

The derived dust evolution trends
In this section, we present the main dust evolution trends derived from the reference run (Sect. 3.2). These results are displayed as correlations between two inferred parameters, for each source in the sample. Displaying the full posterior PDF of each galaxy as density contours is visually impractical. Instead, we display its extent as a skewed uncertainty ellipse (SUE; Appendix F). SUEs approximately represent the 1σ contour of the PDF, retaining the information about the correlation and the skewness of the posterior, with a dot at the maximum a posteriori. When discussing parameter values in the text, we often quote the 95% credible range (CR 95% ), which is the parameter range excluding the 2.5% lowest and 2.5% highest values of the PDF. We also adopt the following terminology. We call extremelly lowmetallicity galaxy (ELMG), a system with Z Z /10. To simplify the discussion, since the heavy-element-to-gas mass ratio, Z, is usually called metallicity, we introduce the term dustiness to exclusively denote the dust-to-gas mass ratio: by specific, we denote quantities per unit stellar mass (similar to sSFR): (i) the specific dust mass is sM dust ≡ M dust /M ; (ii) the specific gas mass is sM gas ≡ M gas /M . We note that, in all the displayed relations, the number of objects depends on the availability of the ancillary data (Table 2).

Evolution of the total dust budget
We first focus on scaling relations involving the total dust mass, M dust , with respect to the gas and stellar contents, the metallicity and the star formation activity. Casasola et al. (2020) have explored additional scaling relations, focussing on DustPedia LTGs.

Qualitative discussion
as a function of the gas fraction: This well-known relation was previously presented by Clark et al. (2015), De Vis et al. (2017b), and Davies et al. (2019). It shows that: (i) at early stages ( f gas 0.7; mainly irregulars), there is a net dust build-up; (ii) it then reaches a plateau (0.2 f gas 0.7; mainly LTGs) where the dust production is counterbalanced by astration; (iii) at later stages ( f gas 0.2; mainly ETGs), there is a net dust removal. Several sources have peculiar positions relative to the above mentioned trend. Firstly, the irregular (blue SUE) with f dust 0.01 is PGC 166077. It is however technically not an outlier, as CR 95% ( f dust ) = [1.7 × 10 −3 , 2.8 × 10 −2 ], overlapping with the rest of the sample. Secondly, the two ETGs (red SUEs) with a low f dust , at f gas 0.3 and f gas 0.6, are NGC 5355 and NGC 4322, respectively. For NGC 4322, CR 95% ( f dust ) = [1.2 × 10 −6 , 1.4 × 10 −4 ], marginally overlapping with the rest of the sample. For NGC 5355, however, CR 95% ( f dust ) = [2.7 × 10 −6 , 6.8 × 10 −5 ], making it a true outlier. The ETG (red SUE) with f gas 1 is ESO 351-030. It is the Sculptor dwarf elliptical galaxy. This is not an outlier to the relation, but it indisputably lies outside of its group. Panel b of Fig. 8 presents the relation between the specific gas mass, and the dustiness. Such a relation was previously presented by Cortese et al. (2012), Clark et al. (2015), andDe Vis et al. (2017b). There is a clear negative correlation between these two quantities, showing that when a galaxy evolves, its ISM gets progressively enriched in dust and its gas content gets converted to stars. There is one notable feature deviating from this main sequence: a vertical branch at sM gas 0.1, exhibiting a systematically lower dustiness. This branch is mostly populated by ETGs and contains most of the ETGs of the relation. We discuss the likely origin of this branch in Sect.  Fig. 8 shows the relation between the specific dust mass and the specific star formation rate, sSFR. This relation was previously presented by Rémy-Ruyer et al. (2015) and De Vis et al. (2017b). It was also discussed in Cortese et al. (2012) with sSFR replaced by its proxy NUV-r. In the same way, but on resolved 140 pc scales, it was shown in M 31 by Viaene et al. (2014). There is a clear positive correlation between the two quantities. The different galaxy types are grouped in distinct locations: (i) ETGs are clustered around 10 −4 sSFR 0.01 Gyr −1 ; (ii) LTGs are clustered around 0.01 sSFR 1 Gyr −1 ; (iii) Irregulars tend to lie around 0.1 sSFR 10 Gyr −1 , but are more scattered than the other types, with a systematically lower sM dust than what the extrapolation of the trend would suggest. This scaling relation provides an interesting approximation to derive the dust content from SFR and M , two quantities which usually are easy to estimate. In particular, the relation is quasi-linear in the low sSFR regime, with: with CR 95% (M dust /SFR) = [0.12, 53] × 10 7 yr. Panel d of Fig. 8 shows the variation of the dustiness as a function of the metallicity (Eq. (1)). Such a relation is one of the most common benchmarks for global dust evolution models (cf. Sect. 5), and has been presented by numerous studies (e.g., Lisenfeld & Ferrara 1998;James et al. 2002;Draine & Li 2007;Galliano et al. 2008b;Galametz et al. 2011;Rémy-Ruyer et al. 2014;De Vis et al. 2017a). There is a clear correlation between the two quantities, reflecting the progressive dust enrichment of the ISM, built from heavy elements injected by stars at the end of their lifetime. Few ETGs are present, due to the lack of reliable gas metallicity estimates for these objects. In the four panels, the SUEs represent the posterior of each galaxy, using the reference run (Sect. 3.2). The SUEs are color-coded according to the Hubble stage of the object (Sect. 3.3). We show the Milky Way values, as a yellow star, for comparison. We emphasize that, although we are showing different parameters of the same sample, the different panels do not exactly contain the same number of objects (Table 2). In particular, there are fewer reliable metallicity measurements, especially for ETGs.
These would occupy the high-metallicity regime. Compared to Rémy-Ruyer et al. (2014), who presented a similar trend for the DGS sources, we notice that a few sources are missing and some metallicities have been updated. This comes from the fact that we have adopted the more robust DGS metallicity reestimates by De Vis et al. (2017a), with published nebular lines and using the PG16_S calibration (Sect. 2.2.2). The lowest metallicity source, at 12 + log(O/H) 7.15, is I Zw 18. The most dust-deficient source of the panel, at Z dust 1.9×10 −6 , is UGCA 20. It is a clear outlier to the trend with CR 95% (Z dust ) = [1.3 × 10 −6 , 3.0 × 10 −6 ].

Comparison to X-ray luminosities
The ETG outliers, visible in panel b of Fig. 8 and discussed in Sect. 4.1.1, are likely due to enhanced dust destruc-tion by thermal sputtering in their hot, X-ray emitting gas (e.g., Bocchio et al. 2012;Smith et al. 2012), as suggested by De Vis et al. (2017b). We could conceive a reverse causality where it is the low dust abundance that allows a higher fraction of escaping X-ray photons. However, this is unrealistic, as X-ray observations clearly indicate that ETGs are permeated by a coronal gas that is usually not found in later types of galaxies (Mathews & Brighenti 2003, for a review). In order to support the likeliness of the thermal sputtering scenario, we have compiled X-ray luminosities from the literature (Table 4). Figure 9 displays the dust-to-stellar mass ratio as a function of the X-ray photon rate per dust grain, L X /M dust , for the 256 galaxies in our sample with published X-ray luminosities. This figure shows a mild negative correlation between the two quantities, consistent with the enhanced dust destruction in X-ray-bright Notes. The total number of sources (last line) is fewer than the sum of each sample size (last column). This is because the same sources were observed by different instruments. When it is the case, we keep the most recent estimate.
galaxies. ETGs clearly lie in the bottom right quadrant of this figure. It is reminiscent of Fig. 10 in Smith et al. (2012), displaying L FIR /L B as a function of L X /L B . We note there is however a significant intrinsic scatter in this relation. This could be due to the fact that most of the studies listed in Table 4 quote a total X-ray luminosity, including both: (i) point sources (AGNs, binary systems); and (ii) the diffuse thermal emission that is sole relevant to our case 16 . In addition, the spectral range used to compute the X-ray luminosity varies from one instrument to the other (Table 4), leading to systematic differences. Nonetheless, accounting for these differences, which is beyond the scope of this paper, would likely not change the plausibility that dust grains are significantly depleted in ETGs due to thermal sputtering.

On the variation of the dust-to-metal mass ratio
Panel d of Fig. 8 shows that the dustiness-metallicity relation is nonlinear. Rémy-Ruyer et al. (2014) first argued, relying on insights from the models of Asano et al. (2013) and Zhukovska (2014), that such a trend was the result of different dust production regimes: (i) at low 12 + log(O/H), dust production is dominated by condensation in type II Supernova (SN) ejecta, with a low yield; (ii) around a critical metallicity 17 of 12 + log(O/H) 8, grain growth in the ISM becomes dominant, causing a rapid increase of Z dust ; (iii) at high 12 + log(O/H), the dust production is dominated by grain growth in the ISM, with a yield about two orders of magnitude higher than SN ii, and is counterbalanced by SN ii blast wave dust destruction. Rémy-Ruyer et al. (2014) argued that the intrinsic scatter of the relation, which could not be explained by SED fitting uncertainties, was due to the fact that each galaxy has a particular star formation history (SFH). Fig. 9. Relation of dust mass to X-ray luminosity. This figure shows the variation of the dust-to-stellar mass ratio as a function of the X-ray photon rate per dust grain, for the sample of Table 4. This is a subset of the reference run (Sect. 3.2). The color convention is similar to Fig. 8 We explore this aspect in Sect. 5. In the following paragraphs, we discuss the different biases that could have induced an artificial nonlinearity in our empirical trend.
Comparison to DLAs. Figure 10 shows the evolution of the dust-to-metal mass ratio (DTM) as a function of metallicity. It is another way to look at the data in panel d of Fig. 8. A constant DTM corresponds to a linear dustiness-metallicity trend. The SUEs in Fig. 10 are not consistent with a constant DTM (horizontal yellow line). However, there are reports in the literature of objects exhibiting an approximately constant DTM, down to very low metallicities: damped Lyman-α absorbers (DLA). We have overlaid the DLA sample of De Cia et al. (2016). These measures are performed in absorption on redshifted systems, along the sightline of distant quasars. The metallicity and DTM are derived from a combination of atomic lines of volatile and refractory elements. We can see that the DLA DTMs vary significantly less than in our nearby galaxy sample 18 . Such an evolutionary behavior requires a high SN-dust condensation efficiency coupled to a weak grain growth rate in the ISM (e.g., De Vis et al. 2017a). Alternatively, the DLA estimates could be biased. It is indeed not impossible that the hydrogen column density of most DLAs includes dust-and element-free circumgalactic clouds in the same velocity range. It would result in a typical solar metallicity object appearing to have a solar DTM, and at the same time, an artificially lower metallicity, diluted by the additional pristine gas along the line of sight.
Accounting for the gas halo. The contamination of the gas mass estimate by external, dust-and element-poor gas is also Fig. 10. Comparison to DLAs. This figure shows the relation between the metallicity and the dust-to-metal mass ratio, for the reference run (Sect. 3.2). This figure is very similar to panel d of Fig. 8: the only difference is that the y-axis has been divided by Z (related to the x-axis through Eq. (1)). We have overlaid in magenta the DLA measures from Table 6  a potential issue for our nearby galaxy sample. This is particularly important for low-metallicity, dwarf galaxies, where the IR-emitting region is usually small compared to the whole H i halo (e.g., Walter et al. 2007;Begum et al. 2008). For instance,  hinted that the trend between Z and Z dust was close to linear, when the gas mass used to estimate Z dust was integrated in the same region as the dust mass. They only had nine galaxies below 12 + log(O/H) < 8.1. We have addressed this issue by adopting the interferometric [H i] 21 cm observations of 20 of the lowest metallicity galaxies in our sample, including the lowest metallicity system, I Zw 18 (Roychowdhury et al., in prep.; Sect. 2.2.3). We have integrated the gas mass within the photometric aperture for these 20 objects. I Zw 18 still lies two orders of magnitude below the Galactic DTM, despite this correction. On the contrary, it is possible that the DTM of UGCA 20, the lowest SUE in Fig. 10, has been underestimated, as it has not been resolved in H i. We have to admit that there is still room for improvement as several of the ELMGs are barely resolved in the IR. Thus, although we corrected for a large fraction of the H i halo, there might still be residual gas not associated with the star forming region within our aperture. We might therefore be underestimating the dustiness of our ELMGs. This will have consequences in Sect. 5. The amplitude of this underestimation is not quantifiable as these sources are not resolved in the far-IR. It is however difficult to imagine that there would still be 99% of gas not associated with IR emission within our aperture. Indeed, for the most extreme case, I Zw 18, our aperture is only 1.5 times the optical radius (Rémy-Ruyer et al. 2013), which should be comparable to the IR radius. The overall rising DTM with 12 + log(O/H) of Fig. 10 is therefore unlikely due to an improper correction of the H i envelopes of ELMGs.
Variation of the grain opacities. We have noted in Sect. 3.1.3 that our dust mass estimates depend on the rather arbitrary grain opacity we have adopted. This grain opacity has been designed to account for the emission, extinction, and depletions of the diffuse Galactic ISM (cf. Sect. 3.1.1). A systematic variation of the overall grain opacity with metallicity could change the slope of the trend in Fig. 10. In order to move I Zw 18 up to the Galactic DTM, we would need to adopt a grain emissivity diminished by about two orders of magnitude 19 . Qualitatively, the ISM of an ELMG, such as I Zw 18, is (e.g., Cormier et al. 2019): (i) permeated by hard UV photons; (ii) very clumpy with a low cloud filling factor. With more UV photons to evaporate the mantles and less clouds to grow them back, we could assume that the grains in such a system would be reduced to their cores (cf. Fig. 16 of Jones et al. 2013). Demantled, crystalline, compact grains are indeed among the least emissive grains. However, to our knowledge, there are no interstellar dust analogs having a far-IR opacity two orders of magnitude lower than those used in the THEMIS model. The Draine & Li (2007) mixture, which resembles compact bare grains, is only a factor of 2 less emissive than THEMIS (e.g., Fig. 4 of Galliano et al. 2018). We could also imagine that the composition of the dust mixture itself changes. In particular, the fraction of silicates could be higher in ELMGs, as C/O and O/H are correlated (Garnett et al. 1995). This would have a limited effect, since: (i) a decrease of the large a-C(:H) abundance by 50% would decrease the global emissivity of THEMIS in the SPIRE bands by 15−30%; (ii) a variation of the forsterite-to-enstatite ratio would change the emissivity by less than 30%. It is therefore very unlikely that the dependence of DTM with metallicity is artificially induced by our grain opacity assumption.
Variation of the size distribution. Another factor that could affect the trend of Fig. 10 is that we have fixed the size distribution of the large grains. In principle, relaxing this assumption and allowing the size distribution to be dominated by very small grains (VSG) in low-metallicity sources, would raise the DTM of objects such as I Zw 18. Indeed, VSGs are stochastically heated. They spend most of their time at very low temperatures between successive photon absorptions. Their excursion at T 20 K will span only a fraction of their time. A mixture of VSGs would thus appear less emissive than a larger grain at an equilibrium temperature close to the high end of their temperature distribution. We have demonstrated this effect in Fig. 11. We have simulated a VSG-dominated SED mimicking a typical ELMG, peaking around λ 40 µm, and with very weak aromatic feature emission ( Fig. 11; panel b; blue curve). The size distribution needed to produce such a SED is made almost exclusively of a 1.5 nm radius grains (Fig. 11; panel a; blue curve). We have fit this synthetic SED with our reference model (Sect. 3.1.2), keeping the large grain size distribution fixed ( Fig. 11; panel a; red curve), but varying the ISRF distribution. This model reproduces very well the photometric fluxes ( Fig. 11; panel b; red circles), but requires a total dust mass a factor of 3 lower than the VSG model. This effect goes in the right direction but is not enough to explain the two orders of magnitude required to account for a constant DTM in Fig. 10. We could further decrease the emissivity of the VSG model by lowering the size of the grains. However, it would result in a SED peaking shortward Fig. 11. Demonstration of a VSG-dominated SED. Panel a: size distribution of the two models we are comparing here: the reference model in red, which is simply the THEMIS model, with the small grains scaled down to mimic the SED of a typical ELMG; and a VSG model, in blue, which is a log-normal size distribution, peaking at 1.5 nm with a logwidth of 0.2. The optical properties of the VSG model are those of THEMIS, and we have kept the same silicate-to-carbon grain mass ratio. In both cases, the curves represent the size distributions of a-C(:H) and silicates. Panel b: corresponding SEDs for the two models. The SED of the reference model has been scaled down by a factor of 0.30, to fit the VSG SED. It means the VSG SED reproduces the reference model with a dust mass 3.33 times higher. The VSG SED is uniformly illuminated by an ISRF with U = 10, while the reference model has a distribution of ISRFs with U = 200. The blue filled and red empty circles show the synthetic photometry of each model in the IRAC, MIPS1, PACS, and SPIRE bands.
λ 30 µm, inconsistent with our observations. Finally, VSGdominated ELMGs would not be in agreement with theoretical dust size distribution evolution models (e.g., Hou et al. 2017;Hirashita & Aoyama 2019;Aoyama et al. 2020). At early stages, these models predict the ISM being populated of large grains.
The reason is that the dust production at these stages is thought to be dominated by SN-ii-condensed dust, and grains condensed in core-collapse SN ejecta are essentially large as the small ones are kinetically sputtered (e.g., Nozawa et al. 2006).
Very cold dust. A significant fraction of the dust mass could have been overlooked, hidden in the form of very cold dust (T dust 10 K). The emission of this component would manifest as a weakly emissive continuum at submm wavelengths, that we have not accounted for. Very cold dust has been invoked to explain the submm excess in dwarf galaxies (cf. Sect. 3.1.3). It could thus be invoked to flatten our dust-tometal mass ratio trend, in principle. However, while the excess has been reported by numerous studies, predominantly in dwarf galaxies (e.g., Galliano et al. 2003Galliano et al. , 2005Dumke et al. 2004;Bendo et al. 2006;Galametz et al. 2009;Bot et al. 2010), very cold dust does not appear as one of its viable explanations. Indeed, to reach such a low temperature, very cold dust should be shielded from the general ISRF in massive dense clumps. In the LMC, Galliano et al. (2011) showed that the excess emission at 10 pc scale was diffuse and negatively correlated with the gas surface density, inconsistent with the picture of a few dense clumps. Similarly, Galametz et al. (2014) and Hunt et al. (2015) showed that this excess was more prominent in the outskirts of late-type disk galaxies, where the medium is less dense. In addition, other realistic physical processes to explain this excess have been proposed (Meny et al. 2007;Draine & Hensley 2012). Dust analogs also exhibit a flatter submm slope that could partly account for this excess (Demyk et al. 2017b). More qualitatively, it is difficult to conceive of the presence of massive dense clumps in ELMGs, where the dust-poor ISM is permeated by intense UV photons from the young stellar populations that are forming. In addition, this dust would likely be associated with dense gas that we have not accounted for, limiting its impact on the estimated dustiness.
Systematic uncertainty on the ELMG's DTM. Overall, it appears that none of our model assumptions could be demonstrated to be responsible for artificially inducing the observed variation of the DTM. The trend of Fig. 10, for nearby galaxies, is therefore likely real. It is however possible that the dustiness of the ELMGs has been underestimated. We can quote a rough systematic uncertainty for an object such as I Zw 18, the following way. Firstly, since the H i aperture is a factor of 1.5 times the optical radius, Z dust might be underestimated by a factor at most 1.5 2 = 2.25. Secondly, we have discussed that the grain opacity could have been overestimated by a factor of 2. The systematic uncertainty on Z dust is thus at most a factor of 2. Finally, we have discussed that our grain size distribution assumption could lead to an underestimate of Z dust by a factor of at most 3. Overall, we can consider that the dustiness of ELMGs could have been underestimated by a factor of √ 2.25 2 + 2 2 + 3 2 = 4.25.

Evolution of the aromatic feature emitters
We now focus our discussion on another important dust parameter, the mass fraction of aromatic feature emitting grains, q AF (Sect. 3.1.1). We note that, in the THEMIS model, all grains are either pure a-C(:H) or a-C(:H)-coated. All of them therefore potentially carry aromatic features. However, only the smallest ones (a 10 nm) will fluctuate to high enough temperatures (T 300 K) to emit these features. Finally, we remind the reader that, assuming aromatic features are carried by PAHs, q AF is formally equivalent to 2.2 × q PAH (Sect. 3.1.1).

The competing small a-C(:H) evolution scenarios
The strong spatial variability of the aromatic feature strength has been known for several decades. Dramatic variations have been observed within Galactic and Magellanic H ii regions (e.g., Boulanger et al. 1998;Madden et al. 2006;Galametz et al. 2013), as well as among nearby galaxies (e.g., Galliano et al. 2003Galliano et al. , 2005Engelbracht et al. 2005;Madden et al. 2006). The following scenarios have been proposed.
Photodestruction. The aromatic feature strength appears to be negatively correlated with the presence of intense or hard UV fields. For instance, Boulanger et al. (1998) showed that their strength in Galactic PDRs is decreasing when the intensity of the ISRF is increasing. Madden et al. (2006) showed that their strength, within Galactic and Magellanic regions, and within nearby galaxies, is negatively correlated with [Ne iii] 15.56 µm /[Ne ii] 12.81 µm , an ISRF hardness indicator.
Numerous studies confirmed these trends, showing the depletion of aromatic features around massive stars, as a result of the photodissociation and photosublimation of small a-C(:H). At the same time, it was also shown that the aromatic feature strength is empirically correlated with metallicity. This is clear in nearby galaxies, as a whole (e.g., Engelbracht et al. 2005;Madden et al. 2006), as well as within extragalactic H ii regions (e.g., Gordon et al. 2008;Khramtsova et al. 2013). This correlation is not inconsistent with photodestruction, as most low-metallicity galaxies detected at IR wavelengths to date are actively forming stars. Their ISM, less opaque due to a lower Z dust , is permeated by an intense UV field, potentially destroying small a-C(:H) on wider scales. It is also consistent with the increasing strength of the 2175 Å bump with metallicity, comparing sightlines in the Small Magellanic Cloud (SMC), LMC, and Milky Way (e.g., Gordon et al. 2003). The extinction bump is indeed thought to be carried by small carbon grains with aromatic bonds (Joblin et al. 1992).
Inhibited formation. Alternatively, several studies have explored the possibility that small a-C(:H) are less efficiently formed at low-metallicity. Galliano et al. (2008b) showed that, assuming small a-C(:H) are formed from the carbon produced by AGB stars (on timescales of 400 Myr) and the rest of the grains are mostly made out of the elements produced by massive stars (on timescales of 10 Myr), the aromatic feature emitters are under-abundant at early stages of galaxy evolution. Another scenario, developed by Seok et al. (2014), reproduces the trend of aromatic feature strength with metallicity, assuming small a-C(:H) are the product of the shattering of larger carbon grains (see also Rau et al. 2019;Hirashita & Murga 2020). Finally, Greenberg et al. (2000) have proposed that aromatic feature carriers could form on grain surfaces in molecular clouds and be photoprocessed in the diffuse ISM. Sandstrom et al. (2010) and Chastenet et al. (2019) show that the spatial distribution of q PAH in the Magellanic Clouds is consistent with this scenario. This process would also explain the fact that these grains are underabundant in ELMGs, as the molecular gas fraction rises with metallicity.

Empirical correlations within our sample
The different processes we have just listed are not exclusive and could very well compete within the ISM. It is however important to understand which one controls the overall abundance of small a-C(:H), at galaxy-wide scales. Figure 12 shows q AF as a function of the average starlight intensity, U (Sect. 3.1.2), and metallicity, in our sample. Similar relations were previously shown by , Galliano et al. (2008b), Khramtsova et al. (2013), and Rémy-Ruyer et al. (2015). The two SUEs at high U (panel a) are I Zw 18 and SBS 0335-052. Their SEDs indeed peaks around 30 µm. However, their q AF appear higher than the extrapolation of the general trend, with CR 95% (q AF ) = [0.012, 0.035] for I Zw 18 and CR 95% (q AF ) = [0.024, 0.064] for SBS 0335-052. Mid-IR Spitzer-IRS spectroscopy did not detect aromatic features in these galaxies (Wu et al. 2007;Houck et al. 2004). The true value of q AF is likely lower for these two objects. The reason of this overestimation lies in the difficulty to estimate aromatic feature strengths solely with broadband fluxes, when the featureto-continuum ratio is weak. Indeed, in the weak aromatic feature regime, q AF is biased by the color of the mid-IR continuum (cf. Fig. 1 of Galliano et al. 2008b).
Overall, we find clear correlations in both panels of Fig. 12, consistent with past studies. However, the relation appears more scattered with U , than with 12 + log ( We could also question the accuracy of U as an ISRF tracer. Indeed, U is a mass-averaged ISRF intensity, giving a large weight to the coldest regions within the beam. Rémy-Ruyer et al. (2015) and Nersesian et al. (2019) used sSFR, which is luminosity weighted, in place of U and found a good negative correlation with q PAH and q AF , respectively. In our sample, the correlation coefficient between ln q AF and ln sSFR is not improved: ρ = −0.488 +0.028 −0.024 with CR 95% (ρ) = [−0.53, −0.43]. Therefore, it appears that q AF correlates much more robustly with metallicity than ISRF indicators, in our sample. This result is worth noting, especially since several studies focussing on a narrower metallicity range concluded the opposite (e.g., Gordon et al. 2008;Wu et al. 2011). It probably relies on the fact the metallicities we have adopted (Sect. 2.2.2) correspond to well-sampled galaxy averages, while in the past a single metallicity, often central, was available and may have not been representative of the entire galaxy. This result suggests that photodestruction, although real at the scale of star-forming regions, might not be the dominant mechanism at galaxy-wide scales and that one needs to invoke one of the inhibited formation processes discussed in Sect. 4.2.1.

Approximate analytical metallicity trends
It can be interesting to have a simple analytical approximation describing how Z dust and q AF vary as a function of 12+log(O/H). To that purpose, we have performed polynomial fits of the relations in panel d of Fig. 8 and panel b of Fig. 12. These fits are displayed in Fig. 13. A18, page 20 of 42 This equation provides a good fit of our trend, for the metallicity range covered by our sources. However, it gives a rising dustiness trend with decreasing metallicity below I Zw 18. To improve this relation, we can assume that the dustiness will be proportional to 12 + log(O/H) at extremelly low-metallicity, as we show in Sect. 5.3 that dust evolution in this regime is dominated by SN ii production. We therefore modify Eq. (8), as: log Z dust −13.230 + x for x < 7.3.
This fit splits our data in two equal size samples above and below.

Empirical quantification of key dust evolution processes
We now analyze the trends of Fig. 8, in a more quantitative way, with the help of a grain evolution model. The goal here is twofold: (i) by fitting theoretical tracks to our sample, we intend to test if it is possible to account for the variation of the main galaxy parameters, with only a few evolutionary processes; (ii) we aim to give a self-consistent empirical quantification of the main ad hoc tuning parameters controlling these evolutionary processes.
Recently, such models have been used to post-process numerical simulations in order to provide a more comprehensive understanding of galaxy evolution (e.g., Aoyama et al. 2017).

Model hypotheses
We adopt the one-zone dust evolution model of Rowlands et al. (2014) updated by De Vis et al. (2017a 21 . It solves the coupled 20 They can also eventually compute the variation of the grain size distribution and composition with time (e.g., Hirashita et al. 2015). 21 We started from the public Python code provided by the authors at https://github.com/zemogle/chemevol. We have rewritten it in Fortran for numerical efficiency, since we needed to generate large grids of models. We have also implemented an adaptative temporal grid to ensure numerical precision at later stages of evolution. In both panels, the dashed lines display the envelope encompassing 68% of the sources, and the dotted lines, the envelope encompassing 95% of the sources. We show here the decimal logarithm of the Z dust and q AF . differential equations accounting for the time evolution of the mass of the four following quantities.
Stars are made out of the gas, which is partially returned to the ISM at the end of their lifetime. The model tracks their evolution as a function of their mass, which determines their lifetime and their elemental and dust yields. The role of this component therefore relies greatly on the form of the assumed IMF (e.g., Salpeter 1955;Chabrier 2003).
Gas is depleted by astration and outflow, and is replenished by stellar feedback and by inflow of metal-and dust-free gas.
Heavy elements are injected in the ISM by stars at the end of their lifetime. A fraction of them is recycled into stars through astration and lost via outflow.
Dust is produced by three main processes: (i) condensation in low-and intemediate-mass star (LIMS) ejecta (m < 8 M ) 22 ; (ii) condensation in SN ii ejecta (m ≥ 8 M ); (iii) grain growth in the ISM, by accretion of elements onto grain seeds. Dust is 22 Following De Vis et al. (2017a), we assume that LIMSs condense 15% of their heavy elements into dust.  (ii) outflow; (iii) astration. The drivers of the evolution of these quantities are: (i) the assumed SFH (i.e., the SFR as a function of time); (ii) the assumed inflow and outflow rates.

The tuning parameters
The efficiencies of individual dust evolution processes are poorly known (see reviews by Dwek 2005;Draine 2009;Jones 2016a,b,c;Galliano et al. 2018). From a theoretical point of view, these efficiencies depend on the most elusive detailed microscopic dust properties: chemical constitution, structure (crystalline, amorphous, aggregate, etc.), and size. Consequently, theoretical estimates usually span several orders of magnitude. From an observational point of view, unambiguous constraints of the evolution rates are also problematic, because it is nearly impossible to isolate a dust source or sink in a telescope beam. Dust evolution models therefore use simple parametric efficiencies controlled by tuning parameters.
Dust condensation in SN-II ejecta. Dust formed by massive stars (m ≥ 8 M ) is thought to dominate the dust production at early stages, below the critical metallicity (cf. Sect. 4.1.3). The dust evolution model we are using assumes the theoretical dust yields, m SN dust , of Todini & Ferrara (2001), modified by a general tuning parameter, δ SN (Table 5). These dust yields can also be integrated over the IMF, φ(m ): to provide a single averaged dust yield per SN ii. For both Salpeter (1955) and Chabrier (2003) IMFs, it is: Y SN 0.35 × δ SN M /SN. The dust condensation timescale, τ cond , can be expressed as a function of the SN ii rate, R SN : Grain growth in the ISM. The dust build-up by accretion of gas atoms onto pre-existing dust seeds is a potentially dominant production process at late stages, above the critical metallicity A18, page 22 of 42 (Sect. 4.1.3). We adopt the parametrization of , where the grain growth timescale, τ grow , is parametrized by a tuning parameter, grow : where ψ(t) is the SFR as a function of time t. This tuning parameter encompasses our uncertainty about grain sizes, sticking coefficients, and the fraction of cold clouds where dust growth occurs 23 .
Dust destruction by SN II blast waves. SN ii shock waves destroy dust by kinetic sputtering and grain-grain collisional vaporization (e.g., Jones et al. 1994). We adopt the dust destruction timescale, τ dest , of Dwek & Scalo (1980): where m dest gas is a tuning parameter quantifying the effective mass of ISM swept by a single SN ii blast wave within which all the grains are destroyed.

Modeling the evolution of individual galaxies
We use the dust evolution model of Sect. 5.1 to fit, for each galaxy: M dust , M gas , M , SFR, and Z. Dust destruction in X-ray emitting gas is not taken into account in this model. We therefore exclude the ETGs (T ≤ 0) which would not be consistently fitted (cf. Sect. 4.1.2). The sample we model here contains 556 sources.

The model grid
We have generated a large grid of models that we interpolate to find the best evolution tracks for each of our sources. We assume a delayed SFH (Lee et al. 2010): parametrized by the SFH timescale, τ SFH , and a scaling factor, ψ 0 . We assume a Salpeter (1955) IMF in order to be consistent with the observed SFR and stellar mass estimates (Sect. 2.2). Exchanges of matter between galaxies and their environment can have a significant role in regulating the global dustiness at all redshifts (e.g., Jones et al. 2018;Ohyama et al. 2019;Sanders et al. 2020;Burgarella et al. 2020). We account for inflow and outflow, assuming their rates are proportional to SFR: R in/out (t) = δ in/out × ψ(t). Table 6 gives the parameters of our dust evolution model grid. We perform log-log interpolation of this grid in order to estimate the tracks corresponding to any arbitrary combination of parameters. The SFH we have adopted here, to compute the chemical evolution, is different from the SFH used by Nersesian et al. (2019) to model the SEDs and estimate the observed SFR and M (cf. Sect. 2.2.1). Ideally, adopting the same SFH would be more consistent. However, adding a second functional form to Eq. (15), accounting for a potential recent burst, would push this model beyond our current computational capabilities, by increasing the number of free parameters. At the same time, the elemental and dust enrichment by this recent burst would probably be negligible. Alternatively, we could have also modeled the SED, using solely Eq. (15). The problem, in this case, is that the use of a single stellar population biases the estimate of M (cf. Appendix B.2).

Hierarchical Bayesian dust evolution inference
To constrain our dust evolution model, we consider the posterior inference of our reference run (Sect. 3.2) as a set of observational constraints. We fit the four independent, intensive quantities: sM dust , sM gas , sSFR, and Z. The complex PDF, including the intricate parameter correlations, is preserved in this process. We make the restrictive assumption that the three tuning parameters, δ SN , grow , and m dest gas are universal and are therefore identical in every galaxy. Consequently, we assume that the difference between galaxies is solely the result of their particular individual SFH. We vary the age of the galaxy, t, the two SFH parameters, τ SFH and ψ 0 , and the inflow and outflow rates, δ in and δ out .
On the universality of the tuning parameters. As we have seen, the three tuning parameters hide effects of the dust constitution, size distribution, fraction of cold clouds, etc. These quantities could vary across different environments. However, as we show in Sect. 5.3.1, the SN II dust condensation dominates in the low-metallicity regime, while grain growth and SN destruction are important above the critical metallicity. Thus, the parameters that we constrain are representative of the regime where they dominate the dust budget. Exploring their variation with the environment is probably premature. We are, however, able to infer the variation of the timescales across galaxies from Eqs. (11)-(14). In addition, if we were to infer one set of tuning parameters per galaxy, it would imply that they vary as a function of the galaxy's global parameters. In order to be consistent, we would then need to implement these variations of the tuning parameters in the dust evolution model, and change their value at each time step, accordingly.
The model prior. We have built a HB model to infer these parameters. The prior of the five SFH-related parameters is a multivariate Student's t distribution controlled by hyperparameters, similar to the treatment in HerBIE. We have assumed wide log-normal priors for the three common tuning parameters. These priors are centered at ln 0.1, ln 1000, and ln 320 M with standard-deviations, 0.8, 0.8, and 0.4, for δ SN , grow , and m dest gas , respectively. These priors were designed so that their ±3σ range roughly corresponds to the extent of the values reported in the literature: 1% δ SN 100%, 100 grow 10 4 , and 100 M m dest gas 1000 M . It does not mean that these parameters can not exceed these limits, but it will be a priori improbable. The reason to adopt weakly informative priors is to avoid unrealistic degeneracies. For instance, there is a well-known degeneracy between grain growth and grain destruction (e.g., De Vis et al. 2017a;De Looze et al. 2020). Indeed, the ratio of the two timescales, τ dest (t)/τ grow (t) ∝ grow /m dest gas × (Z(t) − Z dust (t)) is not constant, but the metallicity dependence is quite mild in the narrow range above the critical metallicity. This is enough to allow the MCMC to explore unrealistically high values of grow and m dest gas , if we assume a flat prior.
HB run specifications. Accounting for the missing ancillary data, we have a total of 1913 observational constraints, for our 556 sources. One may note that we are fitting a model with 5 × 556 + 3 = 2783 parameters and 5 + 5 + C 2 5 = 20 hyperparameters. This is not an issue in the Bayesian approach (e.g., Hogg et al. 2010). Model parameters are rarely completely independent. Furthermore, the hyperprior is the dominant constraint for very scattered parameters (G18). Finally, by sampling the full joint PDF, a Bayesian model clearly delineates the various degeneracies, provided the MCMC has converged toward the stationary posterior. We have run 12 parallel MCMCs, starting from uniformly distributed random initial conditions within the parameter ranges in Table 6, in order to ensure the uniqueness of our posterior. The results discussed below are from a 10 5 length MCMC, removing the first 10 4 steps to account for burn-in. The longest integrated autocorrelation time is 6200.

The fitted dust evolution tracks
The four panels of Fig. 14 present the same relations as in Fig. 8, for the subsample of 556 sources. The posterior PDF of the dust evolution tracks is displayed as colored density contours, marginalizing over the SFH of individual galaxies.
Qualitative inspection. At first glance, we can see that the tracks of Fig. 14 reproduce the data quite well. Apart from a few outliers, there are however two notable systematic discrepancies. Firstly, the sM dust -sSFR trend of panel c is poorly reproduced. The initial rise of sM dust at sSFR 0.3 Gyr −1 undershoots several of the starbursting dwarf galaxies. Overall, this discrepancy is quite mild, as most of the SUEs in this area are less than 2σ away from the trend. The recent burst of star formation, that is not accounted for by the chemical evolution model, is likely responsible for the enhanced observed SFR. On the contrary, a starburst occuring in more evolved objects, on the left of this trend, might simply contribute to the general scatter of the relation and go unnoticed. Below sSFR 0.1 Gyr −1 , there is a general decreasing trend of sM dust with decreasing sSFR, but our model is essentially flat. This is the most troublesome discrepancy of our analysis. We discuss it fully below. Secondly, in panel a, several high-gas-fraction sources are undershot by the model, although most of them are only less than 2σ away from the tracks. Those are the irregular galaxies around the critical metallicity regime. They are in the stage where the dustiness changes rapidly, due to the increased contribution of grain growth. These outliers can be seen in panel b, around sM gas 10, in panel c, around sSFR 3 Gyr −1 , and in panel d, around 12 + log(O/H) 8. The model poorly reproduces the rapid dustiness rise around the critical metallicity, in panel d.
On the sM dust -sSFR relation. The quasi-linear trend of sM dust with sSFR, for sSFR 0.1 Gyr −1 (panel c of Fig. 14), is not accounted for by our model. N20 argue that outflows is responsible for this trend 24 . Indeed, outflow depletes the dust content proportionally to SFR, without affecting the stellar content. It seems like a natural explanation. The outflow rate, δ out , is a free parameter in our model. However, it does not produce a linear sM dust -sSFR relation. Figure 15 shows the same data as in panel c of Fig. 14. We have overlaid several dust evolution tracks corresponding to our maximum a posteriori parameters, varying δ out . This figure is similar to Fig. 6 of N20. We note the following. Firstly, we can see that no single track can reproduce the trend. It could be a satisfactory explanation if galaxies were rapidly evolving off the main trend, staying only a small fraction of their lifetime on the horizontal branch. However, looking at the top axis of Fig. 15, we realize that a galaxy should spend about half of its lifetime on the horizontal branch. If our model was correct, there should have been, statistically, a significant number of sources deviating from the main trend. Secondly, the range of δ out values that can account for most of the trend is quite narrow (Fig. 15). It would mean that the outflow has to be precisely regulated. It seems unlikely. Finally, if this trend was solely due to outflow, the specific gas mass would follow it closely, which is not the case. This point has also been discussed by Cortese et al. (2012). In summary, it appears our failure at reproducing the sM dust -sSFR relation is rather an incapacity of our model to produce a realistic trend, rather than a fitting issue. As we see in Sect. 5.3.3, this is not an IMF-related issue. Since the other trends are acceptably reproduced, the problem might lie in the only quantity represented in panel c of Fig. 14 not appearing in the other panels: the SFR. Our adopted SFH (Eq. (15)) and our inflow and outflow prescriptions might be too simplistic to account for the wealth of data we have here (cf. e.g., Leja et al. 2019, for a discussion on the limitations of the delayed SFH).
The importance of fitting dust evolution models. We stress the importance of actually fitting, in a consistent way, dust evolution models, rather than simply performing visual comparisons. Here, we have attempted to fit M , M gas , M dust , Z, and SFR for each galaxy. This rigorous process highlights the model limitations. Most past studies have merely overlaid tracks on their data, producing a convincing but inconsistent interpretation. For instance, De Vis et al. (2017a), who used the same dust evolution model as we use, and compared it to a similar sample as ours, were able to produce tracks accounting for most of the observations in the panels a and d of our Fig. 14. However, two quantities of a given galaxy, such as the dust and stellar masses, were usually explained with different values of the dust evolution parameters, at different ages. On the contrary, our approach allows us to avoid mutually inconsistent explanations of different trends and correlations. Overall, performing a rigorous fit does not help getting better solutions, but it definitely helps avoiding bad ones.  Fig. 8. The black SUEs represent the 556 galaxies of our subsample. The yellow star is the Milky Way. The colored density contours represent the posterior PDF of dust evolution tracks, marginalizing over the individual SFH of each galaxy.

The inferred dust evolution parameters
We now discuss the parameters inferred from the fit of Sect. 5.2.3.

Parameter distribution
The fits of Fig. 14 allow us to infer the common dust evolution tuning parameters, as well as the individual SFH-related parameters (Table 6).
The dust evolution tuning parameters. The PDF of the three common dust evolution tuning parameters is displayed in Fig. 16. We infer the following values: Y SN 7.3 +0.2 −0.3 × 10 −3 M /SN; grow 4045 404 −354 ; m dest gas 1288 +7 −8 M /SN. The relatively small uncertainties on these parameters reflect the fitting uncertainties. They indicate the values we infer are not ambiguous. However, they do not include the assumption-dependent uncertainties. In particular, the precise value of Y SN relies mainly on our estimate of the dustiness of the few ELMGs in our sample. In Sect. 4.1.3, we have extensively discussed the different systematic effects that could have biased these estimates. If some of these effects are, at some point, proven to be relevant, our inference of Y SN would have to be revised accordingly. To first order, the SN dust yield is proportional to the dustiness of the lowest metallicity objects: We also notice that δ SN and m dest gas are peaking at the tail of their prior. It means our data carries a large weight evidence. It also means the values we infer are probably an upper limit for δ SN and a lower limit for m dest gas . Since m dest gas and grow are correlated, our inference of grow is also probably a lower limit. We have estimated in Sect. 4.1.3 that the dustiness of ELMGs could have been underestimated by a factor of at most 4.25, the conservative conclusions we can draw from our analysis are thus that: Y SN 0.03 M /SN; grow 3000; m dest gas 1200 M /SN.
The SFH-related parameters. Figure 17 displays the corner plot of the five parameters controlling the individual SFH of each galaxy. The displayed PDF is the posterior of the parameters of every galaxy. There is a relatively large scatter of these parameters, implying that our different galaxies have different SFHs. We note that the interpolation between models does not produce a perfectly smooth distribution. The edges of our model grid are visible in this corner plot. This does not however impact the results as our grid samples well enough the PDF.

The dust evolution timescales
It is possible to estimate the posterior PDF of the dust evolution timescales of each galaxy, from the inferred parameters of  Figure 18 displays these timescales as a function of metallicity. Although there is some scatter due to the different SFH and age of galaxies in a given metallicity bin, we note that these timescales evolve.
The SN II dust condensation timescale (panel a of Fig. 18) is around τ cond 100 Myr for ELMGs, implying that dust can be dominated by stardust in this regime. It rises up to τ cond 1000 Gyr around solar metallicity, indicating this process is not sufficient to account for all ISM dust in evolved systems.
The grain growth timescale (panel b of Fig. 18) is quite scattered. It starts around τ grow 1 Gyr in the ELMG regime and decreases down to τ grow 50 Myr around solar metallicity. The average value of our sample at 12 + log(O/H) ≥ 8.5 is τ grow 45 Myr. It is another way to show that dust formation is dominated by grain growth around solar metallicity.
The SN II dust destruction timescale (panel c of Fig. 18) is also scattered but stays around τ dest 300 Myr across our metallicity range.
The Milky Way value displayed in the three panels is consistent with the cloud of points in the highest metallicity domain.
There is a common misconception that dust destruction by SN II blast waves is unimportant at early-stages, because the dustiness is so low that few grains are destroyed by a single SN II. However, we show here that this is not the case as: (i) the SN II rate is on average higher at low-metallicity; (ii) the fraction of grains destroyed by a single SN II is dustiness-independent.

Sensitivity to the IMF
As we discussed in Sects. 2.2.3 and 2.2.4, the main uncertainty on our estimated M and SFR comes from our IMF assumptions. Those were derived with a Salpeter (1955) IMF. Here, we  Fig. 16. The displayed PDF is the distribution of the individual parameters of every galaxy, marginalizing over the dust evolution tuning parameters. explore the sensitivity of our results assuming a Chabrier (2003) IMF. To be consistent, we first need to correct our estimated M and SFR. According to Madau & Dickinson (2014, Sect. 3.1), we need to multiply SFR by 0.63 and M by 0.61. We have then generated another grid of models similar to Table 6  This agreement was expected. The reason is that, to first order, LIMSs represent a dead weight in both the SED and the chemical enrichment. The total power emitted by stellar populations is dominated by massive stars. This is the reason why SFR and M estimates are so dependent on the IMF assumption. From the dust evolution modeling, the elemental and stardust enrichment is also dominated by massive stars. The only quantity for which we infer a different value is logically ψ 0 . It is 50% higher to generate the same number of massive stars, compensating for the 0.63 correction factor. The three panels display the posterior PDF of the three dust evolution timescales. The SUEs represent the value of these timescales as a function of metallicity, for each galaxy. The yellow star represents the Milky Way, at the maximum a posteriori of the three tuning parameters.

On the low inferred SN II dust yield
Our SN II dust yield is lower than the most recent estimates in situ. Measuring the dust mass produced by a single SN II is quite difficult, as it implies disentangling the freshly-formed dust from the surrounding ISM. It also carries the usual uncertainty about dust optical properties. A decade ago, the largest dust yield ever measured was Y SN 0.02 M (in SN 2003gd; Sugerman et al. 2006). The Herschel space telescope has been instrumental in estimating the cold mass of supernova remnants (SNR). The yields of the three most well-studied SNRs are now an order of magnitude higher: Barlow et al. 2010;Arendt et al. 2014;De Looze et al. 2017;Bevan et al. 2017;Priestley et al. 2019 Matsuura et al. 2015).
In all these cases, the newly-formed grains have not yet experienced the reverse shock (Bocchio et al. 2016). The net yield is thus expected to be significantly lower.
Even if 10−20% of the dust condensed in an SN II ejecta survives its reverse shock (e.g., Nozawa et al. 2006;Micelotta et al. 2016;Bocchio et al. 2016), we have to also consider the fact that massive stars are born in clusters. The freshlyformed dust injected by a particular SN II, having survived the reverse shock, will thus be exposed to the forward shock waves of nearby SNe II (e.g., Martínez-González et al. 2018). This effect is not accounted for by Eq. (14), as it does not account for clustering, nor does it account for the excess dustiness around these stars due to the recent grain production. Our estimate of Y SN is therefore an effective empirical yield, that probably accounts for this effect.

The relevance of local low-metallicity galaxies
Our analysis confirms the long-lasting consensus that Milky Way dust is essentially grown in the ISM (Sect. 1). The apparently paradoxical fact here is that we have drawn this conclusion from the low-metallicity domain. It is because dust production is dominated by SN II condensation below the critical metallicity that we could constrain its efficiency and show it is unimportant at solar metallicity. The relevance of dwarf galaxies here is not necessarily that they can be considered as analogs of primordial distant galaxies, but that they sample a particular, key, dust production regime.

Implications for high-redshift systems
High-redshift (z 6) objects exhibiting copious amounts of dust ( 10 7 −10 8 M ), close to the reionization era, have been challenging grain formation scenarios (e.g., Dwek et al. 2007Dwek et al. , 2014Valiante et al. 2009;Laporte et al. 2017). These objects are indeed only a few 100 Myr old, but have a roughly Galactic dustiness. SN II dust condensation would need to have a high efficiency ( 1 M /SN) to account for the observed dust mass (Dwek et al. 2007. AGB star yield can explain this dust content for z 6 objects (Valiante et al. 2009), but not at z 8 (Laporte et al. 2017).
Our results imply that grain growth should be the dominant dust formation mechanism in these galaxies. The dustiness of these massive objects being roughly Galactic, their metallicity should thus be Galactic too. The grain growth timescale should therefore be shorter than 100 Myr (panel b of Fig. 18), well below the age of these systems. Consequently, these very distant galaxies may not be the best laboratories to constrain the SN II dust yield.

Comparison with recent studies
As stated in Sects. 1 and 5.2.3, past studies have not been rigorously fitting cosmic dust evolution models to observations of galaxies. Recently, N20 and De Looze et al. (2020, hereafter DL20) have addressed this issue. N20 have adopted a A18, page 28 of 42 relatively coarse dust evolution model grid with a frequentist fitting approach. Although interesting, their approach does not allow them to rigorously quantify parameter degeneracies and uncertainties. The comparison with their results is therefore limited. We have discussed the main discrepancies between the present work and theirs in Sects. 2.1.2 and 5.2.3. DL20 have analyzed the JINGLE galaxy sample, following a methodology similar to ours, fitting the IR SED of each galaxy, and subsequently fitting one-zone dust evolution tracks to their derived M , M gas , M dust , Z, and SFR. They adopted a nonhierarchical Bayesian approach. To our knowledge, this is the first article, similar to ours, to perform a rigorous fit of dust evolution models, with clearly quantified parameter uncertainties. Their results however qualitatively differ from ours. DL20 find an overall high SN II dust yield, and a low grain growth efficiency (the tuning parameters are not assumed universal in their study). In our opinion, the discrepancy between our results and those of DL20 comes from the two following points. Firstly, DL20's metallicity coverage is more limited than ours. They have only one source below 12 + log(O/H) = 8.0, and none below 12 + log(O/H) = 7.8 (cf. e.g., their Fig. 7). Consequently, they do not sample the stardust regime, below the critical metallicity. Disentangling SN II dust condensation, grain growth, and shock destruction, with their data, is therefore rendered difficult. Secondly, the posterior distribution of the dust evolution parameters inferred by DL20, as seen in their Figs. G.7-G.12 (they focus their analysis on their galaxy bins 3 and 4), rarely goes down to zero probability within the range allowed by their uniform prior. It means the weight of evidence provided by their data is relatively mild. Our inference of grow and m dest gas are consistent with their PDF, as they fall in a high probability domain in all their models. We simply have a smaller uncertainty, thanks to our low-metallicity coverage. This is more pronounced for their distribution of δ SN (their f survival ). DL20 would have also benefitted from extending their prior range down to smaller values, as they do not consider yields below δ SN = 10%, while we infer a value around δ SN 1%.

Summary and conclusions
This article has presented an observational study aimed at constraining the timescales of the main processes controlling the evolution of interstellar dust in galaxies. The principal highlights are the following.
1. We have gathered the 3 µm-to-1 mm photometry and ancillary data of 798 nearby galaxies from the DustPedia (Davies et al. 2017) and DGS  surveys. We have attempted to create the most conservative, homogeneous sample, by controlling the factors that could lead to systematic biases (Sect. 2): the DustPedia and DGS IR data reduction and photometry have been performed consistently; the stellar mass and SFR have been estimated using the same IMF; the metallicities have been estimated using one uniform calibration; the total gas masses have been derived from [H i] 21 cm and 12 CO(J = 1→0) 2.6 mm observations, when available. When CO data was not available, the molecular gas mass was estimated from a scaling relation.
-Resolved interferometric [H i] 21 cm observations of 20 of the lowest metallicity objects were used in order to extract the gas mass corresponding exactly to the IR photometric aperture.
2. We have performed a hierarchical Bayesian dust SED fit of the 798 galaxies, using the code HerBIE (G18) with the THEMIS grain properties .
-This allowed us to infer the dust mass, M dust , mean starlight intensity, U , and the mass fraction of aromatic-featureemitting grains, q AF , in each galaxy (Sect. 3.2). The inferred parameters are given in Appendix H. -We have compared our inferred parameters to a series of additional runs, as well as to independent literature studies, in order to demonstrate the influence of the different assumptions of our model and assess the robustness of our results (Sects. 3.3 and 3.4).
3. We have displayed several well-known scaling relations involving M , M gas , M dust , 12 + log(O/H), SFR, q AF , and U , for our sample (Sect. 4).
-We have shown that there is a drastic evolution with metallicity of the dust-to-metal mass ratio (by two orders of magnitude). We have extensively discussed the different biases that could artificially produce such a trend, concluding they were unlikely (Sect. 4.1.3). -We have noticed that early-type galaxies have a systematically lower dust-to-gas mass ratio than other types in the same gas-to-stellar mass ratio range. We have investigated the possibility that this was resulting from the enhanced dust destruction due to thermal sputtering in the hot X-ray emitting gas permeating these objects. This scenario is supported by a rough negative correlation between the dust-to-star mass ratio and the X-ray photon rate per dust grain (Sect. 4.1.2). -We have displayed the well-known trends of q AF with 12 + log(O/H), and U . Our data indicate the correlation with 12 + log(O/H) is significantly better (Sect. 4.2). It implies that, at the scale of a galaxy, the overall abundance of small a-C(:H) grains might be principally controlled by the efficiency of their formation (stardust production and/or shattering of larger carbon grains). The photodestruction of small a-C(:H) might overall be circumscribed around star forming regions.

4.
We have performed a hierarchical Bayesian fit of a onezone dust evolution model to the derived M , M gas , M dust , 12 + log(O/H), and SFR of a subsample of 556 late-type and irregular objects (Sect. 5).
-We have inferred the efficiency of the three main dust evolution tuning parameters (Sect. 5.3): (i) the IMF-averaged SN II dust yield is Y SN 0.03 M /SN; (ii) the grain growth efficiency parameter ) is grow 3000; (iii) the average gas mass cleared of dust by a single SN II shock wave is m dest gas 1200 M /SN. Our results therefore imply that dust production is dominated by grain growth in the ISM above a critical metallicity of 12 + log(O/H) 8.0. They also suggest that the massive amounts of dust detected in high redshift systems (z 6) likely grew in the ISM.
-We have shown that ELMGs were crucial in constraining these parameters, as they sample a regime where dust production is dominated by SN II condensation. A steep, strongly nonlinear, dustiness-metallicity relation, such as the one we have found, is the unambiguous evidence that stardust can not dominate the content of solar metallicity systems. -We have shown and explained why these conclusions were, to first order, independent of our IMF assumption (Sect. 5.3.3). -Our model fails at reproducing the relation between the sSFR and the dust-to-star mass ratio. We suspect this is due to the oversimplicity of our SFH, inflow, and outflow prescriptions.
Several of the limitations of our study could be addressed by spatially resolving star forming regions. Performing a similar analysis on 100 pc scales within galaxies would allow us to access another important parameter: the density of the ISM. This quantity drives mantle growth and coagulation. Local variations of the dust-to-metal mass ratio are good indications of grain growth and can help us break the degeneracy with SN II destruction (Sect. 5.2.2). In addition, spatial resolution would be a way to address the origin of the trend of q AF with metallicity (Sect. 4.2), as well as resolving ETGs to quantify the grain sputtering timescale (Sect. 4.1.2). Numerous spatially resolved dust studies have already been published (e.g., Galliano et al. 2011;Draine & Hensley 2013;Hunt et al. 2015;Roman-Duval et al. 2017;Aniano et al. 2020). We are currently preparing several papers of a resolved subsample, with the same consistent approach as employed in the present manuscript (Roychowdhury et al., in prep.). The main observational challenge in order to address these degeneracies is however to obtain spatial resolution in ELMGs, at far-IR wavelengths. The data are currently very limited, because the sensitivity of Herschel was not high enough. SPICA (van der Tak et al. 2018) was the only observatory on the horizon to have the sensitivity to produce resolved far-IR maps of ELMGs. Its sudden cancelation by ESA might leave the important open questions raised in this paper unanswered for several decades.
The HB approach is one of the only methods allowing a rigorous treatment of the photometric calibration uncertainties. These uncertainties are indeed systematic effects, with both spectral and spatial correlations. Some studies account for their spectral correlations, but ignore their spatial correlations (e.g., Gordon et al. 2014). In the present paper, following G18, we assume that calibration uncertainties are perfectly correlated between sources, and partially correlated between wavelengths. In this section, we discuss the values of these uncertainties and of their correlation coefficients.
The DustPedia photometric fluxes, given in the archive 25 , F ν (λ 0 ) at wavelength λ 0 , are accompanied with a total uncertainty being the quadratic sum of noise and calibration errors: where δ C18 cal (λ 0 ) is the calibration uncertainty (Table 1 of C18). HerBIE treats the noise and calibration uncertainties separately. Therefore, we keep the noise uncertainty, σ noise ν (λ 0 ), but we build a covariance matrix of the absolute calibration uncertainties, V cal , to replace the δ C18 cal coefficients.

A.1. The full covariance matrix
Using the separation strategy (Barnard et al. 2000), we decompose the covariance matrix of the absolute calibration uncertainties as: where S cal is the diagonal matrix of standard-deviations and R cal the correlation matrix. The nontrivial elements of S cal and R cal are given in Tables A.1 and A.2. We describe, in the following Appendices A.2-A.8, how we obtained these values from the literature.

A.2. Spitzer/IRAC
The calibration of IRAC data is presented by Reach et al. (2005). It is performed using a subsample of the stellar catalog of Cohen et al. (2003). The components entering in the uncertainty on the absolute calibration are given by Eq. (13) of Reach et al. (2005) and the corresponding values in their Table 7. The first term is the correlated uncertainty, σ abs = 1.5%, corresponding to the uncertainty in the predicted fluxes of Vega and Sirius. The second term is the uncorrelated uncertainty on the calibrator fluxes: (σ 2 groud − σ 2 abs )/n = 0.87%, with σ ground = 2.3% and n = 4. Finally, there is an uncorrelated term introduced by the dispersion of the IRAC observations of the calibrators: σ rms / √ n, where σ rms = 2.0%, 1.9%, 2.1% and 2.1% for IRAC1, IRAC2, IRAC3, and IRAC4, respectively.
In addition, one must multiply the point source calibrated fluxes by aperture correction factors. According to the IRAC instrument handbook 26 (Sect. 4.11), these factors have a 10% uncertainty. We include this extra source of uncertainty, assuming it is uncorrelated. These correction factors have been applied by C18 for all the DustPedia galaxies and by  http://dustpedia.astro.noa.gr/Photometry 26 https://irsa.ipac.caltech.edu/data/SPITZER/docs/ irac/iracinstrumenthandbook/ (2015) for half of the DGS galaxies, the other half being nearly point sources. The IRAC calibration uncertainties will thus be overestimated for about 15 of our sources.

A.3. Spitzer/MIPS
MIPS calibration has been independently performed for each of its three wavebands. The 24 µm calibration is presented by Engelbracht et al. (2007). It is primarily based on observations of A stars. The authors recommend adopting a net calibration uncertainty of 4%. The 70 µm calibration is presented by Gordon et al. (2007). It is primarily based on observations of B and M stars. The authors recommend using a calibration uncertainty of 5%, dominated by repeatability scatter. Finally, the 160 µm calibration, presented by Stansberry et al. (2007) is performed on observations of asteroids, simulatenously in the three MIPS bands. This 160 µm calibration uncertainty is therefore tied to 24 and 70 µm. The authors estimate a 7% uncertainty tied to these bands, for a total uncertainty of 11.6%. To simplify, we assume that the calibration uncertainty is the quadratical sum of 4% correlated with MIPS1, 5% correlated with MIPS2, and the remaining 9.7% uncorrelated.

A.4. Herschel/PACS
PACS calibration is performed on five stars, used as primary calibrators (Balog et al. 2014). The absolute calibration uncertainty for point sources appears to be dominated by the stellar model uncertainty of 5%, correlated between bands. In addition, there is a 2% repeatability uncertainty, uncorrelated between bands.  Notes. See Table 1 for waveband naming conventions.
The extended source calibration uncertainty is not quantified by the instrument team.
A.5. Herschel/SPIRE SPIRE calibration, presented by Griffin et al. (2013) is performed on Neptune. The uncertainty comes from three sources. There is a 4% correlated uncertainty on Neptune's model. There is also a 1.5% uncorrelated uncertainty due to the noise in Neptune's observations. Finally, there is a 4% uncorrelated uncertainty on SPIRE's beam area.
A.6. WISE WISE data have been calibrated by Jarrett et al. (2011) using observations of a network of stars in both ecliptic poles (Cohen et al. 2003) and an additional red source, the galaxy NGC 6552. These calibrators have been observed with Spitzer/IRAC and Spitzer/MIPS, and compared to WISE. It appears that the relative WISE calibration, tied to Spitzer, is accurate to 2.4%, 2.8%, 4.5%, and 5.7% (for WISE1, WISE2, WISE3, and WISE4, respectively). These uncertainties come from the scatter of individual calibrators and can therefore be considered independent. The red calibrator (NGC 6552) is reported to be too bright by 6% by Jarrett et al. (2011). However, Brown et al. (2014) revised the central wavelength of the WISE4 band to correct systematic discrepancies observed with red sources. The absolute calibration accuracy of WISE is not discussed by Jarrett et al. (2011). We can simply assume that each WISE band will be tied to the closest Spitzer band: WISE1 with IRAC1; WISE2 with IRAC2; WISE3 with IRAC4; WISE4 with MIPS1. We can thus quadratically add the Spitzer calibration uncertainties to these bands and assume that this term will be correlated with the Spitzer bands. However, we do not include the IRAC uncertainty on the extended source correction (Appendix A.2), as these calibrators are point sources.

A.7. IRAS
The IRAS flux calibration is presented by Wheelock et al. (1994, Sect. VI.C.2) 27 . The 12 µm flux is calibrated on observations of the star α-Tau, assuming the absolute flux value derived from the ground-based 10 µm photometry by Rieke et al. (1985), accurate within 3%. An additional source of error comes from the uncertainty of 2% on the model used to extrapolate the flux at 12 µm. The 25 and 60 µm fluxes are calibrated by extrapolating the 12 µm flux, using stellar models, normalized to the Sun. The uncertainty in this extrapolation, based on the scatter of individual stars, is estimated to be 4% and 5.3%, respectively (Table VI.C.1 of Wheelock et al. 1994). These uncertainties, which can be assumed to be independent, have to be quadratically added to the 12 µm uncertainty, which is a fully correlated term. The 100 µm calibration is based on observations of asteroids, where the 60 µm flux is extrapolated to 100 µm, assuming that the 25/60 and 60/100 color temperatures are identical. Wheelock et al. (1994) estimate that the total relative calibration uncertainty at 100 µm is 10%, due to the model uncertainty and scatter of the asteroid observations. This uncertainty is correlated with the 60 µm.
For extended sources, uncertainties in the frequency response and the effective detector area have to be considered (Wheelock et al. 1994, Sect. VI.C.4). The uncertainty on the frequency response is not fully quantified, except that it should be 7%/µm of the central bandwidth, with an uncertainty of 0.3 µm on the bandwidth (Sect. VI.C.3 of Wheelock et al. 1994). It is also specified that the 100 µm uncertainty can be significant for objects cooler than 30 K, which is our case. The uncertainty on the detector areas is given by Table IV.A.1 of Wheelock et al. (1994) showing the flux dispersion of each individual detector, during a cross-scan of the planetary nebula NGC 6543. The average dispersions of 6.4%, 6.5%, 11%, and 10.9% (for IRAS1, IRAS2, IRAS3, and IRAS4, respectively) can be considered as an additional, independent source of uncertainty.

A.8. Planck/HFI
The calibration of the HFI data used by C18 is presented by Planck Collaboration XI (2016). The two short wavelength bands (350 and 550 µm) are calibrated on Neptune and Uranus. The uncertainty on the model is 5% including 2% spectrally independent. The statistical errors on the observations of the calibrators are 1.4% and 1.1%, respectively, which can be considered independent. The models of Neptune used here and for SPIRE (Appendix A.5) are the ESA3 and ESA4 models 28 , respectively. Both models differ only by 0.3−0.6% in the SPIRE and HFI bands. They will thus induce correlation. To simplify, we assume that the HFI calibration factors derived from Neptune (correlated with SPIRE) and Uranus (not correlated with SPIRE) are averaged. We obtain calibration uncertainties 30% lower than Planck Collaboration XI (2016), as they have opted for the more conservative, linear addition of independent errors. We choose here the more rigorous quadratrical addition.
The four long wavelength bands of HFI are calibrated on the CMB dipole. The first source of uncertainty is the scatter of individual bolometers: 0.78%, 0.16%, 0.07%, and 0.09% at 850 µm, 1.38, 2.1, and 3.0 mm. The second source of uncertainty comes from the gain variations: 0.03%, 0.04%, 0.05%, and 0.05%, respectively. These different uncertainties can be considered independent.

B.2. Derivation of the stellar mass
To further investigate the discrepancy between the CIGALE fits of N20 and Eskew et al. (2012) approximation used by Madden et al. (2014), discussed in Sect. 2.2.1, we have compared the two estimators on our sample. To close the controversy, we have modeled the UV-to-mm SED of the 35 DGS galaxies that were not in DustPedia, with CIGALE, using the same settings as Nersesian et al. (2019, i.e., using two stellar populations). We therefore have CIGALE-derived M for our whole sample. This is shown in panel a of Fig. B.2. The stellar mass derived using CIGALE with two stellar populations are in very good agreement with the Eskew et al. (2012) approximation.
Panel b of Fig. B.2 compares the same CIGALE-twopopulation estimates with the values of M reported by N20. The main differences are the following, noting that the stellar masses quoted by N20 come from the SED modeling presented by Burgarella et al. (2020). Firstly, Nersesian et al. (2019) used the UV-optical photometry of Clark et al. (2018), which have been homogenized (cf. Sect. 2.1), whereas Burgarella et al. (2020) gathered UV-optical data from the NASA/IPAC Extra- galactic Database (NED). The latter is a compilation of data from the literature that can be inhomogeneous in terms of photometric aperture, extinction correction and removal of foreground stars. Secondly, Nersesian et al. (2019) used CIGALE with two stellar populations: (i) an old exponentially decreasing SFH; (ii) a young burst. In contrast, Burgarella et al. (2020) used a single delayed SFH, which can lead to underestimating the mass, because of the outshining effect of the recent star formation (e.g., Lopes et al. 2020). For instance, Buat et al. (2014) conducted a large comparison of stellar mass estimates and concluded that models using two stellar populations were providing the best fits. This can explain why the IRAC1 and IRAC2 fluxes are systematically underestimated by the mean model of N20 (cf. their Fig. A.1), while it is not the case for Nersesian et al. (2019) 29 . Yet, these bands sample the spectral range where the bulk of the stellar mass dominates the emission. In addition, most of the galaxies of the DGS are blue compact dwarf galaxies (BCD). These galaxies are known to be actively forming stars and have an old underlying stellar population (e.g., Hunter & Gallagher 1989;Grebel 1999;Kunth & Östlin 2000;Aloisi et al. 2007;Janowiecki et al. 2017).   (2003) IMF, by N20. In both panels, the blue symbols correspond to the values presented in Nersesian et al. (2019), and the red symbols correspond to the values estimated for this study, with the same CIGALE settings. The solid yellow line represents the 1:1 relation. The dashed yellow line in panel b represents the 1:0.61 relation that accounts for the difference in IMFs.

Appendix C: Additional discussion on the parametrization of THEMIS
We have presented the way we parametrize the THEMIS size distribution in Sect. 3.1.1. It consists in linearly scaling the fractions of a-C(:H) smaller than 7 Å, and between 7 and 15 Å. We call it the linear parametrization. It is controlled by the two parameters q AF and f VSAC , introduced in Sect. 3.1.1. In the present appendix, we demonstrate, on an example, that this parametrization is flexible enough to mimic the original parametrization of THEMIS. The THEMIS size distribution of small a-C(:H) is a power-law, with minimum cut-off size a min = 4 Å and index α = 5 (Table 2 of Jones et al. 2013). We call it the nonlinear parametrization. Varying a min and α is probably more physical than our linear method, as it preserves the continuity of the size distribution. However, it produces SEDs with very similar shapes.  Fig. 1. On each panel, the red curve shows THEMIS, nonlinearly parametrized, that is varying the minimum cutoff radius, a min , and the index of the power-law1 size distribution, α. The blue curve shows THEMIS, linearly parametrized, with the method described in Sect. 3.1.1. The dots on the two SED curves (panel b) represent the synthetic photometry, that is the model integrated in the filters of Table 1. This is demonstrated on Fig. C.1. This figure displays the same quantities as Fig. 1, but compares the two parametrizations. The red curves show the size distribution (panel a) and the corresponding SED (panel b), for the nonlinear method, with a min = 5 Å and α = 4. The blue curves show the same quantities, with the linear method. We see that the broadband fluxes (dots in panel b) produced by these two parametrizations are very similar. The fact that they are not exactly the same is not important here. What is relevant is that the two methods can alter the SED with the same dynamical range.

Appendix F: Uncertainty representation
We detail here the way we display uncertainties in correlation plots throughout this article.

F.1. Skewed uncertainty ellipses
Our HB model computes the full posterior distribution of the parameters of each source. Such a distribution is usually asymmetric and correlations between parameters can be strong. Displaying this marginalized posterior distribution as density contours, for several hundreds of sources, is visually ineffective. In 2D correlation plots, uncertainty ellipses are a widely-used device to display the extent of the posterior and show the correlation between parameters. However, it does not account for its skewness. To address this issue, we display each posterior as a skewed uncertainty ellipse (SUE), which is the 1σ contour of a bivariate split-normal distribution (BSN; Villani & Larsson 2006), having the same means, variances, skewnesses, and correlation coefficient as the posterior. We add a central dot corresponding to the mode of this BSN. Figure F.1 demonstrates the different ways of displaying uncertainties, for a typical PDF (orange density). Panel c shows the corresponding SUE. It has the advantage of retaining a lot of information from the posterior, with only one dot and one contour.
Displaying a SUE is not straightforward. Indeed, a BSN is parametrized by its position vector, X 0 = (x 0 , y 0 ), its scale vector, Λ = (λ x , λ y ), its shape vector, T = (τ x , τ y ), and its rotation angle, −π/2 < θ < π/2. If we call X = (x, y) our original coordinates and X = (x , y ) the coordinates in the centered, rotated, reference frame (R being the rotation matrix): widely-used uncertainty ellipse, which can be viewed as the mode and 1σ contour of a bivariate normal distribution having the same means, standard-deviations and correlation coefficient as the PDF. Panel c: concept of SUE introduced in Appendix F.1.

F.2. Robust estimate of the posterior distribution moments
We estimate the moments of Eqs. (F.4)-(F.10) numerically, from the marginalized posterior. Some outliers can be present due to incomplete burn-in removal, requiring robust estimators. We choose M-estimators (Mosteller & Tukey 1977), which are a generalization of maximum-likelihood estimators. They have been popularized in astrophysics by Beers et al. (1990). Location and scale M-estimators, using Tukey's biweight loss function were presented by Lax (1975). Posing u i = (x i − l x )/(c × s x ), where l x = med(X) is the median of the sample X = {x i } i=1...N , s x = 1.4826 × med(|X − med(X)|) is the median absolute deviation (MAD) of X, and c is a tuning parameter, the M-estimator of the mean is: (F.20) and for the variance: Considering a second sample Y = {y i } i=1...N , an M-estimator of the covariance is presented by Mosteller & Tukey (1977). Posing v i = (y i − l y )/(c × s y ), the covariance can be estimated as: These estimators are implemented in the Python astropy module (Astropy Collaboration 2013, 2018). We have designed our own skewness W-estimator: We have also slightly improved astropy's implementation by iterating on Eqs. (F.20)-(F.23), replacing l x , l y , s x , and s y witĥ µ(X),μ(Y), V (X), and V (Y), respectively, until a 10 −5 relative accuracy is reached. The tuning parameter is usually taken as c = 6 for the mean and c = 9 for the variance, covariance, and skewness.  (2003) IMF. This is the equivalent of Fig. 16.