Highlight
Open Access
Issue
A&A
Volume 679, November 2023
Article Number A31
Number of page(s) 27
Section Cosmology (including clusters of galaxies)
DOI https://doi.org/10.1051/0004-6361/202347556
Published online 01 November 2023

© The Authors 2023

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1. Introduction

The discovery of “MACS J1149 Lensed Star 1” (Kelly et al. 2018), informally known as Icarus (Kelly et al. 2018), represented the birth of a new branch of astronomy dedicated to the study of luminous stars at cosmological distances (redshift z > 1). This feat is only possible thanks to the boost provided by extreme magnification factors, μ, with values μ > 1000 at least for short periods of time. In the case of Icarus, the object caught the attention of astronomers because it brightened by more than a magnitude in observations two years apart (Kelly et al. 2018). This variability was interpreted as due to a microlens – an object with an approximate stellar mass in the z = 0.5444 lensing cluster – that momentarily intersected the path of the light emitted by a z = 1.49 background star and increased the star’s observed brightness. After the discovery of Icarus, other examples of stars observed through a similar technique quickly followed (Rodney et al. 2018; Chen et al. 2019; Kaurov et al. 2019). The discovery of Icarus led to the prediction that the James Webb Space Telescope (JWST) should find Population III stars at z > 7 through caustic transits (Windhorst et al. 2018). That prediction formed the basis of the JWST Guaranteed Time Observations (GTO) program Prime Extragalactic Areas for Reionization and Lensing Science (PEARLS: Windhorst et al. 2023, PIDs 1176, 2738, PI: Windhorst).

The search for lensed stars currently covers a wide range in redshift and includes objects that do not show variability. One example is Godzilla (Diego et al. 2022), which was identified thanks to its anomalous magnification. Godzilla was identified as a monster star because of its extraordinary brightness. At z = 2.37, it has apparent magnitude AB ≈ 22 in the visible bands and is within reach of modest ground-based telescopes. The monster nature of Godzilla is a result of the combination of three factors: (i) it is extremely luminous and likely undergoing a major outburst similar to the Great Eruption of Eta Carinae in the 19th century; (ii) it is close to the caustic of a powerful gravitational lens; and (iii) it is further magnified by a relatively large (yet invisible) substructure up to magnification factors exceeding several thousand. The mass of this substructure is estimated to be ∼108M, consistent with being a small dwarf galaxy. Another example is Earendel at an estimated z = 6.2 and currently holding the record for the most distant star ever observed (Welch et al. 2022a), although Earendel may be a binary system (Welch et al. 2022b). Similar to Godzilla, Earendel was identified not by its change in flux but by the lack of counterimages and its proximity to the cluster critical curve. The absence of a counterimage is believed to be because the separation between the image pair is smaller than the resolving power of the Hubble Space Telescope (HST) or JWST. The magnification of Earendel is estimated to be several thousands, making it one of the most (if not the most) extremely magnified objects known to date. Other examples of lensed stars have appeared recently in the literature (Kelly et al. 2022; Diego et al. 2022; Chen et al. 2022; Meena et al. 2023a,b). Several of the newly discovered stars were detected thanks to the superior sensitivity (and spatial resolution) of JWST, especially at wavelengths > 1 μm, where red supergiant stars at z > 1 are brightest (e.g., Diego et al. 2023a).

The galaxy cluster MACS J0416.1−2403 (“M0416”) at z = 0.396 has been particularly fruitful for the detection of lensed stars. Two galaxies strongly magnified by this cluster contain at least four lensed stars (Rodney et al. 2018; Chen et al. 2019; Kaurov et al. 2019; Kelly et al. 2022; Diego et al. 2023b). Rodney et al. (2018) reported two 2014 transient events consistent with being microlensed stars in a lensing arc (nicknamed the “Spock arc”). The same arc revealed two additional 2016 transients (Kelly et al. 2018). These transients are consistent with being supergiant stars at z ≳ 1 magnified by the combined effect of the galaxy cluster (macrolens) and stars or perhaps black holes in the intracluster light (microlenses, Diego et al. 2023a). More recently, Yan et al. (2023) examined JWST/NIRCam data in four epochs separated by up to 126 days and identified 14 transients. Among these, 12 are compatible with being lensing events of distant stars.

This paper focuses on one of the 12 lensing transients (Yan et al. 2023) in a strongly lensed z = 2.091 galaxy. This peculiar object (at RA = 4:16:08.84, Dec = −24:03:58.62) was originally identified as a lensed-star candidate in the first epoch of NIRCam observations, not as a transient but as a source with extreme magnification (similar to Godzilla). The variable, or transient, nature of the source was revealed in the subsequent epochs as explained in Sect. 5. The similarity between the source discussed in this paper and Godzilla lies in the fact that both are extremely luminous stars; they are both believed to be intrinsic variables, and both are highly magnified by a dark substructure that is not directly detected and with a mass in the range of dwarf galaxies. Because of this similarity, we place this new star in the category of monster (or kaiju) stars and nickname it Mothra1 in order to easily refer to it throughout the paper. Alternatively, the more official name is EMO J041608.838−240358.60, where the acronym EMO refers to extremely magnified object.

The paper is organized as follows. Section 2 describes the JWST and HST data used in this work. Section 3 describes and interprets the spectral energy distribution (SED) of Mothra. Section 4 discusses the relevant lensing properties such as the magnification and critical curve near the lensed star. The lens model was derived using the new constraints obtained from the JWST data and is discussed in detail in a separate paper. Section 5 presents the arguments showing that Mothra must be a stellar object. Section 5.3 shows in more depth how the alternative interpretation of Mothra being a projection effect is inconsistent with the data. In Sect. 6 we discuss the observed time variability of Mothra, expected in situations where microlensing is involved or when the star is intrinsically variable. Further evidence in support of the lensing hypothesis is presented in Sect. 7 where we identify the possible counterimage, predicted in the lensing scenario. Section 8 interprets the extreme magnification of LS1 as a possible microlensing or millilensing event. Section 8.2 discusses the maximum mass allowed for the millilens. Section 9 discusses the results and, in particular, the implications for several dark matter models: cold dark matter, warm dark matter, and fuzzy dark matter. Section 10 summarizes the conclusions. Throughout this paper, the nickname Mothra refers to the physical background source. Its observed image is referred to as LS1 and its possible counterimage as LS1′.

In this work, we adopt a flat cosmology with h = 0.7 and ΩM = 0.3. For this model, 1″ corresponds to 5.34 kpc at the redshift of the cluster (z = 0.396) and 8.44 kpc at z = 2.091. For the same model, the angular diameter distances that appear in the lens equation are 1102 Mpc, 1741 Mpc, and 1196 Mpc for Dd, Ds, and Dds, respectively. These are the distances to the deflector, to the source, and from deflector to the source, respectively. The luminosity distance modulus for a source at redshift z = 2.091 is 44.41. We use the term macrolens to refer to a lens with the approximate mass of a galaxy cluster and the term microlens for stars embedded in the intracluster light (or ICL). The term millilens is used for larger deflectors with masses comparable to dwarf galaxies. Similarly, the term caustic refers to the macromodel, and the term microcaustics refers to the microlenses. A microlens with mass 1 M at the redshift of the cluster deflecting light from a source at redshift z = 2.091 would have an Einstein radius of 2.25 microarcseconds (μas). Because the Einstein radius scales as the square root of the effective mass of the lens, and the effective mass scales as the true mass times the tangential magnification, μt, (Diego et al. 2018, where the magnification μ is the product of the tangential magnification μt and the radial magnification μr), a millilens–source system with the same redshifts but with mass 104M and embedded in a macromodel magnification μ = μt × μr = 250 × 3.3 = 825 would have an Einstein radius times larger, that is ≈3.5 milliarcseconds (mas). Similarly, a millilens with a mass 100 times larger (that is 106M) would have an Einstein radius 10 times larger, that is ≈35 mas, which is slightly more than the 30 mas pixel size in JWST/NIRCam images used here. This mass range (104M–106M) is relevant for the discussion below.

2. Data

This work uses new data from the James Webb Space Telescope (JWST) together with archival data from the Hubble Space Telescope (HST). HST observed M0416 in two epochs, six months apart, in 2014 as part of the Hubble Frontier Fields program (HFF, Lotz et al. 2017). (Observations of this cluster taken in 2012 as part of the CLASH program Postman et al. 2012 were too shallow to show Mothra and are not used here). Data used here are from the ACS filters F435W, F606W, and F814W and reach depths from 28.8 mag in F435W to 29.1 mag in F814W. JWST/NIRCam observed the cluster in four epochs in 2022–2023 over a time span of about four months. Epochs 1, 2, and 4 are from the PEARLS program (Windhorst et al. 2023), and Epoch 3 is from the CANUCS program (Willott et al. 2022). Each epoch included observations in eight filters: F090W, F115W, F150W, F200W, F277W, F356W, F410M, and F444W. The PEARLS epochs reached depths (5σ point source limits in AB mag) of 28.7, 28.8, 28.9, 29.1, 28.7, 28.8, 28.2, and 28.5, respectively, while the CANUCS epoch is about 0.2−0.3 mag deeper. The three PEARLS epochs combined reach ∼0.6 mag deeper than the individual epochs. Yan et al. (2023, their Table 1) provide exact dates, exposure times, and depths reached in each epoch and each filter. This work used images with a pixel scale of 30 mas.

The strongly lensed arc that is the focus of this paper (RA = 64.03675, Dec = −24.06624) was identified by the HFF program as a lensed galaxy. This arc has two discrepant spectroscopic redshift estimates obtained from MUSE data, z = 1.827 (Richard et al. 2021) and z = 2.091 (Bergamini et al. 2023, Table A.1). We adopted z = 2.091 because this redshift is closer to the value predicted by the lens model (Diego 2023), and it is consistent with the position of the previously unidentified third counterimage of the galaxy (the arc being a pair of counterimages). The third counterimage appears in the new NIRCam data as a small, round source rather than an arc. The source is just 065 from the position predicted by the lens model when z = 2.091. In contrast, a test lens model with the arc at z = 1.827 predicts that the third counterimage should be at RA = 64.0453942, Dec = −24.0746914, but there is no object meeting the criteria of SED compatibility, surface brightness, and photometric redshift within 3″ of this position (typically, counterimages are found within ∼1″ from the predicted positions). The higher redshift is also more consistent with the arc’s photometric redshift estimate z = 2.23 ± 0.032. The third counterimage’s photometric redshift has z = 2.1 ± 0.17, consistent with this. All in all, the z = 2.091 solution is a much better match to the available data.

Figure 1 shows HST and JWST images of the arc. Each multiply lensed feature is labeled with a single letter, and the counterimage of that feature has a prime. Mothra is in the middle of the arc. The 2014 HST data show Mothra as well as knots b and b′. Knots c and c′ are seen only in JWST data and are key for the interpretation because they constrain the position of the critical curve to be close to the midpoint between c and c′ independent of any specific lens model. The new knots c and c′ emphasize the anomaly of LS1 because a counterimage, LS1′, is expected between LS1 and c but is not seen yet LS1 was visible in 2014. If LS1′ is missing because the source is transient and the two paths have different light travel times, the time difference must be > 8 years.

thumbnail Fig. 1.

HST and JWST images of the arc hosting LS1. All panels are labeled with the image’s filter, and the positions of LS1 and two multiply imaged knots (b and b′, c and c′) are marked in some panels. Unlabeled arrows point to LS1. A scale bar is to the left of the top row. The top three panels show the ACS data taken in 2014 for the HFF project. The 2022–2023 NIRCam data (three epochs combined) are shown in the middle and bottom rows. These are the discovery images for knots c and c′.

3. SED fitting

Photometry of LS1 is complicated by its location in a strongly lensed arc and by the nearby point source c′ as shown in Fig. 2. At longer wavelengths, the instrument PSF blends LS1 and c′ (Fig. 1). To overcome these limitations, we fit a point-spread function (PSF) to LS1 and to each of the six nearby knots in the three-epoch combined image. The PSF model was derived from the stacked signal of nearby, unsaturated, unresolved sources on the same image. For the filters with λ > 1.5 μm, in which LS1 and c′ are partially blended, we subtracted a model of the arc prior to PSF fitting. The model was scaled from the residual in the F150W band after point source subtraction and smoothed to match the resolution of the longer wavelength filters. Details are given in the appendix.

thumbnail Fig. 2.

Enlarged color image of Mothra’s arc and its surroundings. Colors are a combination of HST and JWST filters; F435W + F606W + F814W + F090W + F115W, F150W + F200W, and F277W + F356W + F410M + F444W for blue, green and red respectively. LS1 and three multiply lensed knots and their counterimages are labeled. No counterimage for LS1 is visible. Numerous faint, unresolved objects are marked with unlabeled magenta circles. These could be globular clusters or compact galactic remnants in the galaxy cluster. The white dashed line is the inferred position of the critical curve based on the ratio of the b–c separation (039) to the b′–c′ separation (032). The solid white curve is the expected position of the critical curve based on the lens model. The two curves are apart. The third counterimage of the galaxy identified in the new JWST images is shown in the bottom-left. Its position is RA = 64.046155, Dec = −24.0752067.

The measured SED for LS1 is shown in Fig. 3. The SED is too broad to come from a single star, but a binary system with temperatures T ≈ 14 000 K and T ≈ 5250 K matches well. Finding a binary is not surprising because most massive stars in our Galaxy are binaries, and the binary fraction of massive stars seems to go up at lower metallicities (see e.g., the discussion of Windhorst et al. 2018, for some references on this). The upper limit of likely magnifications requires the stars to be massive (initial masses ≳10 M), and therefore only low surface gravities need to be considered. Dust reddening would demand higher luminosities, and we assume zero. The best fit is a hot star with Teff = 14 000 K and log(μL/L)≈8.9 plus a cool star with Teff = 5250 K and log(μL/L)≈8.4. For an adopted magnification of μ ≈ 5000, the intrinsic luminosity of the cooler star would be log(μL/L)≈4.7, which would correspond to a yellow super/hypergiant star inside the instability strip, with initial mass M ≈ 15 M (Ekström et al. 2012; Szécsi et al. 2022). If the two stars experience a similar magnification, then the higher–Teff B-type star must have an intrinsic luminosity a factor of ≈2.5 higher. A red supergiant (Teff ≈ 4000 K) would provide a better SED fit, but Mothra exhibits significant flux variations at 1.5 μm and longer wavelengths, yet not at 0.9 or 1.15 μm (Fig. 4). This means the red component is varying, and it has to contribute significant light to the F150W band. This requires Teff ≳ 5000 K. At peak brightness, Mothra becomes redder (Yan et al. 2023), which requires the cooler component to transition into the red supergiant regime or alternatively experience a boost in luminosity accompanied by increased circumstellar dust reddening.

thumbnail Fig. 3.

Mothra’s SED and the best matching binary model. The blue and red lines represent model stellar spectra for stars with Teff and μLbol as shown in the legend. Models are from the Lejeune et al. (1997) compilation of stellar atmosphere spectra at solar metallicity redshifted to z = 2.091. The green line shows the combined SED. Black circles with error bars represent the observed photometric data and green boxes the model photometry resulting from the best-fit compound spectrum. If the two stars experience the same magnification, the high–Teff star has a bolometric luminosity ≈0.4 dex (a factor of ≈2.5) higher than that of the low–Teff star.

thumbnail Fig. 4.

Time variability of LS1. The y-axis shows the difference in flux between different epochs in apertures of radius 009 centered on LS1 (error bars are 1-σ derived from random positions outside the arc). Each color corresponds to a different combination of epochs, as indicated by the legend inside the figure. The legend gives the time of each epoch in days after Ep1.

Under the assumption of single-star evolution and similar magnifications for the two components, their luminosity ratio creates an apparent age discrepancy, since the more luminous star is expected to evolve into the Teff < 20 000 K regime significantly ahead of the lower–L star, whereas both stars here seem to be observed in these short-lived states. Due to the degeneracies involved in this two-component fit, the current SED does not allow dust effects toward the two stars to be meaningfully constrained, but the effect of significant dust attenuation and reddening toward either star would raise its Teff and intrinsic bolometric luminosity beyond what is inferred from the current fit. In a scenario where the low–Teff star is more strongly affected by dust, as could happen in the case of significant circumstellar dust around red or yellow supergiants (e.g., Massey et al. 2005; Drout et al. 2012, but see Beasor & Smith 2022 for a different view), the luminosity gap between the two stars could be reduced or even reverted.

Another possibility is that the bluer, higher–L star is the rejuvenated result of a stellar merger, with extended lifetime as a result (Glebbeek et al. 2013; Schneider et al. 2016). Merger products of this type have recently been proposed to explain the extended main-sequence turnoff problem of young star clusters (Wang et al. 2022), in which massive stars with seemingly discrepant ages seem to coexist in the same cluster. While Lhigh Teff > Llow Teff pairs of stars do not appear in local samples of red supergiant binaries (Patrick et al. 2022), our constraints on the source size of Mothra do not necessarily require the two stars contributing to the SED to be a binary pair, as they could just be two bright members of the same star cluster.

Yet another potential solution is that the shorter-wavelength, non-varying part of the SED is dominated not by a single star but by the integrated light from many young stars in a star cluster that hosts the varying yellow or red supergiant. Because of its much larger source size (parsec-scale), the star cluster would be less magnified (μ in the hundreds) than the supergiant (μ in the thousands) and would still allow Mothra to appear point-like. This solution would not have any age discrepancy between the two components because the most luminous blue star in the cluster would have a bolometric luminosity similar to or below that of the red supergiant. A young, moderately magnified star cluster (age ≲20 Myr, mass ≲105M) plus a highly magnified supergiant star could also explain Mothra’s overall SED as long as there is enough dust reddening (AV > 0.5 mag) to produce the red SED slope at λ ≲ 1 μm. The main problem with scenarios of this kind is that it is difficult to find a solution where the cluster dominates the light at λ ≤ 1.15 μm, where no significant variabilitiy is seen, yet does not completely outshine the star at λ = 1.5 μm, where Mothra varies substantially (Fig. 4). As we have been unable to find a satisfactory solution of this kind, we consider this scenario less likely than the two-star explanation for the properties of Mothra. Even if a cluster model exists, the extra parameters it would require would not improve the fit to the observations. This consideration disfavors more complex solutions of this type.

4. Lens model

Diego (2023) described a new lens model for M0416. The model uses previous lensing constraints from HST (Zitrin et al. 2013; Jauzac et al. 2014; Diego et al. 2015, 2023b; Caminha et al. 2017; Richard et al. 2021; Bergamini et al. 2021, 2023) and a new set of constraints derived from the PEARLS JWST data. Only the small area containing the arc around Mothra (Fig. 1) is relevant for the present paper, but Diego (2023) gave full details of the model.

Figure 2 shows the strongly lensed z = 2.091 galaxy that hosts Mothra. The galaxy is imaged twice, forming an elongated arc with the cluster-lens critical curve (white solid line) passing close to the midpoint of the arc. With lensing constraints from b–b′ but not c–c′, the model critical curve is offset from the c–c′ midpoint by ≈03. Such offsets between predicted and observed positions are typical in lens models. For this location in particular, the proximity of the northern BCG makes the lens model less accurate because the BCG contributes significantly to the lensing potential, but this BCG has no radial arcs near it to determine an accurate mass for it. The test model with the arc at z = 1.827 is similar, but the model critical curve is displaced ≈04 toward the SE from the curve shown in Fig. 2. This has it passing between positions LS1 and c′. The strength of the critical curve is similar to the z = 2.091 case. At either redshift, the best estimate for the true position of the critical curve is obtained from the symmetry argument below, and hence our conclusions do not depend on the arc’s redshift.

A more precise estimate for the position of the critical curve can be obtained from basic lensing principles. Galaxies that intersect cluster caustics form arcs with distinctive features that repeat on either side of the critical curve. The magnification decreases as one moves away from the caustic, usually following a well defined law μ = A/D where D is the distance to the critical curve expressed in arcseconds, and A is a normalization constant that depends on the lens model. Different portions of the critical curve have different values of A. The largest values of A are often found on elliptical lenses at the extremes of their critical curves. These extremes correspond to the cusps of the caustics in the source plane. Near the critical curve, the law μ = A/D is relatively accurate, only departing from it when D ≳ 1″. For smaller separations, the critical curve is near the midpoint between an image pair. A more precise estimate can be obtained when there are two image pairs by taking magnification into account. Here the ratio of separations b − c/b′−c′ = 0.39/0.32 = 1.22 gives a good approximation for the magnification ratio on the two sides of the critical curve. Applying this ratio to the c–c′ separation (043) puts the critical curve from c′ (and 024 from c) as marked in Fig. 2. This is 007 from LS1. Images in the region to the northwest of the critical curve have positive parity (that is, similar orientation to the unlensed image) while images appearing to the southeast of the critical curve have negative parity. In the absence of micro- or millilenses, the magnification of both negative and positive parity images is often very similar as shown in Fig. 5. When micro- or millilenses are present, the magnification of microimages corresponding to very small sources (such as stars) can be very different depending on the parity. Microimages of lensed stars with positive parity are often magnified (with respect to the macromodel magnification) by micro- and millilenses while microimages with negative parity are often demagnified (with respect to the macromodel magnification) by small deflectors (micro- and millilenses). An important takeaway from this section is the fact that LS1 is observed on the side with negative parity, so demagnification is expected, yet LS1 is observed but not its counterimage LS1′ which has positive parity and hence expected to often have more magnification that the one predicted by the macromodel. This issue will come later on when we discuss the interpretation of Mothra. A major difference between LS1 and other extremely magnified stars such as Earendel is that LS1 is not on the critical curve. The implication is that there should be a counterimage of LS1 ≈07 on the other side of the critical curve, but no such image is obvious in the current data.

thumbnail Fig. 5.

Magnification versus distance. The solid black line represents the law μ = A/D, with D being the distance to the critical curve expressed in arcseconds and A = 62″. The blue and red points are measured magnification values from our lens model, at the position of the critical curve intersecting the arc and in a direction perpendicular to the critical curve. For the critical curve we assume the is at z = 2.091. The blue points correspond to magnification values measured on the side with positive parity (northwest of the white solid curve in Fig. 2) and the red points are for magnifications measured on the side with negative parity (southeast of the same curve). At a distance of D = 0.07″, the black curve predicts magnification ≈8855.

Another important parameter is the magnification at the position of LS1. The Diego (2023) model gives normalization factor A = 62″ at the position of the z = 2.091 arc. With LS1 007 from the critical curve, μ ≈ 885. LS1 appears unresolved, and the magnification sets an upper limit on source size R < 1 parsec. A compact group of stars such as R136a would fit the size and luminosity constraints but cannot explain the lack of counterimage. To explain this, the source must be very luminous and magnified by extreme values on one side of the critical curve, while on the other side its counterimage is magnified by a factor close to the most likely value from the macromodel, μ ≈ 885. This situation is similar to Godzilla, where among the several counterimages predicted, only one is observed (Diego et al. 2022) thanks to the extra magnification provided by a nearby millilens. The lack of counterimage detection at 29.5 mag together with μ ≈ 885 gives an upper limit on the source’s intrinsic flux density corresponding to ≈36.9 mag and absolute magnitude fainter than −7.5. Supergiant stars fall into this category and are a prime candidate to explain Mothra.

The detection of LS1 at ≈28 mag requires a boost in magnification of at least ≈1.5 mag above the magnification of the counterimage. Furthermore, this boost must have been maintained over at least 8 years because the source was already visible in 2014. If microlensing is involved to explain LS1, high sustained magnification can be obtained only if the star is moving nearly parallel to the direction of the microcaustic and very close to it. That is, such a scenario would require a very high degree of fine tuning between the relative direction of motion and the orientation of the microcaustic. This scenario is explored in more detail in Sect. 8.

Finally, another important variable from our lens model is the value of the radial magnification, which to first order can be considered more or less constant along the arc. The lens model predicts for the convergence and shear, κ ≈ 0.85 and γ ≈ 0.15 respectively, at the intersection between the critical curve and arc. This results in a radial magnification factor of μr ≈ 3.3. The tangential magnification changes rapidly as one moves away form the critical curve following the canonical μ ∝ D−1.

5. An extremely magnified and compact source

Although we have already established the possible interpretation of LS1 as a binary star undergoing extreme magnification, before proceeding any further it is imperative we consider other more mundane possible interprations. LS1 and the absence of a counterimage can be interpreted in only a few ways:

  • (i)

    LS1 is a transient event, and due to time delays, we have not yet seen the transient’s counterimage, or alternatively, the counterimage has already disappeared.

  • (ii)

    There is in fact an image pair, but the two are so close to each other that they appear as an unresolved image. This scenario would be similar to Godzilla or Earendel.

  • (iii)

    The source is a foreground object, and we are seeing a projection effect.

  • (iv)

    Microlensing is temporarily increasing the flux of LS1 but not its counterimage.

  • (v)

    Millilensing is temporarily increasing the flux of LS1 but not its counterimage. The difference from microlensing is that the timescale would be longer.

The next subsections explore these possibilities.

5.1. LS1: A transient event

Among the possible scenarios above, (i) can be easily discarded because the lens model gives a time delay between b and b′ between 120 days (for a lens model that uses only the available spectroscopically confirmed systems) and 230 days (for a lens model that in addition uses the newly discovered JWST lensed systems, and with estimated geometric redshifts from the lens model). For LS1 and its counterimage the time delay, would be even shorter because the separation between them is smaller than the separation between b and b′. Because LS1 has been observed for at least 8 years, any intrinsic change in flux that makes it detectable on one side of the critical curve should make it detectable on the other side as well within the time frame covered by our JWST observations. The same argument applies even for the NIRCam observations alone, which span 126 days.

5.2. LS1: An unresolved pair of images

A key element to interpret the different possibilities is the precise position of the critical curve in relation to the source. As discussed in tye previous section, the critical curve cannot be at the position of LS1 but it must be offset by . Its counterimage should be found at a similar distance form the critical curve but in the opposite direction. Finding LS1 007 from the critical curve (Sect. 4) rules out scenario (ii). The fact that the source is found on the side with negative parity is not surprising perse. Microcaustics on the negative-parity side are in general more powerful than the corresponding caustics on the positive-parity side (for the same macromodel magnification and microlens mass). This is consistent with flux conservation arguments, since the more powerful caustics for images with negative parity compensate for the areas of demagnification which are present on this side of the critical curve, but do not exist on the side with positive parity. What is more surprising is that the magnification must have remained nearly constant for at least 8 years. Comparison between the 2014 F160W ACS data and the 2022–2023 F150W NIRCam data shows no evidence of flux variation at the position of LS1 where LS1 is fully blended with the arc in the F160W image. However, this test is limited by the relatively low resolution of HST compared to JWST. Section 6 addresses the variability of LS1 during the four NIRCam epochs.

5.3. LS1: A globular cluster or dwarf galaxy

As shown in Fig. 2, the LS1 arc is surrounded by several compact sources. These could be globular clusters or remnant galactic cores that have had their outer envelopes tidally stripped, but for simplicity we will refer to them as GCs. In the area shown in the figure, the number density of GCs is ≈1.6 arcsec−2. The area between c and c′ (the separation between c and c′ times the arc’s thickness) is ≈0.05 arcsec−2. The probability of a compact source falling in this region is thus ≈8%. Although relatively small, this probability is large enough that the possibility of LS1 being a GC in M0416 needs to be considered seriously.

LS1 is brighter than any of the GCs, making the probability that it is one considerably smaller than 8%. LS1 is also bluer than the nearby GCs. This is evident in Fig. 1, where none of the GCs is detected in HST’s F606W, and only one can be seen in F814W. Figure 6 gives more quantitative comparison via a color–magnitude plot. LS1 stands out as the bluest object with F090W < 28.5 mag. Although this suggests LS1 is not a GC, it does not entirely rule out its being an object along the line of sight, for instance in M0416 (and hence not magnified).

thumbnail Fig. 6.

Color-color plot for the globular clusters. The red points correspond to globular clusters found in the central regions of M0416, and near LS1. The blue dot is for LS1. Magnitudes are computed after fitting the star-based PSF model. LS1 is clearly an outlier when compared with the globular clusters. Errors in the magnitudes have been omitted for clarity purposes. The horizontal dashed is the mean F200W − F090W of the globular clusters. The blue dotted ones represent 1 standard deviation.

If LS1 is one of the many globular clusters observed near it, the SED will reveal the stellar population parameters. We fit the observed SED with the BayEsian Analysis of GaLaxy sEds tool (BEAGLE: Chevallard & Charlot 2016). The fit used the Bruzual & Charlot (2003) stellar population models with a Chabrier (2003) initial mass function (IMF) and a single stellar population (SSP) at fixed redshift z = 0.396. Figure 7 shows the posterior distributions of the fit parameters. If LS1 is indeed a globular cluster, it must be massive (log(M/M) > 6.8) and heavily dust-obscured (τV > 2.7). These properties are more consistent with a tidally stripped galaxy nucleus than a globular cluster. That would make the mass even higher because a galaxy nucleus, as a compact stellar structure with large escape velocities, should contain a non-negligible amount of dark matter. A fundamental problem with this picture is that a mass as high as ≈107M would change the relative magnifications of image pairs c and c′. This is not observed: these two knots have flux ratios close to unity, ruling out the hypothesis that LS1 is a ≈107M stellar structure at z = 0.396.

thumbnail Fig. 7.

Posterior distributions of the four BEAGLE SSP fit parameters assuming Mothra is a globular cluster at z = 0.396. The four derived parameters are stellar mass M, stellar age tage, dust attenuation optical depth τV, and stellar metallicity Z. The blue contours represent the 68%, 95%, and 99% contours of the full joint posterior distribution respectively and the gray shaded are represents the (marginalized) 68% interval of each individual parameter.

LS1 could have a lower stellar mass if it is less distant than M0416. A globular cluster at z = 0.12, where a few galaxies are found in the field, would have stellar mass ≈106M and would not unduly distort the c/c′ magnification ratio. The problem is that the fit of this model to the LS1 SED is poor. In particular, the observed flux densities at λ > 3 μm are ≈0.85 mag above the observed ones. (At low redshift, rest-frame and observed wavelengths are nearly the same, and globular clusters do not exhibit infrared excesses). On the opposite direction, if LS1 is a foreground object behind the cluster but in front of the lensed galaxy, its mass needs to be even larger in order to fit the photometry, worsening the lensing constraints from c and c′. Only if the redshift of LS1 is very close to the lensed galaxy (but below it), could LS1 appear as not multiply lensed and have a small lensing effect on the background galaxy. But this would require an extraordinary (and possibly entirely new) type of object since it would also be extremely magnified yet unresolved. It must be also very dust-obscured in order to explain the observed SED. Such an object should not show time variability or have a counterimage, two features typical of lensed stars which we discuss in more detail in Sects. 6 and 73. All in all, the possibility of LS1 being a foreground object is ruled out.

5.4. LS1 as a microlensing event

Microlenses are expected to be present in relatively large numbers at LS1’s position near the northern BCG in M0416. However, stellar microlenses produce relatively small caustics that can boost the magnification by the necessary factor (≥4 with respect to the macromodel magnification) only for small periods of time (usually weeks to months for typical relative velocities). A possible solution to this small time scale is if the background star is moving paralell to a microcaustic. This would boost the image with negative parity (LS1) during more extended periods while leaving the counterimage with positive party (LS1′) undetected.

In order to test the microcaustic hypothesis, we simulated a microlens near the critical curve on the side with negative parity. We adopted values for the tangential and radial magnification μt = 250 and μr = 3.3, consistent with the lens model. This results in a total magnification μ = 825. For the microlens, we adopted a mass of 1 M. Most microlenses in the intracluster light are expected to be lighter, but a few could be even more massive. More massive objects include also remnants (neutron stars and black holes) and could potentially include massive but compact candidates for dark matter such as primordial black holes. A mass of 1 M offers a good compromise between all these scenarios. Figure 8 shows the magnification and caustics from this microlens from a standard ray-tracing technique at spatial resolution of 10 nanoarcseconds per pixel. The macrocaustic at LS1’s position is smooth: a nearly horizontal line a few parsecs (equivalent to a few hundred microarcsceconds) away from the microlens.

thumbnail Fig. 8.

Caustics around a 1 M microlens on the side with negative parity at the redshift of the cluster lensing a background source at z = 2.091. The macromodel magnification in this region is ≈850, and the gray scale indicates the combined magnification μ of the macromodel and the microlens. The black horizontal band corresponds to the demagnification region that exists only for images with negative parity. The microlens can demagnify a lensed star down to μ ≈ 10 in this band. Lighter triangular regions above and below show high-magnification regions. The yellow horizontal line illustrates a trajectory grazing the bottom horizontal caustic, and the black line is for a trajectory 0.1 μas south of the yellow line. These lines correspond to the yellow and black curves in Fig. 9.

The modeled caustics form two triangular shapes (Fig. 8). The largest magnification factors are found near the cusps of the caustics in two regions ≈10 μas in maximum size. Because the microlens is on the side with negative parity, there is a central ≈1 μas region adjacent to the caustics that can demagnify sources by several magnitudes. In order to be at least one magnitude brighter than the counterimage, the source must be close to the region of maximum magnification. Figure 8 illustrates two trajectories for a source moving parallel to the demagnification zone, and Fig. 9 shows the magnifications, i.e., light curves, corresponding to these trajectories and to two others. While it is possible for a 1 M microlens to maintain a nearly constant, high magnification for eight years (≈1 μas if the relative velocity vr = 1000 km s−1), to do that, the microlens’ relative motion must be perfectly aligned with one of the microcaustics. The degree of fine tuning required makes this scenario unlikely.

thumbnail Fig. 9.

One-dimensional magnification (light curve) of a star moving in a horizontal direction across the microlens shown in Fig. 8. The black and yellow curves correspond to the tracks shown in yellow and black respectively in Fig. 8. The track for the blue curve is also horizontal and is midway between the black and yellow tracks. The red curve corresponds to a track one pixel (10 nanoarcsec) above the yellow track, where the trajectory intersects both the caustic and the area of demagnification. The radial magnification from the macromodel is 3.3, and the tangential one is 250. At large distances from the microlens, the magnification converges to the macromodel value μ = 825. The small scale fluctuations are pixel noise from the simulation.

The requirements on the direction of motion of the microlens can be relaxed if the relative velocity is smaller or if the microlens is more massive. A smaller relative velocity keeps the source in the high-magnification region longer. For the redshifts of the lens and background source, one could consider smaller values for the velocity by a factor ≈2, but this requires fine tuning between the relative motions of the observer, lens, and source. Due to the high degree of fine tuning required to explain LS1 as a microlensing event we deem this possibility as unlikely. We conclude this section by leaving the microlensing hypothesis as the only viable possibility to explain LS1 as a long duration anomaly in the flux of a lensed star.

Increasing the mass of the perturber allows a wider range of distances between the source and microcaustic because the strength of the microcaustic scales with the mass of lens. This also eliminates the need for fine tuning in the relative direction of motion since multiple trajectories near the caustic of the perturber can maintain the needed magnification for at least 8 years. The next two sections present additional evidence in favor of millilensing, and Sect. 8 describes the millilens scenario in detail.

6. Time variability

If LS1 is a small galaxy or globular cluster along the line of sight, it would not vary in flux unless it hosts a variable source such as an AGN. Microlensing of a bright star in a globular cluster of MACS0416 is very unlikely as the microlens has to have z < 0.3 in order to be an effective lens, and no foreground galaxy is observed near the line of sight. A dark microlens (brown dwarf or stellar remnant) in our Galaxy could magnify a bright star in a globular cluster at this redshift, but this would be extremely unlikely at Galactic latitude −44° and only 40° from the Galactic anticenter. This scenario would also require exceptional fine tuning to perfectly align the microlens with one of the brightest stars in the globular cluster. Microlensing of more modest stars in the cluster would result in smaller than observed changes in the flux as discussed by Dai & Pascale (2021). If instead LS1 is a small source such as a (binary) star, variability is expected because supergiant stars (especially cool ones) are often variable (Kiss et al. 2006; Yang & Jiang 2011). Also, microlensing from stars in the intracluster medium (ICM) should cause small changes in observed flux over time. At macromodel magnification factors of order 1000, as expected for LS1, even a modest mass density of microlenses of a few M pc−2 should produce observable variations (Diego et al. 2018). Also, as shown by Diego et al. (2023b), M0416 is expected to lens a wealth of z > 1 red supergiant stars that happen to fall near cluster critical curves. These stars are often intrinsic variables.

To search for time variability of LS1, we measured its flux in fixed apertures of 009 radius on each of the four single-epoch images and also on difference images described in Appendix B. PSF subtraction was not used here because aperture photometry is less sensitive to position-angle differences between epochs, and the non-variable arc contamination does not matter. Yan et al. (2023, their Table 4) did photometry by a different method, and the results are consistent. Figure 4 shows LS1’s flux changes in the four epochs. A clear trend with time is apparent with LS1 showing a large increase in brightness from Ep1 to Ep2 and then smaller increases to Epc and then Ep3. The changes are largest in the F410M filter, where the change from first to last epoch is 5σ. Similar but smaller fluctuations are observed at 1.5 μm and longer wavelengths. In contrast, there are no significant variations at 0.9 or 1.15 μm. This chromatic effect can be easily understood if the source is a binary star (or an unrelated pair of stars at separations of 1 parsec or less) with a blue and red component, and the red component is intrinsically varying in flux. The variation is about 0.65 mag (Yan et al. 2023) in F410W and F444W (rest wavelengths 1.3−1.4 μm) in 41 rest-frame days.

Difference imaging shown in Fig. 10 confirms that the variability is coming from LS1 and not some unrelated source. While some filters such as F277W, F356W, and F444W show a small offset between the variability peak and the position of LS1, this can be understood as a PSF effect. The NIRCam PSFs are asymmetric, and epochs 1 and 3 were taken at position angles (PA) differing by 42°. PA differences have the biggest effect when the source is variable, as discussed further in Appendix D.

thumbnail Fig. 10.

Difference images between first and last epochs. Each panel shows a region of centered on LS1 and in filters as labeled in each panel. A source that brightened between the two epochs shows up as dark in the figure. White crosses mark the position of LS1, and black crosses mark the positions of knot b and its equally bright counterimage b′. All images have been smoothed with a Gaussian kernel with to increase contrast.

7. The likely counterimage of LS1

The final piece of evidence in favor of LS1 being a lensed star would come from the detection of its counterimage, LS1′, on the other side of the critical curve. In a classic lensing scenario, where micro- or milli-lensing is not perturbing either of the images, we would see both images of the lensed star and with similar magnitude. Seeing an image pair would favor the lensing scenario over the projection-effect scenario because an object in the foreground would not be multiply lensed. As mentioned above, no LS1′ is directly detected in the images, but the bright arc limits the sensitivity. To subtract the arc, we created difference images in every possible wavelength pair after matching the image resolutions and in some cases normalizing the images. The results of all possible differences are shown in Figs. B.1B.3. Appendix B gives details.

Most of the difference images show no evidence for additional point sources, but there are some differences in F200W − α * F090W in Figs. B.2 and B.3. For this difference and also in other F150W − F090W and F150W − F115W, there is a significant negative fluctuation near LS1 (Fig. 11) close to the expected position of LS1′ if the critical curve is at the position of the dashed line in Fig. 2. The significance of this negative fluctuation is ≈4σ, so we cannot rule out its being an unfortunate noise fluctuation, but the fact that similar (but less prominent) fluctuations can be observed in other differences suggests that this is possibly a real source. The difference shown in Fig. 11 was obtained after adding all three differences shown in Fig. B.2 in the column labeled F200W.

thumbnail Fig. 11.

Stacked difference , where αi was chosen to minimize the contribution from the arc to the difference, and FnnnWi are all filters with wavelengths below 2 μm, that is F090W, F115W, and F150W. These were degraded to the resolution of F200W using the star-derived PSF in Appendix B. The individual differences are shown in Col. 3 of Fig. B.2. The stacked image has been smoothed with a Gaussian of to increase the contrast. The position of LS1 is marked with a white arrow. The position of the possible counterimage LS1′, ≈01 from LS1, is marked with a yellow circle.

Seeing a negative signal, which for simplicity we refer to as LS1′, in the difference image implies a color difference from LS1, which is seen as a positive source. This is only possible for a counterimage of LS1 if LS1 is a composite object, for instance a binary with a blue and a red component. The negative signal implies that in LS1′, the blue component is being magnified more than the red component. This is possible if microlensing affects the blue component more than the red one, which is possible for wide binaries or for a small group of stars with two dominant supergiants, one red and one blue. Given the relatively low significance of LS1′, we cannot be certain this is the counterimage of LS1. Additional monitoring is needed in order to confirm or reject this hypothesis. There is so far no significant time variability at the LS1′ position between the different epochs with JWST data, but if microlensing is the cause of the chromatic effect, some variability should be observed in future observations at this position.

Figure 12 shows a different difference-image search for LS1′. Here the F090W was PSF-matched to the F200W one using a convolution kernel generated with WebbPSF models. The local background near LS1 is ∼1.87× higher in the F200W image than in F090W, and F090W was normalized by this factor and subtracted from F200W to minimize the local background in the neighborhood of LS1 and LS1′. The difference shows a clear negative signal ≈01 from LS1. There are even hints of a source at that location in the unsubtracted images. LS1 and LS1′ both show negative signals, meaning that they are bluer than the local background. PSF photometry using the WebbPSF model (details described by Yan et al. 2023) gives flux densities in the difference image of 6.30 ± 2.56 and 3.64 ± 2.56 nJy for LS1 and LS1′, respectively. This result is consistent with the one shown in Fig. 11. The difference between the two results, a positive residue in LS1 and a negative one in LS1′ in the first and a negative residue for both LS1 and LS1′ in the second, could in part result from the different PSFs (star model in the first case and WebbPSF in the second one). More likely, though, is the different normalization with LS1′ being compared to LS1 in the first case and to the local background in the second.

thumbnail Fig. 12.

Independent result confirming the likely counterimage. From left to right, the first and second panels show the (negative) original images of the arc in F090W and F200W; the third panel shows the convolved image from F090W to F200W using WebbPSF models; the fourth image shows F200W − 1.87 × F090W to subtract off the local background in the neighbor of LS1. In all panels, red and blue circles show the LS1 and LS1′ positions, respectively.

In summary, LS1′ is likely the counterimage of LS1, which would definitely confirm the strongly lensed nature of Mothra. In this case, the critical curve must pass very close to the middle point between LS1 and LS1′. If so, the separation between LS1 and the critical curve is instead of . This would raise the macromodel magnification for LS1 and LS1′ to and a physical separation for the binary of less than 0.75 pc.

8. Millilensing interpretation of LS1

Having excluded the possibility that LS1 is a transient event, an unresolved double image on the critical curve, a projection effect (for instance one of the globular clusters found nearby) or a microlensing event, the only surviving hypotheses from Sect. 5 is millilensing at the position of LS1. On this side of the critical curve, images have negative parity (saddle point) and have a relatively large probability of being demagnified with respect to the macromodel magnification (e.g., Diego et al. 2018). Because LS1 has been detectable for at least 8 years, while its counterimage LS1′ has remained undetected for the same period, the only possible explanation is that a substructure is boosting the magnification locally at the position of LS1 by at least an extra factor of 4 with respect to the macromodel value (μmacro ≈ 1000). That makes the total magnification at the position of LS1 ≥4000. Microlenses can attain this only if the relative direction of motion and velocity of the background star is fine tunned. A more massive millilens has a larger caustic region which does not require any fine tunning.

A millilens with 104 < ML < 2  ×  107M can explain the large differential magnification between LS1 and a putative LS1′. The source is observed only on the side with negative parity because only on this side can produce differential magnification > 2 mag. The more massive millilens also has a larger high-magnification region than a microlens, allowing the differential magnification to persist for more than eight years, independent of the relative direction of motion of Mothra. We therefore consider the millilens hypothesis the most likely explanation for the anomalous magnification of LS1. At the low end of the mass range, 104 ≲ ML ≲ 106M, a possible source would be a small globular cluster that would not be detected in JWST images. On the high end, the millilens could be a massive globular cluster or galactic core remnant. Tidal stripping by multiple close passes by the BCG (projected distance kpc) could have made the object very compact. A central black hole would make the galaxy’s mass even more compact. Remnant cores of very massive galaxies can, however, be ruled out because such a galaxy would have a central SMBH mass exceeding the upper limit derived below.

8.1. Minimum mass of the millilens

To study the millilens possibility in more detail, we constructed a set of lens models with ML between 104M and 107M. Each model had a millilens forming a smaller critical curve ≈70 mas from the macrolens critical curve, and we simulated an area large enough to contain the source and the critical curve. The pixel size was 50 μas for the higher-mass lens and 1 μas for the lower mass. The latter scenario considered three mass distributions for the millilens: a point source, a Gaussian with σ = 10 pc, and a Gaussian with σ = 20 pc. (These correspond respectively to an intermediate-mass black hole or core-heavy globular cluster, a loose globular cluster, and a small dwarf galaxy). The macromodel magnification was fixed to μ = 800 with radial component μr = 3.3 as predicted by the lens model. Microlenses from the intracluster medium were not included in the simulation, except for the most massive millilens (just for illustration purposes Fig. 13 shows a case with microlenses). Figure 14 shows the resulting magnifications near the critical curve for three sources at different distances from the critical curve, and for two different millilenses. All three sources are stretched into lines (because μt ≫ μr), but this effect is below JWST’s resolution. All three sources produce image pairs, one image on each side of the cluster critical curve, but the counterimages for the two sources most distant from the critical curve are outside the image boundary on the left. The source near the caustic of the macromodel shows two images near the critical curve with μ > 10 000. The source magnified by the millilens has μ ∼ 4000, but its counterimage has only the macrolens magnification of ≈800. The source farthest from the critical curve is magnified mostly by the cluster and has μ ∼ 1000. For the ML = 105M case, the critical curves increase in radius by , and the probability of magnifying a background source by factors of several thousand increases accordingly.

thumbnail Fig. 13.

Millilens model with mass 107M near the critical curve and magnifying LS1. The millilens is farther from the critical curve than in Fig. 14 but still forms critical curves at the position of LS1. This simulation includes both LS1 and i LS1′ as well as the knot image pair b and b′. The geometry of the simulation mimics the observations, with a separation between b and b′ of 045 and a tangential separation (i.e., in the vertical direction) between LS1 and knot b′ of 30 mas. The numbers in yellow indicate the approximate magnification at the position of the corresponding counterimage. In this configuration, LS1′ is ≈2.5 mag fainter than LS1 and therefore unobserved. For this mass, the effect on knots b and b′ is noticeable with the magnification of b′ affected by the millilens. Stars in the intracluster medium with a surface mass density of 5 M pc−2 blur the critical curves and reshuffle large magnifications away from the critical curve. The red double-arrow segment shows the width of the saturation region as predicted by Diego et al. (2018, their Eq. (15)).

thumbnail Fig. 14.

Millilens models near the critical curve. Gray scale shows the logarithm of the magnification at each location with the critical curve from the cluster being the white vertical line and the critical curves from the millilenses being white ovals. Each panel has a millilens 007 (about twice the resolving power of JWST) to the right of the macrolens critical curve. The top panel is for a millilens with 104M, and the bottom panel is for a millilens ten times more massive. (The critical curve in the bottom panel moved toward the right owing to the contribution of the millilens to the total convergence). Three sources, modeled as Gaussians with σ = 2 μas = 0.0168 pc and marked with arrows, are being lensed. The source labeled “normal image” is far from the critical curve and magnified by the cluster macrolens with only a small contribution from the millilens. The source labeled “Images magnified by millilens” is magnified by the millilens. The source labeled “Pair of images near cluster critical curve” is close to the caustic of the cluster and forms a pair of images.

Figure 15 shows the model magnifications computed in the image plane. The point source produces sharp magnification peaks and dips. The smoother Gaussian profiles can be subcritical, that is, not producing critical curves around the millilens. Criticality can be obtained for smaller values of σ. For instance for σ = 1 pc, the deflection field can produce real critical curves around the deflector. Despite being subcritical, a deflector with mass M = 104M and σ ≲ 10 pc can produce magnification factors large enough and lasting > 10 years independent of the direction of motion of the background source provided it is close enough to the region of maximum magnification.

thumbnail Fig. 15.

Simulated magnification across the arc in the lens plane. The critical curve is at the origin. A perturber with 104M is placed ≈007 from the critical curve. Colors show three assumptions perturber profiles: light blue for a point source, dark blue for a Gaussian profile with σ = 10 pc and red for a Gaussian profile with σ = 20 pc. Left panel shows a wide range, and the right panel is a magnified version around the perturber.

Figure 16 shows a high-resolution view of the caustic in the source plane. At distances ≲28 μas from the origin, μ ≈ 10, much smaller than the macromodel μ = 800. Just outside this region, the magnification can be very high, and there is a range ≈1 μas wide where μ > 4000. A star in this region would be brighter by > 1.75 mag compared with stars magnified only by the macromodel. At relative speed of 1000 km s−1, a star moving directly toward the millicaustic takes ≈8 years to cross this distance. If it is moving more nearly parallel to the millicaustic, it would take longer to leave the region of high magnification. It would also take longer if the transverse velocity is lower. If the mass of the millilens is larger, the thickness of the region parallel to the microcaustic with magnification greater than 4000 increases linearly with the mass of the millilens Palencia et al. (2023). This results in an increase of the thickness of the caustic region with μ > 4000 by a factor ≈2. Considering all these factors, millilens masses > 104M can easily provide μ > 4000 for ≥8 years.

thumbnail Fig. 16.

Magnification in the source plane in one of the two millicaustics. Curves are for ML = 104M (lower) and 4 × 104M (upper). The origin is at the projected position of the millilens. Both curves are for a Gaussian millilens profile with σ = 2 pc.

8.2. Maximum mass of the millilens

A simple upper limit on the millilens mass can be established from the lack of obvious lensing magnification on the nearby knot c′. If the Einstein radius of the millilens were ≳1/3 of the distance between LS1 and c′, c′ would be magnified by much more than c. This requires . for the 104M millilens, and the Einstein radius scales as the square root of the mass. This requires ML ≲ 2 × 107M in order to magnify LS1 but not c′. A millilens with this mass would have a large region with μ > 4000, but, as shown in Sect. 5.3, a stellar population with this much mass would be luminous enough to have as much flux as LS1.

A more detailed model sets a tighter limit on the maximum mass of the millilens. Figure 13 shows the case of a massive millilens with mass equal to 107M. The pixel scale for this simulation is 50 μas. The millilens was placed to put its critical curves 70 mas from the critical curve of the cluster and avoid merging the cluster and millilens critical curves. To simulate the background sources, we again adopted Gaussian profiles with LS1’s source (Mothra) having σ = 0.04 pc. Stars would be much smaller than this and better approximated by a disk, but the Gaussian model suffices for illustration. The model accounts for the effect of the millilens on the neighboring knots c and c′. The source of these knots was modeled as a Gaussian with σ = 0.85 pc, mimicking a compact star-forming region or globular cluster. The positions in the source plane were adjusted to make all lensed images resemble the observation. The result is shown in Fig. 13. The ratio of magnifications between LS1 and LS1′ is > 5, as required in order to observe LS1 but not LS1′.

For illustration purposes, the model includes the effect of microlenses, ignored so far. For the microlenses, we used a simple approximation4 that the deflection field is a Gaussian random field (Dai & Pascale 2021) with dispersion directly proportional to the surface mass density of microlenses, Σ*. Real N-body simulations in smaller areas have shown that σ ≈ 0.036Σ*/(M pc−2) μas, and the model used a conservative Σ* = 5 M pc−2. We included microlenses by adding a random Gaussian field to αx and αy with dispersion σ = 0.036Σ* = 0.18 μas. For this value of the surface mass density, A = 62″ and μr = 3.3, and the width of the saturation region, i.e., the region where microlenses fully disturb the critical curve region, is (Diego et al. 2018, their Eq. (15)). Our simple simulation (Fig. 13) reproduces this width. Even for point sources as small as stars, the magnification within the saturation region typically cannot exceed ∼100 000. In the presence of microlenses, the sharp critical curves are substituted by a corrugated network of less powerful microcritical curves. Extreme magnification factors of order one million or more for background stellar objects, possible when microlenses are not present, are prohibited when microlenses are included in the calculation. To compensate, regions that are farther from the critical curve, where without microlenses the macromodel magnification would be 𝒪(1000), can momentarily magnify a background star by 𝒪(10 000). The time-integrated flux of a background star crossing the entire network of microcaustics (taking hundreds or even thousands of years) is the same whether microlenses are present or not, as required by flux conservation.

The situation represented in Fig. 13 corresponds to an extreme situation where LS1 is the result of the merging of two microimages, each with magnification ∼10 000. The result is ≳3 mag difference between LS1 and LS1′. If the two microimages are close enough to each other, the pair appears as an unresolved single source. If the source of LS1 moves farther away from the caustic of the millilens, the magnification is reduced, but the separation between the pair of microimages responsible for LS1 increases. For magnification factors ≈10 000, the separation between microimages is comparable to the pixel size in NIRCam images, which is approximately the resolving power of the telescope. Smaller magnification factors (i.e., larger separations) would make the source appear resolved, in conflict with the observations.

The Fig. 13 model predicts magnification factors for knots c and c′ between 130 and 200. The observed flux ratio is c/c′ ≈ 1, while the magnification ratio derived from the geometric distances between knots b–c and b′–c′ is (b−c)/(b′–c′)≈1.25. This magnification ratio is given by the macromodel magnification and is basically unaffected by microlenses or millilenses with masses similar to those considered. However, the Fig. 13 simulation predicts μc/μc′ ≈ 1.4, inconsistent with the observations. This is better illustrated in Fig. 17, which shows a one-dimensional plot of the magnification pattern of the model. The magnification of c′ is clearly smaller than the magnification of c in all cases because c′ falls within the demagnification region of the millilens. Thus the millilens increases the flux ratio between c and c′, in contradiction with observations. Eliminating the discrepancy requires a smaller millilens mass, which will make c′ fall in the magnification region and bring μc/μc′ closer to the observed flux ratio of 1. This sets an upper limit ≪107M on the mass of the millilens. This simulation has LS1 in the “lower” branch of the oval critical curves around the millilens. Because the tangential separation between LS1 and c′ is fixed, putting LS1 in the “upper” branch would move c′ closer to the millilens, and the magnification would be even larger. This would require an even smaller mass for the millilens not to violate observational constraints.

thumbnail Fig. 17.

Magnification of knots c and c′ for the model shown in Fig. 13. The y-axis shows the magnification along lines perpendicular to the critical curve. The small-scale fluctuations are due to microlenses. The position of the critical curve is shown as a vertical dashed line and the positions of c and c′ by vertical solid lines. The black curve shows the magnification along a horizontal line in Fig. 13 crossing 35 mas south of the millilens. The other two lines are at 20 mas and 50 mas as labeled. The three lines cover the range of estimated distances between c′ and the possible millilens.

Even with a mass of 107M for the millilens, it is difficult to simultaneously explain LS1, its lack of counterimage above the detection limit, and the separation and relative flux of c and c′. In addition, a millilens stellar mass > 107M would be directly visible in the JWST images (Sect. 5.3). The distance from the millilens to its critical curves is > 60 mas (Fig. 13), and the millilens–LS1 pair would be resolved, contradicting observations. The JWST resolution limit is nearer 30 mas, and the limit on the size of the critical curves can be scaled accordingly. The mass goes as the square of the size of the critical curve, and therefore the mass limit goes down by a factor of four. If stellar system with a mass of 2.5 × 106M would be detectable by JWST, we can rule out masses above this one on the grounds of the unresolved nature of LS1. Hence we adopt an upper limit for the mass of the millilens of 2.5 × 106M. This is about the minimum stellar mass that would be detectable by JWST at M0416’s distance (see Sect. 5).

One final issue to consider is that some or all of the millilens mass could be in the form of a black hole. Black holes are commonly found at the centers of ultra-compact-dwarfs (UCDs) orbiting around massive galaxies, such as BCGs. If the millilens contains a SMBH, by the same arguments discussed above, this should have a mass smaller than 2.5 × 106M.

A model for a millilens with ML = 2.5 × 106M (and Gaussian profile as before) is shown in Fig. 18. The spatial configuration of sources and millilens reproduces the observations. The magnification ratio between LS1 and LS1′ is > 4 as required. A third counterimage is predicted closer to the millilens but with much smaller magnification and therefore undetectable. This third counterimage would be even smaller with a more cuspy profile but with the same total mass. The model flux ratio between c and c′ is 0.9, close enough to 1.0 to agree with observations. Finally the separation between the millilens and LS1 is comparable to the resolving power of JWST, so even if the millilens is luminous enough to be observable by JWST, the LS1+millilens pair would still appear unresolved. The model is thus fully compatible with the observations.

thumbnail Fig. 18.

Model for ML = 2.5 × 106M, the maximum mass allowed for a millilens. Microlenses will be present and distort the critical curves as in Fig. 13, but their effect is omitted for clarity. The magnification for each image is indicated in yellow. The millilens is 43 mas from the magnified image LS1 as shown by the labeled scale bar. A third predicted image of Mothra with magnification ≈50 (therefore undetectable) is marked by a yellow circle.

9. Discussion

Summarizing our main results, LS1 is detected with F200W apparent magnitude ≈27.8 AB at ≈005 from the critical curve. The macromodel magnification at this position is predicted to be 𝒪(1000). While the F200W and shorter-wavelength images show hints of LS1′, the counterimage of LS1, at ≈01 from LS1, LS1′ has an apparent magnitude > 29.5. This translates into a differential magnification > 4 between LS1 and LS1′. Like LS1, LS1′ should have a macromodel magnification of 𝒪(1000). Hence, the magnification of LS1 must be ≳4000. A millilens placed along the line of sight to LS1 provides the needed extra boost in magnification making the net magnification of LS1 ≳4000. At this magnification, the earlier size constraint R < 1 pc tightens to R < 0.25 pc. This smaller value tightens the constraint on the size of the source but still does not rule out the possibility that Mothra is composed of a very small group of stars containing at least one variable red supergiant and either one blue supergiant or a few luminous blue stars adding up to a blue supergiant’s luminosity.

The lensed background source Mothra is likely composed of at least two supergiant stars, one with T ≈ 5000 K and the other one with temperature T ≈ 14 000 K. Adopting an observed luminosity log10(μLbol/L) = 8.4 for the red SG, and assuming the magnification factor is 5000, this implies an intrinsic bolometric luminosity Lbol > 5 × 104L. The bolometric luminosity of the blue SG is approximately 2.5 times larger.

The only plausible scenario for the lens is for it to have a mass at least 104M in order to create a millicaustic with a region around the caustic having magnification factors at least 4000 (2.5log10(3500/825)≈1.5 mag boost in relation to LS1′) and thickness at least 1 μas (distance travelled by a source moving at 1000 km s−1) so LS1 can appear magnified by μ > 4000 during at least 8 years without requiring it to move exactly parallel to the millicaustic. On the other hand, the mass of the millines needs to be less than 2.5 × 106M in order to not introduce anomalous flux ratios in the counterimages c and c′, and/or be directly detectable by JWST and produce a resolved image of the pair LS1+millilens. A mass of ∼105M offers a good compromise because it can easily accommodate both constraints and produce negligible effects on the c/c′ flux ratio.

9.1. Time variability: Intrinsic or microlensing

The observed time variability could arise from intrinsic variability of the redder star in the Mothra binary. An alternative possibility is that the red SG star is moving close to one of many microcaustics, while the blue SG remains too far away to undergo any significant change in flux. If the two stars are separated by a relatively large distance, d > 0.01 pc (consistent with the size constraint, d < 1 pc derived from the fact that LS1 is unresolved at magnifications factor 𝒪(1000)), this translates to an angular separation > 1 μas. This is large enough to allow a 1 M microlens (Fig. 8) to temporarily magnify the red component but not the blue one. In this case, the red SG flux change ΔF ∝ 1/|t − to|−0.5, where to is the time the microcaustic was crossed. This interpretation is, however, challenged by the small flux change between Epc and Ep3. This constancy during a caustic crossing can be attained if the red SG portion of Mothra is moving with a relatively small velocity with respect to the microcaustic, and the radius of the red SG is large. A star with radius 300 R moving with relative velocity 100 km s−1 with respect to a microcaustic moves one stellar radius (≈1 nanoarcsec) in one month, the time separation between Epc and Ep3. While possible, this scenario requires fine tuning between the three planes (observer, cluster, and source) or the direction of motion of the star (for instance, close to parallel to the microcaustic) in order to produce such small relative velocities. Moreover, most of the velocity comes from the bulk motion of the three planes (observer, cluster, and source). As shown in earlier work (Kelly et al. 2022), this cluster the Warhol and Spock arcs show several quickly evolving microlensing events, suggesting a relatively high relative transverse velocity. This makes intrinsic variability of the red component of Mothra a more likely hypothesis. This is not surprising as red SGs are known to be variable. Intrinsic variability of the red component opens the possibility to estimate its intrinsic luminosity (and hence derive the magnification) based on its observed periodicity (e.g., Soszyński et al. 2007; Yang & Jiang 2012). Such a feat requires long-term monitoring of this arc, but repeated observations of M0416 are well motivated because lenses a wealth of background sources, some of which are expected to be intrinsically variable red SG stars (Diego 2023; Yan et al. 2023).

9.2. The millilens in cold dark matter models

The mass of the millilens is constrained to be between 104M and 2.5 × 106M. A natural possibility is to consider one of the globular clusters (marked in Fig. 2) near the lensed galaxy. The observed globular clusters are more massive than the millilens, but less massive clusters would be too faint to observe and could exist. If an intermediate-mass black hole (Seth et al. 2014) contributes part of the millilens mass, the luminous mass would be correspondingly smaller and the clusters even fainter. This picture is consistent with the standard CDM model that predicts a wealth of dark matter halos in this mass range. These halos would contain also baryons that can cool down more efficiently than dark matter and form compact nuclei at the center of the halos. As the low-mass halos fall into the deepest regions in the cluster potential well, the outer parts of the halo get stripped away by tidal forces (Chilingarian et al. 2023), leaving a compact core. This mechanism works in a wide range of masses and has been invoked to explain for instance the presence of ultracompact dwarf galaxies in galaxy clusters (Pfeffer & Baumgardt 2013; Mihos et al. 2015).

Below we consider two alternative models of dark matter, that can predict a different distribution of matter below the kiloparsec scale.

9.3. Implications for the warm dark matter model

Assuming the millilens has significant contributions from dark matter (DM) we can set constraints on models of warm DM (or WDM). If the free-streaming length of the DM particle is too large, DM halos do not form below this scale, and the resulting halo mass function exhibits a cutoff at a characteristic mass (Bond & Szalay 1983; Benson et al. 2013). In WDM, the half-mode mass of the halo mass function can be related to the DM particle mass by the relation (Schneider et al. 2013; Gilman et al. 2020)

(1)

If we assume the half-mode mass is 2.5 × 106M (the WDM mass function is suppressed by a factor 1/2 at the scale of the half-mode mass), this results in a minimum mass for the DM of 14 keV. This constraint pushes the limit of previous results by a factor ≈2 (Iršič et al. 2017; Hsueh et al. 2020; Gilman et al. 2020). This result does not take into account that a halo of dark matter orbiting near the BCG would have a fraction of its mass stripped away by tidal forces. Hence this constraint should be considered with caution because the original halo would have been more massive (hence lowering the minimum mass for the DM particle). Results based on simulations suggests that up to 80% of the dark matter mass can get stripped away from infalling satellites into massive clusters (Niemiec et al. 2019). If this much mass was lost, the mass of the DM halo before infall would have been 5 times larger, that is, 1.2 × 107M and the minimum mass for the DM particle decreases to 8.7 keV.

9.4. Implications for the fuzzy dark matter model

Another interesting possibility is fuzzy dark matter (or FDM, also known as wave dark matter or axion-like-particle dark matter among other names). In this model, the dark matter can be described by a scalar field with an associated particle that is extremely light. Owing to its low mass, the associated wavelength of the particle is at the astrophysical scale. Relevant for our work is that in this model, the density field of dark matter fluctuates in length scales of the de Broglie wavelength of the DM particle (λ = h/mav) on ∼Gyr timescales. These fluctuations are often referred to as granules, and they are characterized by their physical size and mass. The granules can produce distortions along the critical curve, naturally predicting anomalous flux ratios between pairs of images (Amruth et al. 2023; Powell et al. 2023). The time variability of these granules is many orders of magnitude larger than the time spanning our observations and can be ignored.

A proper study of FDM requires constructing models with the constraints provided by the macrolens model. This is well beyond the scope of this paper, but we can present a simple analysis. In the popular axion mass range of ma = 10−22 eV, the de Broglie wavelength of the DM granules is approximately 0.3 kpc (for velocity v = 400 km s−1, Laroche et al. 2022). This is small enough to remain unresolved at the cluster distance, and these granules could act as millilenses. In the classic picture of FDM, the density fluctuations of the granules are of the same order of magnitude as the density itself Schive et al. (2014), Dalal et al. (2021). In galaxy clusters at distances of 50−100 kpc from the BCG (comparable to the distance from Mothra to the BCG), the dark matter density is typically 1 − 10 × 106M kpc−3. Thus a DM granule could have the required mass and scale to act as the millilens discussed in Sect. 8. For a galaxy cluster as massive as M0416, there would be hundreds of granules with similar density fluctuations (half of them negative and half of them positive) projected along the line of sight. Their lensing distortions would partially compensate each other, so the net effect is expected to be smaller. Nevertheless, we expect random fluctuations equivalent to the contribution from a few granules along the line of sight. These fluctuations should be ubiquitous along the critical curves (Amruth et al. 2023) and leave imprints on other knots in the same arc such as c and c′ in the form of flux distortions. In fact, there is no evidence for significant flux distortions on c and c′. Larger axion masses would form granules with smaller de Broglie wavelength and smaller mass per granule, resulting in smaller lensing distortions. The mass of the granule scales as λ3, and an axion mass 5 times larger results in granule masses around two orders of magnitude smaller, below the lower limit for the millilens (Mmin ∼ 104M). Therefore axion masses above ≈5 × 10−22 eV cannot produce millilens-like fluctuations with the required mass to explain the observations. Lighter axion masses can create granules that are more massive and give larger lensing distortions, but in this case, the DM granules extend over larger scales. These granules can be described to first order as Gaussian perturbations, and Sect. 8 showed how millilens Gaussian profiles with large σ may become undercritical at the position of LS1 and unable to produce the required large magnification factors. For a particular case of an axion mass of 0.5 × 10−22 eV, the mass of the granule would be ∼107M. A Gaussian granule with this mass, placed at the right position to magnify Mothra to factors of several thousand, does not produce critical curves if σ > 225 pc. Masse < 0.5 × 10−22 eV are then in conflict with the de Broglie wavelength which would be λ ≈ 600 pc for this particular axion mass model. Therefore the mass of the axion cannot be much smaller than 10−22 eV in order to produce extreme magnification at the position of LS1. As a conservative approximation, we consider 0.5 × 10−22 eV to be the lower limit for the axion mass consistent with the observations. Masses in the range 5 × 10−23 < ma < 5 × 10−22 eV are then (in principle) consistent with the lensing constraints discussed in this paper. As noted earlier, this conclusion is based on significant approximations, so it should be regarded with caution. A better treatment should consider a range of momenta that affect the de Broglie wavelength and consider more carefully the effect on magnification after projecting multiple granules along the line of sight.

10. Conclusions

Mothra, a new kaiju (or monster) star at redshift z = 2.091, is being magnified to extreme factors by the combined effect of a massive galaxy cluster and a millilens along the line of sight. The star is best described as a binary system with two supergiant stars, a hot one with temperature T ≈ 14 000 K, and a cooler one with T ≈ 5000 K. Mothra’s flux changed at observed wavelengths ≥1.5 μm, which we attribute to intrinsic variability in the cooler star. A possible counterimage may exist (with low confidence) at the expected position on the opposite side of the lens-model critical curve, but it must have significantly lower magnification than the main image. Hypotheses to explain the anomalous magnification of Mothra and its counterimage include a transient event, a globular cluster or foreground object, a cluster-caustic crossing, microlensing, or a millilens. Among these, only millilensing offers a satisfactory explanation. In particular, the fact that Mothra was detected in 2014 as part of the HFF program is key to constraining the minimum mass of the millilens to > 104M. At the opposite extreme, the lack of distortions on a nearby image pair (observed for the first time by JWST) and the fact that no millilens is detectable in the JWST images set an upper limit on the mass of the millilens ≤2.5 × 106M.

One possibility for the millilens is a small globular cluster, undetected in JWST images. Because this millilens is undetected, it could be dominated by dark matter. Such a millilens sets significant constraints on models of dark matter. Our findings are consistent with expectations from the cold dark matter model, but warm dark matter models where the dark matter particle is lighter than 8.7 keV are excluded by the existence of such a millilens. Models of fuzzy dark matter could in principle reproduce the observations but only if the axion mass is 5 × 10−23 < ma < 5 × 10−22 eV.


1

As explained in Sect. 5, Mothra is found in the demagnification (low flux) area, similar to moths being found at night.

3

An AGN type-of-object could show variability but more in the blue tan in the red component, opposite to what we find in Sect. 6. Also, reddening would be incredibly high given the typical blue nature of this type of objects.

4

A real simulation over an area of 06 that resolves stellar microlenses would require a pixel size smaller than 1 μas. That would require > 109 pixels, which is computationally demanding. The simple approximation used here is accurate enough for present purposes.

Acknowledgments

This work is dedicated to the memory of our dear colleague Mario Nonino. A kind and gentle person, and an example for many. J.M.D. acknowledges the support of project PGC2018-101814-B-100 (MCIU/AEI/MINECO/FEDER, UE) Ministerio de Ciencia, Investigación y Universidades. J.M.D. acknowledges the hopsitality of University of Pennsylvania that hosted him during part of this work. A.Z., L.J.F. and A.K.M. acknowledge support by Grant No. 2020750 from the United States-Israel Binational Science Foundation (BSF) and Grant No. 2109066 from the United States National Science Foundation (NSF), and by the Ministry of Science & Technology, Israel. E.Z. acknowledges funding from the Swedish National Space Agency and grant 2022-03804 from the Swedish Research Council. This work is based on observations made with the NASA/ESA/CSA James Webb Space Telescope. The data were obtained from the Mikulski Archive for Space Telescopes at the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS 5-03127 for JWST. These observations are associated with JWST programs 1176 and 2738. R.A.W., S.H.C., and R.A.J. acknowledge support from NASA JWST Interdisciplinary Scientist grants NAG5-12460, NNX14AN10G and 80NSSC18K0200 from GSFC. Work by CJC acknowledges support from the European Research Council (ERC) Advanced Investigator Grant EPOCHS (788113). B.L.F. thanks the Berkeley Center for Theoretical Physics for their hospitality during the writing of this paper. M.A.M. acknowledges the support of a National Research Council of Canada Plaskett Fellowship, and the Australian Research Council Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), through project number CE17010001. C.N.A.W. acknowledges funding from the JWST/NIRCam contract NASS-0215 to the University of Arizona. We also acknowledge the indigenous peoples of Arizona, including the Akimel O’odham (Pima) and Pee Posh (Maricopa) Indian Communities, whose care and keeping of the land has enabled us to be at ASU’s Tempe campus in the Salt River Valley, where much of our work was conducted. We thank the CANUCS team for generously providing early access to their proprietary data of M0416.

References

  1. Amruth, A., Broadhurst, T., Lim, J., et al. 2023, Nat. Astron., 7, 736 [NASA ADS] [CrossRef] [Google Scholar]
  2. Beasor, E. R., & Smith, N. 2022, ApJ, 933, 41 [NASA ADS] [CrossRef] [Google Scholar]
  3. Benson, A. J., Farahi, A., Cole, S., et al. 2013, MNRAS, 428, 1774 [NASA ADS] [CrossRef] [Google Scholar]
  4. Bergamini, P., Rosati, P., Vanzella, E., et al. 2021, A&A, 645, A140 [EDP Sciences] [Google Scholar]
  5. Bergamini, P., Grillo, C., Rosati, P., et al. 2023, A&A, 674, A79 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  6. Bond, J. R., & Szalay, A. S. 1983, ApJ, 274, 443 [NASA ADS] [CrossRef] [Google Scholar]
  7. Bruzual, G., & Charlot, S. 2003, MNRAS, 344, 1000 [NASA ADS] [CrossRef] [Google Scholar]
  8. Caminha, G. B., Grillo, C., Rosati, P., et al. 2017, A&A, 600, A90 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  9. Chabrier, G. 2003, PASP, 115, 763 [Google Scholar]
  10. Chen, W., Kelly, P. L., Diego, J. M., et al. 2019, ApJ, 881, 8 [NASA ADS] [CrossRef] [Google Scholar]
  11. Chen, W., Kelly, P. L., Treu, T., et al. 2022, ApJ, 940, L54 [NASA ADS] [CrossRef] [Google Scholar]
  12. Chevallard, J., & Charlot, S. 2016, MNRAS, 462, 1415 [NASA ADS] [CrossRef] [Google Scholar]
  13. Chilingarian, I., Grishin, K., Afanasiev, A. V., et al. 2023, ArXiv e-prints [arXiv:2306.08049] [Google Scholar]
  14. Dai, L., & Pascale, M. 2021, ArXiv e-prints [arXiv:2104.12009] [Google Scholar]
  15. Dalal, N., Bovy, J., Hui, L., & Li, X. 2021, JCAP, 2021, 076 [CrossRef] [Google Scholar]
  16. Diego, J. E. 2023, ApJ, 1208 [Google Scholar]
  17. Diego, J. M., Broadhurst, T., Molnar, S. M., Lam, D., & Lim, J. 2015, MNRAS, 447, 3130 [NASA ADS] [CrossRef] [Google Scholar]
  18. Diego, J. M., Kaiser, N., Broadhurst, T., et al. 2018, ApJ, 857, 25 [NASA ADS] [CrossRef] [Google Scholar]
  19. Diego, J. M., Pascale, M., Kavanagh, B. J., et al. 2022, A&A, 665, A134 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  20. Diego, J. M., Meena, A. K., Adams, N. J., et al. 2023a, A&A, 672, A3 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  21. Diego, J. M., Kei Li, S., Meena, A. K., et al. 2023b, A&A, submitted [arXiv:2304.09222] [Google Scholar]
  22. Drout, M. R., Massey, P., & Meynet, G. 2012, ApJ, 750, 97 [NASA ADS] [CrossRef] [Google Scholar]
  23. Ekström, S., Georgy, C., Eggenberger, P., et al. 2012, A&A, 537, A146 [Google Scholar]
  24. Gilman, D., Birrer, S., Nierenberg, A., et al. 2020, MNRAS, 491, 6077 [NASA ADS] [CrossRef] [Google Scholar]
  25. Glebbeek, E., Gaburov, E., Portegies Zwart, S., & Pols, O. R. 2013, MNRAS, 434, 3497 [NASA ADS] [CrossRef] [Google Scholar]
  26. Hsueh, J. W., Enzi, W., Vegetti, S., et al. 2020, MNRAS, 492, 3047 [NASA ADS] [CrossRef] [Google Scholar]
  27. Iršič, V., Viel, M., Haehnelt, M. G., et al. 2017, Phys. Rev. D, 96, 023522 [CrossRef] [Google Scholar]
  28. Jauzac, M., Clément, B., Limousin, M., et al. 2014, MNRAS, 443, 1549 [Google Scholar]
  29. Kaurov, A. A., Dai, L., Venumadhav, T., Miralda-Escudé, J., & Frye, B. 2019, ApJ, 880, 58 [NASA ADS] [CrossRef] [Google Scholar]
  30. Kelly, P. L., Diego, J. M., Rodney, S., et al. 2018, Nat. Astron., 2, 334 [NASA ADS] [CrossRef] [Google Scholar]
  31. Kelly, P. L., Chen, W., Alfred, A., et al. 2022, ArXiv e-prints [arXiv:2211.02670] [Google Scholar]
  32. Kiss, L. L., Szabó, G. M., & Bedding, T. R. 2006, MNRAS, 372, 1721 [Google Scholar]
  33. Laroche, A., Gilman, D., Li, X., Bovy, J., & Du, X. 2022, MNRAS, 517, 1867 [NASA ADS] [CrossRef] [Google Scholar]
  34. Lejeune, T., Cuisinier, F., & Buser, R. 1997, A&AS, 125, 229 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  35. Lotz, J. M., Koekemoer, A., Coe, D., et al. 2017, ApJ, 837, 97 [Google Scholar]
  36. Massey, P., Plez, B., Levesque, E. M., et al. 2005, ApJ, 634, 1286 [NASA ADS] [CrossRef] [Google Scholar]
  37. Meena, A. K., Chen, W., Zitrin, A., et al. 2023a, MNRAS, 521, 5224 [NASA ADS] [CrossRef] [Google Scholar]
  38. Meena, A. K., Zitrin, A., Jiménez-Teja, Y., et al. 2023b, ApJ, 944, L6 [NASA ADS] [CrossRef] [Google Scholar]
  39. Mihos, J. C., Durrell, P. R., Ferrarese, L., et al. 2015, ApJ, 809, L21 [Google Scholar]
  40. Niemiec, A., Jullo, E., Giocoli, C., Limousin, M., & Jauzac, M. 2019, MNRAS, 487, 653 [NASA ADS] [CrossRef] [Google Scholar]
  41. Palencia, J. M., Diego, J. M., Kavanagh, B. J., & Martinez, J. 2023, A&A, submitted [arXiv:2307.09505] [Google Scholar]
  42. Patrick, L. R., Thilker, D., Lennon, D. J., et al. 2022, MNRAS, 513, 5847 [Google Scholar]
  43. Pfeffer, J., & Baumgardt, H. 2013, MNRAS, 433, 1997 [NASA ADS] [CrossRef] [Google Scholar]
  44. Postman, M., Coe, D., Benítez, N., et al. 2012, ApJS, 199, 25 [Google Scholar]
  45. Powell, D. M., Vegetti, S., McKean, J. P., et al. 2023, MNRAS, 524, L84 [NASA ADS] [CrossRef] [Google Scholar]
  46. Richard, J., Claeyssens, A., Lagattuta, D., et al. 2021, A&A, 646, A83 [EDP Sciences] [Google Scholar]
  47. Rodney, S. A., Balestra, I., Bradac, M., et al. 2018, Nat. Astron., 2, 324 [Google Scholar]
  48. Schive, H.-Y., Chiueh, T., & Broadhurst, T. 2014, Nat. Phys., 10, 496 [NASA ADS] [CrossRef] [Google Scholar]
  49. Schneider, A., Smith, R. E., & Reed, D. 2013, MNRAS, 433, 1573 [CrossRef] [Google Scholar]
  50. Schneider, F. R. N., Podsiadlowski, P., Langer, N., Castro, N., & Fossati, L. 2016, MNRAS, 457, 2355 [Google Scholar]
  51. Seth, A. C., van den Bosch, R., Mieske, S., et al. 2014, Nature, 513, 398 [CrossRef] [Google Scholar]
  52. Soszyński, I., Dziembowski, W. A., Udalski, A., et al. 2007, Acta Astron., 57, 201 [EDP Sciences] [Google Scholar]
  53. Szécsi, D., Agrawal, P., Wünsch, R., & Langer, N. 2022, A&A, 658, A125 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  54. Wang, C., Langer, N., Schootemeijer, A., et al. 2022, Nat. Astron., 6, 480 [NASA ADS] [CrossRef] [Google Scholar]
  55. Welch, B., Coe, D., Diego, J. M., et al. 2022a, Nature, 603, 815 [NASA ADS] [CrossRef] [Google Scholar]
  56. Welch, B., Coe, D., Zackrisson, E., et al. 2022b, ApJ, 940, L1 [NASA ADS] [CrossRef] [Google Scholar]
  57. Willott, C. J., Doyon, R., Albert, L., et al. 2022, PASP, 134, 025002 [CrossRef] [Google Scholar]
  58. Windhorst, R. A., Timmes, F. X., Wyithe, J. S. B., et al. 2018, ApJS, 234, 41 [NASA ADS] [CrossRef] [Google Scholar]
  59. Windhorst, R. A., Cohen, S. H., Jansen, R. A., et al. 2023, AJ, 165, 13 [NASA ADS] [CrossRef] [Google Scholar]
  60. Yan, H., Ma, Z., Sun, B., et al. 2023, ApJS, submitted [arXiv:2307.07579] [Google Scholar]
  61. Yang, M., & Jiang, B. W. 2011, ApJ, 727, 53 [NASA ADS] [CrossRef] [Google Scholar]
  62. Yang, M., & Jiang, B. W. 2012, ApJ, 754, 35 [NASA ADS] [CrossRef] [Google Scholar]
  63. Zitrin, A., Meneghetti, M., Umetsu, K., et al. 2013, ApJ, 762, L30 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A: Photometry

To estimate the flux of the point sources in the arc, including LS1, we adopted a data-driven model for the PSF in each band. The advantage of this approach over the use of a precomputed model (such as WebbPSF) is that the data-driven PSF model has to be consistent with the data. For example, the diffraction spikes of the stacked stars are exactly in the same orientation as in the targets. The constant perforation of the mirror by microasteroids results in degradation of the PSF, in particular its tails. Thermal variability across the mirror can result also in minute corrections to the PSF that may not be properly captured by the PSF model. Finally, because we are stacking three epochs with three different position angles, the use of stars instead of a PSF model guarantees the correct orientation of all instrumental effects.

We constructed our PSF model from six stars near LS1. These stars are not saturated so the flux in the central regions is accurate. These stars have spatial shifts between the HST and JWST epochs of less than 025. This translates into a negligible shift of < 10 mas between the JWST epochs. The positions of the six stars are given in Table A.1.

Table A.1.

Position of the six stars used to build the PSF.

We selected a region of 50x50 pixels around each star and supersample this region with a pixel size of 3 mas. We then aligned the stars using a simple Gaussian model centered in the central pixel for the main peak of the star and stacked their images. The resulting average signals for the star models are shown in Figure A.1 for the eleven filters considered in this work (three from HST plus eight from JWST).

thumbnail Fig. A.1.

Stacked signal of six unsaturated stars in the same field in different bands and after coadding all three epochs (for JWST). The stars have been aligned before stacking using steps of 3 mas. In order to better show the sidelobes and diffraction spikes, we plot the logarithm of the stacked signal. The stacked signal is used as a model for the PSF in each filter.

Using this model for the PSF, we estimated the flux of the seven point sources in the arc containing LS1. These seven point sources are shown in Figure A.2 for the case of F150W. The central source is LS1, and the other six point sources correspond to three objects in the arc that are multiply lensed. The flux for each source was obtained after subtracting the corresponding PSF model rescaled by the total flux. As an example, Figure A.2 (right) shows the residual image in the F150W filter. The two brightest sources in the arc show a residue, which is repeated on both sides of the arc, evidence that one is a counterimage of the other.

thumbnail Fig. A.2.

Example of flux estimation using the stacked-star PSF model. The circles mark the positions of seven sources for which the magnitudes were obtained after fitting to the stacked profile of six nearby (unsaturated) stars. The numbers show the estimated magnitude in the F150W band. The left panel shows the original data, and the right panel shows the data after point-source subtraction. LS1 is marked with an orange circle in the left panel. The two yellow circles on the right panel show two regions around the brightest multiply lensed point source, b and b′, where additional unresolved features (also multiply lensed) can still be observed after the source removal. The same feature is observed in other filters suggesting that this is a real feature and not an artifact.

The PSF-fitting was repeated for all filters. For the filters F200W, F277W, F356W, F410M, and F444W, it is more difficult to distinguish the point sources from the underlying arc. For these filters, we used the clean image obtained in the F150W (right panel of Figure A.2) as a model, or template, for the arc. Under the assumption that the arc has similar spectral features along the entire arc, we degraded the resolution of the template to the resolution of the other filters and subtracted it before performing the photometric measurement of the point sources. The degradation was done in Fourier space by rescaling each Fourier mode with wavenumber k by the factor , where PR(k) is the power spectrum of the raw data from which we want to subtract the arc, and PT(k) is the power spectrum of the template (in our case the F150W data after point source subtraction). Once the template was degraded to the desired resolution, we subtracted it from the raw data (at the same resolution), and we estimated the flux of the remaining point sources. This process is graphically shown in Figure A.3. The final photometry for LS1 and for the stacked three epochs of the PEARLS program are shown in Table A.2. Using the more sensitive F200W instead of F150W as the template gives consistent results in the redder bands, but F150W has better spatial resolution and allows the template from F200W to be subtracted. Yan et al. (2023) presented an alternative approach to photometry of Mothra using WebbPSF instead of the empirical PSF used here. Their results are consistent with ours within the uncertainties, and in particular, Yan et al. (2023) also found variability in the red component of Mothra.

thumbnail Fig. A.3.

Illustration of the process followed to estimate the magnitudes of unresolved sources in the arc at 2 micron and larger wavelengths. The left column shows the raw data in each filter. The second column shows the template degraded to the resolution of the raw data following the process described above. The third column shows the difference Raw-Template. In this difference, the point sources are evident in all bands, while some are not so clearly visible in the raw data. Finally, the last column shows the residue left after fitting the point sources with the corresponding star shapes from Figure A.1. The residue has very little signal left, specially in the longest wavelengths. In all cases the template is the point source subtracted F150W image after degrading the resolution to the one corresponding to each band. The template is shown in the right panel of Figure A.2, and at its native resolution.

Table A.2.

Photometry of LS1 in the stacked three PEARLS epochs.

Appendix B: Band differences

Image differences can enhance faint details compared to single-band images, especially when the details are hidden by brighter nearby features. To search for a possible LS1 counterimage or anything else hidden in the arc, we made difference images from each pair of NIRCam images. In all cases, the shorter-wavelength, higher-resolution image was first convolved with a Gaussian kernel to match the longer-wavelength, lower-resolution image in the pair, then subtracted. Three distinct methods were used:

  • M1:

    a simple image subtraction after matching the resolution. Results are shown in Figure B.1. In this scheme, any source with a flat SED will show zero flux. LS1 is very faint in all images because it has a nearly flat SED, but the arc is seen because its SED rises with wavelength.

  • M2:

    an attempt to subtract only the arc. The seven unresolved sources (a, b, c, LS1, c′, b′, and a′) were first subtracted from the shorter-wavelength image, that image was then normalized by a constant factor α and convolved to the resolution of the longer-wavelength image, and that image was subtracted. The factor α was chosen for each image pair to minimize the residual of the arc. Figure B.2 shows the result. The arc disappears, as expected, but LS1 shows up well. These images were used for photometry of LS1 (Section 3). A faint negative image just northwest of LS1 can be seen in the F200W column indicating the possible presence of a source bluer than the arc.

  • M3:

    an attempt to subtract both point sources and the arc. The seven unresolved sources were subtracted from both images, then the shorter-wavelength image was convolved, normalized, and subtracted. Figure B.3 shows the results. Residuals are noticeable for sources b and b′ because they are not well described by a single point source. The faint negative image again shows up in the F200W column and in F150W–F090W. LS1 has a very small residual in all images, showing that the photometry in Figure 3 is accurate.

thumbnail Fig. B.1.

Simple image subtraction M1 with all NIRCam combinations. The positive image is shown by labels above each column, and the negative image is shown by labels to the left of each row. For example, the top-left panel shows F115W–F090W. Sources are labeled in the top-left panel with an arrow showing the location of LS1. Crosses in all panels show the source positions.

thumbnail Fig. B.2.

Image subtraction M2 removing the arc. Other details are the same as Figure B.1.

thumbnail Fig. B.3.

Image subtraction M3 removing the arc and the point sources. Other details are the same as Figure B.1.

Appendix C: Profile along the arc

To find the one dimensional profile along the arc hosting Mothra, we found the circle that best fits the arc. It has radius 3795 and is centered at RA=64.0359850, Dec=−24.0669947. This circle passes through all seven point sources identified in the arc. Figure C.1 shows the profiles for all JWST bands along this circle. All the profiles are contaminated by the emission of the ICL, which varies with frequency and peaks at 2 μm. The right panel of Figure C.1 removes much of the contamination by rescaling the profiles. This reduces the impact of the host galaxy and ICL and better shows the relative brightness of the peaks. Because each band has a different resolution, and the amplitude of a point source is reduced with poorer resolution, dashed lines show the expected change in flux for a source with constant flux (and equal to the flux of the source in F090W).

thumbnail Fig. C.1.

One-dimensional profiles along the Mothra arc. The left panel shows the observed surface-brightness profiles along the arc. Image a′ is on the left, Mothra is the peak in the middle at 3″, and a is on the right. The inset zooms in around the position of Mothra. The right panel shows the profiles rescaled by a multiplicative factor to match the amplitude of the F090W profile at ≈42. The black dotted line shows a power-law model that traces the ICL with d being the distance to the BCG in arcseconds. For reference, Mothra is 63 from the BCG. The horizontal dashed lines in the inset plot show the combined effect of the PSF dilution and rescaling. A source with the same flux in all bands as in F090W that is convolved by the PSF of each band would have its amplitude at the corresponding dashed lines.

Appendix D: Offsets from VSVPA

The end of Section 6 mentioned a small offset in source positions found in the difference images for different epochs. Simulations identify an effect we call the varying source varying position angle (VSVPA). VSVPA manifests itself when images at different position angles are differenced and sources are varying in flux. The combined effect of varying flux and PSF results in asymmetric residuals in the difference between epochs, and the residuals are not necessarily centered at the position of the source. With an asymmetric PSF, offsets can be seen even when flux is constant, but offsets are more evident when the flux is changing between epochs.

To quantify the VSVPA effect, we created a simulation that mimics the real observations around the Mothra arc. First we simulate a thin arc at the same orientation as the real arc (428 counterclockwise from west). Then we added two point sources along the arc with approximately the same flux ratio as between Mothra and the underlying arc. Then we convolved the simulation with the epoch-1 PSF in different filters. We did the same for epoch 3 but with the flux of one of the point sources by increased by 50%, which is approximately the estimated relative increase in flux of Mothra between epochs 1 and 3. Then we added Gaussian noise to each epoch with similar standard deviation to the real data. Finally we subtracted the simulated epoch 3 from simulated epoch 1. The different steps and final result are shown in Figure D.1 for two filters. F356W (first two rows) shows an offset of ≈007 between the position of Mothra and the position of the excess flux, but F410M (bottom two rows) shows no offset.

thumbnail Fig. D.1.

Illustrations of the VSVPA effect with mock data. The top two panels show the mock data F356W, and the bottom two panels show F410M. The first row of each pair shows noise-free data, and the second row shows results with noise added. From left to right, panels show simulated epoch 1, simulated epoch 3, and the difference image. The difference image is smoothed with a 009 Gaussian as for the real data. All images are 15 across. The white cross marks the position of the varying point source. The arms of the cross are 006 end-to-end.

We repeated the VSVPA simulation 1000 times, each time with a different realization of the noise. The measured offsets are shown in Figure D.2. Offsets comparable in magnitude to the one observed in the real data show up in a significant fraction of the simulations. Sources that are not varying in time, such as the point source in the southern portion of the arc, do not show residuals. Hence, in order to see this effect in the actual JWST data, a varying source is needed. The effect can then be explained as a combination of a varying source and a PSF varying because of the varying JWST position angle. While we have not conducted an exhaustive study of the VSVPA effect, we expect it to depend on the specific position angles and filters. Also, we noticed that the offset is smaller for brighter sources that have larger variations in flux. In summary, the position offsets observed in the NIRCam data can be satisfactorily explained by the instrumental effect of varying position angle combined with an intrinsically varying source.

thumbnail Fig. D.2.

Distribution of offsets from 1000 realizations.

All Tables

Table A.1.

Position of the six stars used to build the PSF.

Table A.2.

Photometry of LS1 in the stacked three PEARLS epochs.

All Figures

thumbnail Fig. 1.

HST and JWST images of the arc hosting LS1. All panels are labeled with the image’s filter, and the positions of LS1 and two multiply imaged knots (b and b′, c and c′) are marked in some panels. Unlabeled arrows point to LS1. A scale bar is to the left of the top row. The top three panels show the ACS data taken in 2014 for the HFF project. The 2022–2023 NIRCam data (three epochs combined) are shown in the middle and bottom rows. These are the discovery images for knots c and c′.

In the text
thumbnail Fig. 2.

Enlarged color image of Mothra’s arc and its surroundings. Colors are a combination of HST and JWST filters; F435W + F606W + F814W + F090W + F115W, F150W + F200W, and F277W + F356W + F410M + F444W for blue, green and red respectively. LS1 and three multiply lensed knots and their counterimages are labeled. No counterimage for LS1 is visible. Numerous faint, unresolved objects are marked with unlabeled magenta circles. These could be globular clusters or compact galactic remnants in the galaxy cluster. The white dashed line is the inferred position of the critical curve based on the ratio of the b–c separation (039) to the b′–c′ separation (032). The solid white curve is the expected position of the critical curve based on the lens model. The two curves are apart. The third counterimage of the galaxy identified in the new JWST images is shown in the bottom-left. Its position is RA = 64.046155, Dec = −24.0752067.

In the text
thumbnail Fig. 3.

Mothra’s SED and the best matching binary model. The blue and red lines represent model stellar spectra for stars with Teff and μLbol as shown in the legend. Models are from the Lejeune et al. (1997) compilation of stellar atmosphere spectra at solar metallicity redshifted to z = 2.091. The green line shows the combined SED. Black circles with error bars represent the observed photometric data and green boxes the model photometry resulting from the best-fit compound spectrum. If the two stars experience the same magnification, the high–Teff star has a bolometric luminosity ≈0.4 dex (a factor of ≈2.5) higher than that of the low–Teff star.

In the text
thumbnail Fig. 4.

Time variability of LS1. The y-axis shows the difference in flux between different epochs in apertures of radius 009 centered on LS1 (error bars are 1-σ derived from random positions outside the arc). Each color corresponds to a different combination of epochs, as indicated by the legend inside the figure. The legend gives the time of each epoch in days after Ep1.

In the text
thumbnail Fig. 5.

Magnification versus distance. The solid black line represents the law μ = A/D, with D being the distance to the critical curve expressed in arcseconds and A = 62″. The blue and red points are measured magnification values from our lens model, at the position of the critical curve intersecting the arc and in a direction perpendicular to the critical curve. For the critical curve we assume the is at z = 2.091. The blue points correspond to magnification values measured on the side with positive parity (northwest of the white solid curve in Fig. 2) and the red points are for magnifications measured on the side with negative parity (southeast of the same curve). At a distance of D = 0.07″, the black curve predicts magnification ≈8855.

In the text
thumbnail Fig. 6.

Color-color plot for the globular clusters. The red points correspond to globular clusters found in the central regions of M0416, and near LS1. The blue dot is for LS1. Magnitudes are computed after fitting the star-based PSF model. LS1 is clearly an outlier when compared with the globular clusters. Errors in the magnitudes have been omitted for clarity purposes. The horizontal dashed is the mean F200W − F090W of the globular clusters. The blue dotted ones represent 1 standard deviation.

In the text
thumbnail Fig. 7.

Posterior distributions of the four BEAGLE SSP fit parameters assuming Mothra is a globular cluster at z = 0.396. The four derived parameters are stellar mass M, stellar age tage, dust attenuation optical depth τV, and stellar metallicity Z. The blue contours represent the 68%, 95%, and 99% contours of the full joint posterior distribution respectively and the gray shaded are represents the (marginalized) 68% interval of each individual parameter.

In the text
thumbnail Fig. 8.

Caustics around a 1 M microlens on the side with negative parity at the redshift of the cluster lensing a background source at z = 2.091. The macromodel magnification in this region is ≈850, and the gray scale indicates the combined magnification μ of the macromodel and the microlens. The black horizontal band corresponds to the demagnification region that exists only for images with negative parity. The microlens can demagnify a lensed star down to μ ≈ 10 in this band. Lighter triangular regions above and below show high-magnification regions. The yellow horizontal line illustrates a trajectory grazing the bottom horizontal caustic, and the black line is for a trajectory 0.1 μas south of the yellow line. These lines correspond to the yellow and black curves in Fig. 9.

In the text
thumbnail Fig. 9.

One-dimensional magnification (light curve) of a star moving in a horizontal direction across the microlens shown in Fig. 8. The black and yellow curves correspond to the tracks shown in yellow and black respectively in Fig. 8. The track for the blue curve is also horizontal and is midway between the black and yellow tracks. The red curve corresponds to a track one pixel (10 nanoarcsec) above the yellow track, where the trajectory intersects both the caustic and the area of demagnification. The radial magnification from the macromodel is 3.3, and the tangential one is 250. At large distances from the microlens, the magnification converges to the macromodel value μ = 825. The small scale fluctuations are pixel noise from the simulation.

In the text
thumbnail Fig. 10.

Difference images between first and last epochs. Each panel shows a region of centered on LS1 and in filters as labeled in each panel. A source that brightened between the two epochs shows up as dark in the figure. White crosses mark the position of LS1, and black crosses mark the positions of knot b and its equally bright counterimage b′. All images have been smoothed with a Gaussian kernel with to increase contrast.

In the text
thumbnail Fig. 11.

Stacked difference , where αi was chosen to minimize the contribution from the arc to the difference, and FnnnWi are all filters with wavelengths below 2 μm, that is F090W, F115W, and F150W. These were degraded to the resolution of F200W using the star-derived PSF in Appendix B. The individual differences are shown in Col. 3 of Fig. B.2. The stacked image has been smoothed with a Gaussian of to increase the contrast. The position of LS1 is marked with a white arrow. The position of the possible counterimage LS1′, ≈01 from LS1, is marked with a yellow circle.

In the text
thumbnail Fig. 12.

Independent result confirming the likely counterimage. From left to right, the first and second panels show the (negative) original images of the arc in F090W and F200W; the third panel shows the convolved image from F090W to F200W using WebbPSF models; the fourth image shows F200W − 1.87 × F090W to subtract off the local background in the neighbor of LS1. In all panels, red and blue circles show the LS1 and LS1′ positions, respectively.

In the text
thumbnail Fig. 13.

Millilens model with mass 107M near the critical curve and magnifying LS1. The millilens is farther from the critical curve than in Fig. 14 but still forms critical curves at the position of LS1. This simulation includes both LS1 and i LS1′ as well as the knot image pair b and b′. The geometry of the simulation mimics the observations, with a separation between b and b′ of 045 and a tangential separation (i.e., in the vertical direction) between LS1 and knot b′ of 30 mas. The numbers in yellow indicate the approximate magnification at the position of the corresponding counterimage. In this configuration, LS1′ is ≈2.5 mag fainter than LS1 and therefore unobserved. For this mass, the effect on knots b and b′ is noticeable with the magnification of b′ affected by the millilens. Stars in the intracluster medium with a surface mass density of 5 M pc−2 blur the critical curves and reshuffle large magnifications away from the critical curve. The red double-arrow segment shows the width of the saturation region as predicted by Diego et al. (2018, their Eq. (15)).

In the text
thumbnail Fig. 14.

Millilens models near the critical curve. Gray scale shows the logarithm of the magnification at each location with the critical curve from the cluster being the white vertical line and the critical curves from the millilenses being white ovals. Each panel has a millilens 007 (about twice the resolving power of JWST) to the right of the macrolens critical curve. The top panel is for a millilens with 104M, and the bottom panel is for a millilens ten times more massive. (The critical curve in the bottom panel moved toward the right owing to the contribution of the millilens to the total convergence). Three sources, modeled as Gaussians with σ = 2 μas = 0.0168 pc and marked with arrows, are being lensed. The source labeled “normal image” is far from the critical curve and magnified by the cluster macrolens with only a small contribution from the millilens. The source labeled “Images magnified by millilens” is magnified by the millilens. The source labeled “Pair of images near cluster critical curve” is close to the caustic of the cluster and forms a pair of images.

In the text
thumbnail Fig. 15.

Simulated magnification across the arc in the lens plane. The critical curve is at the origin. A perturber with 104M is placed ≈007 from the critical curve. Colors show three assumptions perturber profiles: light blue for a point source, dark blue for a Gaussian profile with σ = 10 pc and red for a Gaussian profile with σ = 20 pc. Left panel shows a wide range, and the right panel is a magnified version around the perturber.

In the text
thumbnail Fig. 16.

Magnification in the source plane in one of the two millicaustics. Curves are for ML = 104M (lower) and 4 × 104M (upper). The origin is at the projected position of the millilens. Both curves are for a Gaussian millilens profile with σ = 2 pc.

In the text
thumbnail Fig. 17.

Magnification of knots c and c′ for the model shown in Fig. 13. The y-axis shows the magnification along lines perpendicular to the critical curve. The small-scale fluctuations are due to microlenses. The position of the critical curve is shown as a vertical dashed line and the positions of c and c′ by vertical solid lines. The black curve shows the magnification along a horizontal line in Fig. 13 crossing 35 mas south of the millilens. The other two lines are at 20 mas and 50 mas as labeled. The three lines cover the range of estimated distances between c′ and the possible millilens.

In the text
thumbnail Fig. 18.

Model for ML = 2.5 × 106M, the maximum mass allowed for a millilens. Microlenses will be present and distort the critical curves as in Fig. 13, but their effect is omitted for clarity. The magnification for each image is indicated in yellow. The millilens is 43 mas from the magnified image LS1 as shown by the labeled scale bar. A third predicted image of Mothra with magnification ≈50 (therefore undetectable) is marked by a yellow circle.

In the text
thumbnail Fig. A.1.

Stacked signal of six unsaturated stars in the same field in different bands and after coadding all three epochs (for JWST). The stars have been aligned before stacking using steps of 3 mas. In order to better show the sidelobes and diffraction spikes, we plot the logarithm of the stacked signal. The stacked signal is used as a model for the PSF in each filter.

In the text
thumbnail Fig. A.2.

Example of flux estimation using the stacked-star PSF model. The circles mark the positions of seven sources for which the magnitudes were obtained after fitting to the stacked profile of six nearby (unsaturated) stars. The numbers show the estimated magnitude in the F150W band. The left panel shows the original data, and the right panel shows the data after point-source subtraction. LS1 is marked with an orange circle in the left panel. The two yellow circles on the right panel show two regions around the brightest multiply lensed point source, b and b′, where additional unresolved features (also multiply lensed) can still be observed after the source removal. The same feature is observed in other filters suggesting that this is a real feature and not an artifact.

In the text
thumbnail Fig. A.3.

Illustration of the process followed to estimate the magnitudes of unresolved sources in the arc at 2 micron and larger wavelengths. The left column shows the raw data in each filter. The second column shows the template degraded to the resolution of the raw data following the process described above. The third column shows the difference Raw-Template. In this difference, the point sources are evident in all bands, while some are not so clearly visible in the raw data. Finally, the last column shows the residue left after fitting the point sources with the corresponding star shapes from Figure A.1. The residue has very little signal left, specially in the longest wavelengths. In all cases the template is the point source subtracted F150W image after degrading the resolution to the one corresponding to each band. The template is shown in the right panel of Figure A.2, and at its native resolution.

In the text
thumbnail Fig. B.1.

Simple image subtraction M1 with all NIRCam combinations. The positive image is shown by labels above each column, and the negative image is shown by labels to the left of each row. For example, the top-left panel shows F115W–F090W. Sources are labeled in the top-left panel with an arrow showing the location of LS1. Crosses in all panels show the source positions.

In the text
thumbnail Fig. B.2.

Image subtraction M2 removing the arc. Other details are the same as Figure B.1.

In the text
thumbnail Fig. B.3.

Image subtraction M3 removing the arc and the point sources. Other details are the same as Figure B.1.

In the text
thumbnail Fig. C.1.

One-dimensional profiles along the Mothra arc. The left panel shows the observed surface-brightness profiles along the arc. Image a′ is on the left, Mothra is the peak in the middle at 3″, and a is on the right. The inset zooms in around the position of Mothra. The right panel shows the profiles rescaled by a multiplicative factor to match the amplitude of the F090W profile at ≈42. The black dotted line shows a power-law model that traces the ICL with d being the distance to the BCG in arcseconds. For reference, Mothra is 63 from the BCG. The horizontal dashed lines in the inset plot show the combined effect of the PSF dilution and rescaling. A source with the same flux in all bands as in F090W that is convolved by the PSF of each band would have its amplitude at the corresponding dashed lines.

In the text
thumbnail Fig. D.1.

Illustrations of the VSVPA effect with mock data. The top two panels show the mock data F356W, and the bottom two panels show F410M. The first row of each pair shows noise-free data, and the second row shows results with noise added. From left to right, panels show simulated epoch 1, simulated epoch 3, and the difference image. The difference image is smoothed with a 009 Gaussian as for the real data. All images are 15 across. The white cross marks the position of the varying point source. The arms of the cross are 006 end-to-end.

In the text
thumbnail Fig. D.2.

Distribution of offsets from 1000 realizations.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.