Performance of the joint LST-1 and MAGIC observations evaluated with Crab Nebula data

,


Introduction
Very-high-energy (VHE, ≳ 100 GeV) gamma rays cannot be observed directly in an efficient way due to their absorption by the atmosphere.In turn, observations with space-born instrument are marred by relatively low fluxes at those energies.In the last three decades, imaging atmospheric Cherenkov telescopes (IACTs) have proven useful as sensitive instruments in the study of VHE gamma-ray emission from cosmic sources (see e.g., Sitarek 2022 for a recent review).The combination of multiple telescopes at distances on the order of 100 m (comparable to the size of the gamma-ray Cherenkov light pool) allows for joint stereoscopic analyses of the events, thereby improving the performance of the system significantly (Kohnle et al. 1996).
The Cherenkov Telescope Array Observatory (CTAO) is an upcoming next-generation gamma-ray facility (Acharya et al. 2013), composed of two telescope arrays located in the Northern and Southern hemispheres.In order to cover a broad energy range (from a few tens of GeVs up to a few hundreds of TeVs), it is to be composed of telescopes of three different sizes: Large-Sized Telescopes (LSTs), Medium-Sized Telescopes (MSTs), and Small-Sized Telescopes (SSTs).The LSTs, with mirror diameters of 23 m, will be the most sensitive part of the system for the lowest energy range of CTAO (tens of GeV).The construction of the first LST telescope, named LST-1, was completed in October 2018.Since 2019, it has been receiving commissioning and engineering data (CTA-LST project 2021).It is located in Observatorio Roque de los Muchachos, La Palma (Spain), at the altitude of 2200 m a.s.l..It is set at a distance of only ∼ 100 m from the MAGIC (Major Atmospheric Gamma Imaging Cherenkov) telescopes, a pair of 17 m IACTs (Aleksić et al. 2016a).Both systems work independently, but their proximity allows for offline searches of common events and enables joint LST-1+MAGIC analysis.A similar array containing telescopes of different sizes is being operated by the H.E.S.S. Collaboration (Holler et al. 2015).However, in that case, the difference in mirror area (approximately by a factor of 5) causes a similar difference in the energy threshold.On the other hand, in the case of LST-1+MAGIC combination, the difference between the mirror area of the LST-1 and that of the MAGIC telescopes is only a factor of 2.
In this work, we report the common analysis chain of both instruments and its achieved performance using both Monte Carlo (MC) simulations and observations of the Crab Nebula.In Section 2, we describe both participating instruments.The applied data and Monte Carlo (MC) simulations are described in Section 3. We derive various performance parameters of the joint system and present them in Section 4. Our concluding remarks are given in Section 5.

Instruments and data analysis
The relative location of the LST-1 and the MAGIC telescopes, along with their basic parameters are compared in Fig. 1 and Table 1, respectively.While the telescopes have their main design concepts in common, there are some differences, such as the larger LST-1 mirror area, the higher quantum efficiency (QE) of its optical detectors, and its larger field of view (FoV).Despite the same parabolic dish shape (minimizing the time spread of the registered Cherenkov photons) LST-1 has larger f/d and larger camera FoV.The higher event rate in the case of LST-1 is Send offprint requests to: contact.magic@mpp.mpg.de,lst-contact@cta-observatory.org,* Corresponding authors

Parameter
LST a sum of multiple effects: lower threshold (due to higher QE and 59 mirror area), larger size of the trigger region, and monoscopic 60 operations.Since then, both telescopes have operated in a stereoscopic ob-66 servation mode (Aleksić et al. 2012).In the standard operation 67 mode, only events triggering both telescopes are saved.The tele-68 scopes have undergone a few upgrades (with the most recent 69 in 2012) and since then they share a nearly identical design 70 and comparable performance.While the nominal camera FoV 71 is 3.5 • , the part covered by the trigger is limited only to the in-72 ner half of the full camera area.At low zenith angle distance, 73 the energy threshold (defined as the peak of the differential true 74 energy distribution) at trigger level of the MAGIC telescopes for 75 a source with a −2.6 spectral index is ∼ 50 GeV (Aleksić et al. 2016b) for a standard digital trigger.

LST-1
LST-1 is the first of the four LSTs to be constructed in the CTAO Northern site (CTA- LST Project et al. 2022).The construction of LST-1 was completed in October 2018, after which its commissioning and validation period started.Currently, the telescope is performing both commissioning and scientific observations.The mirror area being twice as large, as well as the improved QE of the optical sensors (photomultipliers) compared to MAGIC, enable LST-1 to achieve an energy threshold of ∼ 20 GeV (Abe et al. 2023).However, as any IACT operating standalone, LST-1 suffers from a huge hadronic background, which is much more efficiently rejected in stereoscopic systems.Similarly, it also has worse accuracy in terms of the reconstructed shower geometry, which affects the angular and energy resolutions.Therefore, despite the larger light collection, at energies above 100 GeV, the sensitivity of LST-1 alone is a factor ∼ 1.5 worse than that of MAGIC (Abe et al. 2023).

Event matching
Currently MAGIC and LST-1 operate independently.Both systems are however equipped with GPS clocks that provide time stamps for each event.Those time stamps can be used for offline matching of events that originate from the same shower (similar the arrival times at MAGIC and at LST-1 needs to be taken into account.For each subrun (corresponding to about 10 s of LST-1 data), we match the events with a coincidence window of 0.6 µs.
The optimal delay is obtained using an iterative procedure.To also allow for the analysis of LST-1+M1 or LST-1+M2 event types, the procedure is done independently using time stamps in each of the MAGIC telescopes.For the typical rate of LST-1 and MAGIC (see Table 1), this procedure would result in a negligible rate of accidental coincidences of ≲ 1.8 s −1 .Anomalous coincidence combinations (e.g., matching two LST-1 events to one MAGIC event or two MAGIC events to one LST-1 event) were excluded from the data stream; however, they are very rare due to LST-1 and MAGIC deadtimes.

Data analysis
In their standalone operations, both MAGIC and LST-1 use independent analysis chains.The MAGIC data analysis is based on MARS (Moralejo et al. 2009;Zanin et al. 2013), a C++, ROOTbased library, and a package of analysis programs.The raw data are stored in a custom binary format and the data are generated at each processing step are stored using ROOT containers.
On the other hand LST-1 is using cta-lstchain (Lopez-  After the initial image cleaning (see Aleksić et al. 2016b), the 159 images are parametrized using the classical approach of (Hillas 160 1985) and a quality cut on "intensity" is applied (i.e., the total 161 number of p.e. in the image should be at least 50 p.e.)3 .Next, 162 the events are divided into different classes, depending on which 163 telescopes are triggered.We considered the following combina-164 tions: M1+M2, LST-1+M1, LST-1+M2, and LST-1+M1+M2.165 However, it should be noted, that due to the stereoscopic trigger 166 of the MAGIC telescopes, the event types LST-1+M1 and LST-167 1+M2 also correspond to events in which all three telescopes had 168 been triggered.Though in those events, one of the MAGIC tele-169 scopes provided an image that either did not survive the cleaning 170 or had an intensity that was too low.In Table 2, we report the per-171 centage of events of each kind for various data and MC samples 172 (the parameters of the MC simulations are given in Table 3).173 The dominating type of events are three-telescope events (3/4 174 of all gamma-ray events).The fraction of LST-1+M2 events is 175
about twice as large as those of LST-1+M1 and this is related 176 to the proximity of LST-1 to the M2 telescope.While the per-also allows for the detection of the shower by the third telescope.However, hadronic events show irregularities in their ground distribution of Cherenkov light, caused by individual high transverse momentum sub-showers.Such events can produce a significant signal in MAGIC telescopes without an LST-1 event counterpart.Considering the small fraction of MAGIC-only events, and their dominant background origin, we excluded those events from further analysis.
For convenience, as well as to exploit the information of telescopes not containing the image, the events were then divided into the combination types (see Table 2).For each event type and telescope participating in the combination, the gammahadron separation parameter (i.e., "gammaness," see Abe et al. 2023), estimated energy, and the estimated DISP parameter (estimated distance of the source position projected on the camera to the centroid of the image, Lessard et al. 2001;Aleksić et al. 2010) were computed.The training was done using a random forest (RF) method (Breiman 2001), implemented in the scikit-learn package (Pedregosa et al. 2011).The RF regressors used for energy and DISP estimations used 150 estimators, a maximum tree depth of 50, the squared error criterion To train the shower reconstruction algorithm and to evaluate 263 IRFs, the analysis of IACT data requires MC simulations.In  et al. 2005).Common LST-1+MAGIC observations require the analysis chain to be performed within the same framework, and the same should happen for the simulations.
We performed simulations of the same showers visible by both MAGIC and LST-1, using the sim_telarray program.
To achieve this, we translated the simulation parameters of the MAGIC telescopes from the reflector and camera simulation programs into the sim_telarray nomenclature.For most of the parameters (e.g., mirror dish geometry, angular dependence of light guide efficiencies, average quantum efficiency of PMT, jitter of single p.e. times, telescope trigger parameters, and readout pulse shape), the translation was direct and the same values and curves were used in both simulation chains.For some of the parameters, however, minor simplifications or averaging had to be applied due to intrinsic differences between the two softwares.For example, this was the case for the simulation of the mirrors reflectivity and the noise within the pulse integration window.Thanks to the usage of sim_telarray, the LST-1 simulation parameters could be taken directly from the standard configuration, the so-called LSTProd2 (as in Abe et al. 2023).The common parameters (atmospheric model, geomagnetic field) follow the LSTProd2 settings.The level of uniform night sky background (NSB) was adjusted at the analysis level in the case of LST-1, following the procedure described in Abe et al. (2023).In the case of MAGIC, the adjustment was done according to the same principle (matching noise in empty, the so-called pedestal, and events), but already at the telescope camera response simulation level.The validation process of the MC simulation settings on a dedicated MC production is described in Section B. As a final end-to-end check, we also compared the energies of MAGIC events reconstructed with both the MCP (based on the sim_telarray MCs) and MARS (based on standard MAGIC MCs) chains, achieving similar accuracy (see C).

Observation and simulation samples
We determined the performance of the joint analysis chain in two ways: first using observation data taken from the direction of the Crab Nebula and then using dedicated MC simulations.

Observations
In order to evaluate the performance of the joint analysis, we used 4 hrs of good-quality Crab Nebula data taken simultaneously by the LST-1 and MAGIC telescopes.The observations span the period between October 2020 and March 2021, and taken in wobble observation mode, with the source position offset by 0.4 • from the camera center.After every 20-min long run, the direction of the offset was flipped to maintain consistency between the source and the background control region.Only data in which the pointing direction of both systems matched within 0.1 • were used.The data was taken at low and medium zenith angles, namely 0.8 hrs in between 12 • − 30 • , 2.3 hrs in 30 • − 45 • and 0.6 hrs in between 45 • − 53 • .

MC simulations
For most of the analysis chain we re-use the same MC simulation samples of protons and gamma rays as in Abe et al. (2023).However, we also used additional simulations of helium and electrons, with reoptimized scaling of the simulation parameters (maximum impact parameter and offset angle from the camera center) with zenith angle distance, to improve the sample com-pleteness.The samples were generated at fixed pointings along the path of the source in the sky (training samples), or to cover the full-sky on a grid of pointings (test samples), see Abe et al. (2023) for details.All the MC samples are generated with spectral index of −2 and reweighted to specific particle spectra.The productions are summarized in Table 3.In the interest of studying the performance with MC as well, we divided the "TrainProton" sample into training and testing sub-samples.

Data/MC comparisons
To ensure the correct reproduction of observed data by the MC simulations, we performed an end-to-end comparison with the data.As the gamma-ray showers are more regular and (on average) less extended than hadronic ones, comparisons performed with gamma-ray events are sensitive to possible data-MC mismatches.Hence, we present a comparison with selected gammaray events, which also reflect the performance for gamma-ray observations.Nevertheless, for completeness of the study and to validate the analysis threshold, we also performed similar comparisons with the background events (see D).
We derived the parameter distributions obtained from the gamma-ray excess events.The distributions are extracted from Crab Nebula observations after subtraction of the residual background using a background control region.We compared these excess distributions to the simulated gamma rays weighted (and normalized) according to the spectrum that was measured by Aleksić et al. (2015).In this approach, the gamma-ray events are dominated by a background of much more abundant hadronic showers.Thus, to avoid large statistical (and systematic) errors, some kind of background suppression needs to be applied.In order to perform the comparison without introducing a large bias, we applied soft cuts corresponding to 95% "gammaness" efficiency (in each estimated energy bin), and considered only events with reconstructed direction up to 0.2 • away from the nominal source position.
The results of the comparison are shown in Fig. 3.The "intensity" distribution is roughly reproduced.We note that the MC simulation shape of the distribution of these parameters (in particular, the intensity) is dependent on the assumed spectral model of the Crab Nebula.The length distribution is well matching between the data and MCs.However, contrary to the background case (c.f.D), the width parameter is slightly underestimated in the case of MAGIC-I and MAGIC-II telescopes, which could be due, for instance to insufficiently accurate simulations of the optical PSF.Despite this, the "gammaness" distribution, both for individual telescopes and average, is still sufficiently well reproduced in the simulations to avoid introducing large systematic errors.The reconstructed height of the shower maximum is slightly shifted towards higher values in MC simulations than in the data.This could be due to a combination of various effects, for instance: a systematic uncertainty on the energy scale of the telescopes, a mismatch between the zenith and azimuth distributions, which are continuous for the data and discrete for the simulations, or a slight mispointing of the telescopes.Finally, while the reconstruction of the event direction is roughly consistent with the simulations, a slight increase of the high-values tail in the data is present as well.Similarly, a slight mismatch in such distributions has been observed in LST-1-alone and MAGICalone observations, and might be related to arcmin-scale mispointing of the telescopes (Aleksić et al. 2016b;Abe et al. 2023).

Performance parameters 388
Using Crab Nebula data and MC simulations, we evaluated vari-389 ous performance parameters of the joint analysis chain and com-390 pared those with the MAGIC-only analysis.We also compared 391 the sensitivity and flux reconstruction with the LST-1-only anal-392 ysis.As the performance of Cherenkov telescopes is strongly de-393 pendent on the zenith distance of the pointing, we investigated 394 the case of Zd < 30 • and 30 • < Zd < 45 • separately to provide 395 a comparison with the MAGIC-only performance.In Fig. 4, we present the differential true energy distribution for a 398 source with a −2.6 spectrum.In the case of MAGIC-only events, 399 the energy threshold (peak position of that distribution) at the 400 stereoscopic reconstruction level of ∼ 70 GeV is consistent with 401 the value obtained in Aleksić et al. (2016b).The addition of the 402 third telescope, while it cannot provide additional events at the 403 trigger level, can recover events in which one of the MAGIC im-404 ages displays an intensity level that is too low for further stereo-405 scopic reconstruction.As a result, the energy threshold at the 406 reconstruction level is reduced to ∼ 60 GeV.Additionally, the 407 collection area below the energy threshold is greatly improved, 408 by a factor of about 2 at 30 GeV.  2023), for the spectral analysis we 412 apply an increased cut for the intensity of > 80 p.e. for LST-413 1 images.Due to the significantly larger light yield of LST-1 414 compared to MAGIC, the effect of this cut on the stereoscopic 415 analysis is very small (e.g., for low zenith angle observations at 416 30 GeV only 10% of events are removed).For MAGIC images, 417 a standard intensity of > 50 p.e quality cut was applied.To re-418 construct the spectrum of the Crab Nebula, we derived the IRFs 419 for a number of simulated azimuth and zenith pointings close to 420 those followed by the source during the observations.To eval-421 uate the IRFs corresponding to individual data runs we employ 422 interpolation.We then divided the sample into an ascending and 423 descending branch (i.e., before and after the culmination).For 424 each branch separately, we performed a one-dimensional (1D) 425 interpolation (over the cosine of zenith distance angle) of the 426 IRFs.Subsequently a global, binned, joint likelihood spectral 427 fit was performed with gammapy 0.20.1 (Deil et al. 2017;Do-428 nath et al. 2022) to determine the best parameters of the spectral 429 model for a point-like source at the nominal position of the Crab 430 Nebula.Next, the same software was used to derive individual 431 spectral points by fitting the normalization of the global model 432 in energy bands.In Fig. 5, we present the resulting spectral en-433 ergy distribution reconstructed between ∼60 GeV and ∼10 TeV 434 from the total investigated data set.The spectrum is modeled in 435 gammapy with a log parabola spectrum defined as: with A = (3.48± 0.09 stat ) × 10 −11 cm −2 s −1 TeV −1 , E 0 = 1 TeV, 437 α = 2.49±0.03stat , β = 0.117±0.017stat6 .The resulting spectrum 438 is consistent with previous MAGIC and LST-1 measurements 439 within ∼ 10%.for the run-by-run calculations and 13.1/5 for night-by-night).
Such an observed instability of the flux is likely due to the systematic effects related to, for instance, the atmosphere varying during the observations.We investigated how the χ 2 /N dof statistics changes when a given level of systematic uncertainties is added in quadrature to the statistical uncertainty.To achieve the corresponding fit probability of 0.5, an additional 12.7% systematic uncertainty is required in the case of run-by-run analysis, as well as 7.9% in the case of night-by-night, which is at the level or even lower than what was estimated for MAGIC.

Differential flux sensitivity
Sensitivity is a measure of the minimum flux of a source that can be detected with an instrument in a given time exposure.In the case of differential sensitivity, the detection should be achieved independently in a particular energy bin.We followed the definition typically used in the IACT community, namely, the data are divided into 5 bins per decade of reconstructed energy and the event statistics are rescaled to 50 hrs of observation time.The sensitivity in a given bin then corresponds to the gamma-ray flux from a point-like source that provides 5σ significance signal, as computed via Eq.17 of Li & Ma (1983), with additional two conditions: the number of excess events should be greater than ten and also larger than 5% of the residual background.In the calculations of the significance, we assume that the background can be computed from five regions with the same acceptance of the signal region.
In order to optimize the usage of statistics, we applied a kfold cross-validation procedure (see e.g., Mosteller and Tukey 1968).Specifically, the sample is divided into four sub-samples, each of them using every fourth event in the sample.For each of them, we apply cuts in arrival direction and "gammaness," com- puted using the remaining sub-samples and optimized to pro-480 vide the best sensitivity.We then stack the events from all sub-481 samples and compute the final sensitivity that is not biased by 482 the cut selection.

483
We estimated the sensitivity both with the Crab Nebula 484 data and also with MC simulations.In the latter case, we used 485 the spectrum derived by Aleksić et al. (2015) to convert the 486 flux into Crab Nebula units (C.U.).In the calculations we as-487 sume the proton flux of Yue et al. ( 2019) and the electron 488 flux is a parametrized combination of Fermi-LAT and H.E.S.S. 489 all-electron spectrum applying the parametrization of Eq. 2 in 490 Ohishi et al. (2021).

491
We followed the approach of Aleksić et al. (2012) to include 492 the effect of the other elements.Specifically, we used helium 493 simulations and scaled the spectrum to 0.8 of the proton spec-494 trum to also roughly take the heavier elements into account.It 495 should be noted, however, that helium and higher elements have 496 very little effect on the sensitivity.As it can be seen in Fig. D.1, 497 at high "gammaness" values, their contribution to the estimated 498 background is smaller by about an order of magnitude than the 499 one of protons (see also Sitarek et al. 2018).It also drops very 500 fast with estimated energy.High rejection of proton events via 501 the "gammaness" cut at high energies results in severely reduced 502 background statistics.Therefore, we collected the background 503 statistics from the inner 1 • radius region and apply an energy-504 dependent correction factor between average proton density in 505 this region and the density at camera offset of 0.4  tion factor is computed using a loose (corresponding to 94% effi-507 ciency for gamma rays) "gammaness" cut and is typically ∼ 1.4.

508
When calculating the sensitivity using the Crab Nebula data, ing for a more efficient rejection of the background events.Second, the number of detected gamma-ray events is also increased.

541
Since the observations are performed in software-coincidence 542 mode, the trigger-level collection area cannot be increased with 543 the addition of the LST-1.However, during the regular analysis 544 of data from MAGIC, a fraction of the images do not survive 545 the quality cuts (small showers producing < 50 p.e. are typically rejected).In the MAGIC-only analysis the rejection of either M1 or M2 image is equivalent to the rejection of the whole event, since it is not possible to perform stereoscopic reconstruction with only one telescope.However the LST-1 image makes it possible to recover these kinds of events (as LST-1+M1 or LST-1+M2 event).On average, about 20% of the reconstructed gamma events has only one image from either M1 or M2 (see Table 2).
Finally, we also compared the joint analysis performance achieved with the Crab observations with the one obtained with MC simulations.In the case of low-zenith observations, the sensitivity obtained with data and MC simulations is compatible in the whole energy range.It should be noted that the data curve is based on only 0.8 hr of data (all the available Crab Nebula joint observation data taken with zenith distance below 30 • in the data period described by the used MCs); thus, it is affected by large statistical uncertainties.In the case of medium zenithangle observations (where also the statistical uncertainty of the sensitivity is smaller), a ∼ 30% mismatch is seen above a few hundred GeV.It might be caused by simplifications used in the MC simulations, or by an incompleteness in the background MC sample (e.g., missing events with large impact or angular offset, a simplification in the treatment higher elements, etc.).It should also be noted that a similar, ∼ 20% mismatch in sensitivity has been observed as well in the earlier simulations of the MAGIC telescopes (Aleksić et al. 2012, see also Arcaro et al. 2023).

Angular resolution
To evaluate the angular resolution of the joint LST-1+MAGIC analysis in every energy bin, the gamma-ray excess was computed as a function of the angular separation to the nominal source position.To evaluate it in typical circumstances, we apply a 90% efficiency cut in "gammaness" (the same cut as used for the derivation of the spectral energy distribution of the source).We followed the commonly used definition of the angular resolution as the angular distance from the source that corresponds to 68% containment of the point spread function (see Fig. 9).To facilitate data and MC comparisons, we used reconstructed energy in both cases, assuming that at 0.4 • , the containment is already 100% (MC simulations show that the actual containment at 0.4 • is 96% in the lowest plotted energy bin, 63 -100 GeV).
For the MAGIC-only analysis, the angular resolution is slightly higher (∼ 10%) in the MCP chain, as compared to the dedicated MARS analysis.It should be noted that the MARS analysis software is optimized for two-telescope observations and it employs slightly different shower reconstruction techniques than ctapipe.As a result, at medium and high energies, the angular resolution obtained in the joint analysis with MCP is still similar to the one from MAGIC-alone observations and MARS chain.However, when comparing instead the joint analysis to MAGIC-only analysis with the same chain, a slightly improvement is seen.At the lowest energies the joint analysis reaches a ∼ 10% improvement even with respect to the MARS analysis of MAGIC-only data (only visible in MC curves, as the statistics of the Crab Nebula sample are insufficient for precise evaluation in this energy range).This is likely due to the large fraction of dim images at those energies, which are better reconstructed with LST-1 than with MAGIC due to the higher light yield.
We also investigated whether further improvements can be achieved by selecting only the events in which all three images   7. Sensitivity in the units of Crab Nebula flux percentage of the joint LST-1+MAGIC analysis with MCP from the Crab observations (blue circles) and MC simulations (orange squares) compared to MAGIC-only analysis using MARS (green triangles) and LST-1 sensitivity (red crosses, Abe et al. 2023).Left panels: Zenith distance range of < 30 • , while the right panels for 30 − 45 • .Bottom panel: Ratio of sensitivity values of the joint analysis sensitivity to MAGIC-only (green triangles) and LST-1-only (red crosses), as well as with respect to the MC sensitivity performance (orange squares); e.g., the green points shows that the three-telescope system, relative to MAGIC alone, can in average detect sources with a flux ∼ 30% lower.For visibility in the bottom panels, the points are shifted by ±2% on the X scale.are present.A comparison of the red and orange curves in Fig. 9 607 indicates that such an improvement is negligible.

615
The small distance between the LST-1 and MAGIC telescopes 616 allows both instruments to see the same showers and to perform 617 joint analysis, but it can also be used to compare the light yield 618 of the telescopes (see Hofmann 2003).By selecting events seen 619 at similar impact parameters by LST-1 and one of the MAGIC 620 telescopes, we can compare the light yield of both instruments.In Fig. 10, we make such a comparison after applying angular and "gammaness" cuts to select gamma-ray events.The ratio of the total intensity of the LST-1 images to that of MAGIC-I (MAGIC-II) is 2.99 (2.60), with a standard deviation of 0.57 (0.37).This is in a rough agreement with the expectations from the larger mirror area and higher photodetector QE (see Table 1).
In order to investigate more accurately the possible relative miscalibration of the two instruments, we applied a procedure similar to that used in Aleksić et al. (2016b).Specifically, we compared the energy estimated using MAGIC camera image parameters to the one estimated for LST-1 using gamma-ray excess events.In both cases, stereoscopic parameters (impact and height of the shower maximum) were used as well.To avoid any bias present close to the energy threshold, we only used events in the estimated energy range of 0.3 -3 TeV, where the energy resolu- Comparison of the angular resolution (defined as 68% containment of gamma rays) as a function of the estimated energy of the joint LST-1+MAGIC analysis (orange) from the Crab observations (points) and MC simulations (line) compared to MAGIC-only analysis using MARS (blue) or using MCP (green).Joint analysis in which only all-three telescope events are kept is shown in red (in most of the energy range the curve merges with the orange one).Left panel is for observations at low zenith angle, right panel is for the medium zenith-angle distance.tion is almost constant and energy bias is negligible.We investi-637 gate the relative miscalibration of the two instruments as a func-638 tion of time or zenith distance of the observations (see Fig. 11).

639
The obtained difference in calibration of both instruments is be-

Energy resolution
In the absence of an external calibrator, the energy resolution of an IACT can only be derived using MC simulations.To evaluate the performance of the energy reconstruction, we divided the data into bins of true energy, E true , and in each bin, we determined the distribution of the energy dispersion, (E est − E true )/E true .We defined the energy bias as the median of this distribution (while confirming that using the mean instead would not affect the estimation considerably).Similarly to the angular resolution, we define the energy resolution as the 68% containment area of the dispersion.We first compute the difference between the median of the distribution and the 16% and 84% quantiles, which correspond to the underestimation and overestimation of the energy estimation.Next, we computed the 68% quantile energy resolution as the average (weighted with inversed variance) of these underestimation and overestimation components.For comparison we also computed the energy resolution as the standard deviation of a Gaussian fit to the (E est − E true )/E true distribution excluding the tails.The results are shown in Fig. 12 and summarized in Table E.3.
The energy estimation has nearly no bias down to ∼ 150 GeV.At the lowest energies, the energy resolution is improved with respect to MAGIC-alone observations (cf.Ishio & Paneque 2022).The energy resolution in the medium energy range is ∼ 14%.At multi-TeV energies, the energy resolution starts to worsen.The 68% containment definition provides very similar results as a narrow fit definition.As the standard deviation is increased by outliers, this definition reports worse energy estimation, in particular at the highest energies.It should be noted that most of the events above a few TeV in the joint analysis are not fully contained in the camera (i.e., they have pixels surviving the image cleaning in the outermost ring of the camera).We also note that the performance of the energy estimation at the highest energies is strongly affected by the details of the training.In particular, it varies whether the training is done on diffuse gamma rays or with gamma rays produced in a ring with radius equal to the expected offset used during observations of point-like sources.It also depends on the event statistics used in the training.The energy difference is normalized to weighted average energy obtained from all the telescopes (see Section 2.4).A 90% "gammaness" efficiency cut, 0.2 • angular cut between the nominal and reconstructed source position and 0.3 -3 TeV estimated energy cut are used.Each point corresponds to average from one run (with the error of the mean reported as error bar).Dependence on the zenith angle of the observation is shown in the left panel and on the LST-1 run number (increasing with time) is shown in the right panel.• .The resolution is calculated as an average 68% quantile (orange).Standard deviation of the energy misreconstruction is shown in green.The red line is the resolution value obtained from a tail-less (performed in the range mean ± 2 standard deviations) Gaussian fit.The bias of the energy estimation ((E est − E true )/E true ) is shown in blue.The thickness of the lines represents the statistical uncertainty.A "gammaness" efficiency cut of 90%, angular distance of < 0.2 • and "intensity" of > 50 p.e. cuts are applied.
In order to validate the accuracy of the claimed energy es-690 timation, we performed a similar test as the one described in

Comparison to single instrument analysis
The systematic errors affecting measurements performed with IACTs stem both from the complex hardware and from the imperfect knowledge of the atmosphere status, which is considered as part of the detector.The systematic uncertainties of MAGIC are divided into three categories: uncertainty of the energy scale of 15%, pure flux normalization uncertainty of 11 − 18% and uncertainty of the spectral slope of ±0.15 for an assumed powerlaw spectrum (Aleksić et al. 2016b).In the case of LST-1 standalone observations, the uncertainty of the background estimation plays an important role at the lowest energies (Abe et al. 2023) because single-telescope observations are characterized by much higher background rates (note: no detailed study of the systematic uncertainties of LST-1 has been performed yet).
In the joint analysis, we combined the data from two different IACT systems.Such a combination might on one hand increase the systematic uncertainties, in particular, due to simplifications needed to merge the simulation and analysis chains for both instruments.On the other hand the resulting systematic uncertainties might also decrease because of averaging.Moreover, the atmospheric transmission is one of the main sources of systematic uncertainty for IACTs (Aleksić et al. 2012) and it can also vary on different time scales.Due to the physical proximity of MAGIC and LST-1, both instruments share the same contribution from the atmospheric transmission uncertainty.
While some energy miscalibration between LST-1 and MAGIC has been observed, it is within the claimed systematic uncertainties of MAGIC.Moreover, the fact that the reconstructed spectrum agrees closely (within ∼ 10%) with the MAGIC and LST-1 standalone results further supports that the systematic uncertainties of the joint analysis are not larger than those of individual instruments.The light curve stability analysis requires 12.7% (7.9%) relative systematic uncertainties to be consistent with a constant flux on run (day) timescales.Those numbers are remarkably similar to those obtained from such a study for MAGIC-only data: 11% on run-by-run (Aleksić et al. 2016b) and 7.6% on day-to-day (Ahnen et al. 2017).In the case of LST-1-alone observations, the required variable systematic uncertainties on day-to-day time scale are even slightly lower: 6-7% (Abe et al. 2023); however, that study was limited only to low zenith distance observations.Therefore, we conclude that the systematic uncertainties of the joint analysis are similar to those of MAGIC.

Conclusions
In this work, we introduce a new pipeline enabling the joint analysis of LST-1 and MAGIC data.The LST Collaboration is aimed at performing 50% of the observations together with MAGIC in the coming years.The chain provides common stereoscopic analysis of images corresponding to the same physical shower.
This study is also a path-finder for a stereoscopic analysis chain to be applied to the future CTAO arrays.As the events are matched by arrival time from the individual LST-1 and MAGIC observations, the rate of events cannot be increased at the trigger level.However, the addition of the LST-1 allows us to reconstruct 20% more events in which one of the MAGIC images is rejected during reconstruction -and, thus, where no stereoscopic analysis is possible with MAGIC only.The analysis provides an improvement of the energy threshold at the reconstruction level by ∼ 15% with respect to MAGIC-only observations.As a result of such recovered events, and of the improved background rejection for events seen by all three telescopes, the performance of joint observations is greatly improved.In particular, the minimum detectable flux is 30% (40%) lower than MAGIC-alone (LST-1-alone) analysis.Since in the medium energy range the sensitivity is inversely proportional to the square root of observation time, this corresponds to 2-fold (nearly 3-fold) shortening of the required observation time to reach the same performance of MAGIC-only (LST-1-only) observations.Therefore the two systems work more efficiently together than individually and joint observations are highly valued.
The other performance parameters, in particular, the angular and energy resolution are not strongly affected by the addition of the LST-1 telescope, only minor improvements are seen at the lowest energies.The presented analysis is designed to improve the collection area of MAGIC-alone observations by allowing for the inclusion for events that would not survive the standard analysis chain of MARS due to one of the images not surviving the quality cuts.The additional reconstruction of these kinds of worse-quality events partially contributes to this lack of sig-777 nificant improvement.It is also possible that the peculiar geo-778 metrical placement of the three telescopes (obtuse-angled trian-779 gle) worsens the efficiency of the shower reconstruction.More-780 over the comparisons are performed with the MARS analysis of 781 MAGIC data, optimized for the two-telescope case, and cannot 782 be fully scaled to multiple telescopes.Further optimization of 783 the analysis for improved angular or energy resolution is possi-784 ble, possibly at the price of decreased collection area.

785
We performed comparison checks, both against the MAGIC 786 standard simulation software as well as comparisons of the 787 simulation results to the data and showed a rather good agree-788 ment.The Crab Nebula spectrum obtained from the joint 789 analysis can be described as dN/dE = (3.48± 0.09 stat ) × 790 10 −11 (E/1TeV) −2.49±0.03stat −(0.117±0.017stat ) ln(E/1TeV) cm −2 s −1 TeV −1 .791 It is within about 10% of the earlier measurements.Also, the 792 derived flux stability is comparable to the one of MAGIC, 793 pointing to similar systematic uncertainty of the joint analysis.

Fig. 1 .
Fig.1.Location of the LST-1 and MAGIC telescopes.The X and Y axes represent the geographical East and North direction, respectively.The diameter of the circle is equal to the diameter of the telescope's mirror dish.

62 MAGIC
is a system made up of two IACTs, separated by a dis-63 tance of 85 m.The first telescope, MAGIC-I (M1), was con-64 structed in 2003, while MAGIC-II (M2) was added in 2009.65 has been used in the first H.E.S.S. stereoscopic data, H. E. S. S. Collaboration 2006).Due to the different electronic pathways and different travel times of the Cherenkov light to individual telescopes, a pointing-dependent time delay between

264
the case of LST-1, the development of showers is simulated us-265 ing CORSIKA(Heck et al. 1998), while the response of the 266 telescope is simulated with the sim_telarray program (Bern-267 löhr 2008).On the other hand, within MAGIC, MC simulations 268 of showers are generated using a slightly modified version of 269 CORSIKA, but the response of the telescopes is obtained us-270 ing MagicSoft programs (reflector and camera) (Majumdar 409 4.2.Flux reconstruction 410 Since all the data used in this work were taken before August 411 2021, following Abe et al. (

Fig. 3 .Fig. 4 .
Fig.3.Comparison of image parameters between the gamma-ray excess in the data (blue) and MC simulations of gamma rays (orange).Only observations with zenith distance below 30 • are used.The top four rows of panels show intensity, length, width, and individual telescope "gammaness" (from top to bottom) for LST-1 (left), MAGIC-I (middle), and MAGIC-II (right).The bottom row shows stereoscopic parameters: height of the shower maximum (left), averaged "gammaness" (middle) and squared reconstructed distance to the source (right).

Fig. 5 .
Fig. 5. Spectral energy distribution of Crab Nebula obtained with joint LST-1+MAGIC analysis (blue points and fit line, with the statistical uncertainty of the fit shown as shaded region) compared to reference measurements from MAGIC-alone (orange dashed line, Aleksić et al. 2015), LST-1-alone (red dot-dashed, Abe et al. 2023), Fermi-LAT+LST-1 (violet long-dashed, Abe et al. 2023), and Fermi-LAT+IACT (green dotted line, Meyer et al. 2010).The bottom panel shows the ratio of the spectral model derived with the joint analysis to the individual reference spectra (see the legend).

Fig. 6 .
Fig. 6.Integral flux of Crab Nebula obtained with joint LST-1+MAGIC analysis binned day-by-day (blue empty points).The horizontal line shows the corresponding average flux from the integrated spectral model.

509
Fig. 8.The MAGIC-only sensitivity curve has been derived from 525

Fig.
Fig. 7. Sensitivity in the units of Crab Nebula flux percentage of the joint LST-1+MAGIC analysis with MCP from the Crab observations (blue circles) and MC simulations (orange squares) compared to MAGIC-only analysis using MARS (green triangles) and LST-1 sensitivity (red crosses,Abe et al. 2023).Left panels: Zenith distance range of < 30 • , while the right panels for 30 − 45 • .Bottom panel: Ratio of sensitivity values of the joint analysis sensitivity to MAGIC-only (green triangles) and LST-1-only (red crosses), as well as with respect to the MC sensitivity performance (orange squares); e.g., the green points shows that the three-telescope system, relative to MAGIC alone, can in average detect sources with a flux ∼ 30% lower.For visibility in the bottom panels, the points are shifted by ±2% on the X scale.
scale calibration is one of the main prob-610 lems affecting the observations with IACTs.While current 611 IACTs claim the systematic uncertainty on the energy scale of 612 ∼ 15% (Aleksić et al. 2016b), for the next generation LST-1 a 613 calibration down to 4% is possible, but it requires taking into 614 account ∼ 20 individual systematic effects (Gaug et al. 2019).
Fig.9.Comparison of the angular resolution (defined as 68% containment of gamma rays) as a function of the estimated energy of the joint LST-1+MAGIC analysis (orange) from the Crab observations (points) and MC simulations (line) compared to MAGIC-only analysis using MARS (blue) or using MCP (green).Joint analysis in which only all-three telescope events are kept is shown in red (in most of the energy range the curve merges with the orange one).Left panel is for observations at low zenith angle, right panel is for the medium zenith-angle distance.

Fig. 10 .
Fig. 10.Ratio of the reconstructed image intensities of LST-1 and MAGIC telescopes (blue: MAGIC1, orange: MAGIC2).Each point represents a single event.Only events with "gammaness" above 0.8 and angular distance to the Crab of below 0.1 • , as well as the reconstructed impact parameter to both telescopes within 10 m are used.Only data with zenith distance angle below 45 • , reconstructed impacts below 150 m and reconstructed energy above 300 GeV are used.
640 tween ∼ 6 − 16%, comparable to the systematic uncertainty on 641 the energy scale of MAGIC.LST-1 estimates a higher energy 642 than MAGIC, suggesting either an underestimation of the light 643 collection efficiency of LST-1 in MC simulations or an over-644 estimation in the case of MAGIC.No clear evolution in time 645 is seen, but a hint of zenith dependence (with the miscalibra-646 tion decreasing at medium zenith) is observed.Indeed, a con-647 stant fit to LST-1 to MAGIC-I (II) zenith dependence provides 648 χ 2 /N dof = 48.8/14(χ 2 /N dof = 49.5/14),while a simple linear fit 649 improves the χ 2 /N dof values to 27.7/13 (33.0/13). 650

Fig. 11 .
Fig. 11.Relative difference of estimated energy from LST-1 and MAGIC (blue for MAGIC-I, orange for MAGIC-II) camera images of the same event.The energy difference is normalized to weighted average energy obtained from all the telescopes (see Section 2.4).A 90% "gammaness" efficiency cut, 0.2 • angular cut between the nominal and reconstructed source position and 0.3 -3 TeV estimated energy cut are used.Each point corresponds to average from one run (with the error of the mean reported as error bar).Dependence on the zenith angle of the observation is shown in the left panel and on the LST-1 run number (increasing with time) is shown in the right panel.

Fig. 12 .
Fig. 12. Energy resolution obtained with MC simulations for gammaray showers at zenith angle distance of 23.6 • .The resolution is calculated as an average 68% quantile (orange).Standard deviation of the energy misreconstruction is shown in green.The red line is the resolution value obtained from a tail-less (performed in the range mean ± 2 standard deviations) Gaussian fit.The bias of the energy estimation ((E est − E true )/E true ) is shown in blue.The thickness of the lines represents the statistical uncertainty.A "gammaness" efficiency cut of 90%, angular distance of < 0.2 • and "intensity" of > 50 p.e. cuts are applied.

691
Fig. 13.Standard deviation of the relative difference in estimated energy between one of the MAGIC telescopes and LST-1 for the same event.Data, MC, and quality selection cuts as in Fig.11

Table 1 .
Comparison of LST-1 and MAGIC telescopes parameters.

Table 2 .
Percentage of different event types in different types of MC simulations and in the observations.Note: Only images surviving 50 p.e. cut in intensity are considered.Observations and MC simulations cover low zenith distance angle (< 30 • ).Proton MC are weighted to −2.7 spectral index, while gamma-ray MC to −2.6.Values for gamma-ray simulations are provided separated for showers at typical offset from the pointing direction (0.4 • ) and for isotropic distribution (within 2.5 • from the pointing direction) ing images to photoelectrons (p.e.) and individual pixel timing, 133 with the specific software of each instrument.Then the MAGIC 134 data are converted into HDF5 format, compatible with the LST-1 135 data using the dedicated ctapipe_io_magic package 1 .The rest 136 of the analysis chain is performed with the magic-cta-pipe 2 137 package using lstchain and ctapipe methods.In particu-138 lar, the magic-cta-pipe contains analysis scripts to, for ex-139 ample, apply the same image cleaning as in the MARS pack-140 age, but within a ctapipe-like environment and to match the 141 events produced by the same shower in the three telescopes.142 All the other higher-level analysis steps are performed with the 143 magic-cta-pipe package as well, with the help of other mod-144 ules for specific tasks (e.g., pyirf, Noethe et al. 2022, for the 145 calculation of instrument response function, IRF and gammapy 146 158

Apply RFs Calibration Image parametrization Stereo reconstruction sim_telarray MC DL1 + stereo (image + geometry) Test samples Train RFs Apply RFs Create the IRFs Create DL3 files MAGIC R0 (waveform) MARS Calibration Image parametrization MCP gammapy Fig
. 2. Schematic view of MCP analysis chain.Blue, red, violet, and gray boxed represent different stages of LST-1 data, MAGIC data, joint data, and MC simulations, respectively.Black boxes mark the auxiliary files.
• .The correc-506 Table E.1.Rates and sensitivity values for Crab Nebula observations and MC simulations at zenith distance < 30 • , as plotted in Figures 7 and 8 (left panels).Note: Rates are integrated in 0.2 decades centered on E 0 value.Data sensitivities are provided in the percentage of Crab Nebula flux, while MC sensitivities in SED units.Table E.2.As in Table E.1 but for zenith distance 30 − 45 • (see also Figures 7 and 8, right panels).Table E.3.Energy resolution at zenith distance 23.6 • corresponding to Fig. 12. Note: The columns report: true energy, 68% containment resolution, standard deviation (S.D.), and tail-less fit.