Free Access
Issue
A&A
Volume 625, May 2019
Article Number A10
Number of page(s) 8
Section Numerical methods and codes
DOI https://doi.org/10.1051/0004-6361/201834938
Published online 30 April 2019

© ESO 2019

1. Introduction

The opening of new astronomical windows at multiple wavelengths in recent decades has made it evident that many astrophysical puzzles could be solved only by combining images obtained by different facilities. In the late 1970s a common format was developed to facilitate the image interchange between observatories, hence overcoming incompatibilities between the numerous operating systems. The Flexible Image Transport System (FITS) format was standardized in 1980 (Wells et al. 1981) and formally endorsed in 1982 by the International Astronomical Union (IAU), which in 1988 formed a FITS working group (IAUFWG) entrusted to maintain the format and review future extensions. In the mid-1990s the NASA High Energy Astrophysics Science Archive Research Center (HEASARC) FITS Working Group, also known as the OGIP (Office of Guest Investigator Programs), promoted multi-mission standards for the format of FITS data files in high-energy (HE) astrophysics and produced a number of documents and recommendations that were subsequently incorporated into the FITS standard format definition. Since its conception, the FITS format has been updated regularly to address new types of metadata conventions, the diversity of research projects and data product types. The last version of the FITS standard document (4.0) was released in 20181.

Today the FITS format is in widespread use among astronomers of all observing bands, from radio frequencies to gamma rays. For instance, the HE gamma-ray (E >  100 MeV) Large Area Telescope (LAT; Atwood et al. 2009), on board the Fermi satellite, publicly releases all its high-level analysis data in FITS format, which, processed with the science tools, are used to obtain scientific products as spectra, light curves and sky maps. However, as a branch of astroparticle physics, very-high-energy (VHE; E >  100 GeV) gamma-ray astronomy inherited its methodologies and standards from particle physics, where the ROOT2 framework (Brun & Rademakers 1997) and its associated file format is commonly used. Despite the common container format neither the internal data structure nor the software is shared among the different experiments. Very-high-energy gamma-ray astronomy is conducted by ground-based telescopes; among the most successful is the Imaging Atmospheric Cherenkov Telescopes (IACTs; de Naurois & Mazin 2015). Data from four of the currently operating IACTs were used in this project; i.e. the Major Atmospheric Gamma Imaging Cherenkov Telescopes (MAGIC; MAGIC Collaboration 2016a), the Very Energetic Radiation Imaging Telescope Array System (VERITAS; Holder et al. 2006), the First G-APD Cherenkov Telescope (FACT; Anderhub et al. 2013), and the High Energy Stereoscopic System (H.E.S.S.; Hinton 2004). Each of these is described in Sect. 2.

A new era in VHE gamma-ray astronomy is expected to start with the future Cherenkov Telescope Array (CTA; Acharya et al. 2013), the next generation IACT instrument, which is currently under construction. The future operation of CTA as an observatory calls for its data format and analysis software to be available to a wide astronomical community. This requirement led to a standardization of the IACT data format, adopting the FITS standard, and the development of open-source science tools, initiating the integration of the VHE discipline into multi-instrument astronomy. A first attempt to define a common data format for the VHE gamma-ray data is being carried out within the “Data formats for gamma-ray astronomy3” forum (Deil et al. 2017a, 2018). This is a community effort clustering together members of different IACT collaborations with CTA as driving force. In this paper we implement this prototypical data format for data samples by MAGIC, VERITAS, FACT, and H.E.S.S. and we combine, for the first time, observations by Fermi-LAT and these four IACTs relying on scientific analysis solely on open-source software, in particular on the gammapy 4 project (Donath et al. 2015; Deil et al. 2017b). We provide not only datasets but also all the scripts and an interactive environment to reproduce all the results. These resources are available on github5 and are referred to, from now on, as on-line material. This allows for the full reproduction of the results presented in the paper. The Crab nebula is selected as a target source for this work, being the reference source in the VHE gamma-ray astronomy (Aharonian et al. 2004, 2006; Albert et al. 2008; Aleksić et al. 2012; MAGIC Collaboration 2016b) owing to its brightness, apparent flux steadiness, and visibility from all the considered observatories.

The paper is structured as follows: We describe the selected datasets in Sect. 2, the process of extracting the spectral information with some considerations on the handling of statistical and systematic uncertainties in Sect. 3. In Sect. 4 we present the resources we use to ensure the analysis reproducibility, and in Sect. 5 we provide some prospects for the reuse of the methodologies of data release discussed and implemented in this work. This report is a technical paper to show the ease of multi-instrument results once the format standardization is reached. We do not seek to draw any scientific conclusion on the physics of the Crab nebula and of pulsar wind nebulae, in general.

2. Datasets

In contrast to other astronomical telescopes, instruments for gamma astronomy cannot directly scatter or reflect gamma rays, being photon-matter interactions dominated by pair production for Eγ >  30 MeV. The experimental techniques, either space-borne or ground-based (Funk 2015), rely on the direct detection of secondary charged particles or on the indirect detection of the Cherenkov emission of a cascade of charged secondaries they produce in the atmosphere. A detection, or event, cannot be unambiguously discriminated from the irreducible charged cosmic ray background, but can only be classified with a certain probability as a primary photon. Gamma-ray astronomy could therefore be labelled as “event-based” in contrast to “image-based” astronomy in which charge-coupled devices (CCDs) act as detectors. The input for the high-level analysis of gamma-ray astronomy data is typically constituted by two elements. The first is a list of events that are classified (according to selection cuts) as gamma rays, along with their estimated direction, P, estimated energy, E′, and arrival time. The second element consists of the instrument response functions (IRFs), quantifying the performance of the detector and connecting estimated quantities (E′, P) with their true, physical, values (E, P). The IRFs components are

  • Effective area, representing the effective collection area of the instrument, Aeff(E, P);

  • Energy dispersion, the probability density function of the energy estimator fE(E′|E, P);

  • Point spread function (PSF), the spatial probability distribution of the estimated event directions for a point source, fP(P|E, P).

Their formal definition is shared with lower energy instruments (e.g. x-ray; Davis 2001) and IRFs are computed from Monte Carlo simulations. Since the detector response is not uniform across the field of view (FoV), IRFs generally depend on the radial offset from the FoV centre (full-enclosure IRFs). When this dependency is not taken into account and a cut on the direction offset of the simulated events is applied, IRFs are suited only for the analysis of a point-like source sitting at an a priori defined position in the FoV (point-like IRFs). The components of IRFs, in this case, do not have a dependency on the event coordinate, P, but only on the energy and the PSF component is not specified. The differences between full-enclosure and point-like IRFs are illustrated in the interactive notebook 1_data.ipynb in the on-line material. For this publication we use all the datasets to perform a point-like analysis. In the IACT terminology, event lists and IRFs are dubbed Data Level 3 (DL3) products (Contreras et al. 2015). The datasets used in this work include observations of the Crab nebula by Fermi-LAT, MAGIC, VERITAS, FACT, and H.E.S.S. following the format specifications available in the “Data formats for gamma-ray astronomy” forum (Deil et al. 2017a). The IACT DL3 datasets were produced with proprietary codes that extracted the event lists and IRFs, and saved these in the requested format6. They are released in chunks, typically of 20–30 min, of data acquisition, named runs. The IACTs are ground-based, pointing instruments and their response varies with the observing conditions (atmospheric transmission, zenith angle, night sky background level, and position of the source in the FoV) therefore their data come with per-run IRFs. The Fermi-LAT telescope, orbiting around the Earth at ∼600 km, is generally operating in survey mode and has a stable set of IRFs shipped with the science tools. We use the Fermi-LAT science tools and gammapy to make this dataset spec-compliant. This work constitutes the first joint release of datasets from various instruments in VHE gamma-ray astronomy.

An interactive notebook illustrating the content of the datasets, 1_data.ipynb, is available in the on-line material. The datasets are presented in what follows in order of increasing instrument energy threshold (see Table 1).

Table 1.

Crab nebula datasets used in this work.

2.1. Fermi-LAT

The LAT detector, on board the Fermi satellite, is an imaging pair-conversion telescope, which has been designed to detect photons between 20 MeV and more than 300 GeV (Atwood et al. 2009). We analysed the publicly available observations spanning from 2008 August 8 to 2015 August 2, i.e. ∼7 yr of operations, in a 30° radius region around the position of the Crab nebula. We used all FRONT and BACK type events belonging to the source class with a reconstructed direction within 105° from the local zenith (to reject the emission from the Earth’s atmosphere) and a reconstructed energy between 30 GeV and 2 TeV. The lower energy cut was chosen to minimize the contamination of the Crab pulsar that is located at the centre of the nebula. We estimated that above 30 GeV the pulsar emission contributes to less than 10% to the detected flux and can be neglected given the technical purpose of the paper. An interactive notebook illustrating this calculation is available in the on-line material (5_crab_pulsar_nebula_sed.ipynb). To reduce the IRFs to a DL3-compliant format, we compute the PSF using gtpsf and estimated the effective area from the sky-coordinate and energy-dependent exposure computed with gtexpcube2. The effective area is simply the exposure scaled by the observation time. The energy dispersion at the energies considered in this analysis has an impact smaller than 5% on the reconstructed spectrum7. We approximate the energy dispersion with a Gaussian distribution with mean (bias) 0 and standard deviation 0.05 (resolution) independent of the estimated direction and energy. The event list and IRFs produced by the Fermi-LAT science tools are already FITS files and gammapy is used to store these in a DL3-compliant format.

2.2. MAGIC

A system of two 17 m diameter IACTs, MAGIC has 3.5° FoV and is located on the Canary island of La Palma, Spain at the Roque de Los Muchachos Observatory (2200 m above sea level). The first telescope worked in stand-alone mode from 2004 until 2009, when a second one started operations. In 2012 the two MAGIC telescopes underwent a major upgrade of the readout systems and the camera of the first telescope (MAGIC Collaboration 2016a) leading to a significant improvement of the instrument performance (MAGIC Collaboration 2016b). The FITS data released by the MAGIC collaboration includes 40 min of Crab nebula observations and their corresponding IRFs. The data were recorded in 2013, after the aforementioned major upgrade, at small zenith angles (< 30°) in wobble mode (Fomin et al. 1994) with the source sitting at an offset of 0.4° from the FoV centre. They are chosen from the data sample used for the performance evaluation in MAGIC Collaboration (2016b). The IRFs released by the MAGIC collaboration were generated using the MARS software (Zanin et al. 2013), for a point-like source response: i.e. they are calculated for a simulated source at 0.4° offset, applying a directional cut on the events direction of 0.14° around the source position.

2.3. VERITAS

A system with four 12 m diameter IACTs (Holder et al. 2006), VERITAS has 3.5° FoV and is based at the Fred Lawrence Whipple Observatory in Southern Arizona, USA. Since the start of full array operations in 2007, the sensitivity of VERITAS was improved by two major upgrades: the relocation of one of its telescopes in 2009 (Perkins 2009) and the upgrade of the camera with higher quantum efficiency photomultiplier tubes in 2012 (Kieda 2013). The DL3 data released by the VERITAS collaboration includes 40 min of archival observations of the Crab nebula taken in 2011, after the telescope relocation but prior to the camera upgrade. These observations were carried out in wobble mode with the standard offset of 0.5° at small zenith angles (< 20°). The released IRFs are valid for the analysis of point-like sources taken with the standard offset angle and were generated from events surviving the standard directional cut of 0.1°, using the VEGAS software package (Cogan 2008).

2.4. FACT

The FACT telescope (Anderhub et al. 2013; Biland et al. 2014) is a single IACT with a 4 m diameter reflective surface and a FoV of 4.5°, which is is located next to MAGIC at the Roque de Los Muchachos Observatory. The FACT telescope tests the feasibility of silicon photo-multipliers (SiPM) for use in VHE gamma-ray astronomy. It is the first fully automated Cherenkov telescope that takes data without an operator on site (FACT Collaboration 2018). For this work FACT released one full week of observations of the Crab nebula taken in November 2013, corresponding to 10.3 h8. The data were recorded in wobble mode with an offset angle of 0.6° at zenith angles smaller than 30°. The corresponding IRFs are of point-like type with a directional cut of 0.17°.

2.5. H.E.S.S.

The H.E.S.S. array is equipped with five IACTs and is located in Namibia on the Khomas Highland near Gamsberg mountain. The first four 12 m diameter telescopes, arranged in a square, became operational in December 2003 marking the start of what today is called H.E.S.S. Phase I with a FoV of 5°. Since July 2012 a fifth 28 m diameter telescope, located at the array centre, started operation (H.E.S.S. Phase II) both in stereoscopic and stand-alone mode. In this work, we used four observation runs of the Crab nebula carried out by H.E.S.S. Phase I in 2004, each of which has a duration of 28 min. These data were taken in wobble mode at zenith angles between 45° and 50°, half with a 0.5°, and the other half with a 1.5° offset angle. These data are the Crab runs part of the first FITS test data release (H.E.S.S. Collaboration 2018)9; H.E.S.S. released full-enclosure IRFs (Deil et al. 2017a), i.e. no directional cut is applied on the simulated events.

3. Data analysis

In this section we present a spectral analysis of the gamma-ray datasets described in Sect. 2. First, the gamma-ray event data and IRFs are reduced for each instrument (Sect. 3.1). Then, in Sect. 3.2, we perform a spectral likelihood fit, under the assumption of a log-parabola analytic model, for each datasets separately and for all the datasets together (joint fit). Finally, we present an analysis that includes a systematic error term, representing the uncertainty on the energy scale of each instrument, in a modified likelihood function.

3.1. Spectrum extraction

In order to estimate the energy spectrum of a gamma-ray source (, with Λ a the set of spectral parameters), a binned maximum likelihood method, with nE bins in estimated energy E′ is used. The observed data D for such a likelihood function are the number of events in a circular signal region (labelled as ON) containing the gamma-ray source and in a control region (labelled as OFF) measuring the background to be subtracted from the ON. Considering nruns observation runs from ninstr different instruments (or datasets), we can write the likelihood as

(1)

with each instrument contributing with a term, i.e.

(2)

where

  • Non, ijk and Noff, ijk are the number of observed events within the ON and OFF regions, respectively, in the energy bin k in the run j for the i-th instrument. These values are both characterized by a Poisson distribution;

  • gijk(Λ) and bijk are the expected number of signal and background events, respectively, in the energy bin k in the run j for the i-th instrument. The quantity gijk is computed with the forward folding technique: for a point-like analysis the assumed spectrum is convolved with the effective area and energy dispersion IRFs component. The quantity bijk, in absence of a background spectral model, is treated as a nuisance parameter and fixed to the value returning , for the mathematical details of gijk and bijk evaluation; see Appendix A in Piron et al. (2001).

  • αij is the ON to OFF exposures ratio, constant with energy in our case, in the run j for the i-th instrument.

The size of the circular ON region per each dataset is given as Ron in Table 1; the OFF region can be defined either as multiple regions mirroring the ON symmetrically with respect to the telescope pointing position (reflected regions background method in Aharonian et al. 2001; Berge et al. 2007) or as a ring around the source position (ring background method in Berge et al. 2007). Given the wobble mode observation strategy, and the small FoV, the reflected regions method is naturally suitable for the IACTs datasets. On the other hand, the ring background method is used for the Fermi-LAT datasets in this analysis with a circular signal region of 0.3° radius and a background ring with inner and outer radius of 1° and 2°, respectively. We choose, for all the instruments, 20 bins per decade for the estimated energy between 10 GeV and 100 TeV. For a given instrument i and run j the likelihood values are not computed in the energy bins outside the range [Emin,Emax] given in Table 1. The choice of the energy range for Fermi-LAT is already discussed in Sect. 2.1. For the IACT datasets, Emin is a safe energy threshold for the spectrum extraction computed by the collaboration software and hard coded in the DL3 files. This threshold is mostly dependent on the experiment performance and on the zenith angle of the observations. The FACT telescope has an energy threshold of 450 GeV, which is higher than MAGIC and VERITAS despite the observations carried out in the same zenith angle range because of its limited light-collection area and the single telescope observations. The larger zenith angle of the H.E.S.S. datasets is due to the low altitude at which the source culminates at the H.E.S.S. site. This yields an energy threshold of ∼700 GeV, which is higher than any other IACT. The maximum energy Emax is fixed to 30 TeV for all the IACTs and it is chosen to cover the whole energy range containing events.

The mean number of signal events, labelled as excess events, can be estimated via the equation Nex, ijk = Non, ijk − αijNoff, ijk. The distribution of the excess events in each estimated energy bin, summed over all the observational runs per each instrument is shown in Fig. 1. Table 1 also reports the total number of observed events in the ON region (NON = ∑ijkNon, ijk) per each experiment, and the number of background events in the ON region per each experiment, obtained scaling the OFF events with the exposures ratio αij (Nbkg = ∑ijkαijNoff, ijk).

thumbnail Fig. 1.

Histograms of the estimated mean number of signal events from the Crab nebula, number of excess events vs. estimated energy for each dataset.

Open with DEXTER

To perform a joint point-like analysis we reduced the Fermi-LAT and H.E.S.S. full-enclosure IRFs to a point-like format, removing the dependency from the source position in the FoV. For Fermi-LAT, under the assumption that the acceptance is uniform in a small sky region close to our target, we obtained a point-like effective area by taking the value at the source position. In each energy bin we corrected the effective area with a containment fraction computed integrating the PSF over the signal region. Similarly for the H.E.S.S. IRFs we took the value at the source offset and computed a correction based on the PSF containment fraction.

3.2. Spectral model fit

We assume a spectral model of log-parabolic form

(3)

since it was suggested as the best-function approximation for the Crab nebula spectrum in such a wide energy range (Aleksić et al. 2015). The spectral parameters Λ = (ϕ0, Γ, β) are left free to vary in the fit while the reference energy E0 is fixed at the value of 1 TeV. We refer to the result of the maximum likelihood using all the instrument datasets as joint fit. As a consistency check we also fit each instrument dataset separately (i fixed in Eq. (1)). The reference energy E0 is typically chosen to minimize the correlation between the other spectral. In this work instead, in order to directly compare the parameters Λ also for the fit with the individual instrument datasets, E0 is kept fixed at the same value of 1 TeV. This introduces larger errors and correlation for the datasets for which such value is close to one of the extremes of its energy range. The resulting spectral energy distributions (SEDs; E2dϕ/dE) are shown in Fig. 2, together with a theoretical model taken from Meyer et al. (2010). The values of the fit parameters are listed in Table 2. The joint fit inherently comes with an increase in statistical power, as evidenced by the shrinking of the confidence contours of the fitted spectral parameters for the joint fit in Fig. 3. We note that the Crab SED shape is not exactly represented by a log parabola across the 30 GeV to 20 TeV energy range, which is one reason for differences in the measured fit parameters from the different experiments. An interactive summary of the spectral results is available in the on-line material (2_results.ipynb).

thumbnail Fig. 2.

Crab nebula SED for individual instrument fits and from the joint fit. Single-instrument results are represented with dashed lines, the fit of all the datasets together, labelled as joint, is represented as a thick, solid red line. The shaded areas represent the SED error bands whose calculation is explained in Sect. 3.2. The dotted line shows the model in Meyer et al. (2010).

Open with DEXTER

thumbnail Fig. 3.

Likelihood contours corresponding to 68% probability content for the fitted spectral parameters (ϕ0, Γ, β), for the likelihood in Eq. (1). Results from the individual instruments and from the joint-fit are displayed.

Open with DEXTER

Table 2.

Spectral model best-fit parameters, as defined in Eq. (3).

The statistical uncertainty on the SED is estimated by using a sampling technique to propagate the errors from the fit parameters. We assume that the likelihood of the model parameters Λ is distributed according to a multivariate normal distribution with mean vector μ and covariance matrix Σ defined by the fit results. We assume , values of the fitted parameters and , covariance of the fitted parameters. We sample this distribution and compute the spectrum realization corresponding to each sampled Λ. The ±1σ uncertainty on the fitted spectrum is estimated by taking, at a given energy, the quantiles of the fluxes distribution that returns a 68% containment of all the realizations. The upper and lower limits of the error band estimated with our method are plotted in black against 100 realizations of the spectrum with sampled parameters in Fig. 4, for the example case of the VERITAS datasets.

thumbnail Fig. 4.

Error estimation methods for the measured SED using the VERITAS dataset, as the example case. The solid black lines indicate the upper and lower limits of the error band estimated with the multivariate sampling. These lines represent the 68% containment of 500 spectral realizations (100 shown as grey lines) whose parameters are sampled from a multivariate distribution defined by the fit results.

Open with DEXTER

3.3. Systematic uncertainties on different energy scales

Spectral measurements in gamma-ray astronomy are affected by multiple sources of systematic uncertainty. The DL3 data contains systematic uncertainties that originate from an imperfect modelling of the atmosphere, telescopes, and event reconstruction, resulting in a shift of the reconstructed energy scale and errors in the assumed IRF shapes. A second source of systematic error comes from the data reduction and model fitting, for example due to energy binning and interpolation effects, as well as from source morphology and spectral shape assumptions. Generally two approaches are used to evaluate and report systematic errors (Conrad et al. 2003; Barlow 2017): multiple analyses or bracketing as in Aharonian et al. (2006), Aleksić et al. (2012) or modified likelihood with nuisance parameters (Dickinson & Conrad 2013; Dembinski et al. 2017; Ballet et al. 2018). The first approach leads to the estimation of an overall systematic error on each spectral parameter, for instance for the flux normalization ϕ0 ± σϕ0, stat. ± σϕ0, syst., whereas the second method yields a global error including both statistical and systematic uncertainties, i.e. ϕ0 ± σϕ0, stat.+syst.. As an example of how to treat systematic errors, we present an analysis with a modified likelihood that includes the uncertainty on the energy scale. Following Dembinski et al. (2017), we define a new joint likelihood function that includes a constant relative bias of the energy estimator per each instrument zi, characterized by a Gaussian distribution with mean 0 and standard deviation δi ( in the following notation). δi represents the systematic uncertainty on the energy scale estimated by a single instrument. This parameter is defined as , where is the energy reported by an instrument and E the actual energy of each single event. The apparent spectral model we aim to fit for a single instrument would then be

(4)

and the overall joint likelihood is modified in

(5)

where now the energy biases are included in the spectral parameters to be fitted: Λ = (ϕ0, Γ, β, z1, ..., znisntr) and the energy spectrum in Eq. (4) is used to predict the gijk via forward folding. As in Eq. (1), i runs over the instruments, j on the runs, and k on the energy bin. The inclusion of the energy biases also allows, in addition to the variation of the global spectral parameters ϕ0, Γ, and β (the same for all datasets), an instrument-dependent energy adjustment (a shift) of the assumed model through the individual zi. This shift is not arbitrary: it is, in fact, constrained by its Gaussian distribution with a standard deviation given by the systematic uncertainty on the energy scale provided by the single experiment, δi. Hereafter we refer to this likelihood fit as stat.+syst. likelihood that is the generalized version of Eq. (1) (obtainable from Eq. (5) simply fixing all zi = 0). The result of the stat.+syst. likelihood joint fit is shown in Fig. 5 in blue against the result of the stat. likelihood (Eq. (1)) fit in red. We note that in this work we only account for the energy scale systematic uncertainty, as an example of a modified likelihood. A full treatment of the systematic uncertainty goes beyond the scope of this paper. It is possible to reproduce interactively the systematic fit in the on-line material (3_systematics.ipynb).

thumbnail Fig. 5.

Likelihood contours corresponding to 68% probability content for the fitted spectral parameters (ϕ0, Γ, β), for the likelihood in Eq. (1) (red) and the likelihood in Eq. (5) (blue). Results from the individual instruments with the likelihood in Eq. (1) are shown in grey.

Open with DEXTER

4. Reproducibility

This work presents a first reproducible multi-instrument gamma-ray analysis achieved by using the common DL3 data format and the open-source gammapy software package. We provide public access to the DL3 observational data, scripts used and obtained results with the GitHub repository mentioned in the introduction, along with a Docker container10 on DockerHub, and a Zenodo record (Nigro et al. 2018), which provides a Digital Object Identifier (DOI). The user access to the repository hosting data and analysis scripts represents a necessary, but not sufficient condition to accomplish the exact reproducibility of the results. We deliver a conda 11 configuration file to build a virtual computing environment, defined with a special care to address the internal dependencies among the versions of the software used. Furthermore, since the availability of all the external software dependencies is not assured in the future, we also provide a joint-crab docker container to guarantee a mid-term preservation of the reproducibility. The main results published in this work may be reproduced executing the make.py command. This script works as a documented command line interface tool, wrapping a set of actions in different option commands that either extract or run the likelihood minimization or reproduce the figures presented in the paper.

The documentation is provided in the form of Jupyter notebooks. These notebooks can also be run through Binder12 public service to access, via a web browser, the whole joint-crab working execution environment in the Binder cloud infrastructure. The Zenodo joint-crab record, the joint-crab docker container, and the joint-crab working environment in Binder may be all synchronized if needed with the content present in the joint-crab GitHub repository. Therefore, if eventual improved versions of the joint-crab bundle are needed (e.g. comments from referees, improved algorithms, or analysis methods), they may be published in the GitHub repository and then propagated from GitHub to the other joint-crab repositories in Zenodo, DockerHub, and Binder. All these versions would be kept synchronized in their respective repositories.

5. Extensibility

Another significant advantage of the common-format, open-source, and reproducible approach we propose to the VHE gamma-ray community is the possibility to access the ON and OFF events distributions and the IRFs, i.e. the results of the spectrum extraction, saved in the OGIP spectral data format13 (they can be interactively accessed in 1_data.ipynb). This would allow us to perform a maximum likelihood fit to any assumed spectral model dϕ/dE, that is otherwise impossible. This is of crucial interest for researchers not associated with experimental collaborations who usually only have access to the final spectral points (often published with no covariance matrix attached) and cannot properly test their theoretical models against the data. In the on-line material of this work, besides the analytically log-parabola function, we also considered a theoretical Synchrotron-Self Compton radiative model (4_naima.ipynb) obtained with the open-source naima Python package (Zabalza 2015). This is meant to emphasize, on one hand, the potential of the proposed approach, and, on the other hand, the easy interchange between open-source astronomical Python packages with different functionality.

6. Conclusions

This paper presents a multi-instrument reproducible gamma-ray analysis realized with open-source software. It also contains the first public joint release of data from IACTs. Such data dissemination offers the astronomical community the opportunity to gather knowledge of the VHE analysis techniques while waiting for the forthcoming CTA operations. Furthermore they can also be used in data challenges or coding sprints to improve the status of the current science tools.

On a technical note, the DL3 data producible at the moment allow only for joint analyses of a target source at a given position of the FoV (the Crab nebula in this case): no other potential source in the FoV (or multiple sources) can be analysed given the point-like IRFs computed accordingly with the source position that all the instruments, but H.E.S.S., made available. It is worth noting that even the more exhaustive full-enclosure IRFs format may require further development as the current radial offset dependency does not account for a possible non-azimuthal symmetry of the instrument acceptance (Prandini et al. 2015).

On a more general note, the objective of this publication is also to remark on a novel approach towards gamma-ray science, summarized through the three essential concepts of common data format, open-source software, and reproducible results. We illustrate that a common data format naturally allows multi-instrument analysis. Generating data samples compliant with the prototypical DL3 format defined in the “Data formats for gamma-ray astronomy” forum, we perform, for the first time, a spectral analysis of the Crab nebula using data from Fermi-LAT and four currently operating IACTs via the gammapy software package. Open-source software will be a key asset for the upcoming CTA, which, as an open-observatory, will share its observation time and data with the wider astronomical community. Reproducible results are seamlessly achieved once the data and software are publicly available. There are several tools and platforms on the market that can be used for this purpose. In particular, with the on-line material attached to this issue, we show a practical example of how a future gamma-ray publication can be released with long-term solutions. A Git repository suffices for the first period after publication, whereas a Docker accounts for the eventual loss of maintenance of the software packages needed for the analysis. We also provided some considerations on analysis procedures related to spectral analysis commonly performed in the VHE IACT-related astronomy. We proposed a method for computing error bands on the measured SED based on the sampling of a multivariate distribution. We also suggested a method to account for the systematic uncertainties on the energy scales of different gamma-ray instruments while performing a joint fit of their data. We also pointed out the advantages of publishing the outputs of the spectrum extraction, i.e. the distribution of the signal and background events and IRFs (similar to the OGIP spectral data in the joint-crab repository), instead of the spectral points. Mainly this grants the possibility to successively construct a likelihood using an arbitrary theoretical spectral model.


6

All the FACT software used to generate DL3 datasets is open source.

Acknowledgments

This work was supported by the Young Investigators Program of the Helmholtz Association, by the Deutsche Forschungsgemeinschaft (DFG) within the Collaborative Research Center SFB 876 “Providing Information by Resource-Constrained Analysis”, project C3 and by the European Commision through the ASTERICS Horizon2020 project (id 653477). We would like to thank the H.E.S.S., MAGIC, VERITAS, and FACT collaborations for releasing the data that were used. This work made use of astropy (Astropy Collaboration 2013) and sherpa (Freeman et al. 2001). The authors are indebted to Abelardo Moralejo and Hans Dembinski for their useful suggestions on the statistical and systematic uncertainty estimation. We are grateful to the anonymous referee for improving the paper with helpful comments.

References

All Tables

Table 1.

Crab nebula datasets used in this work.

Table 2.

Spectral model best-fit parameters, as defined in Eq. (3).

All Figures

thumbnail Fig. 1.

Histograms of the estimated mean number of signal events from the Crab nebula, number of excess events vs. estimated energy for each dataset.

Open with DEXTER
In the text
thumbnail Fig. 2.

Crab nebula SED for individual instrument fits and from the joint fit. Single-instrument results are represented with dashed lines, the fit of all the datasets together, labelled as joint, is represented as a thick, solid red line. The shaded areas represent the SED error bands whose calculation is explained in Sect. 3.2. The dotted line shows the model in Meyer et al. (2010).

Open with DEXTER
In the text
thumbnail Fig. 3.

Likelihood contours corresponding to 68% probability content for the fitted spectral parameters (ϕ0, Γ, β), for the likelihood in Eq. (1). Results from the individual instruments and from the joint-fit are displayed.

Open with DEXTER
In the text
thumbnail Fig. 4.

Error estimation methods for the measured SED using the VERITAS dataset, as the example case. The solid black lines indicate the upper and lower limits of the error band estimated with the multivariate sampling. These lines represent the 68% containment of 500 spectral realizations (100 shown as grey lines) whose parameters are sampled from a multivariate distribution defined by the fit results.

Open with DEXTER
In the text
thumbnail Fig. 5.

Likelihood contours corresponding to 68% probability content for the fitted spectral parameters (ϕ0, Γ, β), for the likelihood in Eq. (1) (red) and the likelihood in Eq. (5) (blue). Results from the individual instruments with the likelihood in Eq. (1) are shown in grey.

Open with DEXTER
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.