Analysis of Fermi gammaray burst duration distribution
Astronomical Observatory, Jagiellonian University, Orla 171, 30244 Cracow, Poland
email: mariusz.tarnopolski@uj.edu.pl
Received: 26 April 2015
Accepted: 24 June 2015
Context. Two classes of gammaray bursts (GRBs), short and long, have been determined without any doubts, and are usually prescribed to different physical scenarios. A third class, intermediate in T_{90} durations has been reported in the datasets of BATSE, Swift, RHESSI, and possibly BeppoSAX. The latest release of >1500 GRBs observed by Fermi gives an opportunity to further investigate the duration distribution.
Aims. The aim of this paper is to investigate whether a third class is present in the log T_{90} distribution, or whether it is described by a bimodal distribution.
Methods. A standard χ^{2} fitting of a mixture of Gaussians was applied to 25 histograms with different binnings.
Results. Different binnings give various values of the fitting parameters, as well as the shape of the fitted curve. Among five statistically significant fits, none is trimodal.
Conclusions. Locations of the Gaussian components are in agreement with previous works. However, a trimodal distribution, understood in the sense of having three distinct peaks, is not found for any binning. It is concluded that the duration distribution in the Fermi data is well described by a mixture of three lognormal distributions, but it is intrinsically bimodal, hence no third class is present in the T_{90} data of Fermi. It is suggested that the lognormal fit may not be an adequate model.
Key words: gamma rays: general / methods: data analysis / methods: statistical
© ESO, 2015
1. Introduction
Gammaray bursts (GRBs) are most commonly classified based on their duration time T_{90} (time during which 90% of the burst fluence is accumulated, starting from the time at which 5% of the total fluence was detected). Mazets et al. (1981) first observed the bimodal distribution of T_{90} drawn for 143 events detected in the KONUS experiment. Kouveliotou et al. (1993) also found a bimodal structure in the log T_{90} distribution in BATSE 1B dataset; based on this data GRBs are divided into short (T_{90} < 2 s) and long (T_{90} > 2 s) classes. Horváth (1998) and Mukherjee et al. (1998) independently discovered a third peak in the duration distribution in the BATSE 3B catalog (Meegan et al. 1996), located between the short and long peaks, and the statistical existence of this intermediate class was supported (Horváth 2002) with the use of BATSE 4B data. Evidence for a third component in log T_{90} was also found in the Swift data (Horváth et al. 2008a,b; Zhang & Choi 2008; Huja et al. 2009; Horváth et al. 2010). Other datasets, i.e., RHESSI (Řípa et al. 2009) and BeppoSAX (Horváth 2009), are both in agreement with earlier results regarding the bimodal distribution, and the detection of a third component was established on a lower significance level (compared to BATSE and Swift) owing to less populated samples. Hence, four different satellites provided hints about the existence of a third class of GRBs. Contrary to this, the durations observed by INTEGRAL have a unimodal distribution that extends to the shortest timescales as a power law (Savchenko et al. 2012).
Only one dataset (BATSE 3B) was truly trimodal in the sense of having three peaks. In the others, a threeGaussian distribution was found to follow the histogram better than a twoGaussian, but those fits yielded only two peaks, so although statistical analyses support the presence of a third class, its existence is still elusive and may be due to selection and instrument effects, or might be described by a distribution that is not necessarily a mixture of Gaussians. The latest release is due to Fermi GBM observations (Gruber et al. 2014; von Kienlin et al. 2014) and consists of 1568 GRBs^{1} with calculated T_{90}. This sample, to the best of the author’s knowledge, has not yet been investigated for the presence of an intermediate class; only Horváth et al. (2012) and Qin et al. (2013) conducted research on a subsample consisting of 425 bursts from the first release of the catalog.
Previous analyses, although consistent with each other (except INTEGRAL), showed variances in locations of the Gaussian components, their dispersion, relative heights, and statistical significance of a trimodal fit compared to a bimodal distribution. The aim of this article is to perform a statistical analysis of the Fermi database in order to verify the existence of the intermediateduration GRB class. The current sample constitutes ~83% of long GRBs (T_{90} > 2 s), which is a larger percentage than in previous catalogs. A conventional χ^{2} fitting is applied. Since BATSE 4B (Horváth 2002), the maximum loglikelihood method has been used. In the case of the Swift data this was fully justified because of the small size of the population (Horváth et al. 2008a,b). However, in the first analysis of the BATSE 3B data (Horváth 1998), which consisted of 797 GRBs, three classes were detected with the χ^{2} method, which was later supported by the maximum loglikelihood method on a bigger dataset (Horváth 2002). Hence, the number of GRBs under consideration in this study should not be too small to conduct a conventional fitting analysis. To verify whether the χ^{2} fitting is applicable, the BATSE 4B data (2041 GRBs) were reexamined and results similar to those of Horváth (2002) were obtained. To be sure that the method chosen does not rule out any feature in the data under consideration, the maximum loglikelihood was also applied, but since the results were not different from those obtained using the χ^{2} fitting, they are not reported here or mentioned in what follows.
This article is organized as follows. Section 2 presents the χ^{2} fittings of different models. In Sect. 3 a comparison with previous results is discussed. Section 4 is devoted to discussion, and Sect. 5 gives concluding remarks. The computer algebra system Mathematica v10.0.2 is applied throughout this paper.
2. The χ^{2} fittings
2.1. Sample
In this section a dataset of 1566 GRBs is used. Two durations (the shortest and the longest) were excluded because, for every binning, they were separated from the rest of the distribution by empty bins.
2.2. Standard normal distributions
A least squares fitting of a multicomponent lognormal distribution to the dataset of durations T_{90} is performed, i.e., a mixture of Gaussians (1)is fitted to a histogram of log T_{90}. A significance level of α = 0.05 is adopted. Twentyfive binnings are applied, defined by the bin widths w from 0.30 to 0.06 with a step of 0.01. The corresponding number of bins ranges from 15 to 69. A number of binnings is chosen rather than a neutral bin width established according to some conventional rule (e.g., FreedmanDiaconis, Scott, Knuth) because of Koen & Bere et al. (2012), who found two fits to Swift data that could not be rejected, hence restricting the analysis to only one binning might be concealing. For each binning the same fitting procedure is performed as described.
The χ^{2} of a fit is calculated in a standard way as (2)where O_{i} is the observed ith value, E_{i} is the value expected based on the fit, and N is the number of bins. The number of degrees of freedom, d.o.f., of the χ^{2} test statistic is d.o.f. = N − m − 1, where m is the number of parameters used; for a kGaussian, m = 3k. First, a singleGaussian fit is performed for all binnings. The χ^{2} values range from 12 180 to 54 548, with pvalues being numerically equal to zero in all cases. It follows from the huge χ^{2} values that it is extremely unlikely that the singleGaussian fit describes the log T_{90} distribution correctly.
Next, a twoGaussian fit is performed and the resulting χ^{2} are much lower; however, the majority of pvalues indicates that the distributions do not follow the data well at the α = 0.05 significance level. Fitted parameters are gathered in Table 1, with pvalues greater than α written in bold, while the curves are displayed in Fig. 1. The bimodal structure is well represented by a twoGaussian fit in four out of five cases displayed in Fig. 1. The statistically unsignificant fit for w = 0.25 is also shown due to a comparison with a threeGaussian fit to be performed further on.
Parameters of a twoGaussian fit.
Fig. 1 TwoGaussian fits (solid black) to log T_{90} distributions. All except w = 0.25 are statistically significant. Colored dashed curves are the components of the mixture distribution. 

Open with DEXTER 
Finally, a threeGaussian fit is performed in the same manner. The resulting χ^{2} are naturally lower than they were in the previous case. Again, a majority of the corresponding pvalues that are smaller than α indicates that in general the duration distribution is not well described by a mixture of three lognormal components. Parameters of the five fits with p  values>α are gathered in Table 2, and the fitted curves are displayed in Fig. 2. It is important to note that these five fits are all statistically significant, including the fit for w = 0.25, for which a hypothesis that the histogram is well described by a twoGaussian fit was rejected.
Parameters of a threeGaussian fit.
Fig. 2 Same as Fig 1, but for a threeGaussian fit. 

Open with DEXTER 
The probability that the third component originates from statistical fluctuations may be estimated by comparing a threeGaussian with a twoGaussian fit, based on an observation (Band et al. 1997) that (3)where Δν = ν_{2} − ν_{1} is the difference in the degrees of freedom of fits under consideration, equal to three when Δk = 1 (compare with Eq. (1)), and denotes equality in distribution.
To elaborate to what degree a threeGaussian fit is better than a twoGaussian, Eq. (3) is applied and the pvalues are inferred from a χ^{2} distribution with three degrees of freedom. A small pvalue indicates a small probability that the twoGaussian model alone describes the data in comparison with a threeGaussian. The results, gathered in Table 3, indicate that in all binnings there is a> 20% probability (exceeding 50% for the remaining three statistically significant cases) that a threeGaussian is a significant improvent over a twoGaussian fit. This does not mean that the threeGaussian models are a good description of the data, as the χ^{2} from Table 2 lead to rejection of the null hypothesis for only one out of five binnings with pvalues greater than α in the threeGaussian case. For this unsignificant fit, the pvalue computed from Δχ^{2} means that a bad threeGaussian is better than a bad twoGaussian. For the fits that are statistically significant (when fitting both a two and a threeGaussian) the conclusion is that a null hypothesis cannot be rejected for both, but a threeGaussian describes the data better with a probability >50%. Nevertheless, this probability is insufficient to claim a detection; in Horváth (2009) it was concluded that even a 96% significance level is too small to be considered evidence.
Improvements of a threeGaussian over a twoGaussian fit.
The Fermi distribution is dominated by long GRBs, which constitute ~83% of the sample, that is manifested through a significantly higher dispersion of the short GRB distribution. Moreover, it was proposed that the distribution for short GRBs was nearly flat for T_{90} ≲ 2 s (Bromberg et al. 2011, 2013) (compare with Savchenko et al. 2012); however, for smaller bin widths a statistical noise starts to dominate. This is supported by a very small pvalue of ~10^{4} − 10^{7} for the smaller bin widths, w = 0.11 − 0.06, which indicates that at these binnings statistical noise is dominating.
Three out of five statistically significant fittings (w = 0.27, 0.25, 0.13) located the third component beyond the main peak for long GRBs. One fit (w = 0.26) showed an excess at T_{90} ≈ 10 s that might be assigned to an intermediate class; however, the fitted curve is very similar to a twoGaussian. The last fit, with w = 0.20, detected two components (in addition to a peak related to short GRBs) of approximately the same height and comparable standard deviations in a region of long GRBs. For these last two fits, the dispersion of a corresponding peak of a twoGaussian is greater than the dispersion of both the components in the threeGaussian, hence it gives a hint toward the bimodality of the GRBs (Schilling et al. 2002). This is in agreement with a very high probability that the third group in a threecomponent model is a statistical fluctuation.
3. Comparison with former results
A detailed comparison between BATSE 4B and Swift catalogs has been conducted (Huja et al. 2009) and the results were found to be consistent, also in the means of differences between the instruments. The RHESSI observations were also taken into account (Řípa et al. 2009) and roughly the same distribution as in previous catalogs was reported. The BeppoSAX data were of a relatively low population and the analysis showed the presence of three components on a significance level lower than in previous catalogs, but two classes – the intermediate and long – were detected with high certainty (Horváth 2009). As short GRBs were underpopulated, the threeGaussian fit, despite following the observations better than a twoGaussian, consisted of a component with a high dispersion. Overall, classification analyses found three components in all four samples. Interestingly, the dataset from INTEGRAL (Savchenko et al. 2012) yields a unimodal distribution. The latest wellpopulated sample, based on the Fermi data, was investigated some time ago (Horváth et al. 2012) using PCA and multiclustering analysis, and a threegroup structure was found in a multidimensional parameter space including duration, total fluence, hardness ratio, and peakflux256. Hence, data from five satellites supported a threecomponent distribution of GRBs by means of statistical significance.
In Fig. 3 all of the components’ locations found by the abovementioned univariate analyses are plotted. Results from this work, i.e., locations of the threeGaussian components found by a standard lognormal fitting, are consistent with previous results by means of locations as well as relative separations – mean durations of the five statistically significant fits are centered at 0.745 s, 21.61 s, and 67.05 s for short, intermediate, and long GRBs, respectively. The current sample of 1566 GRBs comprises 17% of short GRBs and 83% of long ones, based on the conventional classification ≶2 s. Because of an insufficient separation of the intermediate and long components, it is impossible to conclude what the population of intermediate class GRBs in Fermi data is, as the distribution is apparently bimodal and shows no evidence for the third class being present, and one can associate the intermediate and long groups with a single class. It is important to note that in Fermi the sensitivity at very soft and very hard GRBs was higher than in BATSE (Meegan et al. 2009). Soft GRBs are intermediate in duration, and hard GRBs have short durations. Hence, an increase in intermediate GRBs relative to long ones might be expected as a consequence of improving instruments, yet the third class remains elusive. Swift is more sensitive in soft bands than BATSE was, hence its dataset has a low fraction of short GRBs. Therefore, the group populations inferred from Fermi observations are reasonable considering the characteristics of the instruments.
Fig. 3 Locations of the short (green squares), intermediate (blue crosses) and long (red triangles) GRBs from previous research and this work. 

Open with DEXTER 
4. Discussion
Among the 25 threeGaussian fittings performed, five (w = 0.27, 0.26, 0.25, 0.20, 0.13) turned out to be statistically significant, with pvalues exceeding the significance level α = 0.05. Locations of the respective groups for the five fits are close to each other with means μ_{1} = −0.128 ± 0.082 (short GRBs), μ_{2} = 1.335 ± 0.156 and μ_{3} = 1.826 ± 0.185 (long GRBs), the error being the standard deviation of the average (compare with Fig. 3). Previous works on datasets from BATSE (Horváth 1998, 2002) and Swift (Horváth et al. 2008a; Huja et al. 2009) indicated that a threeGaussian is a better fit than a corresponding twoGaussian. On the other hand, a threeGaussian fit to RHESSI (Řípa et al. 2009) data yielded only a 93% probability of being correct compared to a twoGaussian, meaning that there is a remarkable 7% probability that the log T_{90} is well described by a twoGaussian, while for BeppoSAX (Horváth 2009) the goodnessoffit was not reported (only the maximum loglikelihoods).
Moreover, the two greater means of the threeGaussian from Fig. 2 (w = 0.20) do not satisfy the criterion of bimodality (Schilling et al. 2002) (4)with σ_{A} ≤ σ_{B}, and S(r) being the separation factor equal to 0.98 in the case of w = 0.20 (assuming equal weights; arbitrary mixtures yield an even higher value of the factor) for σ_{2} and σ_{3} from Table 2, hence long GRBs (T_{90}> 2 s) do not have a bimodal distribution and are described by a single peak, meaning that the distribution over the whole range of available durations T_{90} is bimodal, with peaks corresponding to short and long GRB populations. The fit for w = 0.26 does not fulfil this condition either, where the appropriate S(r) = 1.16 is taken from (Schilling et al. 2002). The remaining three cases, although the shoulder is prominent, are also apparently bimodal.
The relative improvements (Table 3) indicate an enourmous probability, ranging from 14% to 77%, that the third component in a threeGaussian fit is a chance occurrence compared to a twoGaussian.
Finally, among previous research, only the BATSE 3B data revealed a truly trimodal log T_{90} distribution, i.e., having three local maxima, which was not present in the following release, BATSE 4B (where only a bump was present), and is nonexistent in the current BATSE catalog. Swift also observed two local maxima, although with a prominent shoulder on the left side of the long GRB peak (Horváth et al. 2008a), detected by means of the maximum loglikelihood method. Huja et al. (2009) also obtained a bimodal distribution with a bump on one side of the long GRB peak, although somewhat weaker. The explanation may be that for the sample of 388 GRBs, a maximum loglikelihood method is more robust than applying a χ^{2} fitting. The latter may be a drawback in undersampled populations. Fortunately, this is not a case in the Fermi data. The RHESSI distribution (Řípa et al. 2009) is also characterized by a bimodal fit with a shoulder, while in BeppoSAX clearly separated intermediate and long groups were found (Horváth 2009); however, no short GRBs were detected in the log T_{90} distribution there. This is most likely due to a low trigger efficiency to short GRBs, which are highly underpopulated in that sample. Interestingly, an early analysis of 222 GRBs from Swift (Horváth et al. 2008b) detected the short class, while the long GRBs were unimodal, again with a bump. Finally, the standard threeGaussian fits passed the AndersonDarling test, hence a mixture of lognormal distributions account well for the observed durations.
The distributions fitted in this paper are strongly dependent on the binning applied when the locations of components and their relative amplitudes are considered. No statistically significant trimodal fit was found, although a shoulder (in three out of five fits) was detected beyond the region where previous works found the intermediate class, which is a surprising result. Still, a threeGaussian fit is a better fit than a twoGaussian, statistically speaking. However, it is arguable whether this confirms the existence of the third class of GRBs. As the sum of two normal distributions is skewed, which is the apparent case in the Fermi sample, the underlying multimodal distribution may not necessarily be composed of Gaussians. Recently, Zitouni et al. (2015) suggested that the duration distribution corresponding to the collapsar scenario (associated with long GRBs) might not necessarily be symmetrical because of a nonsymmetrical distribution of envelope masses of the progenitors.
Moreover, the more components the mixture distribution has, the more parameters are available for the curve to fit the histogram. This may be the primary reason for the calculated goodnessoffit. On the other hand, in terms of statistical accuracy, the hypothesis that three fundamental components are needed to describe observed durations is corroborated. The true underlying form of the distribution remains obscure. Short GRBs have been underpopulated in observations ever since, and some instrument biases have been proposed to account for the emergence of a third peak. The trimodality in the BATSE 3B catalog was greatly diminished in the 4B version, which may be directly attributed to collecting a more complete sample. The Fermi database contains 75% of the number of GRBs observed in BATSE current catalog, hence further observations, as well as theoretical models that could account for the diversity of GRB events in more detail may clarify the properties of the T_{90} distribution and determine whether an intermediate class of GRBs is a physical or a statistical phenomenom.
Finally, a mixture of lognormal Gaussians may not be a proper model for the T_{90} distribution, and a mixture of intrinsically skewed distributions may be a better explanation of the observed features of the histogram. On the other hand, the duration distribution might not be a sum of components defined on (−∞, +∞), but might be a piecewise function.
5. Conclusions

1.
A mixture of three standard normal distributions was found tobe statistically significant in describing thelog T_{90} distribution for 1566 Fermi GRBs for bin widths w = 0.27, 0.26, 0.25, 0.20, 0.13. Average locations of the components are equal to 0.745 s, 21.61 s, and 67.05 s for short, intermediate, and long GRBs, respectively, or −0.128, 1.335, and 1.826 in logscale. These results are in agreement with values obtained from previous catalogs: BATSE, Swift, RHESSI, and BeppoSAX.

2.
The relative improvements of a threeGaussian fit over a twoGaussian imply that there is a significant probability, varying from 14% to 77% among the fits, that the third component is a chance occurrence. Therefore, the third GRB class is unlikely to be present in the Fermi data.

3.
None of the fits is trimodal (in the sense of having three distinct peaks). Although three out of five fits show a prominent shoulder on the right side of the long GRB peak, the evidence is not sufficient to claim detection of a third class in the Fermi data. Therefore, the distribution of Fermi durations is intrinsically bimodal, hence no evidence for an intermediate class of GRBs has been found.

4.
The observed asymmetry may come from an underlying distribution composed of two skewed components, or it may be a piecewise function.
http://heasarc.gsfc.nasa.gov/W3Browse/fermi/fermigbrst.html, accessed on March 12, 2015.
Acknowledgments
The author acknowledges fruitful discussions with Michał Wyrȩbowski and Arkadiusz Kosior, and wishes to thank the anonymous referee for useful comments that led to significant improvements of the paper.
References
 Band, D. L., Ford, L. A., Matteson, J. L., et al. 1997, ApJ, 485, 747 [NASA ADS] [CrossRef] [Google Scholar]
 Bromberg, O., Nakar, E., & Piran, T. 2011, ApJ, 739, L55 [NASA ADS] [CrossRef] [Google Scholar]
 Bromberg, O., Nakar, E., Piran, T., & Sari, R. 2013, ApJ, 764, 179 [NASA ADS] [CrossRef] [Google Scholar]
 Gruber, D., Goldstein, A., Weller von Ahlefeld, V., et al. 2014, ApJS, 211, 12 [NASA ADS] [CrossRef] [Google Scholar]
 Horváth, I. 1998, ApJ, 508, 757 [NASA ADS] [CrossRef] [Google Scholar]
 Horváth, I. 2002, A&A, 392, 791 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Horváth, I., Balázs, L. G., Bagoly, Z., & Veres, P. 2008a, A&A, 489, L1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Horváth, I., Balázs, L. G., Bagoly, Z., et al. 2008b, AIP Conf. Proc., 1065, 67 [NASA ADS] [CrossRef] [Google Scholar]
 Horváth, I., Balázs, L. G., & Bagoly, Z. 2009, Ap&SS, 323, 83 [NASA ADS] [CrossRef] [Google Scholar]
 Horváth, I., Bagoly, Z., Balázs, L. G., et al. 2010, ApJ, 713, 552 [NASA ADS] [CrossRef] [Google Scholar]
 Horváth, I., Balázs, L. G., Hakilla, J., Bagoly, Z., & Preece, R. D., 2012, PoS(GRB 2012)046 [Google Scholar]
 Huja, D., Mészáros, A., & Řípa, J. 2009, A&A, 504, 67 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Koen, C., & Bere, A. 2012, MNRAS, 420, 405 [NASA ADS] [CrossRef] [Google Scholar]
 Kouveliotou, C., Meegan, C. A., Fishman, G. J., et al. 1993, ApJ, 413, L101 [NASA ADS] [CrossRef] [Google Scholar]
 Mazets, E. P., Golenetskii, S. V., Ilinskii, V. N., et al. 1981, Ap&SS, 80, 3 [NASA ADS] [CrossRef] [Google Scholar]
 Meegan, C. A., Pendleton, G. N., Briggs, M. S., et al. 1996, ApJS, 106, 65 [NASA ADS] [CrossRef] [Google Scholar]
 Meegan, C. A., Lichti, G., Bhat, P. N., et al. 2009, ApJ, 702, 791 [NASA ADS] [CrossRef] [Google Scholar]
 Mukherjee, S., Feigelson, E. D., Jogesh, B. G., et al. 1998, ApJ, 508, 314 [NASA ADS] [CrossRef] [Google Scholar]
 Qin, Y., Liang, E.W., Liang, Y.F., et al. 2013, ApJ, 763, 15 [NASA ADS] [CrossRef] [Google Scholar]
 Řípa, J., Mészáros, A., Wigger, C., et al. 2009, A&A, 498, 399 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Savchenko, V., Neronov, A., & Courvoisier, T. J.L. 2012, A&A, 541, A122 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Schilling, M. F., Watkins, A. E., & Watkins, W. 2002, Am. Stat., 56, 223 [CrossRef] [Google Scholar]
 von Kienlin, A., Meegan, C. A., Paciesas, W. S., et al. 2014, ApJS, 211, 13 [NASA ADS] [CrossRef] [Google Scholar]
 Zhang, Z.B., & Choi, C.S. 2008, A&A, 484, 293 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Zitouni, H., Guessoum, N., Azzam, W. J., & Mochkovitch, R. 2015, Ap&SS, 357, 7 [NASA ADS] [CrossRef] [Google Scholar]
All Tables
All Figures
Fig. 1 TwoGaussian fits (solid black) to log T_{90} distributions. All except w = 0.25 are statistically significant. Colored dashed curves are the components of the mixture distribution. 

Open with DEXTER  
In the text 
Fig. 2 Same as Fig 1, but for a threeGaussian fit. 

Open with DEXTER  
In the text 
Fig. 3 Locations of the short (green squares), intermediate (blue crosses) and long (red triangles) GRBs from previous research and this work. 

Open with DEXTER  
In the text 