Global Optimization and Broadband Analysis Software for Interstellar Chemistry (GOBASIC)

Mary L. Rad; Luyao Zou; James L. Sanders; Susanna L. Widicus Weaver

doi:10.1051/0004-6361/201322591

Home

All issues

Volume 585 (January 2016)

A&A, 585 (2016) A23

Full HTML

Free Access

Issue		A&A Volume 585, January 2016


Article Number		A23
Number of page(s)		8
Section		Numerical methods and codes
DOI		https://doi.org/10.1051/0004-6361/201322591
Published online		09 December 2015

A&A 585, A23 (2016)

Global Optimization and Broadband Analysis Software for Interstellar Chemistry (GOBASIC)^⋆

Mary L. Rad^⋆⋆, Luyao Zou^⋆⋆, James L. Sanders III and Susanna L. Widicus Weaver

Emory University Department of Chemistry Atlanta, GA 30322, USA
e-mail: susanna.widicus.weaver@emory.edu

Received: 2 September 2013
Accepted: 12 October 2015

Abstract

Context. Broadband receivers that operate at millimeter and submillimeter frequencies necessitate the development of new tools for spectral analysis and interpretation. Simultaneous, global, multimolecule, multicomponent analysis is necessary to accurately determine the physical and chemical conditions from line-rich spectra that arise from sources like hot cores.

Aims. We aim to provide a robust and efficient automated analysis program to meet the challenges presented with the large spectral datasets produced by radio telescopes.

Methods. We have written a program in the MATLAB numerical computing environment for simultaneous global analysis of broadband line surveys. The Global Optimization and Broadband Analysis Software for Interstellar Chemistry (GOBASIC) program uses the simplifying assumption of local thermodynamic equilibrium (LTE) for spectral analysis to determine molecular column density, temperature, and velocity information.

Results. GOBASIC achieves simultaneous, multimolecule, multicomponent fitting for broadband spectra. The number of components that can be analyzed at once is only limited by the available computational resources. Analysis of subsequent sets of molecules or components is performed iteratively while taking the previous fits into account. All features of a given molecule across the entire window are fitted at once, which is preferable to the rotation diagram approach because global analysis is less sensitive to blended features and noise features in the spectra. In addition, the fitting method used in GOBASIC is insensitive to the initial conditions chosen, the fitting is automated, and fitting can be performed in a parallel computing environment. These features make GOBASIC a valuable improvement over previously available LTE analysis methods.

Key words: methods: data analysis / astrochemistry / line: identification / molecular data

^⋆

A copy of the sofware is available at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/585/A23

^⋆⋆

The first two authors contributed equally to this manuscript.

© ESO, 2015

1. Introduction

Broadband spectral line surveys are useful tools in astrochemistry, because they can be used to tightly constrain physical conditions in astronomical sources by probing a range of excitation energies for each observed molecule. Over 180 molecules have been detected in the interstellar and circumstellar medium¹ with a large number of those being complex organic molecules (COMs), i.e., those containing six or more atoms (Herbst & van Dishoeck 2009). Many astrochemical modeling networks strive to elucidate the chemistry which leads to the formation of these COMs (see Garrod et al. 2008; Laas et al. 2011 for examples). However, these models must be benchmarked against observational datasets to determine their level of accuracy. Spectral line surveys are well-suited for this purpose.

Spectral line surveys were historically time-prohibitive because of the limited bandwidth of heterodyne receivers and associated spectrometers. However, technological advancements have led to the development of broadband instruments that enable routine line survey observations. Vast spectral datasets are now available because of the implementation of this receiver and spectrometer technology at observational facilities such as the Herschel Space Observatory and the Atacama Large Millimeter/submillimeter Array (ALMA). With this new generation of observational capabilities, the spectral line survey is becoming a default part of nearly every spectral observation. The challenge now becomes how to most efficiently analyze these datasets so that the physical and chemical information can be extracted on a timescale that is comparable to observing proposal cycles. While the rotation diagram method is routinely used for this type of analysis, this approach has shortcomings in terms of dealing with blended lines and multiple physical components. Programs such as the XCLASS/MAGIX program suite (Möller et al. 2013), CASSIS², and WEEDS (Maret et al. 2011) are also available, but often take considerable amounts of time and user input. A more global fitting approach that requires little user input would be ideal for analysis of broadband line survey information.

To meet these needs, a spectral analysis program³ using global optimization methods and local thermodynamic equilibrium (LTE) assumptions was created in the MATLAB scientific computing environment. The Global Optimization and Broadband Analysis Software for Interstellar Chemistry (GOBASIC) program offers a simple and uniform data analysis platform that can be adapted to fit the specific needs of the user while being applicable to a variety of observational data analysis problems. The first generation of this program, presented herein, has been limited to the LTE approximation because that is the simplest model for lineshape analysis. While radiative transfer may be the most physically accurate model for interstellar sources, collision rates are not available for many COMs of interest (van der Tak et al. 2007). Therefore, the LTE approximation is a reasonable first approximation to use for the analysis of line surveys of chemically-complex sources. An overview of the GOBASIC program design and the results of benchmark tests are presented here.

2. Program design

The GOBASIC program code, written in the MATLAB scientific computing environment, has been deposited as supplementary information for this manuscript. A basic overview of the program design is given here; details of the program operation are given in a tutorial document that has been included with the program code. MATLAB is a commercially available program supported by multiple computing environments. All tests described in this paper were run in MATLAB 2014a.

The GOBASIC program accepts an observational dataset as a two column ASCII file, henceforth referred to as xy, formatted with frequency in MHz and intensity in antenna temperature, T_a (K). The spectrum is calibrated to main beam temperature, T_MB (K), using the conversion $T_{MB} (K) = T_{a} (K) / η,$ $\begin{equation} \label{EQ:TMB} T_\mathrm{MB}\mathrm{(K)} = T_\mathrm{a}\mathrm{(K)}/\eta, \end{equation}$ (1)where η is the efficiency of the telescope, which is defined in the program as a user-specified parameter.

Analysis of observational data is performed by comparing a spectrum to laboratory spectral information that is imported in a modified tab-delimited catalog file format, based on the standard format implemented in the JPL Spectral Line Catalog (Pickett et al. 1998) and the Cologne Database for Molecular Spectroscopy (CDMS; Müller et al. 2005). The catalog files are truncated to match the frequency range of the observational dataset. Users may also specify thresholds for line intensity, frequency uncertainty, and lower state energy when loading catalog files. Transitions that do not meet these thresholds are removed from the simulation to reduce computing time. All transitions are simulated if no thresholds are specified.

GOBASIC uses a rotational partition function of the form $Q (T) = \sqrt{\frac{π}{ABC} {(\frac{kT}{h})}^{3}},$ $\begin{equation} \label{EQ:par} Q(T) = \sqrt{\frac{\pi}{ABC}\left(\frac{k T}{h}\right)^3}, \end{equation}$ (2)where Q(T) is the partition function at temperature T, A, B, and C are the rotational constants of the molecule, k is Boltzmann’s constant, and h is Planck’s constant. The value of Q(T) is obtained through interpolation or extrapolation of the partition function information provided by the JPL and CDMS catalogs (Pickett et al. 1998; Müller et al. 2005). Vibrational partition functions are not considered. Users should be mindful of this when loading catalogs that contain transitions from multiple vibrational states.

The Einstein A coefficients of the transitions are determined using the partition function information provided at 300 K and catalog information $A_{ul} g_{u} = 2.7964 \times 10^{-22} \times \frac{I ν_{c}^{2} Q (300)}{e^{- E_{l} / 300 k} - e^{- E_{u} / kT}} (MHz),$ $\begin{equation} \label{EQ:EinsteinA} A_\mathrm{ul}g_\mathrm{u} = 2.7964\times10^{-22}\times\frac{I\nu^2_cQ(300)}{{\rm e}^{-E_\mathrm{l}/300k}- {\rm e}^{-E_\mathrm{u}/kT}} \text{ (MHz)}, \end{equation}$ (3)where g_u is the upper state degeneracy, I is the line strength (I = 10^(LGINT), where LGINT is the base 10 logarithm line strength provided in the catalog), ν_c is the transition frequency in MHz, Q(300) is the partition function value at 300 K, and E_l and E_u are the lower and upper state energies, respectively, in Joules.

The generating function for the spectrum in GOBASIC was derived from the equations used to obtain a population diagram assuming LTE conditions (Goldsmith & Langer 1999). The optical depth, τ, is adapted from Equation 6 in Goldsmith & Langer (1999): $\begin{matrix} τ (x) & = & \sqrt{\frac{4 \ln 2}{π}} \frac{h N_{u} B_{ul} ν_{c}}{c Δ ν} (e^{h ν_{c} / kT} - 1) \\ \times \exp (- \frac{4 \ln 2 (x - ν_{c} + ν_{off})^{2}}{Δ ν^{2}}), \end{matrix}$ $\begin{eqnarray} \label{EQ:tau1} \tau(x) &=& \sqrt{\frac{4\ln2}{\pi}}\frac{hN_\mathrm{u}B_\mathrm{ul}\nu_\mathrm{c}}{c\Delta\nu}\big({\rm e}^{h\nu_{\rm c}/kT}-1\big)\nonumber\\ &&\quad \times \exp\bigg(-\frac{4\ln2(x-\nu_\mathrm{c}+\nu_\mathrm{off})^2}{\Delta \nu^2}\bigg), \end{eqnarray}$ (4)assuming a Gaussian line profile, where x is the spectral frequency, ν_c is the center frequency of the transition, ν_off is the Doppler shift of the transition, Δν is the full width half maximum (FWHM) of the spectral line, N_u is the column density of the upper state, B_ul is the Einstein B coefficient of the transition, and c is the speed of light.

Assuming LTE conditions and substituting the relationships $B_{ul} = A_{ul} \frac{c^{3}}{8 πh ν_{c}^{3}}$ $\begin{equation} \label{EQ:EinsteinB} B_\mathrm{ul} = A_\mathrm{ul}\frac{c^3}{8\pi h \nu_\mathrm{c}^3} \end{equation}$ (5)and $N_{u} = N_{T} \frac{g_{u} e^{- E_{u} / kT}}{Q (T)},$ $\begin{equation} \label{EQ:NT} N_\mathrm{u} = N_\mathrm{T}\frac{g_\mathrm{u}{\rm e}^{-E_\mathrm{u}/kT}}{Q(T)}, \end{equation}$ (6)where N_T is the total column density gives τ in the form $τ_{c} (x) = \frac{1}{4} \sqrt{\frac{\ln 2}{π^{3}}} \frac{c^{2}}{ν_{c}^{2} Δ ν} \frac{N_{T} A_{ul} g_{u}}{Q (T)} e^{- E_{u} / kT} (e^{h ν_{c} / kT} - 1) g (x),$ $\begin{equation} \label{EQ:tau2} \tau_\mathrm{c}(x) = \frac{1}{4}\sqrt{\frac{\ln2}{\pi^3}}\frac{c^2}{\nu_\mathrm{c}^2\Delta\nu}\frac{N_\mathrm{T}A_\mathrm{ul}g_\mathrm{u}}{Q(T)}{\rm e}^{-E_\mathrm{u}/kT}\big({\rm e}^{h\nu_\mathrm{c}/kT}-1\big)g(x), \end{equation}$ (7)where g(x) is the Gaussian line profile $g (x) = \exp (- \frac{4 \ln 2 (x - ν_{c} + ν_{off})^{2}}{Δ ν^{2}}) \cdot$ $\begin{equation} \label{EQ:lineprof} g(x) = \exp\bigg(-\frac{4\ln2(x-\nu_\mathrm{c}+\nu_\mathrm{off})^2}{\Delta \nu^2}\bigg)\cdot \end{equation}$ (8)The optical depth is related to the main beam temperature by the relationship $T_{MB} = \frac{h ν_{c}}{k} \frac{1 - e^{- τ}}{e^{h ν_{c} / kT} - 1} η_{B},$ $\begin{equation} \label{EQ:TMB2} T_\mathrm{MB} = \frac{h\nu_\mathrm{c}}{k}\frac{1-{\rm e}^{-\tau}}{{\rm e}^{h\nu_\mathrm{c}/kT}-1}\eta_\mathrm{B}, \end{equation}$ (9)where η_B is the beam filling factor $θ_{s}^{2} / (θ_{b}^{2} + θ_{s}^{2})$ $\hbox{$\theta^2_\mathrm{s}/(\theta^2_\mathrm{b} + \theta^2_\mathrm{s})$}$ , assuming a Gaussian power profile for both source θ_s and beam θ_b. Source size θ_s can be obtained from interferometric observational studies. For the beam size, θ_b, the half-power beamwidth of a Gaussian illuminated dish with −10 db tapering at the edge is dependent on the radiation frequency, and is calculated in GOBASIC using the relationship $θ_{b} \approx \frac{\sqrt{- 8 \ln 2 \ln 0.1}}{π} \frac{λ}{D} \times \frac{180}{π} \times 3600 (arcsec),$ $\begin{equation} \label{EQ:BEAM} \theta_\mathrm{b} \approx \frac{\sqrt{-8\ln2\ln0.1}}{\pi}\frac{\lambda}{D}\times\frac{180}{\pi}\times3600\text{ (arcsec)}, \end{equation}$ (10)where λ is the radiation wavelength corresponding to the transition frequency ν_c, and D is the dish diameter of the telescope⁴. Equation (10) is appropriate for most single-dish telescopes, but the user may also specify their own beam size estimation and adapt the GOBASIC code directly for either single-dish or interferometeric observations. If no source size information is available, the user can set θ_s to 0, which in turn forces η_B to be unity.

The fit function is generated by summing the main beam temperature vector of each component j: $f (x; p p p) = \sum_{j} T_{{MB}_{j}} (x) = \frac{hν}{k} \sum_{j} \frac{1 - e^{- τ_{j} (x)}}{e^{hν / k T_{j}} - 1} η_{B_{j}},$ $\begin{equation} \label{EQ:FITFUNC} f(x;\pmb{p})=\sum_jT_{\mathrm{MB}_j}(x) = \frac{h\nu}{k}\sum_j\frac{1-{\rm e}^{-\tau_j(x)}}{{\rm e}^{h\nu/kT_j}-1}\eta_{\mathrm{B}_j}, \end{equation}$ (11)where $p p p$ $\hbox{$\pmb{p}$}$ is the fitting parameter vector p^pp = [p₁,p₂,...,p_4j]. The vector norm of the residual $| | y y y - f (x; p p p) - {y y y}_{previous} | |,$ $\begin{equation} ||\pmb{y}-f(x;\pmb{p})-\pmb{y}_\text{previous}||, \end{equation}$ (12)is minimized, where $y y y$ $\hbox{$\pmb{y}$}$ is the observational data, while ${y y y}_{previous}$ $\hbox{$\pmb{y}_\text{previous}$}$ is the previously fitted spectra. This is equivalent to a least-squares optimization. Here, a component refers to any unique spectral simulation that differs from all others in at least one characteristic, i.e., molecular identity, temperature, velocity, line width, or spatial distribution. In calculating the total simulation, GOBASIC sums the intensities of components, rather than their optical depths. In this way, each component can be treated separately for ease of fitting, and individual beam-dilution factors can be applied.

The fitting parameters for each component are the base 10 logarithm of column density N_T in cm^-2, the spectral line full with half maximum Δv in km s^-1, the temperature T in K multiplied by 0.01, and the velocity shift v_off in km s^-1. The rescaling of column density and temperature ensures all fitting parameters are the same order of magnitude, offering reliable fitting results. Line width and velocity shift are converted on the fly to frequency using the Doppler relation $ν - ν_{0} = \frac{ν_{0} v}{c},$ $\begin{equation} \label{EQ:Doppler} \nu - \nu_0 = \frac{\nu_0v}{c}, \end{equation}$ (13)where v is the shift or line width in km s^-1, and c is the speed of light in km s^-1.

Multiple components can be fitted simultaneously. The number of components fitted at once is dictated entirely by the computational cost, and this number can be increased if the program is implemented in a parallel computing environment. Initial guesses for each of the parameters and upper and lower boundaries are required before optimization.

The algorithm used is the pattern search algorithm in the MATLAB global optimization toolbox⁵. This algorithm is extremely useful for this type of dataset, because it is a derivative free method and therefore able to solve non-smooth multivariable optimization. Unlike most conventional derivative fitting routines, the algorithm GOBASIC uses is less likely to be trapped in a particular local minimum before it traverses the whole parameter space, and as such the result is not sensitive to the initial guess of the parameter matrix. This algorithm is also well-suited for parallel processing, which allows the program to take advantage of multicore CPUs and larger computer clusters.

Analysis of many blended features is one of the main challenges of current approaches to line survey analysis. GOBASIC is well-equipped to handle this challenge because it is designed to fit an arbitrary number of molecular components simultaneously. The only limit to this approach is the available computational power, since the fitted parameter space expands on the order of p^4j as the number of components j increases. GOBASIC also allows the approach of iterative fitting, where the whole spectral analysis can be split into several fitting jobs, each one fitting several components. Caution should be exercised, because non-Gaussian lineshapes are also fitted with multiple Gaussian components. Users have the option to edit the lineshape used for fitting if they wish to adapt the program for other types of analyses. An example of the iterative fitting routine is shown in Table 1. This approach requires that molecules with the most intense spectral features be analyzed first. Additional analysis should then be performed for molecules with weaker spectra, working down in order of spectral intensity. The optimization for each new molecule can be set to take into account the results from previously analyzed molecules.

Table 1

Example of the iterative fitting process.

The fit results for the parameters, as well as the associated uncertainties, are displayed in Matlab terminal, and the fitted spectra are plotted versus observational data in Matlab GUI. The results and spectra can also be exported into readable tables and comma-delimited files. More information of the estimation of uncertainties of the parameters are discussed in Appendix A.

3. Results and discussion

3.1. Performance test: synthesized spectra

The performance of the GOBASIC program was assessed by analyzing artificial spectra, which can be easily simulated in GOBASIC. Both a noiseless spectrum and a spectrum with artificial noise were simulated from four molecular components with known physical parameters. The four molecules used were methanol, methyl cyanide, sulfur dioxide, and methyl formate, which are common, highly-abundant molecules found in hot cores. Artificial noise was generated for the noisy spectrum by the Matlab random number generator. Random numbers had a Gaussian distribution, with a standard deviation of σ = 30 mK added to the baseline, plus a standard deviation of σ = 20% applied to spectral line intensity. The input parameters for spectral synthesis were chosen such that both optically thin (e.g., methanol, sulfur dioxide and methyl formate) and optically thick (e.g., methyl cyanide) conditions were included.

The fits of both the noiseless and noisy spectra were open to a large parameter space, where column density could vary across 10 orders of magnitude, temperature could vary from 0 to 1000 K, line width and shift could vary from −30 to +30 km s^-1, and the initial guesses for each parameter were set at the middle point of the parameter space. The fit of the noisy spectrum is shown in Fig. 1. The fitted results and reported uncertainties are shown in Table 2. Both sets of results show excellent agreement with the input values even across such a large parameter space, and are capable of accounting for both optically thin and optically thick spectral line features. Changing initial guess values did not change the results within the uncertainty, indicating the robustness of the GOBASIC fit algorithm to find the global minimum.

Fig. 1

Fit of synthesized noisy spectrum for four molecules. The top panel shows the entire spectrum and the bottom panel shows spectral detail. The synthesized spectrum is in black, methanol in red, methyl cyanide in green, sulfur dioxide in blue, methyl formate in purple, and total fit in cyan.

Table 2

Comparison of fit parameters and reported uncertainties (1σ in parentheses) of both noiseless and noisy simulated spectra.

3.2. Benchmark tests: comparison to rotational diagram method using Orion-KL survey

Performance of GOBASIC was compared with two analysis techniques: the rotation diagram method such as that used by Blake et al. (1987) in their survey of Orion-KL, and the MAGIX (Möller et al. 2013) attachment to XCLASS⁶ (Zernickel et al. 2012; Möller et al. 2013).

The comparison of GOBASIC with the rotation diagram method used by Blake et al. (1987) was conducted using a spectral line survey of the Orion-KL star-forming region (Widicus Weaver, Sumner, & Lis, unpublished data) collected at the Caltech Submillimeter Observatory (CSO)⁷. In this test, analysis was conducted for methanol, sulfur dioxide, methyl cyanide, ethyl cyanide, methyl formate, dimethyl ether, thioformaldehyde, HCCCN, HNCO, CH₃CCH, HDCO, and HDO. Two fit iterations were performed. The first fitting iteration included ¹³CS, ³⁴SO, C³⁴S, CS, formaldehyde, and OCS. These molecules have intense spectral features, but multiple transitions do not occur in the spectral range. This fit simply accounts for their contributions to the spectrum and provides no quantitative temperature or column density information. The second fitting iteration included all 12 molecules listed above, taking the first fitting iteration results into account. Lines from CO isotopologues were flagged throughout the fitting process. The fit was performed with no intensity cut-off, and was open to the large parameter space used for the synthesized spectra described above. Since the results were benchmarked against Blake et al. (1987), whose analysis did not consider beam-filling factors, the beam-filling factor was also not included in the GOBASIC analysis presented here. The beam size of Blake et al. (1987) was identical to the beam size used for the CSO observations. Therefore, the results are directly comparable. Results of the fits are shown in Figs. 2 through 6, and in Table 3.

Fig. 2

Total fit for Orion of all reported molecules listed in Table 3. The observational spectrum is shown in black and the total fit in cyan.

Fig. 3

Orion fit details for methanol, sulfur dioxide (SO₂), HDO, HNCO, and dimethyl ether (DME). The observational spectrum is shown in black, the total fit in cyan, and molecular components are illustrated by the colors listed in the figure legend.

As can be seen in Figs. 3 through 6, GOBASIC modeled the major features of the observational spectrum with few over-predictions, despite the fact that Orion-KL is an extremely complicated source with multiple temperature and velocity components for several molecules and severe line blending in some cases. The temperatures, line widths and velocity shifts of most molecules are comparable with those obtained in the previous study of Blake et al. (1987), even when the parameters were allowed to vary significantly during the optimization. Differing column densities are due to the different temperatures and line widths found between the two studies. The difference in temperatures of some molecules between the two analyses is not surprising because there is a propensity for over-prediction in the rotation diagram analysis due to blended lines, where it is difficult to determine the amount of flux arising from a single molecule. In a rotation diagram, any over-prediction of the integrated intensities that results from line blending may change the slope of the line, and hence the predicted temperature. It can also be seen with the co-adding of spectral simulations of many molecules that the low intensity transitions also contribute significant flux that cannot be accounted for with the single-line Gaussian fitting used to produce a rotation diagram. A global analysis such as that used in GOBASIC can take these components into consideration when determining the physical conditions for a particular molecule.

Another advantage of GOBASIC is that it can be used to identify new molecules, provided that the molecular catalog is available. Rotation diagrams can be unreliable for new molecule identification because reasonable rotation diagrams may be produced from a set of unrelated weak lines, especially for complicated asymmetric rotors (Snyder et al. 2005). In GOBASIC, however, the velocity shift is uniform for all lines in a given molecular component, enabling straightforward determination of missing predicted transitions.

Also, the results of this GOBASIC test indicate that the fit was improved by both fitting multiple components simultaneously and by including previous fits of other molecules, especially in the cases when molecular lines are blended. In this benchmark, we fitted 15 molecular components of 12 molecules simultaneously on a 24-core parallel computing cluster in 60 h of machine time. But it may be impossible to fit all molecules together due to the limitation of computational power. In such cases, molecules can also be fitted individually as long as the transitions of one molecule do not overlap with another, and then included into the total fit later. If one molecule has various temperature or velocity components, or two molecules have significant overlap in transition frequencies (e.g., methyl cyanide and CH₃CCH), simultaneous fitting is recommended.

Fig. 4

Orion fit details for sulfur dioxide (SO₂) and thioformaldehyde (H₂CS). The observational spectrum is shown in black, the total fit in cyan, thioformaldehyde in green, and the three components of sulfur dioxide in the colors listed in the figure legend.

Fig. 5

Orion fit details for CH₃CCH and methyl cyanide (CH₃CN). The observational spectrum is shown in black, the total fit in cyan, the CH₃CCH fit in red, and the methyl cyanide fit in blue.

Fig. 6

Orion fit details for ethyl cyanide (C₂H₅CN), methyl formate (HCOOCH₃), HCCCN, and sulfur dioxide (SO₂). The observational spectrum is shown in black, the total fit in cyan, the ethyl cyanide fit in red, the methyl formate fit in blue, the HCCCN fit in green, and the sulfur dioxide total fit in orange.

Fig. 7

Methyl formate fit of W3(H₂O). The observational spectrum shown in gray, total fit in cyan, methyl formate single-component fit without beam-filling correction in red, and methyl formate fit including other molecules and beam-filling correction in blue. The top panel shows the fit over the whole spectrum and bottom panel shows spectral detail.

Table 3

Comparison of Orion fit parameters and previous analysis results from Blake et al. (1987).

3.3. Benchmark tests: comparison to XCLASS/MAGIX using W3(H₂O) survey

The comparison between GOBASIC and the XCLASS/MAGIX code was conducted by analyzing a line survey of W3(H₂O) for methyl formate. The W3(H₂O) observations were collected at the CSO; these observations and the full analysis of the W3(H₂O) line survey will be presented in a forthcoming publication (Wehres et al., in prep.). In the first step of the comparison between GOBASIC and XCLASS/MAGIX, it was verified that the GOBASIC and XCLASS/MAGIX programs produced the same simulated spectra when using the same input parameters. Subsequent tests involved analysis of identical datasets using both GOBASIC and XCLASS/MAGIX. All XCLASS/MAGIX fits were performed using the Levenberg-Marquardt algorithm, and no correction was made for source size.

In the analysis of W3(H₂O), 15 small molecules such as CO and CS isotopologues were fitted in advance and included in GOBASIC as previous fits, whereas methyl formate was fitted simultaneously with five other larger molecules, including methanol, methyl cyanide, CH₃CCH, dimethyl ether and sulfur dioxide. Fits both with and without consideration of beam filling factor were performed. The source size used was 3.2′′ × 2.0′′ (Hernández-Hernández et al. 2014). A second fit including only methyl formate, expected to give a non-physical result, was also performed for comparison. The results are presented in Table 4, and the fits are shown in Fig. 7.

Table 4

Comparison of XCLASS/MAGIX and GOBASIC fit parameters for methyl formate on W3(H₂O).

The noise level of the dataset is around 0.05 K, and over half of the lines of methyl formate in the frequency range are blended or in the confusion limit. Because of the blended features, fitting methyl formate alone would produce non-physical results. It can be seen from Fig. 7 that without including other molecules, a fit of methyl formate alone tends to over-predict line widths. When including the fitted results of other molecules, GOBASIC gave a result with a reasonable set of parameters for methyl formate. Furthermore, the correction for beam-filling should give a more accurate estimate of column density, while the line width, temperature and velocity shift are not affected significantly. Although the source size information retrieved from Hernández-Hernández et al. (2014) is only the 1.3 mm continuum source size, which is expected to be different from the methyl formate source size, we can still see from Table 4 that the column density is rescaled based on the beam filling factor, while other parameters agree with the non-corrected results within the uncertainty limit. The maximum opacity of methyl formate determined in the full spectral analysis also increased because of the beam correction factor, but is still within optically thin approximation.

The results also agree with those from XCLASS/MAGIX. The linewidth differences arise because the fitting process is so different between the two programs. GOBASIC’s automatic fitting process requires much less manual input than XCLASS/MAGIX, which required an initial fit of strong methyl formate lines to constrain the initial guesses, and a tight parameter space for fitting. GOBASIC, on the other hand, was able to fit, unguided, from a large range of initial guesses, and demonstrated the ability to deal with heavily blended spectral line features when a sufficient number of molecules were included in the fit.

4. Conclusions

We have presented here a spectral analysis program for simultaneous global analysis of broadband line surveys, Global Optimization and Broadband Analysis Software for Interstellar Chemistry (GOBASIC). The GOBASIC program benefits from the use of a derivative-free optimization method which is appropriate for these types of non-linear, non-smooth optimization problems. The simplicity of the generating function and optimization algorithm make this program appropriate for use on most desktop and laptop computers. This program also uses parallel processing to decrease computing time, and can be adapted to run in a computing cluster environment.

The GOBASIC program has been benchmarked against both synthesized and observational datasets and the results compare favorably to those obtained using other methods. GOBASIC can be used for all molecules for which rotational spectral catalogs are available. The use of the LTE approximation allows for rapid analysis of a spectrum, enabling simultaneous spectral fitting over large spectral bandwidths for multiple molecules with multiple physical components. The program offers several advantages over other fitting routines, including robust determination of the global minimum, simple user input, simple correction for beam-filling, fast analysis time with parallel computing, and a simultaneous broadband analysis approach.

Appendix A: Uncertainty estimation of fitted parameters

Denoting the optimizing parameters log ₁₀(N_T),Δv,0.01T,v_off of all components fit as a parameter vector p^pp = [p₁,p₂,...,p_j], the fitting function in Eq. (11) is $y_{i} = f (x_{i}; p p p) = f (x_{i}; p_{1}, p_{2}, ..., p_{j})$ $\appendix \setcounter{section}{1} \begin{equation} \label{AEQ:fitfun} y_i=f(x_i;\pmb{p})=f(x_i;p_1,p_2,\ldots,p_{j}) \end{equation}$ (A.1)for all (x_i,y_i) pairs in the xy dataset.

We use the equations described in Burrell (1990) to estimate the error of the parameters. The curvature matrix $C C C$ $\hbox{$\pmb{C}$}$ is defined by $\begin{matrix} C_{jl} & = & \sum_{i} \frac{1}{σ_{i}^{2}} ((\frac{∂f (x_{i}; p p p)}{\partial p_{j}}) \cdot (\frac{∂f (x_{i}; p p p)}{\partial p_{l}}) - (y_{i} - f (x_{i}; p p p)) \\ \times \frac{\partial^{2} f (x_{i}; p p p)}{\partial p_{j} \partial p_{l}}), \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} C_{jl} &=& \sum_i \frac{1}{\sigma^2_i}\left(\left(\frac{\partial f(x_i;\pmb{p})}{\partial p_j}\right)\cdot\left(\frac{\partial f(x_i;\pmb{p})}{\partial p_l}\right)-\left(y_i-f(x_i;\pmb{p})\right) \right.\nonumber\\[2mm] &&\quad \times \left. \frac{\partial^2f(x_i;\pmb{p})}{\partial p_j\partial p_l}\right), \end{eqnarray}$ (A.2)summing all observational data points (x_i,y_i). It is also assumed that all y_i have the same distribution of error, so σ_i = σ = | | y^yy−f(x;p^pp) | | [N−dim(p^pp)] ^-0.5 is the rms of the residual of the spectra; this value can therefore be taken out of the summation.

Let matrix $ϵ ϵ ϵ$ $\hbox{$\pmb{\epsilon}$}$ be the inverse of $C C C$ $\hbox{$\pmb{C}$}$ , then $σ_{p_{j}}^{2} = \frac{1}{σ_{T_{a}}^{2}} \sum_{i} {(\sum_{l} ϵ_{jl} \frac{\partial}{\partial p_{l}} f (x_{i}; p p p))}^{2} .$ $\appendix \setcounter{section}{1} \begin{equation} \sigma_{p_j}^2 = \frac{1}{\sigma_{T_\mathrm{a}}^2}\sum_i\left(\sum_l\epsilon_{jl}\frac{\partial}{\partial p_l}f(x_i;\pmb{p})\right)^2. \end{equation}$ (A.3)The partial derivative of the fit function is extremely complicated. However, GOBASIC uses numerical derivatives to approximate the results. We define $\begin{matrix} \frac{\partial}{\partial p_{j}} f (x; p p p) \approx \frac{1}{δ} (f (x; p_{j} + δ / 2, p_{l}) - f (x; p_{j} - δ / 2, p_{l})^{)}, (l \neq j \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} \frac{\partial}{\partial p_j}f(x;\pmb{p}) \approx \frac{1}{\delta}\left(f(x;p_j+\delta/2,p_l)-f(x;p_j-\delta/2,p_l)\right), \,\, (l\neq j \end{eqnarray}$ (A.4) $\begin{matrix} \frac{\partial^{2}}{\partial p_{j}^{2}} f (x; p p p) \approx \frac{4}{δ^{2}} ({(f (x; p_{j} + δ / 2, p_{l}) - f (x; p_{j}, p_{l})^{)}}^{} \\ - {(f (x; p_{j}, p_{l}) - f (x; p_{j} - δ / 2, p_{l})^{)}}^{)}, (l \neq j) \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} &&\frac{\partial^2}{\partial p_j^2}f(x;\pmb{p}) \approx \frac{4}{\delta^2}\left(\left(f(x;p_j+\delta/2,p_l)-f(x;p_j,p_l)\right) \right. \nonumber\\[-1mm] &&\left. \hspace*{1.8cm} -\left(f(x;p_j,p_l)-f(x;p_j-\delta/2,p_l)\right)\right),\quad(l\neq j)\\[-8mm]\nonumber \end{eqnarray}$ (A.5)and $\begin{matrix} \frac{\partial^{2} f (x; p p p)}{\partial p_{j} \partial p_{l}} & \approx & \frac{1}{δ^{2}} [(f (x; p_{j} + δ / 2, p_{l} + δ / 2, p_{m})^{} \\ - f (x; p_{j} + δ / 2, p_{l} - δ / 2, p_{m})^{)} \\ - (f (x; p_{j} - δ / 2, p_{l} + δ / 2, p_{m})^{} \\ - f (x; p_{j} - δ / 2, p_{l} - δ / 2, p_{m})^{)}], (m \neq j,l) \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} \frac{\partial^2f(x;\pmb{p})}{\partial p_j\partial p_l}& \approx & \frac{1}{\delta^2}\Big[\left(f(x;p_j+\delta/2,p_l+\delta/2,p_m) \right.\nonumber\\[-1mm] &&\left.\quad -f(x;p_j+\delta/2,p_l-\delta/2,p_m)\right) \nonumber \\[-1mm] &&\quad -\left(f(x;p_j-\delta/2,p_l+\delta/2,p_m)\nonumber \right. \\[-1mm] &&\quad \left. -f(x;p_j-\delta/2,p_l-\delta/2,p_m)\right)\Big],\quad(m\neq j,l) \end{eqnarray}$ (A.6)where δ is a small increment. GOBASIC calculates the variance-covariance matrix and uncertainty of parameters from these values. In the program δ is by default chosen as 10^-3. Tests have shown that a value of δ between 10^-6 to 10^-2 usually gives reasonable and consistent results. Values of δ that are too small may cause singularity during division when calculating the numerical derivatives, due to the finite precision of float numbers in the computer, while values that are too large violate the numerical derivative approximation.

The standard uncertainty for the antenna temperature σ_{T_a} can be either determined based on actual observational data, or from the goodness of the fit. From observational data, the user determines the noise of the dataset and manually enters σ_{T_a} in GOBASIC. From the goodness of the fit, $σ_{T_{a}} [fromfit] = σ = \sqrt{\frac{\sum_{i = 1}^{N} (y_{i} - f (x_{i}; p p p))^{2}}{N - \dim (p p p)}},$ $\appendix \setcounter{section}{1} \begin{equation} \sigma_{T_\mathrm{a}}\text{[from fit]} = \sigma = \sqrt{\frac{\sum_{i=1}^N(y_i-f(x_i;\pmb{p}))^2}{N-\dim(\pmb{p})}}, \end{equation}$ (A.7)where dim(p^pp) is the dimension of $p p p$ $\hbox{$\pmb{p}$}$ , i.e., the number of parameters fitted.

The conversion of uncertainties of log ₁₀N_T and 0.01T to N_T and T uses standard uncertainty propagation equations.

¹

http://www.astro.uni-koeln.de/cdms/molecules (Müller et al. 2005).

²

http://cassis.irap.omp.eu

³

http://chemistry.emory.edu/faculty/widicusweaver/gobasic/GOBASIC_4.4.tar.gz

⁴

http://www.cv.nrao.edu/course/astr534/2DApertures.html

⁵

Global Optimization Toolbox: Direct Search: patternsearch User’s Guide (R2014a). Retrieved September, 2014 from http://www.mathworks.com/help/gads/patternsearch.html

⁶

https://www.astro.uni- koeln.de/projects/schilke/XCLASS

⁷

The CSO is operated by the California Institute of Technology, previously under contract from the National Science Foundation.

Acknowledgments

The authors thank both the referee and Malcolm Walmsley for very helpful comments that improved this manuscript. Funding for this work was provided by NASA Herschel OT1 Analysis Program RSA No. 1428755, the Emory Chemistry Early Career Research Grant, Emory SURE and SIRE programs (funding from Howard Hughes Medical Institute Grant No. 52005873), and the Emory College Undergraduate Research Matching Grant. The authors thank James G. Nagy and Yuanjun Dai for helpful discussion regarding the use of MATLAB. The authors also gratefully acknowledge Matthew Sumner for his help in collecting the Orion-KL dataset, Darek Lis for his help on spectral deconvolution and his useful input during development of the analysis program, Emerson Center for Scientific Computation at Emory University for running most of the tests, and Jacob Laas for his perl script for converting catalog files into tab-delimited text files.

References

Blake, G., Sutton, E., Masson, C., & Philips, T. 1987, ApJ, 315, 621 [NASA ADS] [CrossRef] [Google Scholar]
Burrell, K. H. 1990, Am J. Phys., 58, 160 [NASA ADS] [CrossRef] [Google Scholar]
Garrod, R., Widiucs Weaver, S., & Herbst, E. 2008, ApJ, 682, 283 [NASA ADS] [CrossRef] [Google Scholar]
Goldsmith, P., & Langer, W. 1999, ApJ, 517, 209 [NASA ADS] [CrossRef] [Google Scholar]
Herbst, E., & van Dishoeck, E. 2009, ARA&A, 47, 427 [NASA ADS] [CrossRef] [Google Scholar]
Hernández-Hernández, V., Zapata, L., Kurtz, S., & Garay, G. 2014, ApJ, 786, 38 [NASA ADS] [CrossRef] [Google Scholar]
Laas, J., Garrod, R., Herbst, E., & Widicus Weaver, S. 2011, ApJ, 728, 71 [NASA ADS] [CrossRef] [Google Scholar]
Maret, S., Hily-Blant, P., Pety, J., Bardeau, S., & Reynier, E. 2011, A&A, 526, A47 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Möller, T., Bernst, I., Panoglou, D., et al. 2013, A&A, 549, A11 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Müller, H., Schlöder, F., Stutzki, J., & Winnewisser, G. 2005, J. Mol. Struct., 742, 215 [NASA ADS] [CrossRef] [Google Scholar]
Pickett, H., Poynter, R., Cohen, E., et al. 1998, J. Quant. Spectr. Rad. Transf., 60, 883 [NASA ADS] [CrossRef] [Google Scholar]
Snyder, L., Lovas, F., Hollis, J., et al. 2005, ApJ., 619, 914 [CrossRef] [Google Scholar]
van der Tak, F., Black, J., Schlöier, F., Jansen, D., & van Dishoeck, E. 2007, A&A., 468, 627 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Zernickel, A., Schilke, P., Schmiedeke, A., et al. 2012, A&A, 546, A87 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

All Tables

Table 1

Example of the iterative fitting process.

In the text

Table 2

Comparison of fit parameters and reported uncertainties (1σ in parentheses) of both noiseless and noisy simulated spectra.

In the text

Table 3

Comparison of Orion fit parameters and previous analysis results from Blake et al. (1987).

In the text

Table 4

Comparison of XCLASS/MAGIX and GOBASIC fit parameters for methyl formate on W3(H₂O).

In the text

All Figures

	Fig. 1 Fit of synthesized noisy spectrum for four molecules. The top panel shows the entire spectrum and the bottom panel shows spectral detail. The synthesized spectrum is in black, methanol in red, methyl cyanide in green, sulfur dioxide in blue, methyl formate in purple, and total fit in cyan.
In the text

	Fig. 2 Total fit for Orion of all reported molecules listed in Table 3. The observational spectrum is shown in black and the total fit in cyan.
In the text

	Fig. 3 Orion fit details for methanol, sulfur dioxide (SO₂), HDO, HNCO, and dimethyl ether (DME). The observational spectrum is shown in black, the total fit in cyan, and molecular components are illustrated by the colors listed in the figure legend.
In the text

	Fig. 4 Orion fit details for sulfur dioxide (SO₂) and thioformaldehyde (H₂CS). The observational spectrum is shown in black, the total fit in cyan, thioformaldehyde in green, and the three components of sulfur dioxide in the colors listed in the figure legend.
In the text

	Fig. 5 Orion fit details for CH₃CCH and methyl cyanide (CH₃CN). The observational spectrum is shown in black, the total fit in cyan, the CH₃CCH fit in red, and the methyl cyanide fit in blue.
In the text

	Fig. 6 Orion fit details for ethyl cyanide (C₂H₅CN), methyl formate (HCOOCH₃), HCCCN, and sulfur dioxide (SO₂). The observational spectrum is shown in black, the total fit in cyan, the ethyl cyanide fit in red, the methyl formate fit in blue, the HCCCN fit in green, and the sulfur dioxide total fit in orange.
In the text

	Fig. 7 Methyl formate fit of W3(H₂O). The observational spectrum shown in gray, total fit in cyan, methyl formate single-component fit without beam-filling correction in red, and methyl formate fit including other molecules and beam-filling correction in blue. The top panel shows the fit over the whole spectrum and bottom panel shows spectral detail.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Blake, G., Sutton, E., Masson, C., & Philips, T. 1987, ApJ, 315, 621 [NASA ADS] [CrossRef] [Google Scholar]

[2] Burrell, K. H. 1990, Am J. Phys., 58, 160 [NASA ADS] [CrossRef] [Google Scholar]

[3] Garrod, R., Widiucs Weaver, S., & Herbst, E. 2008, ApJ, 682, 283 [NASA ADS] [CrossRef] [Google Scholar]

[4] Goldsmith, P., & Langer, W. 1999, ApJ, 517, 209 [NASA ADS] [CrossRef] [Google Scholar]

[5] Herbst, E., & van Dishoeck, E. 2009, ARA&A, 47, 427 [NASA ADS] [CrossRef] [Google Scholar]

[6] Hernández-Hernández, V., Zapata, L., Kurtz, S., & Garay, G. 2014, ApJ, 786, 38 [NASA ADS] [CrossRef] [Google Scholar]

[7] Laas, J., Garrod, R., Herbst, E., & Widicus Weaver, S. 2011, ApJ, 728, 71 [NASA ADS] [CrossRef] [Google Scholar]

[8] Maret, S., Hily-Blant, P., Pety, J., Bardeau, S., & Reynier, E. 2011, A&A, 526, A47 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[9] Möller, T., Bernst, I., Panoglou, D., et al. 2013, A&A, 549, A11 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[10] Müller, H., Schlöder, F., Stutzki, J., & Winnewisser, G. 2005, J. Mol. Struct., 742, 215 [NASA ADS] [CrossRef] [Google Scholar]

[11] Pickett, H., Poynter, R., Cohen, E., et al. 1998, J. Quant. Spectr. Rad. Transf., 60, 883 [NASA ADS] [CrossRef] [Google Scholar]

[12] Snyder, L., Lovas, F., Hollis, J., et al. 2005, ApJ., 619, 914 [CrossRef] [Google Scholar]

[13] van der Tak, F., Black, J., Schlöier, F., Jansen, D., & van Dishoeck, E. 2007, A&A., 468, 627 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[14] Zernickel, A., Schilke, P., Schmiedeke, A., et al. 2012, A&A, 546, A87 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

Global Optimization and Broadband Analysis Software for Interstellar Chemistry (GOBASIC)⋆

1. Introduction

2. Program design

3. Results and discussion

3.1. Performance test: synthesized spectra

3.2. Benchmark tests: comparison to rotational diagram method using Orion-KL survey

3.3. Benchmark tests: comparison to XCLASS/MAGIX using W3(H2O) survey

4. Conclusions

Appendix A: Uncertainty estimation of fitted parameters

Acknowledgments

References

All Tables

All Figures

Global Optimization and Broadband Analysis Software for Interstellar Chemistry (GOBASIC)^⋆

3.3. Benchmark tests: comparison to XCLASS/MAGIX using W3(H₂O) survey