Free Access
Issue
A&A
Volume 574, February 2015
Article Number A119
Number of page(s) 21
Section Numerical methods and codes
DOI https://doi.org/10.1051/0004-6361/201323254
Published online 04 February 2015

© ESO, 2015

1. Introduction

Increasingly high-resolution and sensitive surveys at a range of wavelengths are revealing the complex structure of the sky emission with a high level of detail. As a consequence, the source extraction and photometry task is an increasingly challenging problem. There are at least three main situations that this task has to carefully manage: the sources are often observed on top of a complex, highly variable background; the sources are not necessarily isolated but are often blended into multiple, closely spaced objects; and the instruments are often designed to gather multi-wavelength emission from the sky, which is essential for understanding the underlying astrophysics, but the spatial resolution and sensitivity of the observations are then wavelength dependent.

For example, the Herschel Space Observatory (Pilbratt et al. 2010) has observed the far-infrared/sub-mm sky at various wavelengths in the range 70 ≤ λ ≤ 500μm with unprecedented sensitivity and resolution, requiring photometry routines tailored to its capabilities. So far, several different approaches have been produced to estimate the photometry in Herschel data, either adapted from previous existing algorithms or newly developed specifically for Herschel.

The standard Herschel data processing pipeline, HIPE (Herschel Interactive Processing Environment, Ott et al. 2010) allows the user to estimate the photometry by choosing between SUSSEXtractor (Savage & Oliver 2007) and DAOPHOT (Stetson 1987). SUSSEXtractor has been optimised in particular for SPIRE (Griffin et al. 2010), one of the two photometry instruments on-board Herschel. It convolves the image with a kernel derived from the point response function, and the convolved image is used to identify source peaks and to estimate source fluxes. DAOPHOT is based on a combination of find and aper IDL1 tasks. The sources are identified in the image convolved with a DAOPHOT convolution kernel, and the source flux is estimated with a standard aperture photometry assuming a uniform background.

Alternative algorithms such as starfinder (Diolaiti et al. 2000), which adopts the instrumental point spread function (PSF) as a template for the source identification, has been used in extragalactic Herschel surveys such as the PACS evolutionary programme (PEP, Lutz et al. 2011). On the other hand, some widely-used source identification algorithms such as clumpfind (Williams et al. 1994) do not work well in regions with extended emission, since the method ignores the presence of background emission (e.g. Pineda et al. 2011).

In addition, at least three approaches based on different techniques have been developed specifically for these new Herschel data. The MADX algorithm (Rigby et al. 2011) was designed to identify point-like sources in the Herschel extragalactic key-project survey ATLAS. The Cutex algorithm (Molinari et al. 2011) was also developed for the photometry of point-like and compact sources in crowded fields, whereas the getsources algorithm (Men’shchikov et al. 2012) was designed to extract sources ranging from point-like to very extended.

Cutex identifies compact sources in the second derivative image of the sky. The second derivative is very sensitive to strong signal variations, such as point sources, and not significantly affected by smooth signal variations, such as those due to diffuse structures. This allows source identification in fields with strong background contamination. The flux is then estimated by fitting each source with a 2D Gaussian model. The major limitation of this approach is with multi-wavelength analyses, as the Gaussian fit is constrained by the beam size. In the Herschel data the beam size varies by a factor of 7 (from 5′′ at 70 μm up to 35′′ at 500 μm), therefore the Gaussian fits encompass larger regions at longer wavelengths. To account for this bias, the flux of a source at longer wavelengths needs to be rescaled assuming a reference beam at a fixed wavelength. However, the rescaling procedure requires some assumed a-priori knowledge about the source properties (e.g. Nguyen Luong et al. 2011; Giannini et al. 2012).

On the other hand, getsources addresses the multi-wavelength problem based on a fine decomposition of the original images over a wide range of spatial scales and across all wavelengths. This approach appears to work in both test-cases and real data analysis, however it is time-consuming and may require considerable storage space (Men’shchikov et al. 2012) which could both be significant limitations in producing source catalogues across large regions.

An alternative approach is classical aperture photometry (Da Costa 1992). Aperture photometry can easily be adapted to multi-wavelength analysis, circumventing wavelength dependent resolution by using a fixed aperture at all wavelengths. This removes the need for any wavelength-resolution dependent correction factor when comparing source fluxes (unlike, for example, Cutex) or super-resolution of sources (unlike getsources). Aperture photometry does not require detailed source modelling, and the computation time is in general short. However, there are known problems in using classical aperture photometry in the presence of strong background contamination, as well as in crowded fields where sources are blended.

Here we present a new source extraction and photometry algorithm, Hyper (hybrid photometry and extraction routine), with the goal of providing an aperture photometry-based alternative method for the identification and multi-wavelength photometry of point-like and compact (elliptical) sources in regions (including crowded regions) where there is significant and variable background emission. Hyper uses an aperture matched to the source size and shape at a selected wavelength, usually one with the lowest resolution, to measure the flux of sources, providing a measurement of the emission from a common volume of material at all wavelengths.

  • 1.

    Two-dimensional Gaussian fitting is used to determine the size of the elliptical aperture region used for each source. The 2D Gaussian fitting takes the source elongation into account; however, it is only used to estimate the region over which to integrate the source flux. This approach minimises the flux contamination from the background or the companion sources with respect to circular aperture photometry.

  • 2.

    The background is estimated locally by evaluating different background models across regions of different sizes and modelling the background with different polynomial orders in each region. The code automatically chooses the background based on the lowest rms of the residual maps (evaluated for each estimated background).

  • 3.

    In the complicated and frequent case of crowded regions with blended sources, the flux estimation is done using a hybrid approach between aperture photometry and multi-Gaussian fitting. In the case of blended sources Hyper performs a simultaneous multi-Gaussian fit of the sources and separates the reference source from the (modelled) companions.

  • 4.

    Hyper is designed for multiple wavelengths photometry. It associates sources detected independently in different bands and allows the user to choose a fixed wavelength at which to initially identify sources and a different wavelength (typically, but not necessarily, the longest wavelength) at which to fit sources with a 2D Gaussian. The 2D Gaussian fit defines the aperture used to integrate the flux arising from the same volume of gas and dust at all wavelengths, bypassing the issue of wavelength dependent spatial resolutions.

  • 5.

    Finally, Hyper is very fast and it allows the user to identify hundreds of sources in wide fields in few minutes, making the code well suited to producing complete, multi-wavelengths source catalogue even in very complex and crowded fields.

The algorithm has been initially designed to produce a catalogue of proto- and pre-stellar clumps (Traficante et al. 2014) embedded in the infrared dark clouds (IRDCs) observed with the Herschel survey of the Galactic plane, Hi-GAL (Molinari et al. 2010). This survey has mapped the whole Galactic plane in the far-infrared (FIR) in 5 bands, using both PACS (Poglitsch et al. 2010) and SPIRE (Griffin et al. 2010) photometry instruments in parallel mode to observe the sky at 70, 160 μm (PACS), 250, 350, 500 μm (SPIRE). However, Hyper is highly adaptable and it can be used to produce source catalogues and photometry from different surveys or instruments.

Hyper is divided into two main blocks: source identification and source photometry. The source identification strategy is described in Sect. 2 while the aperture photometry is described in detail in Sect. 3. In Sects. 4 and 5 we describe the photometry results on both simulated and real data, respectively. Finally, in Sect. 6 we present our conclusions.

thumbnail Fig. 1

Hi-GAL 70 μm counterpart of the IRDC SDC19.073-0.602 (left panel). The blue crosses are the sources identified in the clean high-pass filtered image (right panel). The high-pass filtering emphasises the peaks while suppressing the diffuse emission.

Open with DEXTER

2. Source identification

The source identification, if done by eye, is a relatively simple exercise which allows experienced people to recognise source peaks even in very confused regions. On the other hand, setting up an automatic algorithm is a complicated task since the sources can be blended together, affected by noise and distributed on top of very variable background in the most complicated cases. In order to recognise sources in every environment, Hyper starts with the assumption that if we decompose each map into its spatial frequencies, the higher spatial frequencies describe the point-like and compact sources while all the background and environment variations are described by the lower frequencies. Following this hypothesis, Hyper identifies sources in a modified high-pass filtered map. The high-pass filtered map is obtained by convolving the map with a squared kernel designed to emphasise the brightness of the central pixel relative to neighbouring pixels. It is designed to have a positive peak value in the centre, surrounded by negative values. The filtering suppresses background variations and emphasises the source peaks, as well as small-scale noise and small-scale background variations, which, however, do not affect the source extraction as we show in Sect. 4. The size of the kernel used to filter the map is a user-defined parameter. It has to be set considering the beam size, the pixel size and the structures that the user wants to preserve. If the sky is properly Nyquist-sampled and the pixel size is (as usual) one third of the beam size, then a kernel of 57 pixels filters out most of the diffuse emission and it is able to isolate point-like, PSF-shaped sources, while a larger kernel is more suitable to find both PSF-shaped and slightly extended objects. The default value is set to 9, a value which also preserves the structure of the elongated sources. The convolution of the map with the chosen kernel is calculated in pixel space and it generates a ringing due to the nature of the kernel itself. The main ring is visible as negative pixels across bright sources in the filtered map, while the secondary rings give a negligible contribution to the map and do not affect the source identification step. Since these negative pixels cannot be real sources, they are masked before proceeding with source identification. Hyper sets the negative values to zero and we called the map “clean filtered” map. Note that since the filtered map is only used to identify the peak position of the sources, this filtered map can be modified to improve the identification without altering the flux estimation.

In the clean filtered map only PSF-shaped or the slightly elongated sources (plus noise and small-scale background variations) survive, set on top of a smoothed, or clean, residual background. This simplifies the peak identification on this map which is now easier and less dependent on the original complexity of the map. The peaks in the clean filtered map are identified by applying in sequence the find and gcntrd IDL routines. find searches for perturbations in the maps by means of Gaussian fitting, the most suitable procedure to identify compact source peaks. The gcntrd routine computes the source centroid starting with the find identification and it is used after the find routine to better constrain the source centroids. The find routine requires a threshold, σt, a user-defined parameter expressed as a multiple of the clean filtered map rms. We underline that the clean filtered map rms is unrelated to the true rms of the sky map. Instead, it depends on the distribution of sources in each field since in the clean filtered map the non-zero pixels only contain the source peaks and the background noise residuals. Therefore, σt determines the intensity of perturbation which is considered a source, and is the number of recovered sources in each field, as well as the likelihood of detecting false sources. As often occurs in approaches which require the selection of a threshold to analyse data which are potentially very different, there are no recipes to fix a priori the threshold, since it is influenced by the crowding of each specific field. For example, this approach is similar to that used in Cutex, since the source identification is done in the second derivative images and the threshold is fixed in the derivative domain (Molinari et al. 2011). Also, it is similar to σ-clipping algorithms which require the definition, a priori, of an unknown level to exclude outliers in the analysis. Users are therefore encouraged to test different values and choose a posteriori the most suitable value for their purposes. The algorithm evaluates the peak flux of the faintest identified source. This value is reported in the output file, providing the user feedback on the “depth” of the source extraction for the chosen threshold.

In Sect. 4 we discuss the effect of varying the threshold values for each Herschel wavelength. As we show with simulations in Sect. 4, the code correctly identifies the sources in the fields and the number of false positives identified can be very low, depending on the chosen threshold. An example of source identification in the 70 μm Hi-GAL counterpart of the IRDC SDC19.073-0.602 (Peretto & Fuller 2009) is shown in Fig. 1. The high-pass filter simply suppresses the diffuse emission while the point sources survived the filtering step, making them easily identifiable by the find and gcntrd IDL routines. In this example we used a threshold of σt = 5.5, which is suitable for this dataset (Traficante et al. 2014).

thumbnail Fig. 2

Upper and lower left: simulated map of a 2D Gaussian source added on top of a variable background, extracted from a region centred on longitude l = 55° in the Galactic plane from the Hi-GAL survey at 160 μm. The source itself is masked in the centre to improve the polynomial fit to the background. Centre: background estimate adopting a first-order (plane) approximation (upper centre) and a fourth-order approximation (lower centre). Right: residual map assuming a first-order polynomial fit for the background estimation (upper right) or a fourth-order (lower right) polynomial fit. The high-order polynomial takes the complexity of the background into account.

Open with DEXTER

3. Source aperture photometry

Once identified, the sources are modelled in the original (unfiltered) image with a 2D Gaussian profile. When analysing a single wavelength this Gaussian fit is used to define both the source position and size, as well as the elliptical region across which Hyper integrates the source flux. The process for analysing observations at multiple wavelengths is described in Sect. 3.4. This approach is a significant improvement over a standard circular aperture photometry with two main advantages: it minimises the integration of the residual emission which does not belong to the source, in particular for the most elongated sources; it also minimises the flux contamination arising from nearby sources.

The 2D Gaussian fitting is done using the mpfit2dpeak IDL routine (Markwardt 2009). For each source Hyper estimates seven parameters which can vary within specific ranges. The results are the best fit values of these parameters, for which an initial guess is needed. These parameters are broken down into four groups:

  • 1.

    background: the code cuts out a small region around each source and the background level is initially considered constant within the extracted region. The initial guess is fixed by taking the mean value of the region, evaluated with a sigma-clipping procedure to minimise the contribution of possible companion sources.

  • 2.

    source centroids: the centroid of each source is fixed to the value estimated in the source identification step. It can vary within a radius that can be set in the parameter file.

  • 3.

    source peak flux: the source peak flux is initially taken as the difference between the peak value and the background value and it can vary without bounds.

  • 4.

    2D Gaussian FWHMs and the position angle, φ: by default the FWHMs can vary from a minimum of Δθλ up to 2.0·Δθλ, where Δθλ is the beam FWHM of the observations. The range can be changed in a parameter file. The position angle can vary without limits.

The elliptical aperture used for the photometry is defined to have semi-minor and semi-major axes (the aperture radii) equal to the FWHMs obtained from the 2D Gaussian fit. In two distinct cases Hyper forces the semi-axes to be equal to the average of the minimum and maximum FWHM values chosen in the parameter file: if the semi-axes ratio is greater than 2.5, which can be the result of Gaussian fits forced by filaments where many sources are observed; if the mpfit2dpeak routine does not converge. In the output files the status of the fit is reported and it can assume three different values: 0 if the fit converged; 1 if the result of the fit was too elongated; 2 if the routine did not converge.

Before evaluating the source fluxes the background is estimated and removed and nearby sources are deblended as described in Sects. 3.1 and 3.2 respectively.

3.1. Background estimation

Hyper follows an iterative procedure to estimate and remove the local background. The background is evaluated in a rectangular region across each source and it is modelled with a function described by a 2D-polynomial. The size of the region can vary from a minimum of twice up to four times the major FWHMs of each source. The background is evaluated using different sized regions and different polynomial orders (from zero up to the fourth) with the mpfit2dfun routine (Markwardt 2009). The source is masked in the centre to avoid source contamination in the background estimation. All the different backgrounds are subtracted from the original map and the residual map with the lowest rms is selected to determine which is the best polynomial fit (and the region size) to model the background. This approach automatically accounts for both very smooth and highly variable local backgrounds. A keyword in the output file (“pol.”) identifies the best-order polynomial fit.

thumbnail Fig. 3

Left: simulation of seven blended sources with the corresponding Hyper profiles. Each ellipse represents the aperture area of the sources. The semi-axes are equal to the FWHMs obtained from the 2D Gaussian fit. Right: map of source 4 after companions removal done with the simultaneous multi-Gaussian fit and local background subtraction as described in the text. Only three sources contribute to the flux at the position of source 3 (sources 1, 3 and 6) and have been deblended. The rectangular region in the right panel is the local region where the background for the source 3 is evaluated after the deblending as described in Sect. 3.1. The residuals are low and they do not lead to severe flux errors for the reference source (Table 2).

Open with DEXTER

To show an example where a high-order polynomial improves the background estimation, we simulated a 2D Gaussian source that we added on top of a highly variable, real background extracted from a Hi-GAL region centred on longitude l = 55° at 160 μm. The source has a peak flux of 53 mJy and its integrated flux is 0.840 Jy. The peak flux is 4.5 × σ where σ = 12 mJy/pix is the standard deviation of the map evaluated after a sigma-clipping procedure. Figure 2 shows the simulated map, the first-order background and the best-order polynomial background (fourth-order) fits and the residuals respectively. Table 1 shows the fluxes evaluated using different background fits up to the fourth-order fit. Due to the high variation in the background, a simple first-order approximation gives rise to a severe overestimate of the flux and Hyper measured a total flux of 1.452 Jy, 73% higher than the true value. The discrepancy is significantly reduced using higher orders. With a fourth-order polynomial the code reduces the flux difference to 19%, with the flux measured equal to 1.003 Jy. Also the rms of the background residual after subtraction, evaluated in the same rectangular region used to fit the background, is lower for the higher degree polynomial fits (see Table 1).

Table 1

Hyper flux estimation for the injected test source (Fig. 2) using different polynomial fits to model the background.

3.2. Companion sources removal

The blending of overlapping sources is a major problem for aperture photometry and it can lead to a severe source flux overestimation if the companion sources are not carefully identified and disentangled.

The parameter which determines if a reference source is assigned one (or more) companions is the distance between its centroid and the centroids of other sources in the field. All sources with a distance less than a fixed value are flagged as companion sources by Hyper. The distance is a user specified parameter but by default is automatically set to twice the chosen maximum aperture radius (see Sect. 3). This is the maximum distance at which two sources can have their 2D Gaussian fits (and therefore their integration areas) partially overlapping.

In the case of a source with companions Hyper uses a hybrid combination of multi-Gaussian source fitting and aperture photometry. It simultaneously fits 2D Gaussians to all the sources plus a first-order polynomial background fit as a first guess. Hyper does not use a higher order in this step since the confusion due to the companion emission could affect the background fitting.

The 2D Gaussian parameters of the companions are used to subtract the corresponding Gaussians from the initial image. The final image contains the reference source plus the background and the residuals from the subtraction of each companion. The 2D Gaussian fitting of the source is then used to identify the region to integrate the flux as is in the single source case (Sect. 3.3), after the standard background subtraction described in Sect. 3.1.

To test the algorithm we have simulated a toy model in which we have injected 7 partially overlapping sources on top of a real background extracted from a patch of Hi-GAL map observed with PACS at 160 μm. Hyper recognises all the seven sources with a position accuracy of 1.14 ± 0.72″, therefore better than one third the pixel size (4.5 at 160 μm). Figure 3 shows the map of the seven sources and the map of the residuals after the companion and background subtraction for the brightest source. In this example, the flux of the source model is 35.772 Jy. The integrated flux estimated by Hyper without subtracting any companions is 45.076 Jy, a difference of 26%. After the companions removal (Fig. 4) the Hyper flux is 37.366 Jy, a difference of 4%. The model fluxes compared with the fluxes recovered with and without companions deblending for all the seven sources are shown in Table 2. The mean difference between the source model fluxes and the Hyper fluxes is 43% without companions removal. The difference drops substantially after the deblending, with a mean value of ~7.8% and in 5 out of 7 cases the flux recovered by Hyper is within 5% of the flux of the model.

thumbnail Fig. 4

Results of deblending the seven partially overlapping sources shown in Fig. 3. The flux of the 2D Gaussian source model for each source is shown as a black diamond. The source flux evaluated without any companions removal is shown as a blue triangle, and after companion removal is shown as a red asterisk.

Open with DEXTER

Table 2

Comparison between the integrated flux for the seven blended sources in Fig. 3.

3.3. Source photometry

Once the source background has been removed and the companions have been subtracted, the flux is evaluated by integrating all the flux within the elliptical aperture defined as described at the beginning of this Section.

The precision of the flux integral depends on the precision in evaluating the percentage of each pixel that belongs to the integration region, in particular for the pixels which are only partially included within the elliptical aperture. Hyper follows an oversampling procedure to evaluate the flux integral. Each pixel is expanded in a fixed number of sub-pixels using the frebin IDL routine, which preserves the total pixel flux. The frebin routine requires the number of bins per side of each pixel, which will determine the final number of sub-pixels. All the sub-pixels that are within the ellipse defining the aperture are counted in the evaluation of the flux integral.

The precision in the integration also depends on the number of bins and the number of points describing the ellipse. The higher the number of bins, the higher the number of points describing the ellipse required to include all the sub-pixels within the integration area.

We tested different configurations varying the number of bins, as well as the points describing the aperture ellipse. In order to determine the quality of the integration routine we produced a simple test map with constant pixel values of 10 (arbitrary units). We integrate in a circular region with a radius equal to 2 pixels, therefore the area of the circle is equal to 125.664. In Fig. 5 we report the percentage difference of the integral evaluated by Hyper for varying the number of ellipse points and bins.

thumbnail Fig. 5

Flux difference in percentage with varying the number of points describing the ellipse for different bins used to oversample each pixel with respect to the reference model flux. The differences are 1.5% using more than 200 points to describe each ellipse and with at least a bin value per side of 15 (green and blue lines). When using no bins (black line) the percentage is remarkably high and independent of the number of points describing the ellipse (using at least 200 points). In fact it implies integrating the entire pixel values for all the pixels with which the ellipse points is described.

Open with DEXTER

Without oversampling (no bins) the pixels can either fall completely within the area or fall completely outside, therefore the code does not converge to the correct value irrespective of the number of points describing the ellipse. At high numbers of bins and low numbers of points describing the ellipse, too many sub-pixels are associated with the region and the area results severely overestimated. However, increasing the number of points significantly improves the estimation and the difference from the true area and the estimated area is 1.5% using 1000 points for the ellipse and with bin value per side of 20 (e.g. oversampling the map of a factor 400). The number of bins, as well as the points which describe the ellipse, will determine the accuracy of the flux estimation. Increasing these values will enhance the flux estimation, but at the expenses of memory request and execution time. The default settings, used in the tests described in Sects. 4 and 5, are 20 bins and 1000 points.

3.4. Multiple wavelength source photometry

A major problem with multi-wavelength analysis is the different spatial resolution of the maps at the different wavelengths. Hyper (and the aperture photometry approach in general) can overcome this limitation by setting a fixed aperture radius over which to integrate the source flux regardless of the wavelength. We note that the algorithm leaves the user free to choose the minimum and maximum apertures as described above, however the aperture region should never be smaller than the beam size of the lowest resolution map.

This approach assures that the emission arising from the same volume of gas and dust is included in the flux integration at all wavelengths. The background is estimated at each wavelength using a square region with a length of a side equal to at least as the aperture, as described in Sect. 3.1. The flux measured for a source will include any perturbation in high resolution maps with a spatial frequency higher than the order of the best polynomial. This is appropriate as the emission from such perturbations contributes to the emission measured within the beam of the low resolution observations. In Sect. 4.4 we test the Hyper multi-wavelength approach by injecting sources in a complex field at each Herschel wavelengths and measuring the source flux in the region estimated at the longest Herschel wavelength, the 500 μm band. In Sect. 4.4 we also demonstrate that the method works well at all wavelengths and the largest uncertainties arise at 70 μm, where is strongest the contribution from background structures not subtracted within the 500 μm aperture.

If the user wants to obtain a multi-wavelength photometry with different apertures at each wavelength, Hyper can be easily run separately for each wavelength and the results combined a posteriori.

To process maps at various wavelengths Hyper can read a list of maps from its input file and four parameters can be set to control the multiple wavelength analysis. These parameters are:

  • 1.

    Multiple wavelengths: this parameter defines all the wavelengths at which the user wants to evaluate the flux within the same aperture region.

  • 2.

    Reference wavelength: this parameter defines the wavelength at which the sources are initially identified.

  • 3.

    Fitting wavelength: the wavelength at which the source size is determined and used to define the elliptical photometry aperture. It can be different from the reference wavelength.

  • 4.

    Common source distance: this radius sets the maximum distance at which sources observed at two different wavelengths are considered counterparts of the same object. By default it is equal to half the FWHM of the highest resolution map among the various selected wavelengths.

When running Hyper on multi-wavelength data the source identification and 2d Gaussian fitting is carried out independently at each wavelength. As well as providing the source positions, this procedure allows the deblending of the sources at each wavelength. The common source distance is then used to determine whether a source detected at the reference wavelength has a counterpart at the other wavelengths. The position of a source at the reference wavelength and of a source at another wavelength must be within the common source distance for them to be associated. The elliptical photometry aperture is then defined by the 2D Gaussian fit to the source at the fitting wavelength. The integrated flux is then evaluated within this same aperture at each wavelength. Therefore, Hyper only evaluates the source flux for the sources with counterparts at all the multiple wavelengths set in the parameter file. However, Hyper also produces a catalogue and a region file with the source peak positions of all the sources identified independently at each wavelength. The user can identify a posteriori sources which are detected at only some wavelengths and were therefore not included in the final output.

If there are multiple sources within the aperture in a high resolution map, the counterpart is assumed to be the source closest to the reference source. The fluxes of all the resolved sources contribute to the total flux if they are within the elliptical aperture. In the output file the cluster parameter indicates the number of the resolved sources observed at each wavelengths which fall within the aperture (see Sect. 5).

4. Simulated data

To test the quality of the extraction and photometry in presence of a highly variable, realistic background we produced several maps starting from real data extracted from the Hi-GAL survey. The Galactic plane seen in the FIR shows a strong background and it is crowded with sources (e.g., Molinari et al. 2010), so it is a suitable test of the code in extreme conditions.

We tested separately the source extraction routine, in order to explore the identification of false positives as function of the threshold σt (Sect. 2), and the full Hyper capabilities on different series of simulated maps.

4.1. Testing the source extraction routine

We first generated a series of map to test the effect of varying the threshold σt (Sect. 2) which determines the percentage of sources recovered, as well as of the number of false identifications. In order to show how the threshold σt influence the goodness of the source extraction, we extracted a squared region of the Galactic plane with smooth background and few point sources. The region has been extracted from the PACS 160 μm map centred on (l,b) = (20.6°,0.7°) and with a size of 15 per side. It contains a few identified point sources, depending on the value of the threshold σt. We produced 5 runs with different thresholds σt = [5.5,6.5,7.5,9.0,10.0]. The number of sources identified in the field are [12,3,3,1,1] respectively.

Table 3

Total number of sources injected at various wavelengths, for both T1 and T2 test cases.

We generated 80 random Gaussian sources modelled as 2D Gaussian with random FWHMs along their two axes and position angles. For this test we aimed to avoid confusion due to overlapping sources, therefore we generated sources with a minimum centroid to centroid separation of 50′′. The FWHM was allowed to vary in the range ΔθλFWHM ≤ 1.5·Δθλ. In order to improve the statistics, we generated 20 different templates, for a total of 1600 sources, divided in 2 groups of 800 sources.

In the first group (hereafter TS1), the source peak flux, Fp, varies from 1.5σtest up to 5σtest and in the second (TS2) 2.5σtestFp ≤ 25σtest, with σtest being the standard deviation of the flux in the map. TS1 has been carried out to test the ability of the code to identify faint objects. An example of TS1 realisation is in Fig. 6. We ran the code with different thresholds for both groups and the results are shown in Fig. 7. The false positives are due to perturbations in the background, therefore at low σt values the percentage of false positives is larger in TS1 than in TS2. At high σt values the number of false positives strongly decreases with less than 2% of false positives identified in both groups. This trend however can vary from observation to observation and there is no universal σt value (Sect. 2); it has to be optimised for each different source extraction analysis. In both TS1 and TS2 the percentage of true sources recovered is 100%.

Note that the threshold depends on both the intensity of the sources in the field and the wavelength. The next series of tests, aimed to demonstrate the quality of the overall Hyper capabilities, have been performed at all wavelengths and the thresholds have been chosen to minimise the false positives and maximise the true source recovery. The chosen threshold can be taken as a reference for each Herschel wavelength, however the users are encouraged to test different values to optimise the extraction for their own data.

thumbnail Fig. 6

One realisation of the TS1 run with 80 injected sources on top of true background, extracted from the Hi-GAL 160 μm map centred at on (l,b) = (20.6°,0.7°). The blue ellipses are the source identified by Hyper using a threshold σt of 6.5. The ellipses describe the aperture regions of the sources, evaluated as discussed in Sect. 3.

Open with DEXTER

thumbnail Fig. 7

Percentage of false positives identified for various thresholds σt for two simulated fields, TS1 with faint sources (red line) and TS2 with bright sources (blue dashed line). The fraction of false positives decreases with increasing threshold values.

Open with DEXTER

4.2. The photometry accuracy of Hyper

The full Hyper pipeline, from the source extraction up to the source photometry, has been tested and validated on complex, extended fields, with sources injected on top of real background extracted from Herschel observations. We have run two different series of tests: in the first we injected sources randomly across wide regions with variable backgrounds and in the second the sources has been injected on top of specific, highly background contaminated regions where real compact sources are more likely to be found, namely filaments.

In the first series of test we extracted a large patch for each wavelength of the Hi-GAL data in order to include different variable backgrounds. The patch belongs to a region included in the Galactic range 35° ≤ l ≤ 41° depending on the wavelength with an extent of 0.5° × 0.5°.

For each wavelength (except 350 and 500 μm), we generated 6 templates of 500 sources each modelled as 2D Gaussian as in the TS1 and TS2 tests. Due to the different pixel size of the longer Hi-GAL wavelengths, we injected fewer sources (100) in the 350 and 500 μm maps and produced 20 template realisations. In total, we generated 13 000 model sources (Table 3). In order to include more complex situations, in each realisation the source FWHMs could vary in the range ΔθλFWHM ≤ 2.0·Δθλ. The sources have minimum centroid to centroid separation to avoid excessive overlapping of 2.0·Δθλ. As in the case of TS1 and TS2, we ran two different series of tests. In the first run (hereafter T1), the source peak flux, Fp, varies from 1.5σtest up to 5σtest and in the second (T2) 5.5σtestFp ≤ 25σtest, with σtest being the standard deviation of the flux in each map. Examples of T1 realisations for PACS 70 μm and SPIRE 500 μm are shown in Fig. 8.

thumbnail Fig. 8

One realisation of T1 run for a field extracted from PACS 70 μm map (left panel) and SPIRE 500 μm map (right panel). There are 500 sources and 100 sources injected per realisation in the 70 μm and 500 μm maps respectively.

Open with DEXTER

We fixed different thresholds for each run, chosen after few trials (less than 5) and visual inspections of the extraction results. The thresholds have been chosen to recover as many sources as possible without extracting false positives. The number of sources recovered at each wavelength, with the corresponding thresholds, are in Table 4 for both the test cases. For T1, the most compelling test, the percentage of recovered sources is 100% in all the SPIRE bands and only 1 out of 1500 sources is not recovered in the PACS 160 map.

The only exception is the 70 μm map with 76 out of 1500 sources not recovered. A lower threshold would recovered all the sources, but at the cost of contamination with false sources. The chosen thresholds however circumvent this effect in both T1 and T2. For T2, the percentage of recovered sources is always 100% with the only exception of SPIRE 250 μm, in which Hyper misses 2 out of 1500 sources.

Table 4

Number and percentage of recovered sources at each wavelength for T1 and T2 runs using different thresholds.

The flux of each source, estimated as described in Sect. 3.3, is compared with the flux of the injected sources. The flux of the models is evaluated within the same area that we expect to recover with Hyper, therefore the flux integral is evaluated within a one FWHM radius. This area corresponds to 93.75% of the total area of a 2D Gaussian equal to 2 ∗ Aπσaσb, with A being the intensity of the source peak and σa and σb being the standard deviation of the Gaussian along the two semi-axes a and b respectively.

The flux differences between the value recovered by Hyper and each source model for all wavelengths, as function of the peak flux expressed as multiples of the map standard deviation (peak SDEV), are shown in Figs. A.1 and A.2 for T1 and T2 respectively.

Table 5

rms of the size distribution for both T1 and T2 runs at each wavelength.

We also compared the source sizes evaluated from the model and measured by Hyper. For the model, the source size is defined as the geometrical mean of the FWHM in each direction used to build the 2D Gaussian of each source. For Hyper, it is defined as the geometrical mean of the FWHMs estimated by the mpfit2dpeak routine, which is equivalent to the aperture radius. The comparison plots are shown in Figs. B.1 and B.2 for runs T1 and T2 respectively.

In Table 5 there is the rms of the size distribution as obtained from Hyper compared with the “true” size of the source models for both T1 and T2 runs. There are some sources for which Hyper cannot properly estimate the source shape and forces, internally in the mpfit2dpeak routine, the 2 FWHMs to be equal to the minimum or maximum aperture radius. These points are seen as the points distributed along fixed radii corresponding to the minimum and maximum aperture radius chosen at each wavelength. If the fit does not converge at all (the status of the fit is −1 or −2), Hyper forces the FWHMs to be equal to the geometrical mean of the minimum and the maximum aperture chosen in the parameter file. These points are distributed along fixed radii in the middle of the plots.

4.3. Testing Hyper in a complex environment: sources along filaments

We further validated Hyper across regions with strong local background variations where real compact objects are likely to be found. Herschel reveals that star formation mostly occurs along filamentary structures (e.g. Molinari et al. 2010; André et al. 2010; Arzoumanian et al. 2011), therefore for this test we selected real filaments observed in the Polaris Flare region. Polaris has been observed as part of the Herschel Gould Belt Survey programme (HGBS, André et al. 2010) and its reduced maps were downloaded from the HGBS Archives2. This region is rich in cirrus emission and filamentary structures (e.g. Men’shchikov et al. 2010; Ward-Thompson et al. 2010) but the absence of star formation activity makes it suitable for source injection. We isolated a wide sub-region at 160, 250, 350 and 500 μm. Since star formation has not yet begun, the region does not emit at λ = 70μm. We injected 50 2D Gaussians at each wavelength with random axes, orientations and fluxes along the filamentary structures and avoiding real sources. The 2D Gaussians have been generated at each wavelength using the same procedure described in Sect. 4.2. The fluxes vary in the range 6σPolFp ≤ 10σPol, with σPol being the rms of the map at each wavelength. The radii of the sources are in the range ΔθλFWHM ≤ 2.0·Δθλ, with Δθλ being the beam of each observation. Figure 9 shows the region and the injected sources at the four Herschel wavelengths.

We performed the test by injecting weak sources with the brightest source from 1.5 to 10 times fainter than the densest starless cores observed within the Polaris field, depending on the wavelength (Ward-Thompson et al. 2010). The minimum and maximum source fluxes, the average flux difference between the Hyper and the model fluxes and the 1-sigma dispersion of this difference at each wavelength are given in Table 6. Although the background structures generated by the filaments are difficult to model and the injected sources are relatively faint, the agreement between the recovered and the modelled fluxes is 20% on average, with a 1-sigma dispersion of 15% on average.

thumbnail Fig. 9

Section of the Polaris Flare region with filamentary structures as seen by Herschel at 160 (top left), 250 (top right), 350 (bottom left) and 500 (bottom right) μm (from HGBS, André et al. 2010). Fifty random 2D Gaussian sources have been injected at each wavelength and positioned on top of the filaments, as discussed in Sect. 4.3. The 250 μm map also shows the 50 sources positions and 2D Gaussian fit as obtained with Hyper.

Open with DEXTER

Table 6

Average flux difference between the Hyper fluxes and model fluxes for the 50 sources injected in the Polaris field as showed in Fig. 9 at 160, 250, 250 and 500 μm.

4.4. Testing Hyper multi-wavelength analysis

In order to test the Hyper multi-wavelength approach described in Sect. 3.4 we selected a complex region from Hi-GAL, namely a 2° × 2° region centred on (l,b) = (50°,0°) in the Galactic plane and we injected across the central region 200 identical 2D Gaussian sources at all Hi-GAL wavelengths: 70, 160, 250, 350 and 500 μm. We modelled the source spectral energy distribution (SED) at wavelengths 160 ≤ λ ≤ 500μm assuming a single-temperature greybody model with a spectral index of β = 2.0 (Elia et al. 2010; Giannini et al. 2012). We assumed a source with a temperature of 15 K, mass of 300 M in a radius of 0.5 pc and located at 4 kpc, typical values of compact protostars observed in the Galactic plane (e.g. Traficante et al. 2014). The corresponding fluxes are F = (31.50,29.84,17.37,7.28) Jy at λ = (160,250,350,500)μm respectively. At 70 μm the emission from young sources is usually observed in excess compared to a single-temperature greybody model (e.g. Motte et al. 2010; Giannini et al. 2012). Therefore we adopted a 70 μm flux of 3 Jy for this source (in comparison to the 0.9 Jy from the 15 K greybody model). We fixed the centroids of the sources equal at all wavelengths, but we let the FWHMs and the position angles at each wavelength vary. Each source realisation has been convolved with the appropriate Herschel beam before being injected into the corresponding map.

The aperture region was estimated at 500 μm and the flux is evaluated at all wavelengths within the same aperture. Hyper identified 194/200 sources using a threshold σt = 3.0. After a visual inspection of the 6 missing sources, 5 were randomly injected at the borders of the map and only 1 has been missed by the algorithm. An image of the complexity of the region in which we injected the sources at 250 μm and a zoom of a region including 5 injected sources at all wavelengths are in Figs. 11 and 12 respectively. The run at all five wavelengths simultaneously on this region requires 15 min on a 2.2 GHz Intel Core i7 with 8GB of RAM.

We have compared the Hyper fluxes with the injected fluxes for a subset of sources, namely all the sources which do not show any identified clustered companions. Clustered sources are in fact real sources identified within the integration region at shorter wavelengths and possibly overlapped with the injected ones at the longer wavelengths (see Sect. 3.4). We obtain 145 sources with no clustered companions. In Appendix C the flux difference distribution histograms at each wavelength are shown. We further excluded from the comparison the sources with an absolute flux difference greater than 5σd the mean of the flux difference, with σd being the dispersion of the distribution. These few sources (4 at 70 μm, 2 at 160 μm and 1 at 250 and 350 μm respectively) are close to bright, diffuse emission which contaminates the aperture region and was not fully removed by the background removal. We ended up with [141,143,144,145,145] sources at λ = (70,160,250,350,500)μm respectively. The mean flux differences between the injected models and the Hyper sources are shown in Fig. 10. Even if the background emission in the region is extremely complex and variable across the map, the absolute flux differences are [22,8,8,7,11] % with a σd of [20,7,8,6,9] % at λ = (70,160,250,350,500)μm respectively. The difference is slightly higher on average at 70 μm, likely due to high-spatial frequency variations of the background within the integration region that are not accounted in the background estimation.

If the user wants to measure the flux of each single resolved source in the high resolution images,and so more accurately remove the background on a smaller region in particular for the 70 μm counterparts, Hyper can be easily run separately at each different wavelength.

thumbnail Fig. 10

Average absolute flux differences between the flux models and the flux recovered by Hyper within the 500 μm fit aperture region at each wavelength. The accordance is on average 10% at all wavelengths except at 70 μm, where within the 500 μm aperture the flux estimation includes residual background contribution not removed with the background removal as discussed in Sect. 4.4.

Open with DEXTER

5. Real data applications

Testing photometry algorithms on real dataset is not an easy task since the correct value of the source flux is not known nor easily predictable, in particular in complex fields.It is not surprising that two different methods give different fluxes. The main reasons are the differences in the removal of the background and the deblending of the sources. A useful illustration has been shown in Pezzuto et al. (2012, Table 1), in which the fluxes of the two first hydrostatic core candidates (FHSC, e.g. Commerçon et al. 2012) identified by Herschel in the Perseus region can differ up to 50% if evaluated with Cutex or getsources. In that work the authors have chosen Cutex to estimate the flux and combined the Cutex and getsources photometry to estimate the flux uncertainties (Pezzuto et al. 2012). Recently, Sadavoy et al. (in prep.) found for several sources in Perseus star forming region differences in fluxes greater than a factor 2 when comparing the Spitzer c2d catalogue (Evans et al. 2003) with the Gutermuth et al. (2009) catalogue, also based on Spitzer data.

Nonetheless, testing Hyper on real data is crucial to validate the algorithm and, in order to compare Hyper with some other existing codes, we selected two different test-cases: in the first we selected a region of few relatively isolated sources for which we expect that the photometry should be least sensitive to the algorithm used (Sect. 5.1). In the second we selected a catalogue of 1000 compact sources across the Galactic plane, with integrated fluxes and background properties varying by several order of magnitudes (Sect. 5.2). The comparison of Hyper with other algorithms in such a complex case gives a good estimation of the intrinsic uncertainties in the flux estimation due to the different approaches.

Finally, we compared the Hyper photometry with the published catalogue of source fluxes extracted from a survey of the Galactic plane made with a ground-based telescope, the 1.1 mm Bolocam Galactic Plane Survey (BGPS, Aguirre et al. 2011) carried out with the Caltech Submillimeter observatory (Sect. 5.3). This test shows the versatility of the aperture photometry approach applied to different surveys and instrument specifications.

Table 7

Hyper output parameters.

5.1. Test on real data: isolated sources in B1 Perseus field

As region with relatively isolated sources we selected the B1 Perseus star forming region observed with Herschel at 70 μm as part of the HGBS programme (André et al. 2010).

The Perseus star-forming region is active in forming stars as demonstrated, e.g., by the large number of FHSC candidates reported in literature, e.g. L1451-mm (Pineda et al. 2011). In the B1 part of the Perseus region (see Sadavoy et al. 2012), Pezzuto et al. (2012) have recently found other two FHSC candidates, B1-bN and B1-bS, thanks to Herschel observations of the region. We have extracted from B1 a small square region where the sources are point-like, isolated and with a very faint background easily allowing a direct comparison of different photometry algorithms and philosophies. Importantly for this region the photometry is also available for both the Cutex and getsources algorithms.

thumbnail Fig. 11

Left: Hi-GAL map of the 2° × 2° region chosen centred on (l,b) = (50°,0°) observed at λ = 250μm chosen to test the Hyper multi-wavelength approach described in Sect. 4.4. Right: same region with overlapped the 200 injected sources as described in Sect. 4.4.

Open with DEXTER

thumbnail Fig. 12

Zoom of a region with 5 injected sources extracted from the map showed in Fig. 11. From left to right: the same region observed at 70, 160, 250, 350 and 500 μm. The black cross in the 70 μm image identify the 5 source centroids. The blue ellipses in the other maps represent the Hyper source fit done in the 500 μm map which determines the aperture region at all wavelengths as described in Sect. 4.4.

Open with DEXTER

Table 8

Fluxes and FWHM of the seven sources in part of the Perseus B1 region measured with Hyper, Cutex and getsources respectively.

The image of the region observed with PACS at 70 μm is shown in Fig. 13. Seven sources can be seen by eye and all of them have been identified by Hyper using σt = 6.5. The same sources have also been identified with Cutex and getsources. The background is very clean and different values of the Hyper threshold do not give rise to other, possibly false detections. The Hyper output file for these sources with all the parameters estimated by Hyper is shown in Table 7.

Table 8 compares the flux and size measurements from Hyper, Cutex and getsources. For the Hyper fluxes in this table we have been applied the aperture corrections, which are source-size and wavelength dependent. The correction curve is publicly available3. The Cutex fluxes have also been corrected for a size-dependent correction factor. The photometry for the different algorithms is compared in Fig. 14. The source FWHM has been evaluated as the geometrical mean of the FWHMs obtained from the 2d fits for both Hyper and Cutex algorithms. The agreement between the three codes is very good, with a mean difference in flux of <2% and 7% between Hyper and Cutex and getsources respectively. The source FWHMs differ by 6% and 7% with Cutex and getsources respectively.

thumbnail Fig. 13

A small area of the Perseus B1 region observed with PACS 70 μm as part of the HGBS (André et al. 2010). The background is flat all across the region and the seven sources (the blue ellipses) are easily identified by the code.

Open with DEXTER

thumbnail Fig. 14

Flux comparison between Hyper and Cutex (red crosses) and getsources (green triangles) for the seven sources identified in the Perseus 70 μm field. The Hyper flux are in very good agreement with the Cutex and the getsources fluxes, with a discrepancy on average of 7% and 13% with Cutex and getsources fluxes respectively.

Open with DEXTER

5.2. Test on complex real data: protostar clumps in IRDCs

As test case of a strongly variable background and various source fluxes, we selected the catalogue of protostellar clumps associated with IRDCs in the Galactic region 15° ≤ l ≤ 55°, | b | ≤ 1° (Traficante et al. 2014). These sources have been observed as part of the Hi-GAL survey (Molinari et al. 2010) for which the first generation of the compact sources catalogue produced with Cutex is being published (Molinari et al. 2014, in prep.), therefore allowing a direct comparison between these two algorithms in very complex and realistic fields. The Hyper catalogue contains 1000 clumps identified as Hyper compact sources at the reference wavelength of 160 μm and with counterparts at 70, 250 and 350 μm. The fitting wavelength is the 250 μm. Since Cutex evaluates the flux independently at each wavelength (Molinari et al. 2011), the two approaches are only directly comparable at the wavelength fixed as the fitting wavelength in the Hyper catalogue, therefore at 250 μm. In Fig. 15 we show the comparison between Hyper and Cutex fluxes for the 960 sources in common between the two 250 μm compact sources catalogues. Despite the complexity of the analysis the distributions are in a good agreement, although with a larger dispersion than for the simple case of the Perseus field (Sect. 5.1). The linear fit of the distribution has an intercept of 0.13 Jy which indicates very few systematics in the algorithms. The coefficient is m = 0.87, which indicates a slight overestimate of the Hyper fluxes compared to the Cutex fluxes on average. The scatter is likely due to the different approaches of the algorithms to estimate the background and to deblend the sources in crowded fields. However the rms of the distribution is 50%, in line with the differences observed by (Pezzuto et al. 2012) between Cutex and getsources in Perseus.

thumbnail Fig. 15

Comparison between Hyper and Cutex 250 μm integrated fluxes for 1000 compact sources extracted from the Traficante et al. (2014) survey of protostars associated with IRDCs in the Galactic range 15° ≤ l ≤ 55°. Blue dashed line: y = x straight line. Red line: linear fit of the distribution. The coefficient of the fit is m = 0.87, and the intercept is 0.13 Jy which indicate no systematics in the algorithms but a slight overestimate of the Hyper fluxes compared with the Cutex fluxes. The scatter is mostly due to the different background subtraction and deblending approaches.

Open with DEXTER

thumbnail Fig. 16

Flux comparison between the official BGPS flux catalogue and the Hyper flux estimation. The comparison includes 796 sources extracted from the public available BGPS v2 catalogue in the Galactic range 20° ≤ l ≤ 28°. The fluxes in the BGPS v2 catalogue are estimated with Bolocat within three different aperture radii: 20′′, 40′′ and 60′′ (for details of the BGPS v2 catalogue generation see Ginsburg et al. 2013). Hyper fluxes have been evaluated for all the sources fixing the three different apertures. The agreement is very good for both faint and bright sources at each aperture, with less than 1% of the source with a flux difference 50%.

Open with DEXTER

5.3. Test on the Bolocam Galactic Plane Survey data

To show the versatility of Hyper we have also tested the algorithm on real data obtained with a completely different instrument. For this purpose we have extracted a sub-region of the BGPS (Aguirre et al. 2011). The survey covers approximately the Galactic range −10.5° ≤ l ≤ 90.5°, | b | ≤ 0.5° and the effective beam FWHM is 33′′ (Aguirre et al. 2011). The public catalogue of compact source coordinates and flux densities has been produced with a specifically designed algorithm, Bolocat (Rosolowsky et al. 2010). We used the BGPS v2 catalogue (Ginsburg et al. 2013) which contains 8000 sources in total with flux densities estimated within three different aperture radii: 20′′, 40′′ and 60′′. We have compared the BGPS source catalogue fluxes with the Hyper fluxes measured on the public available BGPS map in a wide region in the range 20° ≤ l ≤ 28°, | b | ≤ 0.5°. The BGPS v2 catalogue in this region contains 796 sources. We evaluated the Hyper fluxes for all the BGPS sources at three fixed aperture radii, 20′′, 40′′ and 60′′, in order to allow a direct comparison with the BGPS fluxes. The flux comparison using these three aperture radii are shown in Fig. 16. The agreement is very good at all the three different apertures, with only few outliers at each aperture radius. Less than 1% of the sources have a flux difference greater than 50% at each aperture (67, 34 and 54 using aperture radii of 20′′, 40′′ and 60′′ respectively). Visual inspections show that these sources are either very weak or in crowded regions and/or on top of very variable backgrounds. The mean and rms of the flux differences at each aperture radius excluding these outliers are in Table 9. The average difference between the BGPS and Hyper fluxes is 9%, −8% and −10% at 20′′, 40′′ and 60′′ respectively, with a rms of 16.5% for all the apertures, likely due to the different strategies adopted to estimate the background and the source deblending.

Table 9

Mean flux difference and rms of the flux difference distribution between the flux of 796 BGPS sources extracted from the BGPS v2 catalogue and the flux estimated with Hyper using three different aperture radii.

6. Conclusions

We have developed a new source extraction and photometry algorithm, Hyper, an enhanced application of aperture photometry specifically designed for multi-wavelengths photometry on crowded fields in complex background. The extraction is done in a high-pass filtered map which amplifies the compact sources while suppressing the diffuse emission, allowing source identification in regions with highly variable background. The source photometry is done over an elliptical aperture with a size and shape estimated from a 2D Gaussian fit using the mpfit2dpeak IDL routine. The 2D Gaussian fitting allows us to identify the region over which integrate the flux for both point sources and slightly extended sources, minimising the flux contamination from the region surrounding the sources. The background is modelled with different polynomial orders and in squared regions of different sizes. The model which minimises the rms of the residual map is taken as the background estimate and subtracted from the data. Blended sources are fitted simultaneously with a multiple 2d Gaussian models and the fit for companions is subtracted from the original data before evaluating the flux of the reference source. This deblending systematically improves the flux estimates of the sources in crowded fields. The algorithm is designed to allow multi-wavelength flux estimation by fixing the aperture radius at a reference wavelength and integrating simultaneously at all the selected wavelengths across the same volume of gas and dust.

Hyper has been tested on simulated fields in which model sources have been injected on real observed backgrounds. These simulations show that Hyper can typically recover the model source flux with a high degree of accuracy both in the case of random injected sources and in specific tests with sources injected across filamentary structures. The multi-wavelength approach has been tested at the Herschel wavelengths demonstrating high degree of accuracy at each wavelength, with a slight flux overestimation in the extreme case of the flux at 70 μm estimated within the aperture fixed at 500 μm.

Hyper photometry has been tested on real fields showing good agreement with other algorithms and the estimation of the uncertainties both in a very simple case (few isolated sources in the B1 Perseus star-forming regions) and in very complex fields (hundred of sources on top of very variable backgrounds and in crowded regions). Also, the versatility of Hyper has been demonstrated by showing the very good agreement between the Hyper and the publicly available fluxes of 800 sources extracted from the BGPS, the 1.1 mm survey of the Galactic plane carried out with the Caltech Submillimeter Observatory.

Hyper is also very fast. To measure the fluxes of 1600 sources extracted from Hi-GAL counterparts of IRDCs in the 15° ≤ l ≤ 55°, | b | ≤ 1° region of the Galactic plane at four wavelengths simultaneously (70, 160, 250 and 350 μm) with the default settings it requires 30 min on a 2.2 GHz Intel Core i7 machine, and uses less than 200 Mb of RAM (Traficante et al. 2014)

Hyper is an IDL code initially developed to extract compact sources from Herschel surveys, in particular for the Herschel Galactic plane survey, Hi-GAL. However, it is highly modular and highly parameterisable and allows the user to adapt it to the specifications of different surveys and observations.

The code is freely available and it can be downloaded from the page http://www.irdarkclouds.org.


1

Interactive Data Language.

Acknowledgments

This research has made use of data from the Herschel Gould Belt Survey (HGBS) project (http://gouldbelt-herschel.cea.fr). The HGBS is a Herschel Key Programme jointly carried out by SPIRE Specialist Astronomy Group 3 (SAG 3), scientists of several institutes in the PACS Consortium (CEA Saclay, INAF-IFSI Rome and INAF-Arcetri, KU Leuven, MPIA Heidelberg), and scientists of the Herschel Science Center (HSC). The authors also want to thank the HGBS consortium for using the B1 Perseus map at PACS 70 μm and to publish the Cutex and getsources fluxes in Table 8. The authors want to thank Alexander Men’shchikov for having run getsources on a part of the B1 Perseus region. J.E.P. has received funding from the European Communitys Seventh Framework Programme (/FP7/20072013/) under grant agreement No 229517. A.T. wants to thank Maria del Mar Rubio Diez for the help during the test phase of the Hyper algorithm. A.T. is supported by STFC consolidated grant to JBCA.

References

  1. Aguirre, J. E., Ginsburg, A. G., Dunham, M. K., et al. 2011, ApJS, 192, 4 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
  2. André, P., Men’shchikov, A., Bontemps, S., et al. 2010, A&A, 518, L102 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  3. Arzoumanian, D., André, P., Didelon, P., et al. 2011, A&A, 529, L6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  4. Commerçon, B., Launhardt, R., Dullemond, C., et al. 2012, A&A, 545, A98 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  5. Da Costa, G. S. 1992, in Astronomical CCD Observing and Reduction Techniques, ed. S. B. Howell, ASP Conf. Ser., 23, 90 [Google Scholar]
  6. Diolaiti, E., Bendinelli, O., Bonaccini, D., et al. 2000, A&A, 147, 335 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  7. Elia, D., Schisano, E., Molinari, S., et al. 2010, A&A, 518, L97 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  8. Evans, II, N. J., Allen, L. E., Blake, G. A., et al. 2003, PASP, 115, 965 [NASA ADS] [CrossRef] [Google Scholar]
  9. Giannini, T., Elia, D., Lorenzetti, D., et al. 2012, A&A, 539, A156 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  10. Ginsburg, A., Glenn, J., Rosolowsky, E., et al. 2013, ApJS, 208, 14 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
  11. Griffin, M. J., Abergel, A., Abreu, A., et al. 2010, A&A, 518, L3 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  12. Gutermuth, R. A., Megeath, S. T., Myers, P. C., et al. 2009, ApJS, 184, 18 [NASA ADS] [CrossRef] [Google Scholar]
  13. Lutz, D., Poglitsch, A., Altieri, B., et al. 2011, A&A, 532, A90 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  14. Markwardt, C. B. 2009, in ADASS XVIII, eds. D. A. Bohlender, D. Durand, & P. Dowler, ASP Conf. Rev., 411, 251 [Google Scholar]
  15. Men’shchikov, A., André, P., Didelon, P., et al. 2010, A&A, 518, L103 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  16. Men’shchikov, A., André, P., Didelon, P., et al. 2012, A&A, 542, A81 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  17. Molinari, S., Swinyard, B., Bally, J., et al. 2010, PASP, 122, 314 [NASA ADS] [CrossRef] [Google Scholar]
  18. Molinari, S., Schisano, E., Faustini, F., et al. 2011, A&A, 530, A133 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  19. Motte, F., Zavagno, A., Bontemps, S., et al. 2010, A&A, 518, L77 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  20. Nguyen Luong, Q., Motte, F., Hennemann, M., et al. 2011, A&A, 535, A76 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  21. Ott, S., & Herschel Science Ground Segment Consortium 2010, AAS Meeting, 216, 413.10 [Google Scholar]
  22. Peretto, N., & Fuller, G. A. 2009, A&A, 505, 405 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  23. Pezzuto, S., Elia, D., Schisano, E., et al. 2012, A&A, 547, A54 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  24. Pilbratt, G. L., Riedinger, J. R., Passvogel, T., et al. 2010, A&A, 518, L1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  25. Pineda, J. E., Arce, H. G., Schnee, S., et al. 2011, ApJ, 743, 201 [NASA ADS] [CrossRef] [Google Scholar]
  26. Poglitsch, A., Waelkens, C., Geis, N., et al. 2010, A&A, 518, L2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  27. Rigby, E. E., Maddox, S. J., Dunne, L., et al. 2011, MNRAS, 415, 2336 [NASA ADS] [CrossRef] [Google Scholar]
  28. Rosolowsky, E., Dunham, M. K., Ginsburg, A., et al. 2010, ApJS, 188, 123 [NASA ADS] [CrossRef] [Google Scholar]
  29. Sadavoy, S. I., di Francesco, J., André, P., et al. 2012, A&A, 540, A10 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  30. Savage, R. S., & Oliver, S. 2007, ApJ, 661, 1339 [NASA ADS] [CrossRef] [Google Scholar]
  31. Stetson, P. B. 1987, PASP, 99, 191 [NASA ADS] [CrossRef] [Google Scholar]
  32. Traficante, A., Fuller, G. A., Peretto, N., et al. 2014, MNRAS, submitted [Google Scholar]
  33. Ward-Thompson, D., Kirk, J. M., André, P., et al. 2010, A&A, 518, L92 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  34. Williams, J. P., de Geus, E. J., & Blitz, L. 1994, ApJ, 428, 693 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A: Flux difference between source model and Hyper measurements for T1 and T2

thumbnail Fig. A.1

Difference between Hyper measured fluxes and the model fluxes as function of the source peak fluxes measured as a multiple of the standard deviation of the map (peak SDEV), for T1 at all wavelengths. 70 μm flux distribution (upper left panel), 160 μm (upper right), 250 μm (centre left), 350 μm (centre right) and 500 μm (lower panel). The mean and median values are only few percent and the rms of the distribution is only higher than 30% for the 350 μm map. Mostly for the faint sources the flux estimation is worse than 30%.

Open with DEXTER

thumbnail Fig. A.2

Same as Fig. A.1, but for T2. The distributions are much more regular with a rms ≤ 10%. Only few sources have a flux difference higher than 15%.

Open with DEXTER

Appendix B: Radius difference between source model and Hyper measurements for T1 and T2

thumbnail Fig. B.1

Comparison of the injected and measured source size for run T1, evaluated as the geometrical mean between the minimum and the maximum aperture radius. 70 μm size distribution (upper left panel), 160 μm (upper right), 250 μm (centre left), 350 μm (centre right) and 500 μm (lower panel). There are less than 1% of the points distributed along three vertical lines, corresponding to fixed Hyper source radii. The lowest and the highest radii correspond to the sources that Hyper has forced to be equal to a circular 2D Gaussian with FWHM equal to, respectively, the minimum and the maximum aperture radius chosen in the parameter file. The central points are in correspondence of fits that did not converge. In these cases Hyper has forced the source shape to be circular with a FWHM equal to the geometrical mean of the minimum and maximum FWHM limits set in the parameters file.

Open with DEXTER

thumbnail Fig. B.2

Same distributions as Fig. B.1 but for run T2.

Open with DEXTER

Appendix C: Absolute flux difference distributions between injected sources and Hyper sources in the multi-wavelength approach

thumbnail Fig. C.1

Histogram of the absolute flux difference between the flux of the injected sources and the Hyper fluxes at each wavelengths with flux measured in the 500 μm aperture region as described in Sect. 4.4. The sources are 145 at each wavelength, all the sources with no clustered companions as described in Sect. 4.4. The figures show the 70 μm (upper left panel), 160 μm (upper right), 250 μm (centre left), 350 μm (centre right) and 500 μm (lower panel) distributions.

Open with DEXTER

All Tables

Table 1

Hyper flux estimation for the injected test source (Fig. 2) using different polynomial fits to model the background.

Table 2

Comparison between the integrated flux for the seven blended sources in Fig. 3.

Table 3

Total number of sources injected at various wavelengths, for both T1 and T2 test cases.

Table 4

Number and percentage of recovered sources at each wavelength for T1 and T2 runs using different thresholds.

Table 5

rms of the size distribution for both T1 and T2 runs at each wavelength.

Table 6

Average flux difference between the Hyper fluxes and model fluxes for the 50 sources injected in the Polaris field as showed in Fig. 9 at 160, 250, 250 and 500 μm.

Table 7

Hyper output parameters.

Table 8

Fluxes and FWHM of the seven sources in part of the Perseus B1 region measured with Hyper, Cutex and getsources respectively.

Table 9

Mean flux difference and rms of the flux difference distribution between the flux of 796 BGPS sources extracted from the BGPS v2 catalogue and the flux estimated with Hyper using three different aperture radii.

All Figures

thumbnail Fig. 1

Hi-GAL 70 μm counterpart of the IRDC SDC19.073-0.602 (left panel). The blue crosses are the sources identified in the clean high-pass filtered image (right panel). The high-pass filtering emphasises the peaks while suppressing the diffuse emission.

Open with DEXTER
In the text
thumbnail Fig. 2

Upper and lower left: simulated map of a 2D Gaussian source added on top of a variable background, extracted from a region centred on longitude l = 55° in the Galactic plane from the Hi-GAL survey at 160 μm. The source itself is masked in the centre to improve the polynomial fit to the background. Centre: background estimate adopting a first-order (plane) approximation (upper centre) and a fourth-order approximation (lower centre). Right: residual map assuming a first-order polynomial fit for the background estimation (upper right) or a fourth-order (lower right) polynomial fit. The high-order polynomial takes the complexity of the background into account.

Open with DEXTER
In the text
thumbnail Fig. 3

Left: simulation of seven blended sources with the corresponding Hyper profiles. Each ellipse represents the aperture area of the sources. The semi-axes are equal to the FWHMs obtained from the 2D Gaussian fit. Right: map of source 4 after companions removal done with the simultaneous multi-Gaussian fit and local background subtraction as described in the text. Only three sources contribute to the flux at the position of source 3 (sources 1, 3 and 6) and have been deblended. The rectangular region in the right panel is the local region where the background for the source 3 is evaluated after the deblending as described in Sect. 3.1. The residuals are low and they do not lead to severe flux errors for the reference source (Table 2).

Open with DEXTER
In the text
thumbnail Fig. 4

Results of deblending the seven partially overlapping sources shown in Fig. 3. The flux of the 2D Gaussian source model for each source is shown as a black diamond. The source flux evaluated without any companions removal is shown as a blue triangle, and after companion removal is shown as a red asterisk.

Open with DEXTER
In the text
thumbnail Fig. 5

Flux difference in percentage with varying the number of points describing the ellipse for different bins used to oversample each pixel with respect to the reference model flux. The differences are 1.5% using more than 200 points to describe each ellipse and with at least a bin value per side of 15 (green and blue lines). When using no bins (black line) the percentage is remarkably high and independent of the number of points describing the ellipse (using at least 200 points). In fact it implies integrating the entire pixel values for all the pixels with which the ellipse points is described.

Open with DEXTER
In the text
thumbnail Fig. 6

One realisation of the TS1 run with 80 injected sources on top of true background, extracted from the Hi-GAL 160 μm map centred at on (l,b) = (20.6°,0.7°). The blue ellipses are the source identified by Hyper using a threshold σt of 6.5. The ellipses describe the aperture regions of the sources, evaluated as discussed in Sect. 3.

Open with DEXTER
In the text
thumbnail Fig. 7

Percentage of false positives identified for various thresholds σt for two simulated fields, TS1 with faint sources (red line) and TS2 with bright sources (blue dashed line). The fraction of false positives decreases with increasing threshold values.

Open with DEXTER
In the text
thumbnail Fig. 8

One realisation of T1 run for a field extracted from PACS 70 μm map (left panel) and SPIRE 500 μm map (right panel). There are 500 sources and 100 sources injected per realisation in the 70 μm and 500 μm maps respectively.

Open with DEXTER
In the text
thumbnail Fig. 9

Section of the Polaris Flare region with filamentary structures as seen by Herschel at 160 (top left), 250 (top right), 350 (bottom left) and 500 (bottom right) μm (from HGBS, André et al. 2010). Fifty random 2D Gaussian sources have been injected at each wavelength and positioned on top of the filaments, as discussed in Sect. 4.3. The 250 μm map also shows the 50 sources positions and 2D Gaussian fit as obtained with Hyper.

Open with DEXTER
In the text
thumbnail Fig. 10

Average absolute flux differences between the flux models and the flux recovered by Hyper within the 500 μm fit aperture region at each wavelength. The accordance is on average 10% at all wavelengths except at 70 μm, where within the 500 μm aperture the flux estimation includes residual background contribution not removed with the background removal as discussed in Sect. 4.4.

Open with DEXTER
In the text
thumbnail Fig. 11

Left: Hi-GAL map of the 2° × 2° region chosen centred on (l,b) = (50°,0°) observed at λ = 250μm chosen to test the Hyper multi-wavelength approach described in Sect. 4.4. Right: same region with overlapped the 200 injected sources as described in Sect. 4.4.

Open with DEXTER
In the text
thumbnail Fig. 12

Zoom of a region with 5 injected sources extracted from the map showed in Fig. 11. From left to right: the same region observed at 70, 160, 250, 350 and 500 μm. The black cross in the 70 μm image identify the 5 source centroids. The blue ellipses in the other maps represent the Hyper source fit done in the 500 μm map which determines the aperture region at all wavelengths as described in Sect. 4.4.

Open with DEXTER
In the text
thumbnail Fig. 13

A small area of the Perseus B1 region observed with PACS 70 μm as part of the HGBS (André et al. 2010). The background is flat all across the region and the seven sources (the blue ellipses) are easily identified by the code.

Open with DEXTER
In the text
thumbnail Fig. 14

Flux comparison between Hyper and Cutex (red crosses) and getsources (green triangles) for the seven sources identified in the Perseus 70 μm field. The Hyper flux are in very good agreement with the Cutex and the getsources fluxes, with a discrepancy on average of 7% and 13% with Cutex and getsources fluxes respectively.

Open with DEXTER
In the text
thumbnail Fig. 15

Comparison between Hyper and Cutex 250 μm integrated fluxes for 1000 compact sources extracted from the Traficante et al. (2014) survey of protostars associated with IRDCs in the Galactic range 15° ≤ l ≤ 55°. Blue dashed line: y = x straight line. Red line: linear fit of the distribution. The coefficient of the fit is m = 0.87, and the intercept is 0.13 Jy which indicate no systematics in the algorithms but a slight overestimate of the Hyper fluxes compared with the Cutex fluxes. The scatter is mostly due to the different background subtraction and deblending approaches.

Open with DEXTER
In the text
thumbnail Fig. 16

Flux comparison between the official BGPS flux catalogue and the Hyper flux estimation. The comparison includes 796 sources extracted from the public available BGPS v2 catalogue in the Galactic range 20° ≤ l ≤ 28°. The fluxes in the BGPS v2 catalogue are estimated with Bolocat within three different aperture radii: 20′′, 40′′ and 60′′ (for details of the BGPS v2 catalogue generation see Ginsburg et al. 2013). Hyper fluxes have been evaluated for all the sources fixing the three different apertures. The agreement is very good for both faint and bright sources at each aperture, with less than 1% of the source with a flux difference 50%.

Open with DEXTER
In the text
thumbnail Fig. A.1

Difference between Hyper measured fluxes and the model fluxes as function of the source peak fluxes measured as a multiple of the standard deviation of the map (peak SDEV), for T1 at all wavelengths. 70 μm flux distribution (upper left panel), 160 μm (upper right), 250 μm (centre left), 350 μm (centre right) and 500 μm (lower panel). The mean and median values are only few percent and the rms of the distribution is only higher than 30% for the 350 μm map. Mostly for the faint sources the flux estimation is worse than 30%.

Open with DEXTER
In the text
thumbnail Fig. A.2

Same as Fig. A.1, but for T2. The distributions are much more regular with a rms ≤ 10%. Only few sources have a flux difference higher than 15%.

Open with DEXTER
In the text
thumbnail Fig. B.1

Comparison of the injected and measured source size for run T1, evaluated as the geometrical mean between the minimum and the maximum aperture radius. 70 μm size distribution (upper left panel), 160 μm (upper right), 250 μm (centre left), 350 μm (centre right) and 500 μm (lower panel). There are less than 1% of the points distributed along three vertical lines, corresponding to fixed Hyper source radii. The lowest and the highest radii correspond to the sources that Hyper has forced to be equal to a circular 2D Gaussian with FWHM equal to, respectively, the minimum and the maximum aperture radius chosen in the parameter file. The central points are in correspondence of fits that did not converge. In these cases Hyper has forced the source shape to be circular with a FWHM equal to the geometrical mean of the minimum and maximum FWHM limits set in the parameters file.

Open with DEXTER
In the text
thumbnail Fig. B.2

Same distributions as Fig. B.1 but for run T2.

Open with DEXTER
In the text
thumbnail Fig. C.1

Histogram of the absolute flux difference between the flux of the injected sources and the Hyper fluxes at each wavelengths with flux measured in the 500 μm aperture region as described in Sect. 4.4. The sources are 145 at each wavelength, all the sources with no clustered companions as described in Sect. 4.4. The figures show the 70 μm (upper left panel), 160 μm (upper right), 250 μm (centre left), 350 μm (centre right) and 500 μm (lower panel) distributions.

Open with DEXTER
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.