EDP Sciences
Free Access
Issue
A&A
Volume 611, March 2018
Article Number A23
Number of page(s) 12
Section Astronomical instrumentation
DOI https://doi.org/10.1051/0004-6361/201731428
Published online 16 March 2018

© ESO 2018

1 Introduction

In recent years numerous large-scale direct imaging surveys for exoplanets have shown that massive gas giant planets at wide orbital separations (>50 AU) are rare (e.g. Lafreniére et al. 2007; Chauvin 2010; Heinze et al. 2010; Rameau et al. 2013b; Biller et al. 2013; Nielsen et al. 2013; Wahhaj et al. 2013; Chauvin et al. 2015; Meshkat et al. 2015; Reggiani et al. 2016). However, a few objects that have masses below ≈12 MJ and separations smaller than 100 AU have been detected (cf. Bowler 2016), including the four-planet system around HR 8799 (Marois et al. 2008, 2010b), β Pic b (Lagrange et al. 2009), HD 95086 b (Rameau et al. 2013a), and 51 Eri b (Macintosh et al. 2015). The currently ongoing new surveys with optimized high-contrast imagers installed at 8 m telescopes, such as VLT/SPHERE (Beuzit et al. 2006) and Gemini/GPI (Macintosh et al. 2006), will not only put unprecedented constraints on the overall occurrence rate of wide-separation massive planets, but they are also expected to reveal a number of new detections.

In order to understand the atmospheric properties and composition of directly imaged giant planets, which ultimately one may want to link to predictions from planet models (e.g. Öberg et al. 2010; Thiabaud et al. 2015), it is crucial to measure the SED of the planets over a broad wavelength range (e.g. Skemer et al. 2012). In addition to the 1–2.5 μm range, which is the target wavelength range of the instruments listed above, the 3–5 μm range is of particular importance because L band (~3.8 μm) and M band (~4.8 μm) flux measurements probe CH4 and CO atmospheric absorbtion features (Currie et al. 2014). Determining the CO/CH4 ratio could for example lead to evidence for potential non-equilibrium chemistry in the planets atmosphere (e.g. Galicher et al. 2011; Currie et al. 2014). It has also been shown that L band flux measurements are useful to determine cloud properties through atmospheric retrieval (Lee et al. 2013).

Furthermore, observing at 3–5 μm allows us in principle to search for cooler objects, which either means less massive for a given age or older for a given mass (e.g. Heinze et al. 2010). Eventually, in the era of extremely large telescopes with apertures >30 m, direct imaging in the 3–5 μm range will not only allow the direct detection of numerous exoplanets with an empirically determined mass (which are typically a few Gyr old), but it might even be possible to directly detect the thermal emission from small planets around the nearest stars (e.g. Quanz et al. 2015b).

One of the main challenges, however, to detect an exoplanet in the 3–5 μm range is the high thermal background emission from the sky, the telescope and also instrument optics (Lloyd-Hart & Angel 2000). On the one hand this emission adds a significant amount of photon noise to the data, which can obscure the planet signal (the sky brightness at the VLT observatory is 3.0 mag/arcsec2 in the L band and –0.5 mag/arcsec2 in the M band1). Even worse, however, are temporal and spatial variations of the background emission on short timescales that may ultimately limit the sensitivity of a given dataset.

For some of the directly imaged exoplanets M band data have been published (e.g. Galicher et al. 2011; Bonnefoy et al. 2013; Rajan et al. 2017), but, for instance, the M band detection of the HR 8799 planets was only achieved by applying the LOCI algorithm (Lafrenière et al. 2007), which was initially written to model and subtract the point spread function (PSF) of the central star in high-contrast imaging data, to subtract the thermal background emission (Marois et al. 2010a; Galicher et al. 2011).

In this paper, we follow a similar strategy and apply principal component analysis (PCA) to model and subtract the thermal background emission in high-contrast imaging datasets. PCA-based algorithms (e.g. Amara & Quanz 2012; Soummer et al. 2012; Amara et al. 2015) are today widely used to model and subtract the PSF in high-contrast datasets, and here we show that PCA can also be applied prior to the PSF modelling and subtraction to model and remove the thermal background separately from the PSF (see also Gonzalez et al. 2017). The procedure effectively decouples the stellar light from the background and enables background limited performance. Complex background estimation for astronomical images is also important in fields other than exoplanets (e.g. Popowicz & Smolka 2015)

The datasets that we used to test the performance of the new algorithm are presented in Section 2. In Sect. Section 3 we use one dataset to show explicitly how the PCA-based background subtraction works and what the thermal background is composed of for this particular dataset. In addition, we show how well the algorithm removes the thermal background in the vicinity and directly on top of the stellar PSF. In Section 4we present and discuss the results for complete data reductions with PCA-based background subtraction and conventional background subtraction schemes for all three selected datasets. We further discuss some general results and implications in Section 5 and in Section 6 we summarize our conclusions and discuss future applications.

2 Data

For testing and validating the PCA-background subtraction we used three publicly available datasets from the ESO archive. The observations were made with the Very Large Telescope (VLT) and the Nasmyth Adaptive Optics System (NAOS) coupled to the Near-Infrared Imager and Spectrograph (CONICA). The data was collected in pupil-stabilized (angular differential imaging, ADI) mode with CONICA running in Cube Mode, that is, each individual frame was stored. Cube mode allows us to select the best frames during the data reduction process and is in fact crucial for the PCA-based background subtraction. The L27 camera with a resolution of 27.12 mas/pixel, a field-of-view of 28 × 28 arcsec2 and a size of 1024 × 1024 pixel2 was used, but only subarrays were read out for these observations.

Two datasets are uncoronagraphic observations of β Pic and HD 100546 in M band (λc = 4.8 μm, Δλ = 0.59 μm), respectively. The data for β Pic was acquired during the night of November 26 in 2012 and first published by (2013; Program ID: 090.C-0653). The companion of β Pic is a good target for the test of a data reduction method because the planetary system is at a distance of only 19.44 ± 0.05 pc (van Leeuwen 2007) and β Pic b is bright in the near- and mid-infrared. Therefore, the planet can easily be observed and it is possible to compare the performance of different data reduction processes. The observing conditions during the night were rather poor (seeing ~1.5′′ at 0.5 μm, coherence time ~0.001 s). However, observations at these wavelengths benefit from improved Strehl ratio and PSF stability compared to shorter wavelengths. The dataset of HD 100546 is interesting because it contains a significant amount of extended emission from a circumstellar disk. This particular dataset was acquired on April 19, 2013 (Quanz et al. 2015a, Program ID: 091.C-0818(A)), the conditions were photometric. The background sampling and bad pixel correction for both observations were enabled via a four-quadrant dither pattern across the detector.

The third dataset is an observation of HD 169142 in the L filter (λc = 3.8 μm, Δλ = 0.62 μm) with the annular groove phase mask (AGPM) vector vortex coronagraph (Mawet et al. 2013). It was carried out in June 28, 2013 (Reggiani et al. 2014, Program ID: 291.C-5020(A)). For this observation, the background was sampled by moving the star away from the detector every ~20 min. More detailsof the VLT/NACO datasets used in this paper are summarized in Table 1.

In addition we also applied the PCA-based background subtraction scheme to the M band data of HR 8799 from Galicher et al. (2011). The goal of our reduction of this data is to show that the PCA-based approach is also capable of retrieving the outer three companions of HR 8799 just as well as the LOCI basedbackground subtraction introduced in their paper. We put the result of this particular data reduction into Appendix Section A because, even though is an important proof of concept, the NIRC2 dataset is substantially different from the other datasets in this paper.

Table 1

Summary of datasets.

3 Subtracting background emission with PCA

The PCA-based background subtraction was tested and validated in great detail with the M band dataset of β Pic. Here we have used this example to explain each step of the process. However, in practice, the method can be adapted to any kind of dataset if the background was reasonably sampled during the observation.

3.1 Observing strategy and preparation of raw data

The raw data is stored in cubes and each cube consists of 300 images recorded in quick succession. Each cube has a total integration time of 19.5 s. In every subsequent cube the star is shifted to another quadrant on the detector. With this strategy it is possible to acquire images from the object during the whole observation time while simultaneously sampling the background across the whole detector. This observing strategy allows for a well sampled background without sacrificing observation time for the object. Moving the star to another part on the detector also helps to reduce the effect of detector inhomogeneities (e.g. bad pixels) and flat field variations on the final result of the data reduction.

In order to model the background, the raw images from all cubes are split up into quarters. In each stack this leaves us with one quarter of the data containing the stellar PSF and three quarters of background. The background images from a certain quadrant then serve as a basis for modelling the background of the images where the star is present in this particular quadrant. The background subtraction is performed for each quadrant separately.

3.2 Identifying bad frames

It was important to identify and remove frames where the AO performance was bad. This was done cube by cube, by comparing the PSF peak flux for every image inside the cube. An image was removed from the cube when the PSF peak flux in this particular image was lower than 85% of the maximum PSF peak flux of all images within the cube. This effectively selected the images with bad AO performance, but still kept 95% of the total number of images (52 170 good frames) for further analysis.

3.3 Correcting bad pixels

Bad pixels were corrected by applying a 5σ filter to the frames. The filter calculates for every single pixel the difference between its value and the median value of the surrounding 24 pixels (within a 5 × 5 square pixel mask). Pixels were marked as “bad” if this difference was larger than five times the standard deviation of the surrounding pixels. Bad pixels were subsequently replaced by the median of the surrounding pixels. This method is good for correcting single outliers, however, it is not capable of correcting clusters of bad pixels, this can be seen in Figure 1, for instance.

thumbnail Fig. 1

Top left shows an image of the star that was only corrected for bad pixels. Only the first quadrant of the detector is shown because this is where the star was put in this exposure. The top right figure is the same image with identical colourbar length but after a mean background subtraction. The figure on the bottom is additionally smoothed with a Gaussian filter to reduce the pixel noise and has a narrower colourbar emphasize the inhomogeneous residual background structure.

Open with DEXTER

3.4 Subtracting the mean background

For this dataset, we applied a mean background subtraction before calculating the PCA. In principle, this is not necessary because it would also be done by the PCA background subtraction algorithm, but in this case here it is instructive to do it because we want to analyse the PCA based algorithm relative to the simple subtraction of a mean background.

The mean background for a particular quadrant is the mean of all images from this quadrant that do not contain the star. This mean background image is subtracted from both the star images and the background images of this quadrant. The result of this is a number of mean background subtracted star images and three times this number of mean background subtracted background images. Finally, the mean of every individual image is removed to correct for offsets of the residual background. For this step the star is masked with a 50 × 50 pixels2 square mask. The mean background subtracted background images are later analysed with PCA to find the principal components of the residual background.

We are left with star images and background images that only contain the spatially and temporarily variable residual background. Figure 1 shows one of the star images before and after the mean background subtraction. The frame at the bottom shows the variable residual background structure that is still left after the mean background subtraction. The residual background can change significantly between two exposures.

3.5 Subtracting the PCA residuals

The PCA was used to find an orthogonal set of basis images (principal components) to model the residual background that was left after the mean background subtraction. The advantage of PCA is that it creates the basis set automatically and arranges the basis images according to their contribution to the representation of the background. Therefore, one can easily identify those principal components that should be chosen to model the residual background structure most effectively. The PCA is essentially a singular value decomposition (SVD) of the residual background images. The most important principal components are the orthonormal basis vectors that belong to the highest singular values. Shlens (2014) is good a source for a more detailed description of how PCA works and how it can be performed with a SVD. The PCA implementation for the residual background subtraction code is essentially a clone of the PynPoint package (Amara & Quanz 2012), which was built to model and subtract the stellar PSF in high-contrast imaging data.

Two steps are necessary to model and subtract the residual background in the star images. First, the basis images need to be created and, second, the optimal number of basis images has to be fitted to the residual background in the star images. A discussion about the meaning of an optimal number of principal components can be found in Section 3.6.

The principal components are created with a PCA applied to the prepared mean background subtracted background images from Section 3.4. Figure 2 shows some examples of the principal components from the analysis of the first quadrant data. Those are some of the components which are best suited for modelling the residual background structure. Higher order components are less important for modelling the background.This comes naturally from the PCA and is further shown and discussed in Section 3.6.

The next step, after calculating the components, was fitting a linear combination of the optimal number of principal components to the background of the star images. The stellar PSF in the star images is masked during the fitting so that the components are only fitted to the background and not also to the signal from the star. The masked 50 × 50 pixels area around the PSF is highlighted in Figure 3. Figure 3 also shows the resulting PCA residual background after fitting the 60 first principal components to the residual background of the star image. It also shows the star image after the subtraction of the residual background. The procedure removes most of the structured background. The images in Figure 4are a sequence of residual backgrounds from five star images that were recorded consecutively. This sequence shows that the inhomogeneous residual background changes rapidly from one image to the next on a timescale equal to or faster than the time between two recordings (~0.1 s). It is difficult to pinpoint the exact cause for the short timescale variations. They could be induced, for example by temporal variations in the atmosphere or the deformable mirror (DM) of the instrument.

thumbnail Fig. 2

Some of the first few principal components (component # 1, 2, 3, 4, 5, 10, 20, 40 and 60) from the PCA of the mean subtracted background images from the first quadrant of the β Pic M band dataset. The principal components are normalized and the colour scale is in linear units.

Open with DEXTER
thumbnail Fig. 3

Comparison between a star image with subtracted mean background (upper left) and subtracted mean and PCA residual background with 60 principal components (upper right). The star image also shows the area around the stellar PSF which is masked for the fit of the principal components (black rectangle). The image on the bottom shows the corresponding PCA residual background. All images were smoothed with a Gaussian filter to improve the visibility of the background structure.

Open with DEXTER
thumbnail Fig. 4

Residual background from five consecutively recorded star images. Each image was fitted with 60 principal components. Therefore, the figure essentially depicts how the variable residual background changes on timescales of ~0.1 s. All images were filtered with a Gaussian filter to improve the visibility of the background structure.

Open with DEXTER

3.6 Optimal number of principal components

Tests on a few examples have shown that there is an optimal numberof principal components for fitting the residual background.This means that at some point the fitted background converges and the use of more components does not further change the result significantly. Overfitting did not occur for the number of principal components that we used in our tests. Figure 5 shows how the background changes when different numbers of principal components are used to fit the background in the same image. The residual backgrounds do not change visibly anymore for more than ~60 components.

The convergence of the fitted residual background can also be seen quantitatively in Figure 6. The plot shows the RMS of the difference between residual backgrounds with increasing numbers of fitted components. The convergence is roughly exponential with an e-folding scale of around 18. This plot along with the results in Figure 5 show that around 60 principal components, or 3 e-foldings, are necessary for the background to converge. This number of principal components was finally used for fitting the background in the complete data reduction of the β Pic M band dataset.

It is important to be aware that the number of significant principal components does not have to be the same for each dataset. It depends on the total number of images and how theconditions change during the observation. The PCA finds principal components which are the best description for residual background of all images in the whole dataset. Therefore, when conditions change significantly during the observation, more components are going to be needed to model the residual background in the individual images. It is advisable to always calculate the background with different numbers of principal components and use some convergence criterion, such as the one shown in Figure 6, to decide what number of components to use for the final data reduction.

thumbnail Fig. 5

Fitted PCA residual background for one of the star images from the β Pic M band dataset. The images show how the background changes when more principal components are used to model the residual background (for 5, 10, 15, 20, 30, 40, 60, 80, and 100 components).

Open with DEXTER
thumbnail Fig. 6

Convergence of the computed residual background when moreprincipal components are used for the fit. The data points in the plot are the square root of the mean quadratic sum (RMS) of the difference between two background images with different number of components. The points at X principal components were calculated by taking the background image fitted with X components minus the background image fitted with X + 10 components and taking the RMS (calculated over the whole image) from this difference. There are 10 different results for each # of components because this was done for 10 randomly selected star images. The continuous line goes through the mean values of theresults to show how the background converges. The e-folding scale from the exponential fit to the curve is 18 components.

Open with DEXTER

3.7 Masked background interpolation

In the previous section we explained how our PCA background subtraction works and have shown an example of an image from β Pic before and after the background subtraction. This shows how well the principle components are fitted to the unmasked background outside of the PSF. However, for close-in companions,like exoplanets, it is more important that the fitted background is able to accurately reconstruct and subtract the background structure in the vicinity of the PSF, the region which has to be masked for the fit. This also applies for β Pic b, because the planet has a roughly measured projected separation of around 0.48′′ in this particular dataset. This is well within the area of the 1.35′′× 1.35′′ mask which we used to cancel out the PSF for the fit of the principal components.

To analyse how well the background in the masked area is interpolatedwe used images from the stacks that contain only background.We masked the area where the PSF would usually be located and then applied the PCA background subtraction algorithm. In this way, we obtained a reconstructed background image including an area where we know exactly how the background should look like. This allows us to show how well the residual background is reconstructed in the masked area.

Figure 7 shows the results for one particular background image that was analysed in this way. It was chosen as an example because it shows a strong feature in the residual background that is largely hidden by the mask during the fit of the principal components (see panel a). The image in panel b shows how the “true” residual background looks like, when the fit of the principal components is applied to the full and unmasked background image. The image in panel c shows the fit using only the region outside of the mask. The residual background within the masked region is reconstructed very accurately as shown by the difference image in panel d of Figure 7. The residuals in d are at the 5% level over the whole image and at the 10% level in the vicinity of the mask.

We also analysed how close we are to the background limit of pure photon noise after apply counts from individual pixels through time we founding the residual background subtraction. We did this by calculating the root-mean-square (RMS) of the pixels within the mask before and after the background subtraction. The background limit is reached when the RMS approaches the standard deviation of the photon noise. Through analysing the distribution of that the standard deviation of the photon noise (converted to image counts) is approximately 37 ± 0.5 counts3 in this dataset. We calculated the remaining RMS for an increasing number of fitted principal components in 100 different background images and from this we determined the mean RMS for each number of fitted components. The result of this in shown in Figure 8. The mean RMS curve reaches a value of ~37 counts for our previously defined optimal number of 60 fitted principal components. From this we conclude that after the residual background subtraction, besides the speckle noise around the PSF, we are background limited solely by Poisson noise of the photons.

We also applied a Shaprio–Wilk test to test the normality of the distribution of the remaining RMS values for each number of fitted principal components. The number of background photons is large enough so that their distribution can be approximated by a normal distribution. We found that the distributions are skewed to higher RMS values for ten or fewer principal components but completely consistent with a normal distribution for ≥15 principal components. This is because the RMS has a lower limit at the background limit of ~37 counts but, in principle, no upper limit. The upper limit is given by the additional variability of the background that is not induced by Poisson noise. This result is another indication that we have approached the background limit and that it is possible by fitting around 15 principal components or more.

thumbnail Fig. 7

Panel a: example for a patch of an exposure which only shows background. The rectangle marks the area which is masked during the residual background fit, this is where the star would usually be located in a star image. This image was smoothed with a Gaussian filter to highlight the residual background structure. Panel b: “true” residual background when the principal components are fitted to the whole background image. Panel c: reconstructed background when the principal components are only fitted to the area outside the mask. Panel d: difference between (b) and (c). The colourbar in (a) is valid for (a), (b) and (c).

Open with DEXTER
thumbnail Fig. 8

Mean RMS (solid line) from the noise in the masked area after applying the PCA residual background subtraction to background images. These are the mean values from 100 different images. The background limit for this dataset was estimated to be around 37 counts.

Open with DEXTER

4 Complete data reduction

This sectionpresents and compares the results from the complete end-to-end data reductions of the aforementioned three datasets, including the β Pic M band dataset. Each dataset was background subtracted with the PCA-based method and with a conventional mean background subtraction. The mean background subtraction that was applied to the data from β Pic and HD 100546 works as follows: taking advantage of the dithering sequence, we subtract the mean image computed from both cube i – 1 and i + 1 from the images in cube i and we do this for each cube i. This attempts to correct for linear drifts in the sky over moderate timescales on a pixel by pixel basis. The mean background subtraction that we applied to HD 169142 just used the mean from the sky cube closest in time as described in Reggiani et al. (2014).

Before the PSF subtraction step all images of the non-coronagraphic datasets were aligned and centred. First, we aligned all images by maximizing the subpixel cross-correlation of each image with a reference image from the same dataset. Then we stacked the aligned images to create a master-PSF. We centred the master-PSF by fitting it with a Moffat function. Finally, we aligned and centred all images by maximizing the subpixel cross-correlation of each image with the centred master-PSF. Since the HD 169142 data is already centred due to the AGPM (see Reggiani et al. 2014), applied no further centring of the images for this dataset.

In all cases, after background subtraction, we then used PynPoint to subtract the stellar PSF, derotate the images and collapse the stack of images to a final residual image. We compare the final results by using the definition of signal-to-noise ratio (S/N) and false alarm probability (FAP) from Mawet et al. (2014). The small number statistics effect is not negligible because the separation of the planets in relation to the λD aperture size is small, especially for HD 169142b.

4.1 HD 100546 in the M filter

We applied the PCA-based background subtraction in exactly the same way as described in the previous section for the data from β Pic. We used 60 principal components to model and subtract the background. Due to the large size of the dataset, 101 657 individual images, we ran PynPoint so that it pre-stacked a certain number of consecutive images before applying the PSF subtraction algorithm. The other free parameter for the PSF subtraction is the number of principal components used to model and subtract the PSF. We reduced the data with pre-staking of 10, 100 and 300 consecutive images, resulting in a number of 10166, 1017 and 339 individual coadded images with effective integration times of 0.4, 4 and 12 s, respectively.We varied the number of principal components between 1 and 80.

There are many extended emission features visible around the star after the data reduction (discussed in Quanz et al. 2015a), making it difficult to assess and compare the results of the different data reductions by calculating S/N values for the companion with the established methods for point sources. Because of that, we decided to evaluate the results qualitatively.

We found that we could reproduce the result from Quanz et al. (2015a) very well by pre-stacking ten images and subtracting the PSF with 11 principal components for the dataset which was background subtracted with PCA4. This result is shown in Figure 9d. The identical data reduction with the mean background subtracted dataset can be seen in Figure 9c. The two bright spots at the bottom of the image are artefacts which are caused by detector inhomogeneities which are not sufficiently corrected by the mean background subtraction. They are not present in the PCA background subtracted image because they are also subtracted by the PCA background subtraction algorithm. Also in Figure 9, we show the results with ten pre-stacked images but a PSF subtraction with only three principal components. The result with PCA background subtraction clearly already shows the extended emission features, while the result from the mean background subtracted dataset is still very noisy. If we perform the PSF subtraction with even more principal components, the results from the mean background subtracted dataset do not change very much anymore. However, in the PCA background subtracted results, using more components for the PSF subtraction also subtracts the extended structures and the planet signal and for ≳40 components there is essentially only noise left.

Overall, better results are achieved if we do not pre-stack too many images before the PSF subtraction. The best results were achieved for the dataset, where we subtracted the background with PCA. For the mean background subtracted dataset, we need many more principal components to subtract the PSF and the result still appears more noisy. Without the more advanced background scheme it would not be possible to show the extended emission in this dataset so clearly.

We did not analyse the point-source component, that is, the planet candidate embedded in the extended structure of the disk, or assessed its S/N as this has already been done in Quanz et al. (2015a) for this particular dataset. The shape of theextended structures around the star can be explained by the impact of ADI processing on the scattered light emission from an inclined disk (Garufi et al. 2016).

thumbnail Fig. 9

Results for HD 100546 in M band after the PSF subtraction with PynPoint and pre-stacking of ten images. Panels a andc: results with mean background subtracted frames and a PSF subtraction with three and 11 principal components, respectively. Panels a and c: results with PCA residual background subtracted frames and a PSF subtraction with three and 11 principal components, respectively. The radius of the inner mask is 0.2′′ . North is up and east is to the left of the images. The scale is in arbitrary linear units.

Open with DEXTER

4.2 β Pic in the M filter

We used 60 principal components to model and subtract the background.Then we applied the PSF subtraction without pre-stackingand with pre-stacking 10, 50, 100, 200, 300 and 500 images, resulting in 52 170, 5217, 1044, 522, 261, 174 and 105 individual coadded images with effective integration times of 0.065, 0.65, 3.25, 6.5, 13, 19.5 and 32.5 s, respectively. We varied the number of principal components between 1 and 70.

The best S/N values for the companion β Pic b were achieved for around 200–500 pre-stacked images and with a PSF subtraction with between around ten and 30 principal components. The maximum S/N in this parameter space is in the range between 29–35. The numbers show some scatter because the number of stacked images and number of principal components can usually not be optimized independently from each other and even small changes can change the resulting S/N significantly. The fully reduced images with the highest S/N that we could achieve with both background subtraction schemes can be seen in Figure 10. Figure 11 shows S/N as a function of the number of principal components used for the PSF subtraction and the number of pre–stacked images for all data reductions with this dataset. The maximum S/N is very high (≳32) and not significantly different for the two cases. We also calculated the noise as a function of the separation from the star for the two highest S/N results. We did this by calculating the standard deviation of fluxes within non-overlapping apertures placed in concentric circles around the centre of the star, in accordance with the definition of noise in Mawet et al. (2014). The resulting noise curves are shown in Figure 12. The PCA background subtracted result is much more noisy, however, the signal from the planet is larger as well, resulting in a similar S/N. We suppose that the residuals of the mean background subtracted result are less noisy because we pre-stacked 500 images instead of only 100 as in the PCA background subtracted case, therefore some of the PSF variability is averaged out by the stacking. The less variable PSF is then easier to subtract from the stacked images. A possible reason for the weaker planet signal in the mean background subtracted case is that the signal is more smeared out due to the additional field rotation within each of the stacked images.

The results of the complete data reductions with PynPoint for this dataset can be summarized as follows:

  • Pre-stacking of more than around 200 images yield similar results and there is no significant improvement of the companion S/N due to the application of the PCA residual subtraction, considering all combinations of pre-stacked numbers of images and numbers of principal components for the PSF subtraction which we tested.

  • For a pre-stacking of less than 200 images, the maximum S/N or minimum FAP for the PCA background subtracted images can be achieved with a PSF subtraction using fewer principal components.

  • Highest S/N can only be achieved consistently when more than 200 images are pre-stacked prior to the PSF subtraction with PynPoint.

  • The resulting FAP of β Pic b corresponds to a 5σ detection for almost every data reduction that we tested.

  • Without pre-stacking the maximum achievable S/N is ~11 for both background subtraction techniques.

Even though we have explicitly shown for this dataset that we can remove the spatially and temporally variable background that cannot be removed by a mean background subtraction method (see Fig. 3) and that it also works for the background that is obscured by the PSF (see Sect. 3.7), we were not able to show that this necessarily yields a higher companion S/N. We assume that this is because the companion of β Pic is exceptionally bright and the sampling of the background was done very frequently under stable observing conditions. The frequent background sampling allows for an improved mean background subtraction, so that the S/N is more dominated by PSF residuals than by background residuals. Altogether, this does not leave much room for improvement from a moreadvanced background subtraction scheme.

thumbnail Fig. 10

Results for β Pic in M band after the PSF subtraction with PynPoint. The image on the left shows the result with mean backgroundsubtracted frames and a PSF subtraction with 21 principal components after pre–stacking 500 images. The image on the right shows the result with PCA background subtracted frames and a PSF subtraction with 25 principal components after pre-stacking100 frames. The radius of the inner mask is 0.16′′ . The S/N values are 34.5 and 32.5, respectively. North is up and east is to the left of the images. The scale is in arbitrary linear units.

Open with DEXTER
thumbnail Fig. 11

Summary of the S/N values for β Pic b for the different data reductions. The top panel shows the S/N depending on the number of pre–stacked images and the number of principal components used for the PSF subtraction of the dataset that was mean background subtracted. The bottom panel shows the same for the PCA background subtracted case.

Open with DEXTER
thumbnail Fig. 12

Noise (in arbitrary units) as a function of the separation from the star for our best results of β Pic in M band. Thecompanion is located at a separation of ~0.5′′ and has in both reductions a similar S/N.

Open with DEXTER

4.3 HD 169142 in the L filter

This datasetis different from the two above. First of all, it was a coronagraphic observation with the AGPM meaning that the background also contains additional thermal emission close to the vortex centre of the coronagraph (see Absil et al. 2016). This component actually dominates the thermal background residuals after the mean background subtraction. Secondly, due to the coronagraph, the background is only poorly sampled temporarily because it is not possible to observe background and object quasi-simultaneously, but one has to observe a dedicated sky position every few minutes. The PCA background subtraction for this dataset was also done by performing a PCA on all the sky frames and fitting the principal components to the area around the coronagraphic PSF. We used 60 principal components to model and subtract the background. However, for this dataset, around 40 principal components would have been sufficient as well.

For the PSF subtraction with PynPoint we tested no pre–stacking and pre-stacking of ten and 100 images, resulting in 25 488, 2549 and 255 images with effective integration times of 0.25, 2.5 and 25 s, respectively. We varied the number of principal components between 1 and 30. The best results in terms of S/N were achieved with no pre-stacking and nine principal components for the mean background subtracted dataset and only three principal components for the PCA background subtracted dataset. The maximum S/N in both cases was aproximately seven, this corresponds to a FAP of ~10−5. These best results are shown in Figure 13. A summary of all results can be found in Figure 14. In the cases where we pre-stacked the images, the maximum S/N was achieved with three principal components for both datasets, but the S/N was lower compared to the case where we did not stack (6 and 6.5 for the mean background and PCA background subtracted dataset, respectively).

We found that we could achieve the best results for this dataset if we do not pre-stack images at all. The best results for both, the mean background and PCA background subtracted datasets are very similar when judged by calculating the S/N of the companion. However, we need many more principal components to subtract the PSF in the mean background subtracted case. The best result with mean background subtraction is more noisy than the one with PCA background subtraction. We illustrate this in Figure 15. This plot shows the noise depending on the separation from the star for both best results. This is essentially proportional to the achievable contrast. The noise is higher for the mean background subtracted dataset for all separations ≳0.135′′, which corresponds to the separation of the companion candidate HD 169142b. We determined the position of HD 169142b roughly by maximizing S/N. For smaller separations the noise is difficult to quantify due to the inner mask. We argue that the results for the PCA background subtracted dataset are better because, even though the achievable S/N for HD 169142b is not higher, the result overall is less noisy for all other separations. This would improve the S/N of a potential detection at these separations.

thumbnail Fig. 13

Results for HD 169142 in L band after the PSF subtraction with PynPoint and no pre-stacking. The image on the left shows the result with meanbackground subtracted frames and a PSF subtraction with nine principal components. The image on the right shows the result with PCA residual background subtracted frames and a PSF subtraction with three principal components. The radius of the inner mask is 0.06′′ . North is up and east is to the left of the images. The scale is in arbitrary linear units.

Open with DEXTER
thumbnail Fig. 14

Summary of all S/N values for HD 169142b for the different data reductions. The top panel shows the S/N depending on the number of pre-stacked images and the number of principal components used for the PSF subtraction of the dataset that was mean background subtracted. The bottom panel shows the same for the PCA background subtracted case.

Open with DEXTER
thumbnail Fig. 15

Noise (in arbitrary units) as a function of the separation from the star for our best results of HD 169142 in L band. HD 169142b is roughly located where the two lines meet on the left side, this corresponds to a separation of 0.135′′ .

Open with DEXTER

5 General discussion

5.1 PCA-based background subtraction

We are convinced that it is important to have a systematic, flexible and powerful way for the subtraction of the thermal IR background in high-contrast imaging data in the 3–5 μm range. The PCA-based scheme described above is a possible and easy to implement approach, even if the quantitativeresults of true companions did not always show a significant gain in S/N. However, Galicher et al. (2011) already showed that, in other datasets, the results can be significantly improved with more sophisticated background subtraction schemes. We confirm their findings for a M dataset of the HR 8799 planetay system with our PCA-based approach in Appendix Section A. In addition, because we noticed that even small changes of the mean background subtraction method can lead to very different results after the complete data reduction, a more robust and stable approach is warranted. Finally, while the mean background subtraction approach does significantly depend on the data quality and hence observing conditions, a systematic frame-by-frame procedure considering all possible representations of background emission, like our PCA-based residual background subtraction, should automatically yield the possible results for a given dataset, irrespective of observing conditions and sky sampling frequency.

We have shown that the PCA-based background subtraction procedure can be used to effectively model and remove the residual structures of the thermal IR background in the M band dataset of β Pic. An example for this is shown in Figure 3. A temporally and spatially variable component like this cannot be removed easily by a mean background subtraction method. Removing the residual structures decreases the variability of the background and therefore can lower the noise in the vicinity of the star which is not induced by speckles. In Sect. 3.7 we were able to show that our procedure also manages to accurately reconstruct the thermal background when it is partially obscured by the PSF, therefore removing all sources of noise down to the background limit due to photon noise. This is an ideal way of preparing a dataset for PSF subtraction in a subsequent analysis step.

Another advantage of the PCA background subtraction can be seen in the results in Figure 9. Artefacts from detector inhomogeneities or bad pixels are efficiently removed by the PCA algorithm from each image individually, because they are present in considerable amount of images and they are static.

5.2 Complete data reductions

The datasets that we reduced were very different from each other. Therefore the PCA background subtraction shows different effects and it is not always simple and straightforward to quantify the improvements on the whole data reduction or interpret the results. However, there are some general trends that were consistently seen many times during our tests with the PCA background subtraction. If we pre-stack a high number of images (≳100) before PSF subtraction, we see virtually no difference between results from both background subtraction methods. We have observed this in all of our complete data reductions. This is because variations in the background of the images are reduced by averaging over many images. For a large dataset with a bright companion this can be fine. We have seen this for β Pic in M band. However, for the other datasets this was the other way around. Pre-stacking a large number of images does not always provide a good solution for averaging out the residual background structures and in some cases it can lower the planet flux by smearing out the signal over multiple pixels. However, this depends strongly on the field rotation and the separation of the companion. If we do not pre-stack at all, the PCA background subtraction makes it easier to subtract the PSF afterwards because the background in the images is more stable and hence fewer principal components are needed.

6 Conclusions and future prospects

We have shown that PCA in combination with sufficient background sampling can be used to systematically model and subtract the thermal background from high-contrast imaging datasets on a frame-by-frame basis. It is an improvement over the conventional mean background subtraction approach because it can remove any fast changing inhomogeneous residual background, which adds to the already noisy IR background. However, we have also shown that this does not necessarily improve the S/N of a companion in a complete data reduction because this depends on many other factors as well. Some of those are observing conditions, observing strategy, the position of the companion and the presence of a coronagraph. We nevertheless expect that our new algorithm will help reducing the error-bars on already detected planets and possibly even help finding new planets in archived or new high-contrast imaging data when combined with current PSF subtraction routines.

PCA is particularly good at finding patterns which are present in a large fraction of the images. This is why the background subtraction also effectively catches bad pixels, so that no other bad pixel correction has to be applied. This can be especially helpful if no bad pixel map exists. We noticed this effect during the preparation of the β Pic dataset and it should be further investigated.

Ground-based direct observations in particular in the M band, and at 3–5 μm in general, are challenging but nevertheless important. They reveal crucial information about the spectral energy distribution of extra-solar planets, and therefore information about the effective temperatures and the composition of the atmospheres. Only a fraction of the planets which were directly observed, were also directly observed in M band. Advanced algorithms for the data reduction, such as the PCA-based background subtraction presented in this work, have the potential to make the mid-infrared range more accessible for observations. This will become even more important in the future. With the Enhanced Resolution Imager and Spectrograph (ERIS; Kuntschner et al. 2014), there is a new 1–5 μm instrument in development for the VLT foreseen to be installed behind an adaptive secondary mirror. It is designed to be the replacement for NACO with first light in early 2020. Then later in the mid 2020s the European Extremely Large Telescope (E-ELT) is planned to start operating with Mid-infrared E-ELT Imager and Spectrograph (METIS) as one of its first instruments(Brandl et al. 2016). Both instruments aim to produce high resolution diffraction limited images of extra-solar planets in mid-infrared (e.g. Quanz et al. 2015b) and they require the most advanced data reduction pipelines to push the parameter space of directly detectable exoplanets.

Acknowledgements

We would like to thank the anonymous referee for helping to improve this manuscript with a very constructive report. We would also like to thank O. Absil, Ch. Marois and J. Milli for providing additional helpful comments. Part of this work was supported by the Swiss National Science Foundation through grant 200020_162630/1. SH acknowledges the financial support of the SNSF. Part of this work has been carried out within the framework of the National Centre for Competence in Research PlanetS supported by the Swiss National Science Foundation. SPQ acknowledges the financial support of the SNSF. The work is based on data obtained from the ESO Science Archive Facility originally observed under programs 090.C-0653, 091.C-0818(A) and 291.C-5020(A). This research has made use of the Keck Observatory Archive (KOA), which is operated by the W. M. Keck Observatory and the NASA Exoplanet Science Institute (NExScI), under contract with the National Aeronautics and Space Administration. This research has made use of the SIMBAD database, operated at CDS, Strasbourg, France.

Appendix A HR 8799 in the M filter

The M band (λc = 4.67 μm, Δλ = 0.241 μm) data of HR 8799 from Galicher et al. (2011) were obtained during the nights of November 1 and 2 in 2009 at the Keck II observatory using the adaptive optics system and the NIRC2 near-infrared imager. The data were also taken in ADI mode with ~180 field-of-viewrotation in both nights. Unlike the NACO data, however, the NIRC2 data were only saved as 140 unsaturated images each consisting of 200 coadded raw frames of 0.3 s detector integration time. This is not optimal for our purpose because our goal is to model and subtract the long and short timescale fluctuations from the individual raw images before they are coadded.

We performed a mean background subtraction and our PCA based background subtraction so that we could compare the results, just as we did for the NACO data. For the mean background subtraction we used for each frame the closest frame wherethe star was dithered enough that it could be used as an approximate background measurement. This was done for both days separately. For the PCA-based background subtraction we performed a PCA on all background images for each dither position of the star separately, independent of the day on which it was observed. The resulting 40 first principal components were used to fit and subtract the background.However, we noticed that already approximately ten components would have been enough. We assume that this is because most of the background variability was already averaged out as the input frames were effectively coadds of 200 individual exposures with a total integration time of 60 s. However, the PCA algorithm still managed to remove some detector features and background changes over longer timescales.

The PSF subtraction was in both cases done with the PynPoint PCA algorithm applied to the complete dataset. We did not pre-stack images before the PSF subtraction.

We subtracted the PSF with up to 20 principal components, determined the position of the planets by fitting a 2D Gaussian and determined a S/N ratio for each planet. The results are summarized in Fig. A.1. In the results with PCA background subtraction we were able to detect the planets c and d with a maximum S/N above six. The outermost planet could only be detected with SN ≈ 3, even in the best results. Just like Galicher et al. (2011) we find that the detection of the planets in the results from the mean background subtraction is difficult and the computed S/N of the planets is consistently lower than in the PCA background subtracted results. However, considering all our results, we still get reasonable detections of planets c and d even with the mean background subtraction. This is most likely because we used a more advanced method for the subsequent PSF subtraction. Galicher et al. (2011) only show results with median speckle subtraction. In Fig. A.2 we show representative examples for results with both background subtraction schemes.

thumbnail Fig. A.1

Measured S/N for all planets in the results with PCA background subtraction (solid lines) and mean background subtraction (dashed lines). S/N was measured in unfilteredresults, we applied neither unsharp masks nor noise filters.

Open with DEXTER
thumbnail Fig. A.2

Top: result after subtracting the mean background and using the PynPoint PCA algorithm to subtract the speckle noise with 17 principal components. Bottom: result after using the PCA algorithm to subtract the background with 40 principal components and PynPoint to subtract the speckle noise with 15 principal components. The inner 0.3′′ were masked for the speckle subtraction. We convolved both images with a 0.5λ/D width Gaussian to improve the visibility of the planet signals for this figure. The images shown here are typical results for both background subtraction schemes, where all planets can be identified, but they do not necessarily exhibit the highest achievable S/N for all planets. The scale is in arbitrary linear units.

Open with DEXTER

References


2

This corresponds to 1.35′′× 1.35′′ or 11.2 × 11.2 λD in M band.

3

We also tried to estimate the background limit from the average level of background counts (~19 800) and the gain of the CONICA camera (~9), but this results in an implausibly hight standard deviation of around 47 counts. We are not yet sure about the underlying reason. Either there is a problem with the calibration files for this dataset or with the documented gain of the detector.

4

Quanz et al. (2015a) also used the PCA based background subtraction for reducing the M band data.

All Tables

Table 1

Summary of datasets.

All Figures

thumbnail Fig. 1

Top left shows an image of the star that was only corrected for bad pixels. Only the first quadrant of the detector is shown because this is where the star was put in this exposure. The top right figure is the same image with identical colourbar length but after a mean background subtraction. The figure on the bottom is additionally smoothed with a Gaussian filter to reduce the pixel noise and has a narrower colourbar emphasize the inhomogeneous residual background structure.

Open with DEXTER
In the text
thumbnail Fig. 2

Some of the first few principal components (component # 1, 2, 3, 4, 5, 10, 20, 40 and 60) from the PCA of the mean subtracted background images from the first quadrant of the β Pic M band dataset. The principal components are normalized and the colour scale is in linear units.

Open with DEXTER
In the text
thumbnail Fig. 3

Comparison between a star image with subtracted mean background (upper left) and subtracted mean and PCA residual background with 60 principal components (upper right). The star image also shows the area around the stellar PSF which is masked for the fit of the principal components (black rectangle). The image on the bottom shows the corresponding PCA residual background. All images were smoothed with a Gaussian filter to improve the visibility of the background structure.

Open with DEXTER
In the text
thumbnail Fig. 4

Residual background from five consecutively recorded star images. Each image was fitted with 60 principal components. Therefore, the figure essentially depicts how the variable residual background changes on timescales of ~0.1 s. All images were filtered with a Gaussian filter to improve the visibility of the background structure.

Open with DEXTER
In the text
thumbnail Fig. 5

Fitted PCA residual background for one of the star images from the β Pic M band dataset. The images show how the background changes when more principal components are used to model the residual background (for 5, 10, 15, 20, 30, 40, 60, 80, and 100 components).

Open with DEXTER
In the text
thumbnail Fig. 6

Convergence of the computed residual background when moreprincipal components are used for the fit. The data points in the plot are the square root of the mean quadratic sum (RMS) of the difference between two background images with different number of components. The points at X principal components were calculated by taking the background image fitted with X components minus the background image fitted with X + 10 components and taking the RMS (calculated over the whole image) from this difference. There are 10 different results for each # of components because this was done for 10 randomly selected star images. The continuous line goes through the mean values of theresults to show how the background converges. The e-folding scale from the exponential fit to the curve is 18 components.

Open with DEXTER
In the text
thumbnail Fig. 7

Panel a: example for a patch of an exposure which only shows background. The rectangle marks the area which is masked during the residual background fit, this is where the star would usually be located in a star image. This image was smoothed with a Gaussian filter to highlight the residual background structure. Panel b: “true” residual background when the principal components are fitted to the whole background image. Panel c: reconstructed background when the principal components are only fitted to the area outside the mask. Panel d: difference between (b) and (c). The colourbar in (a) is valid for (a), (b) and (c).

Open with DEXTER
In the text
thumbnail Fig. 8

Mean RMS (solid line) from the noise in the masked area after applying the PCA residual background subtraction to background images. These are the mean values from 100 different images. The background limit for this dataset was estimated to be around 37 counts.

Open with DEXTER
In the text
thumbnail Fig. 9

Results for HD 100546 in M band after the PSF subtraction with PynPoint and pre-stacking of ten images. Panels a andc: results with mean background subtracted frames and a PSF subtraction with three and 11 principal components, respectively. Panels a and c: results with PCA residual background subtracted frames and a PSF subtraction with three and 11 principal components, respectively. The radius of the inner mask is 0.2′′ . North is up and east is to the left of the images. The scale is in arbitrary linear units.

Open with DEXTER
In the text
thumbnail Fig. 10

Results for β Pic in M band after the PSF subtraction with PynPoint. The image on the left shows the result with mean backgroundsubtracted frames and a PSF subtraction with 21 principal components after pre–stacking 500 images. The image on the right shows the result with PCA background subtracted frames and a PSF subtraction with 25 principal components after pre-stacking100 frames. The radius of the inner mask is 0.16′′ . The S/N values are 34.5 and 32.5, respectively. North is up and east is to the left of the images. The scale is in arbitrary linear units.

Open with DEXTER
In the text
thumbnail Fig. 11

Summary of the S/N values for β Pic b for the different data reductions. The top panel shows the S/N depending on the number of pre–stacked images and the number of principal components used for the PSF subtraction of the dataset that was mean background subtracted. The bottom panel shows the same for the PCA background subtracted case.

Open with DEXTER
In the text
thumbnail Fig. 12

Noise (in arbitrary units) as a function of the separation from the star for our best results of β Pic in M band. Thecompanion is located at a separation of ~0.5′′ and has in both reductions a similar S/N.

Open with DEXTER
In the text
thumbnail Fig. 13

Results for HD 169142 in L band after the PSF subtraction with PynPoint and no pre-stacking. The image on the left shows the result with meanbackground subtracted frames and a PSF subtraction with nine principal components. The image on the right shows the result with PCA residual background subtracted frames and a PSF subtraction with three principal components. The radius of the inner mask is 0.06′′ . North is up and east is to the left of the images. The scale is in arbitrary linear units.

Open with DEXTER
In the text
thumbnail Fig. 14

Summary of all S/N values for HD 169142b for the different data reductions. The top panel shows the S/N depending on the number of pre-stacked images and the number of principal components used for the PSF subtraction of the dataset that was mean background subtracted. The bottom panel shows the same for the PCA background subtracted case.

Open with DEXTER
In the text
thumbnail Fig. 15

Noise (in arbitrary units) as a function of the separation from the star for our best results of HD 169142 in L band. HD 169142b is roughly located where the two lines meet on the left side, this corresponds to a separation of 0.135′′ .

Open with DEXTER
In the text
thumbnail Fig. A.1

Measured S/N for all planets in the results with PCA background subtraction (solid lines) and mean background subtraction (dashed lines). S/N was measured in unfilteredresults, we applied neither unsharp masks nor noise filters.

Open with DEXTER
In the text
thumbnail Fig. A.2

Top: result after subtracting the mean background and using the PynPoint PCA algorithm to subtract the speckle noise with 17 principal components. Bottom: result after using the PCA algorithm to subtract the background with 40 principal components and PynPoint to subtract the speckle noise with 15 principal components. The inner 0.3′′ were masked for the speckle subtraction. We convolved both images with a 0.5λ/D width Gaussian to improve the visibility of the planet signals for this figure. The images shown here are typical results for both background subtraction schemes, where all planets can be identified, but they do not necessarily exhibit the highest achievable S/N for all planets. The scale is in arbitrary linear units.

Open with DEXTER
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.