The search for radio emission from exoplanets using LOFAR beam-formed observations: Jupiter as an exoplanet

$\textit{Context.}$ The magnetized Solar System planets are strong radio emitters and theoretical studies suggest that the radio emission from nearby exoplanets in close-in orbits could reach intensity levels $10^{3}-10^{7}$ times higher than Jupiter's decametric emission. Detection of exoplanets in the radio domain would open up a brand new field of research, however, currently there are no confirmed detections at radio frequencies. $\textit{Aims.}$ We investigate the radio emission from Jupiter, scaled such that it mimics emission coming from an exoplanet, with low-frequency beam-formed observations using LOFAR. The goals are to define a set of observables that can be used as a guideline in the search for exoplanetary radio emission and to measure effectively the sensitivity limit for LOFAR beam-formed observations. $\textit{Methods.}$ We observe"Jupiter as an exoplanet"by dividing a LOFAR observation of Jupiter by a down-scaling factor and adding this observation to beam-formed data of the"sky background". Then we run this artificial dataset through our total intensity (Stokes-I) and circular polarization (Stokes-V) processing and post-processing pipelines and determine up to which down-scaling factor Jupiter is still detected in the dataset. $\textit{Results.}$ We find that exoplanetary radio bursts can be detected at 5 pc if the circularly polarized flux is $10^5$ times stronger than the typical level of Jupiter's radio bursts during active emission events ($\sim4\times10^{5}$ Jy). Equivalently, circularly polarized radio bursts can be detected up to a distance of 20 pc (encompassing the known exoplanets 55 Cnc, Tau Bo\"{o}tis, and Upsilon Andromedae) assuming the level of emission is $10^{5}$ times stronger than the peak flux of Jupiter's decametric burst emission ($\sim6\times10^{6}$ Jy).


Introduction
The detection and characterization of exoplanetary radio emission would constitute a new and important field of exoplanet science. For example, the detection of planetary auoral radio emission is probably the only method to unambiguously detect exoplanetary magnetic fields (Grießmeier 2015). To date, no confirmed radio detection has been achieved, even though a certain number of observations have been conducted over the past few decades (e.g. Winglee et al. 1986;Bastian et al. 2000;Ryabov et al. 2004;George & Stevens 2007;Lazio & Farrell 2007;Smith et al. 2009;Lecavelier Des Etangs et al. 2009Lazio et al. 2010a,b;Stroe et al. 2012;Hallinan et al. 2013;Lecavelier des Etangs et al. 2013;Sirothia et al. 2014;Murphy et al. 2015;Lynch et al. 2017;Turner et al. 2017;Lynch et al. 2018;O'Gorman et al. 2018). A summary of all the observational campaigns can be found in Grießmeier (2017, Table 2). In parallel to observational studies, a number of theoretical studies has been published (e.g. Zarka et al. 1997;Farrell et al. 1999Farrell et al. , 2004Zarka et al. 2001;Lazio et al. 2004;Stevens 2005;Grießmeier et al. 2005Grießmeier et al. , 2007bJardine & Collier Cameron 2008;Vidotto et al. 2010Vidotto et al. , 2015Hess & Zarka 2011;Nichols 2011Nichols , 2012See et al. 2015;Nichols & Milan 2016); an overview is given, e.g., in re-cent review articles such as Zarka (2011); Zarka et al. (2015); Grießmeier (2015Grießmeier ( , 2017. Starting with Zarka et al. (1997) and Farrell et al. (1999), a number of articles have attempted to estimate the radio flux density that can be expected for different types of exoplanets. Of course, such estimates have to be taken carefully. For example, Grießmeier et al. (2007b) give uncertainties of approximately one order of magnitude for the flux density and an uncertainty of a factor of 2-3 for the maximum emission frequency for the planet Tau Boötis b. The uncertainties are even larger when different models are compared. Still, such numbers can be used to determine whether the detection of exoplanetary auroral radio emission seems realistic with a given radio telescope and observational setup. Indeed, according to most recent estimates, emission frequencies are compatible with the frequencies at which some radio telescopes of latest generation operate, and estimated flux densities are close to the sensitivity of these instruments. In particular, Grießmeier (2017) find that the flux densities of 15 exoplanets are above the theoretical detection limit of LOFAR as given by Turner et al. (2017).
With such encouraging radio predictions, radio observations of exoplanets are undertaken by most low-frequency radio tele-Article number, page 1 of 18 arXiv:1802.07316v2 [astro-ph.EP] 29 Jan 2019 scopes. For these observations, different observing modes and strategies can be used. In the following, we will differentiate between (a) imaging observations and (b) beam-formed observations. Many recent observations (e.g. Hallinan et al. 2013;Sirothia et al. 2014;Lynch et al. 2017) have been recorded in the form of interferometric images using an array of distributed antennas or dishes (e.g. GMRT, LOFAR). Interferometric observations have the advantage of a higher robustness against localized (i.e. site-specific) Radio Frequency Interference (RFI), and are equally sensitive to continuous and moderately bursty signals (i.e. longer than the shortest time constant in imaging pipelines, typically a few seconds; e.g. Offringa et al. 2014). They are computationally expensive, but offer the possibility to exclude a bad antenna or dish from the analysis even during offline processing. Beam-formed observations have the advantage of a higher time resolution, which can be used to localize and excise short and sporadic RFI more precisely. They cannot reliably detect continuous or slowly varying emission, but excel at the detection of short bursty signals. Compared to imaging observations, only a handful of pixels have to be analyzed, which reduces the computational cost: Typical observations use 1 ON-beam and 1 to 3 simultaneous OFF-beams, see e.g. Zarka et al. (1997) or Turner et al. (2017).
For both imaging and beam-formed observations, the determination of a minimum detectable flux density is not straightforward in the case of a bursty signal. The reason for this is that the upper limit depends on the detection method. In this work, we present a detection tool that integrates the processing steps described in Turner et al. 2017 (RFI-mitigation, normalization by the time-frequency (t-f) response function, t-f integration) and a series of sensitive observables based on the work of Vasylieva (2015). In order to test, validate, and quantify the sensitivity reached with this tool, we apply it to a LOFAR observation of Jupiter's magnetospheric radio emission in which the signal from Jupiter is attenuated. The idea is simple: we observe Jupiter, divide its signal by a fixed factor before adding it to an observation of "sky background", thereby creating an artificial dataset best described as "Jupiter as an exoplanet". We then run our pipeline and check whether the (attenuated) radio signal from Jupiter is detected. The maximum factor by which we can divide Jupiter's signal and still achieve a detection gives the sensitivity of our setup (i.e. the combination of the telescope and the processing chain). This method is mainly designed for use with beam-formed data, but an extension to radio imaging observations is under preparation and will be described elsewhere (Loh et al. in prep).
Finally, the instantaneous flux density of Jupiter was obtained from a well-calibrated observation using the Nancay Decameter Array (NDA; Boischot et al. 1980;Lamy et al. 2017) simultaneous to our LOFAR observation of Jupiter. The NDA observation is used to convert the sensitivity of our setup into physical units.

Observations
For this study, we use four different sets of Low-Frequency Array (LOFAR; van Haarlem et al. 2013) Low Band Antenna (LBA) beam-formed observations in the frequency range 15-62 MHz. The detailed setup and the summary of all observations (date, time, and beam directions) can be found in Table 2 and 1, respectively. In this paper, we focus on the total intensity (Stokes-I) and circular polarization (Stokes-V) components of the emission. All observations were intentionally scheduled during night time hours to mitigate strong contamination by RFI.
The first observation (hereafter Obs #1) was taken on February 11, 2017 from 02:30 to 5:30 UT and the ON-beam was pointed at Jupiter. The dynamic spectrum of this beam can be found in Figs. 1a and 1b. The structure of the Jupiter emission is very complex and the analysis of this structure (e.g. Burke & Franklin 1955;Carr et al. 1983;Zarka 1998;Kaiser 1993;Lecacheux et al. 2004;Imai et al. 2015;) is beyond the scope of this study. As expected, Jupiter's emission is only seen below 40 MHz in the observation .
Due to its anisotropic beaming, Jupiter's emission is visible from Earth only ∼10% of the time. It does not, however, occur randomly, but depends on the geometrical position of the Earth, Jupiter, and Jupiter's satellite Io, as expressed by Io's orbital phase and the CML (Central Meridian Longitude = the observer's Jovicentric longitude). Statistical studies have identified times when the probability of detecting Jupiter's decametric emission from Earth is > 50%, , and for a specific geometry (so-called Io-B emission), the occurrence rate reaches 94% (i.e. nearly permanent emission) ). To determine a good time window for Obs #1, we made use of the Io-phase/CML diagrams provided by Nançay Radio observatory 1 .
Two OFF-beams were obtained simultaneously with the ONbeam, however the OFF-beams show strong contamination by emission from Jupiter despite being located ∼ 2 degrees away from Jupiter. Therefore, a second observation to obtain "clean" OFF-beams was taken on February 18, 2017 from 01:12 to 4:12 UT (hereafter Obs #2). Obs #2 will be used as the "sky background" to which we will add the attenuated Jupiter signal. Two OFF-beams were obtained at beam positions chosen such that no significant low-frequency point sources were located within the beam. For this we used the TGSS survey (Intema et al. 2017) at 150 MHz. The dynamic spectrum of one of the OFF-beams can be found in Figs. 1c and 1d.
While most of the analysis was done using Obs #2 for the "sky background", we also used two other dates of observations with two OFF-beams to verify our results. The third dataset was taken on February 26, 2017 from 01:16 to 04:16 UT (hereafter Obs #3) and was pointed at the same OFF-beam positions as Obs #2. This date had far worse RFI conditions than Obs #2 and also had noticeable large-scale scintillation due to a disturbed ionosphere. The fourth dataset was taken on September 28, 2016 from 23:00 to 04:00 UT (hereafter Obs #4; Table 1). Obs #4 was comparable in quality to Obs #2 (no large scale scintillation patterns) and RFI conditions but was pointed at a different part of the sky.

Scaling Jupiter's signal
We add the Jupiter signal, multiplied by a factor α (<< 1), to the sky plus instrumental background of a typical exoplanet observation, and then try to detect it with our two-step processing pipeline (Sect. 4). As we will test below the post-processing in 10 MHz bands, we use the Jupiter signal of Figs. 1a and 1b detected in the band 15-25 MHz. In order to test our pipeline across the entire LOFAR-LBA range, we need to be able to add the attenuated Jupiter signal to any 10 MHz band in the range 10-90 MHz. Having no absolute calibration available in the LOFAR-LBA range, we proceed in two steps: (i) the Jupiter signal detected by LOFAR in Obs #1 (I J1 ) is expressed in terms of the     sky background in the band of observation 15-25 MHz (I S 1 ), i.e. the ratio (I J1 /I S 1 ) is computed as in the following section (Sect. 3.2), and it is then transferred to an arbitrary 10 MHz band in the sky background in Obs #2 (I S 2 ); (ii) the flux density of the Jupiter emission is computed from simultaneous calibrated observations performed at the NDA. These two steps are detailed below. For step (i), we add the dynamic spectrum of the Jupiter observation in the range 15-25 MHz to the dynamic spectrum of the sky background in an arbitrary 10 MHz band of an exoplanet observation (with the same observational setup; Table 2) to get a test Stokes-I dynamic spectrum I sim following where I J2 is the Jupiter signal as it would have been observed in the test frequency band, I J1 /I S 1 and I S 2 are derived from the Stokes-I observational data, α (<<1) is the variable down-scaling parameter, and the ratio S S 1 /S S 2 can be computed as the ratio of the SEFD in the band 15-25 MHz and in the test frequency band. Similarly, the test dynamic spectrum for Stokes-V V sim is where V S 2 is the Stokes-V sky background in Obs #2 and V J1 is the Stokes-V Jupiter signal from 15-25 MHz. The full derivation of equations (2) and (3) (2) and (3) can be simply rewritten with N 2 and N 1 the number of LBA stations involved in each observation. Note that equations (2) and (3) can be used to add the signal (of Jupiter or other) observed with one telescope in a given frequency range to the background recorded with another telescope in another frequency range, as long as the SEFD of the two telescopes in their respective spectral ranges are known. Equations (4) and (5) are the application for the considered LOFAR-LBA observations. The Jupiter signal thus transferred retains its absolute intensity (e.g. in Jy).
For step (ii) we use an observation of Jupiter simultaneously taken to the LOFAR one, carried out at the NDA. For this observation, the NDA observes simultaneously in right-hand (RH) and left-hand (LH) circular polarizations from 10 to 40 MHz in 400 spectral channels at a time resolution of 1 second. Hourly calibration sequences on noise sources of known flux density are embedded in the data and allow us to calibrate the observations in absolute flux density (Jy), with an accuracy ∼ 20%. From NDA data, we know that the first Jupiter burst at about 02:45 UT is RH elliptically polarized, whereas the drifting emission bands starting around 04:00 UT are LH elliptically polarized. For Stokes-I, we summed the RH and LH signals to obtain the total intensity. We removed the main fixed-frequency RFI and the main broadband spikes (recognized as non-Jupiter signal by integration over the 26-40 MHz range). After subtraction of a background (computed in each frequency channel) the Article number, page 4 of 18 cleaned calibrated dynamic spectrum was averaged over the 15-25 MHz range to obtain the time series displayed in Fig. 2a (black '+' symbols) together with a running average over 10 seconds (red line). Fig. 2b displays the high-pass filtered flux densities obtained by subtracting the 10 second average from 1 second measurements. The bursty spikes in this high-pass filtered timeseries will be used for comparison to the results of our processing below (Sect. 6). The cumulative distribution function of the values of Fig. 2b is displayed in Fig. 2c. We obtain similar results within a factor ≤2 performing the same analysis on Stokes-V. From that figure, we see for example that ∼100 high-pass filtered flux density measurements exceed 3 × 10 4 Jy. By comparing this curve to the actual number of data points of emission detected in the LOFAR data we can determine the sensitivity of our observations and processing (Sect. 6).

Extraction of Jupiter's signal
To obtain both the Jupiter signal (I J1 ) and the sky background in the Jupiter observation I S 1 we first need a RFI mask. Since Jupiter is as bright as the RFI, we used a modified version of the RFI mitigation pipeline presented in Turner et al. (2017). The following steps are performed: (1) find RFI on the ON-beam above 30 MHz (where no Jupiter emission is present) using the algorithm PATROL (Zakharenko et al. 2013, Vasylieva 2015 to flag entire time steps, (2) find RFI in the OFF-beam (beam 2) using only PATROL to flag entire time steps and frequency channels, and (3) combine the RFI masks from step (1) and (2) together to obtain a final RFI mask. This mask is then applied to Obs #1 and this dataset is used as the Jupiter signal (I J1 ).
Next, we find I S 1 for Obs #1 using the least Jupitercontaminated OFF-beam (beam 2) and during a time interval (3740 -3830 seconds after the start of the observation) where Jupiter's emission was minimal. To find the SEFD we apply the RFI mask from step (3) to the raw data. Then at each frequency we compute the 10% quantile of the distribution of intensities (using this quantile allows for minimal influence from any Jupiter emission or remaining RFI). The level of the 10% quantile is lower than the mean, therefore, I S 1 has to be corrected. Quantitatively, the 10% quantile (µ 10 ) for a Gaussian distribution with moments (µ, σ g ) is where n pol is the number of polarizations (2), b is the frequency resolution (b = 3.05 kHz), and τ r is the time resolution (τ r = 10.5 msec). The factor of 1.3 in equation (6) and (7) was determined using a standard Gaussian distribution. Therefore, the term I S 1 used in the analysis is obtained from the measured value (µ 10 ) using

Signal processing and observables
At low radio frequencies, any observed field not containing a A-team source (especially Cas-A, Cyg-A, Vir-A and Tau-A) is dominated by the Galactic background, that is an intense diffuse radio emission. We use this emission as a calibrator. Before doing so, however, the data have to be cleaned of RFI.

Processing pipeline for Stokes-I
The data of Observation #2 with (I sim ) and without the added Jupiter signal were run through the Stokes-I data reduction pipeline described in Turner et al. (2017). This pipeline performs RFI mitigation, finds the time-frequency (t-f) response function of the telescope and normalizes the data by this function, and rebins the data in broader t-f bins. For RFI mitigation we use four different techniques (Offringa et al. 2010;Offringa 2012;Offringa et al. 2012;Zakharenko et al. 2013;Vasylieva 2015, and references therein) that are combined together for optimal efficiency and processing time. The result of the RFI mititation is an array (mask, m I ) of the same resolution as the data with a value of either 0 (polluted pixels) or 1 (clean pixels). Subsequently, the data is rebinned to a time and frequency resolution of 1 second and 45 kHz. This rebinned data is the input into the Article number, page 5 of 18 A&A proofs: manuscript no. main post-processing pipeline (Sect. 4.3). The original method used in Turner et al. (2017) to find the time-frequency response function of the telescope (hereafter, method 1) is biased if some astrophysical emission or left-over RFI is present in the raw dynamic spectrum since the mean of the data is used to create the function. The nominal raw sensitivity of the LOFAR observations is 208 Jy (Table 2) where the expected flux from most exoplanets is less than 100 mJy (Grießmeier et al. 2007b;Grießmeier 2017). Therefore, for exoplanets we do not expect that the emission will be bright enough to be seen in the raw dynamic spectrum. However, when we test large Jupiter scaling factors (e.g. α = 10 −2 ) this is no longer the case. Therefore, we introduce a new method (hereafter, method 2) to find the time-frequency response function that is less biased towards bright emission in the raw dynamic spectrum. In the pipeline (1) we divide the raw data into sections of 4000 spectra (42 seconds), (2) we apply the RFI mask to the raw data, (3) we create an integrated spectrum from the 10% quantile of the distribution of intensities at each frequency, (4) we correct the average of the 10% quantile such that it is equivalent to the mean using equation (7), then (5) we find a second order polynomial fit at each frequency over all time sections, and (6) we create and save the 2-d time-frequency response surface made from the polynomial fits. As expected, method 1 and method 2 obtain the same results when α is very small (e.g. below α = 10 −5 ). When α is large, method 2 is more robust. In addition, method 2 is computationally faster than method 1; therefore method 2 is the preferred method for finding the time-frequency function and will be used in the analysis of this paper.

Processing pipeline for Stokes-V
For Stokes-V, the processing pipeline has to be partially adapted. For example, the raw Stokes-V data includes both negative and positive values, so we cannot use the same exact approach as done in the Stokes-I pipeline. Just like in the Stokes-I pipeline, we want to perform RFI mitigation, normalize the data by a timefrequency (t-f) response function of the telescope, and rebin the data into larger t-f bins. Additionally, the output of the Stokes-I pipeline is used in the Stokes-V processing pipeline. Therefore, Stokes-V is always ran after processing Stokes-I.

Pipeline for normal operation
In this section, we will describe the normal operation (i.e. no Jupiter signal added) of the Stokes-V pipeline. The raw Stokes-V data does not have an average of 0 as a function of frequency as we have for calibrated Stokes-I data. We perform 3 steps to center the data around 0.
(1) We divide the raw Stokes-V (V) by the raw Stokes-I (I) data to get rid of any large-scale instrumental systematics (2) We subtract v at each frequency by its time average (< v( f ) > t ) to center the dynamic spectrum around 0 The time average < v( f ) > t contains instrumental systematics that are not common between the Stokes-I and Stokes-V polarizations. Additionally, < v( f ) > t is taken over 42 seconds in the default pipeline.
(3) We multiply v by the RFI-mitigated and normalized Stokes-I data (I cor ) I cor is calculated in the Stokes-I pipeline as where < I( f ) > is the frequency response function of the telescope (i.e. obtained with method 1 and method 2 described in Turner et al. 2017 or in Section 4.1, respectively) and m I is the Stokes-I RFI mask.
Step (3) ensures that the units of V are the same as in I cor . As described below all or a subset of these steps are used in each part of the Stokes-V pipeline.
To find the RFI mask for Stokes-V (m V ), we use the following procedure. First, we divide the raw Stokes-V data into slices of 42 seconds (4000 spectra). Then we perform steps (1), (2), and (3) to find V . The time-average in step (2) is done separately on each slice of data. In step (3), we use the frequency response function of the Stokes-I data derived using the 10% quantile of the distribution of intensities at each frequency (Turner et al. 2017). The resulting dynamic spectrum V is then ran through the RFI mitigation code described in Turner et al. (2017) which produces m V . The final mask (m V ) used in the analysis is the V mask multiplied by the I mask (m V = m V m I ).
In the construction of the pipeline, we discovered that the systematic difference between the frequency responses of the Stokes-I and Stokes-V signals changed throughout the observation. Therefore, a Stokes-V t-f response function is also needed for the analysis. To find the Stokes-V t-f response function of the telescope we first divide the raw dynamic spectrum V into sections of 4000 spectra. We then perform step (1) and apply the RFI mask m V to v. Next, we perform step (2) but only save the average of each frequency in v over each time section. Then we fit the averages with a second order polynomial at each frequency over all time sections. Finally, we create and save the 2-d t-f response surface made from the polynomial fits.
To obtain the final RFI-mitigated and normalized V dynamic spectrum we again perform steps (1), (2), and (3) with some variations. Instead of using the time-average in step (2) we subtract by the Stokes-V t-f response surface. In Step (3), to find I cor we normalize the raw data by the Stokes-I t-f response surface using method 2 (Section 4.1). Just like the Stokes-I normalized data, V is in units of the SEFD. After step (3), we multiple the normalized data by the Stokes-V mask m V . After normalization and RFI-mitigation, V is then rebinned to the same resolution as the Stokes-I data (1 second and 45 kHz) and this is dynamic spectrum that will be input into the post-processing code.

Pipeline for Jupiter analysis
For the Jupiter analysis, the data of Obs #2 with (V sim ) and without the added Jupiter signal were processed through the new Stokes-V pipeline. The V sim used in the pipeline follows Equa- The normalized Stokes-V Jupiter data V J1 is found using a slight variation of the 3 steps in the pipeline We have to subtract v J1 by the average of the OFF-beam < v S 1 > t because the average of the Jupiter beam is skewed due to the immense Jupiter emission. We determined < v S 1 > t with the same beam (Beam 2) and time interval (3740-3830 seconds after the start of the observation) as in the calculation of I S 1 .

Post-processing pipeline: Observables of the exoplanet signal
In the following section, we present the post-processing pipeline.
After processing the data we compute several observable quantities that we named Q1 to Q4 for the ON-and OFF-beam and examine their behavior over time or frequency. The input dynamic spectrum for the observables is the RFI-mitigated, normalized, and rebinned data (Sects. 4.1 and 4.2; Fig. 3a). For the Stokes-V data, the analysis is performed on 3 different variants of the data: This will allow us to determine whether right-hand or left-hand polarization can be seen in the analysis. The observable quantities fall into two general categories: extended emission (Q1) or burst emission (Q2 -Q4). Below is the list of observables we defined (inspired by the methods used by Zarka et al. 1997 where x is the time-series of the dynamic spectrum integrated over all frequencies but not rebinned in time and x s is the low-pass filtered data (low-frequency component) created by running a sliding window of w seconds over x (in the default pipeline w = 10 time bins). We subtract by the mean < (x − x s ) > to center y around 0. Finally, the time-series is normalized by its standard deviation in order to unify the thresholds. An example of y can be found in Fig. 4a. We further examine Q2 by creating a scatter plot of the ON-beam values versus the corresponding OFF-beam values (Fig. 4b). In this plot, peaks that appear only in the ON-beam would be visible on the right edge of the cloud of points. An example for Q2 of simulated data is given in Fig. 5a. Due to residual low-level RFI or ionospheric fluctuations, high values of Q2 frequently occur simultaneously in the ON-and OFF-beam (points close to the main diagonal in Fig. 5a). For this reason, we implemented an elliptical correction, as described in Appendix B. After the elliptical correction is applied, the Q2 distribution of the sky noise data points is much closer to circular, which makes the signal data points more easily detectable. This is demonstrated in Fig. 5b. The analysis of real data (Sect. 5) shows that this elliptical correction does indeed facilitate the detection of astrophysical signals Article number, page 7 of 18 A&A proofs: manuscript no. main in the target beam and gives a better sensitivity (i.e. allows the detection of fainter signals). The sensitivity is increased by half an order of magnitude using the elliptical correction. This is demonstrated in Figure 6, where the comparison of panels a (without) and e (with elliptical correction), b (without) and f (with elliptical correction), c (without) and g (with elliptical correction), or panels d (without) and h (with elliptical correction) shows a marked difference. Next, the observables Q3 and Q4 are defined to systematically and statistically explore the parameter space of Q2 (e.g. y). When examining Q3 and Q4, the ON-and OFF-beam are always compared to each other and plotted against a reference curve with the same number of elements. This reference curve is created by taking the mean of the derived Q values from 10000 different Gaussian distributions of random values. When we subtract the ON-and OFF-beam Q value, then the reference curve is the standard deviation of the difference between all the Q values derived from two different Gaussian distributions (each run 10000 times). By default, Q4 is calculated from τ = 1...6σ with a step size of 0.1σ. Q4 is more effective at finding excess faint emission than Q3 since it is summed over all times. Once a detection is found in Q4, then Q3 can be used to localize the emission in time (e.g. Fig 7a). The reason for evaluating Q3a and Q4a are to determine if the ON-beam has more positive peaks than the OFF-beam thus indicative of burst emission. The power of the peaks (Q3b and Q4b) highlights more clearly any potential excess. The peak (Q3c and Q4c) and power asymmetry (Q3d and Q3d) are useful at determining whether there is an asymmetry in the signal distribution. These observables are similar to the skewness but are more adapted to a small numbers of outliers. An excess of positive peaks over negative ones could be evidence of bursts. Finally, the peak (Q3e and Q4e) and power offset (Q3f and Q4f) are the best discrimination of real burst emission because they directly correlate any detection against the other beam. Additionally, ionospheric effects and any remaining lowlevel RFI will be concentrated on the diagonal; the peak and power offset mitigate these effects especially after elliptical correction. See Fig. 5 for an illustration of where these observables lie in the parameter space of the scatter plot of Q2.
A summary of the parameters used in the post-processing can be found in Table 3. The rebin time of the processed data (δt) is a very important parameter because this defines the timescale over which we search for excess peaks in Q2. The frequency (∆ν) and time range (∆T ) over which we calculate these observables is 10 MHz and 3 hours, respectively. Additionally, we include a threshold cut on the rebinned RFI mask. The rebinned mask no longer consists only of values of 0 (polluted pixels) and 1 (clean pixels) since it was rebinned and clean pixels were mixed with polluted pixels. The mask threshold we use in our analysis is 90%, meaning a pixel will not be used in the analysis if ≥ 10% of the original pixels were contaminated. We use a time interval (T I) of 2 minutes and a frequency interval (FI) of 0.5 MHz for Q1b.  (Figs. 3, 4, 7). For Q3a, it can be seen that Jupiter's emission is mainly localized between 1.2-1.4 UT and 3.2-3.9 UT (Fig. 7a) where emission around 1.7 UT can be seen in both OFF beams (Fig. 7b). This is a good example demonstrating that two OFF beams are required to confirm a detection.
The extended emission observables Q1a and Q1b are very useful for Stokes-V. For Stokes-I they are useful only when the simulated exoplanet emission is very bright (α = 10 −2 − 10 −3 ) and can be seen by eye in the processed dynamic spectrum. Additionally, when we run the analysis of V + and V − the righthand and left-hand polarizations are easily separated. The dominant source of variations in Q1a and Q1b are changes in the ionosphere. Ionospheric variations are the limiting factor in distinguishing real emission from background variations.
The observables Q2 -Q4 are more effective at detecting fainter burst emission. The best observables to detect the faintest emission are Q4e and Q4f (Peak/Power Offset). We can reliably detect emission from Jupiter down to a value of α = 10 −3.5 for Stokes-I and α = 10 −4.0 for Stokes-V with the elliptical correction when adding Jupiter to the range 50 -60 MHz. For the Stokes-V detection limit (α = 10 −4.0 ), additional flux in the ONbeam can be seen in all Q values (Figs. 3, 4, 7). Fig. 6 shows the Stokes-I analysis for α = 10 −3.5 for both the ON-vs. OFF-beam and OFF-beam 1 vs OFF-beam 2. The main criteria we use to confirm a detection are (1) Q4f is distinctly different than the OFF-beam 1 vs. OFF-beam 2 comparison plot (Fig. 6g), (2) Q4f shows an excess ≥ 2σ statistical significance (dashed lines in Fig. 6), and (3) the detection curve is always positive for thresholds between 1.5 − 4.5σ.
The detection limits for each frequency range are summarized in Table 4. We get the most constraining detection limit for the frequency range 50 -60 MHz. Our detection limits for 40 -50 MHz and 30 -40 MHz are half an order of magnitude less sensitive, where the detection limit for 20 -30 MHz is an order of magnitude less sensitive. This is expected since the frequencyresponse curve of LOFAR sharply peaks at 58 MHz and has only limited sensitivity at the lowest frequency range (e.g. Figure 1 in Turner et al. 2017).
Next, we test the robustness of the detection limits by varying the parameters of the post-processing from those in Table 3. We vary the rebin time of processed data (δτ), the smoothing window (w), the value of the slope for the Peak/Power Offset (Q4e and Q4f), the frequency range, and the time range. Our detection limit for Stokes-I did not significantly improve when we varied these parameters. For Stokes-V, we improved our detection limit to α = 10 −4.5 (half an order of magnitude) when examining the V + data from 3-4 UT and over the frequency range 53.5-56.5 MHz (Figure 8). Therefore, our detection limit is fairly robust against the exact parameters used in the analysis but can be improved with a thorough frequency and time search. The signal from Jupiter is detected with Q4f until the data is binned to a δτ=30 seconds for Stokes-I and δτ=4 minutes for Stokes-V. For Article number, page 9 of 18 A&A proofs: manuscript no. main Q1a (the time-series), we can detect signal for Stokes-V up to a δτ=30 minutes but with lower significance (∼ 2σ). Therefore, assuming that radio emission from an exoplanet is similar to Jupiter's, searching over a variety of timescales with Q1a and Q4f will be helpful for a detection. This result also indicates that our method of analysis for beam-formed data can be applied to various setups of beam-formed observations and dynamic spectra extracted from rephased calibrated visibilities of imaging pipelines (Loh et al. in prep). Furthermore, we tested whether the date of observation or the position on the sky has a noticeable effect in our detection limits. For Obs #3, we find detection limits for Stokes-I that are half an order of magnitude less sensitive from those found using Obs #2. However, for Stokes-V we obtain similar detection limits to Obs #2. Next, performing the analysis on Obs #4 we find detection limits that are similar to Obs #2. Therefore, our detection limits (Table 4) are also insensitive to where in the sky we are pointed at, provided that the observations were taken under good conditions. Finally, we quantify the statistical significance of our detection limits. First, we normalize the observable Q4f (ON-OFF, i.e. the solid line in Fig. 6h) by the statistical limit (1 σ, i.e. the first dashed line in Fig. 6h) and calculate the average value  Jupiter's emission is mainly localized between 1.2-1.4 UT and 3.2-3.9 UT, whereas the bright emission at 1.7 UT can be seen in both OFF beams.
for Stokes-V. Next, we compare these values to those obtained in the case when both the ON-and the OFF-beam only contain random noise. We generate random distribution of points for the ON-and OFF-beam (generating an artificial equivalent of Q2), and calculate Q4f and <Q4f>. We find that the probability of a false positive for obtaining a signal like Jupiter is 1.4×10 −5 for Stokes-V for α = 10 −4.5 at 53.5-56.5 MHz and 3.2×10 −4 for Stokes-I for α = 10 −3.5 at 50-60 MHz. This corresponds to a statistically significant detection of 3.6σ for Stokes-I and 4.3σ for Stokes-V. As a final step, we compare these values to those obtained in observations without any astrophysical signal (i.e. when comparing the two OFF beams, Figs. 6g and 8e). In that case, we find that the false positive rate is 99% for Stokes-I and 20% for Stokes-V. Therefore, these observations are thus classified as non-detections.

Discussion
In the following, we will compare the detection limits for two of our observables: Q1a and Q4f. We can estimate an upper limit from 50-60 MHz using the Stokes-V Q1a (time-series) observable (e.g. Fig. 3b). The standard deviation of the difference of the two sky beams is 3.7 × 10 −5 of the theoretical SEFD. Therefore, the 1-sigma sensitivity from these observations would be 62 mJy using a SEFD for 24 stations of 1.7 kJy (van Haarlem et al. 2013). This flux density is ∼1.3 times higher than the sensitivity expected for LOFAR beam-formed observations: where S S is the SEFD with a value of 40 kJy (van Haarlem et al. 2013). Turner et al. (2017) found that the Stokes-I sensitivity using this same method was 850 mJy due to fluctuations in the ionosphere (∼50 times the theoretical sensitivity). If we rebin Q1a to longer timescales the standard deviation decreases with ∼ t −1/2 white noise dependence. Additionally, we find no evidence of red noise in the Stokes-V time-series using the timeaveraging method (Pont et al. 2006;Turner et al. 2016). Performing the same procedure in Stokes-I, Turner et al. (2017, Fig. 4) found that there was a great deal of red noise in the time series (RMS of red noise ∼ 0.5 RMS of white noise) due to non-Gaussian ionospheric variations between the two beams.
We demonstrated that we can detect the Jupiter signal downscaled by a factor α = 10 −4.5 for Stokes-V in all Q values but with the largest significance in Q4f (Fig 8f). Our detection in Q4f consists of ∼30 data-points in the NDA calibration data exceeding 3 × 10 4 Jy with a threshold ≥ 2σ (Fig 8d). Therefore, this limit corresponds to a flux density of ∼ α × 4 × 10 4 Jy = 1.3 Jy using the value of Jupiter's absolute flux density corresponding to 30 data-points from Fig. 2c. We also find that this flux density is ∼1.3 times the theoretical sensitivity (σ LOFAR = 1 Jy) expected for LOFAR beam-formed observations using equation (19) when τ = 1 sec and b = 3 MHz. This factor of 1.3 is mostly due to ionospheric variations that were not mitigated during the post-processing and partly due to the fact that our criteria for a burst detection is a statistical significance ≥ 2σ (Section 5; Fig.  6h).
Our comparison shows that for both Q1a and Q4f we are at 1.3× the theoretical sensitivity of LOFAR. Both observables are complementary to each other and should be used in parallel since they probe different timescales and emission structures.
One may wonder why bothering with the complex observables to achieve the sensitivity expected for beam-formed observations. The answer is that they allow us to detect confidently a signal and distinguish it from false positives at a 1.5-2σ level, whereas simple detection of a spike in beam-formed data requires generally a ∼ 10σ level to be considered as reliable. Thus we actually gain a factor > 5 in effective sensitivity (detection capability) with our method. Also, our method allows for the detection of relatively sparse and short bursts that would be washed out by averaging over long integrations. (e) and (f) Q4f difference of the two beams. The comparison of the two OFF-beams from Obs #2 can be found in the left column (panels a, c, e) and the comparison of ON-beam (Jupiter) vs OFF-beam 2 can be found in the right column (panels b, d, f). The dashed lines for panels (c-f) are the 1, 2, 3 σ statistical limits derived from two different Gaussian distributions (each run 10000 times). Panels (d) and (f) show an excess of ON vs OFF points at ≥ 3 σ statistical significance for signals up to a threshold of 6σ. For comparison, in panels (c) and (e) most the excess points are below the 2σ statistical significance level. We find by performing Gaussian simulations that the probability to get by chance a curve like panel (e) is ∼20%, whereas the probability to obtain a curve like panel (f) is ∼10 −5 .
Article number, page 13 of 18 A&A proofs: manuscript no. main Notes. All calculations were done with equation (21) where the scaling factor α = 10 −3.5 for Stokes-I and α = 10 −4.5 for Stokes-V and S J (obs) = 3 × 10 4 Jy (Sect. 3.1, Figure 2). to Jupiter, relative Jupiter flux levels, and distance): where α J is the scaling factor of the emission compared to Jupiter, d is the distance, S J (obs) is the flux density of the observed Jupiter signal in Obs #1 calibrated using NDA, and S J (ref) is a reference flux density value of Jupiter to which the putative exoplanet signal is compared. The Jupiter signal (S J [obs]) of ∼ 3×10 4 Jy is more than a factor 100 below the peak value reached by Jupiter's decametric emission (up to 6 × 10 6 Jy; Queinnec & Zarka 2001) observed from the Earth, at 5 AU range. For S J (ref), we use Jupiter's radio emission levels and occurrence rates given in Zarka et al. (2004, Figure 7). Jupiter does emit decameter emission quasi-continuously but the most energetic emission can be found in bursts. During a fairly active emission event, the median flux density of Jupiter's decametric bursts at 5 AU is ∼ 4 × 10 5 Jy. This flux density is exceeded by ∼ 1% of all detected Jupiter bursts, whereas the level ∼ 4 × 10 4 Jy is exceeded by ≥ 50% of Jupiter bursts. We find that we can detect an exoplanetary polarized signal intrinsically 10 5 times stronger than Jupiter's emission strength from a distance of 5 pc using equation (21) and taking the mean level of Jupiter's decametric bursts as the reference flux (S J [ref] = 4 × 10 5 Jy) that would occur for a few minutes within an observation of a few hours. A stronger signal may be detected more often, a weaker one more rarely. In Table 5, we show the values of α J that would be required for the emission to be detectable in Stokes-I and Stokes-V (circular polarization), respectively.
Such signals are indeed expected to exist. According to most models, the strongest emission up to 10 6−7 times Jupiter's radio emission is expected for close-in planets, especially massive hot Jupiters (Zarka et al. 2001;Zarka 2007;Grießmeier et al. 2007bGrießmeier et al. , 2011. However, rapidly rotating planets with strong internal plasma sources have also been suggested to produce radio emission at detectable levels at orbital distances of several AU from their host star (Nichols 2011(Nichols , 2012. Furthermore, the expected radio flux is a function of the age of the exoplanetary host star, with stronger radio signals being expected for planets around young stars (Stevens 2005;Grießmeier et al. 2005Grießmeier et al. , 2007a, and for planets around stars with frequent and powerful coronal mass ejections (Grießmeier et al. 2006(Grießmeier et al. , 2007a. Sources beyond 20 pc would need to be extremely intense (≥ 10 7 × Jupiter's mean level of burst emission; Table 5), and may be beyond the reach of LOFAR. If the structure of the emission is different from that of Jupiter bursts (e.g. longer bursts of several minutes), the above sensitivity may be improved by an order of magnitude or more.
Finally, let us mention that detection of a radio signal from an exoplanetary system will only constitute the first step. Even though the planetary emission is expected to be much stronger than the stellar emission (see e.g. Grießmeier et al. 2005), one would have to confirm that the signal is indeed produced by the exoplanet rather than its host star. The most direct indication would be the detection of radio emission from a transiting planet, with the planetary emission disappearing during secondary eclipses. Secondly, stellar and planetary radio emission have different polarization properties (Zarka 1998), making polarization a very powerful tool even beyond signal detection. Thirdly, one would have to search for a periodicity in the detected signal, and compare its period to the stellar rotation period (or, more precisely, the beat period between the stellar rotation and the planetary orbit, see e.g. Fares et al. 2010), and (if known) the planetary rotation period.
Ancillary data which would help with the interpretation of a radio signal include: stellar lightcurves (correlation with stellar flares), stellar magnetic field maps (e.g. obtained by Zeeman-Doppler-Imaging), the stellar rotation rate, data on the stellar wind (e.g. obtained by astrospheric absorption) or at least a good estimation of the stellar age, the exoplanet's orbital inclination (see Hess & Zarka 2011) and the planetary rotation rate.

Conclusions and perspectives
Our analysis shows that our pipeline for beam-formed LOFAR Stokes-V data can detect signals of 10 −4.5 times the intensity of Jupiter's polarized emission. This corresponds to either a Jupiterlike planet at a distance of 13000 AU, or an exoplanet with 10 5 times Jupiter's mean radio flux for strong burst emission (4 × 10 5 Jy; Zarka et al. 2004) at a distance of 5 pc (Table 5). According to frequently employed scaling laws (e.g. Zarka et al. 2001;Zarka 2007;Grießmeier et al. 2007b;Zarka et al. 2018, submitted), one can expect exoplanetary radio emission up to 10 6−7 times Jupiter's flux. This also means our pipeline could potentially detect radio emission from the exoplanets 55 Cnc (12 pc), Article number, page 14 of 18 This paper is based (mostly) on data obtained with the International LOFAR Telescope (ILT) under project codes LC2_018, LC5_DDT_002, LC6_010, and LC7_013. LOFAR (van Haarlem et al. 2013) is the Low Frequency Array designed and constructed by ASTRON. It has observing, data processing, and data storage facilities in several countries, that are owned by various parties (each with their own funding sources), and that are collectively operated by the ILT foundation under a joint scientific policy. The ILT resources have benefitted from the following recent major funding sources: CNRS-INSU, Observatoire de Paris and Université d'Orléans, France; BMBF, MIWF-NRW, MPG, Germany; Science Foundation Ireland (SFI), Department of Business, Enterprise and Innovation (DBEI), Ireland; NWO, The Netherlands; The Science and Technology Facilities Council, UK. We use the TGSS survey (Intema et al. 2017) in our study when determining the locations of the OFF-beams and we thank the staff of the GMRT that made these this survey possible. GMRT is run by the National Centre for Radio Astrophysics of the Tata Institute of Fundamental Research. NDA Jupiter observations were used for the flux calibration in our analysis. The NDA is hosted by the Nançay Radio Observatory/ Unité Scientique de Nançay of the Observatoire de Paris (USR 704-CNRS, supported by Université d'Orléans, OSUC, and Region Centre in France).
We also thank the anonymous referee for their helpful comments during the publication process.