Characterizing the spatial pattern of solar supergranulation using the bispectrum

Context. The spatial power spectrum of supergranulation does not fully characterize the underlying physics of turbulent convection. For example, it does not describe the non-Gaussianity in the horizontal flow divergence. Aims. Our aim is to statistically characterize the spatial pattern of solar supergranulation beyond the power spectrum. The next-order statistic is the bispectrum. It measures correlations of three Fourier components and is related to the nonlinearities in the underlying physics. It also characterizes how a skewness in the dataset is generated by the coupling of three Fourier components. Methods. We estimated the bispectrum of supergranular horizontal surface divergence maps that were obtained using local correlation tracking (LCT) and time-distance helioseismology (TD) from one year of data from the Helioseismic and Magnetic Imager on-board the Solar Dynamics Observatory starting in May 2010. Results. We find significantly nonzero and consistent estimates for the bispectrum using LCT and TD. The strongest nonlinearity is present when the three coupling wave vectors are at the supergranular scale. These are the same wave vectors that are present in regular hexagons, which have been used in analytical studies of solar convection. At these Fourier components, the bispectrum is positive, consistent with the positive skewness in the data and consistent with supergranules preferentially consisting of outflows surrounded by a network of inflows. We use the bispectral estimates to generate synthetic divergence maps that are very similar to the data. This is done by a model that consists of a Gaussian term and a weaker quadratic nonlinear component. Using this method, we estimate the fraction of the variance in the divergence maps from the nonlinear component to be of the order of 4-6%. Conclusions. We propose that bispectral analysis is useful for understanding the dynamics of solar turbulent convection, for example for comparing observations and numerical models of supergranular flows. This analysis may also be useful to generate synthetic flow fields.


Introduction
Supergranular motions are characterized by horizontal velocities of about 300 m s −1 and are coherent features with a length scale of about 30 Mm or harmonic degrees of about l ∼ 120 in the power spectrum (e.g., Rincon & Rieutord 2018, and references therein). Close to the surface, supergranules are observed as regions of flow with horizontal divergence (associated with upflows) surrounded by a network of convergent flows (associated with downflows, e.g., Duvall & Gizon 2000).
The origin of supergranulation as a dominant scale of convection remains unclear (see Rincon & Rieutord 2018, for a review). Supergranulation as a particular length-scale might emerge because of thermal instabilities in the ionization zones (Simon & Leighton 1964), as the largest buoyantly driven scale of convection in the Sun (Cossette & Rast 2016), or the largest convective scale not significantly constrained by rotation (Featherstone & Hindman 2016). Also unexplained is the evolution of the supergranulation pattern, which displays a wave-like behavior (Gizon et al. 2003;Schou 2003) -although this is not the topic of this paper.
Comparisons between theory and observations of supergranular-scale flow fields are usually based on the power spectrum of horizontal surface flows (e.g., Rincon et al. 2017). However, it is clear that the power spectrum does not contain much information about the spatial pattern of the flow field, apart from the scale-dependence of the amplitude. For example, the power spectrum does not contain any information about the skewness of the underlying field (see e.g., Figure 5 in Nordlund et al. 1997). Divergence maps of supergranulation have a positive skewness (Duvall & Gizon 2000) which is due to the asymmetric distribution of upflows and downflows.
Such an asymmetry between up-and downflows is present in idealized hexagonal convection cells that have been used in analytical studies of convection and supergranulation in particular (e.g., Chandrasekhar 1961;Gough et al. 1975;Busse 2007). These hexagonal cells are characterized by a central upflow region surrounded by a network of downflows (Chandrasekhar 1961;Toomre et al. 1977;Busse 2003), similar to supergranulation. The opposite case of strong downflows surrounded by a weaker network of upflows was also considered (Busse 2004(Busse , 2007. For example, theoretical attempts to explain the phase velocity of supergranulation and its contribution to maintaining the near-surface shear layer (Busse 2004(Busse , 2007 are based on decomposing the solution to the Navier-Stokes equations in terms of planforms that are constructed from such patterns in the horizontal direction. The main aim of our work is to provide a statistical characterization of the spatial pattern of supergranulation beyond the power spectrum. In other words, we want to find out whether convective patterns, such as hexagons, can be directly related to the supergranulation data. Additionally, our longer-term aim is to construct synthetic supergranular flow fields. The availability of synthetic flow fields could be useful in helioseismic methods Article number, page 1 of 16 arXiv:2002.08262v1 [astro-ph.SR] 19 Feb 2020 A&A proofs: manuscript no. article for studying subsurface convection (e.g., Hanasoge et al. 2012;Duvall et al. 2014).
Information on the skewness is contained in the bispectrum, which is a next-order generalization of the power spectrum (e.g., Priestley 1981, Section 11.5.1). Just as the power spectrum describes the variance as a function of scale, the bispectrum describes the skewness as a function of two coupling scales (e.g., Watkinson et al. 2017). Measuring the bispectrum of supergranular divergence maps would therefore be useful to characterize the skewness as a function of scale. Furthermore, the bispectrum is directly related to hexagonal patterns because a hexagon has a specific signature in the bispectrum. The bispectrum characterizes the coupling of two wave vectors to a third wave vector, which appears similarly in the nonlinearities in the Navier-Stokes equations. Such nonlinearities are present in the modal equations of convection studied using horizontal patterns (e.g., Chandrasekhar 1961;Gough et al. 1975;Busse 2007).
To clarify the results obtained in this paper, it is necessary to clarify what we mean by linear and nonlinear. By linear, we mean the hypothesis that the time-evolution of supergranular divergence maps was given by a linear partial differential equation (PDE) and a stochastic forcing term with independent Gaussian Fourier components. In this case, the dynamics at the Fourier components of the solution would also be independent. By nonlinear, we mean the hypothesis that the dynamics of supergranular divergence maps was described by a PDE that in addition includes a nonlinear term. In this case, the Fourier components of the solution would not be independent but would show some dependence that might be measurable. Supergranular dynamics are strongly nonlinear according to Rincon et al. (2017) and divergence maps have a large nonzero skewness (Duvall & Gizon 2000). This should be reflected in the bispectrum.
In this work, we apply bispectral analysis to near-surface horizontal divergence maps of supergranulation obtained by Langfellner et al. (2018) using local correlation tracking (LCT, November & Simon 1988) and time-distance helioseismology (TD, Duvall et al. 1993) and data from the Helioseismic and Magnatic Imager (HMI, Scherrer et al. 2012;Schou et al. 2012) onboard the Solar Dynamics Observatory (SDO, Pesnell et al. 2012).

Divergence maps of supergranular flows
We use maps of the horizontal near-surface divergence that were obtained by Langfellner et al. (2018) using Dopplergrams and intensity images from HMI/SDO. The data cover 365 days from May 2010 through April 2011.
Regions of size ∼ 180 × 180 Mm 2 , which are centered at the equator, were tracked for eleven days from about 70 • east to 70 • west at the surface rotation rate (Snodgrass 1984). Dopplergrams and intensity maps were remapped and each eleven-day timeseries was subdivided into 33 eight-hour segments. The seventeenth segment crosses the central meridian.
Using the intensity images and LCT of granules (November & Simon 1988), Langfellner et al. (2018) obtained estimates of the two horizontal flow components for the same spatial and temporal coverage. This results in maps with 97×97 pixels, each of size (5 × 0.348 Mm) 2 . These maps span a slightly smaller region of size ∼ 170 × 170 Mm 2 . From these maps, we computed the horizontal divergence by taking derivatives using Savitzky-Golay filters (Savitzky & Golay 1964). In the Savitzky-Golay filters, we used a different filter width (3 instead of 15 pixels) and polynomial order (2 instead of 3) compared to Langfellner et al. (2018) to keep the smoothing as small as possible.
For each segment, Dopplergrams were filtered in the Fourier domain using an f -mode ridge filter (Langfellner et al. 2015) which includes power from wave numbers 300 < kR < 2600 with a peak at kR ≈ 1100. Here, R is the solar radius and k is the horizontal wave number. Next, f -mode point-to-annulus (Duvall et al. 1996) travel-time maps (Gizon & Birch 2004) were obtained from the filtered Dopplergrams in the inward minus outward sense. This resulted in maps with 512 × 512 pixels, with each pixel being (0.348 Mm) 2 in size. These travel-time maps are known to be a proxy of near-surface horizontal divergence of the fluid motion (e.g., Langfellner et al. 2015). To remove the smallest-scale noise from the data, we low-pass filter this dataset in the Fourier domain and keep Fourier components with kR < 1200.
In addition, we obtained a smoothed version of the (not lowpass filtered) TD maps in order to better compare to earlier results. We used a two-dimensional Gaussian window with full width at half maximum of 5.22 Mm (15 pixels), which is similar to the smoothing done by Duvall & Gizon (2000).
For all data sets discussed here, the horizontal mean of each map has been removed. Figure 1 shows example LCT and TD maps. The most prominent features in these maps are outflows due to supergranules, surrounded by inflows. Figure 1 also shows histograms of the maps. Both LCT and TD histograms are asymmetric about zero and have a nonzero skewness, which is well known (e.g., Duvall & Gizon 2000). The skewness of the ith map X (i) (x) is where the variance is Here, x = (x, y) denotes the horizontal location, · is a horizontal average, X (i) (x) = 0 by construction, X (i) (k) denotes the horizontal Fourier transform of X (i) (x) -see Appendix A -, and h k is the resolution in Fourier space. We use the same symbol for a function and its Fourier transform. We useγ for the (mean) skewness computed from the entire dataset. The skewness depends slightly on disk position. In the following, we therefore only use the 11 locations that are closest to disk center for each segment, ranging from about 23 • east to 23 • west, where the skewness is approximately constant. For the entire datasets, this results in a skewness ofγ = 0.277 ± 0.001 for the LCT maps,γ = 0.421 ± 0.002 for the TD maps, and γ = 0.512 ± 0.003 for the smoothed TD maps. Duvall & Gizon (2000) reported a skewness of 0.42 for TD maps. As we see in Section 5, different filters applied in the spatial domain may result in different values for the skewness. The fact that the divergence maps have a nonzero skewness suggests that different Fourier components of the divergence field cannot be independent. If they were independent, the data would have zero skewness, just like a random wave field. As the statistical properties of the supergranular divergence field are translation invariant, it is clear that two Fourier components are uncorrelated. As we see in Section 4, the nonzero skewness of the data can be related to nonzero correlations between three Fourier components, which leads to the definition of the bispectrum. As such correlations between three Fourier components are also present in simple patterns of convection, we now look at these patterns before introducing the bispectrum.

A brief account of convection patterns
We briefly discuss a few patterns used in analytical studies of convection. Related to the specific case of solar supergranulation, Busse (2003Busse ( , 2004Busse ( , 2007 studied hexagonal convection cells. This work is based on earlier and more general studies of Boussinesq convection that used convection patterns such as rolls, squares, and hexagons (e.g., Palm 1960;Chandrasekhar 1961;Toomre et al. 1977) and the decomposition of convective motions into planforms, which are made up from such patterns in the horizontal direction (e.g., Chandrasekhar 1961;Gough et al. 1975). Due to the continuity equation and under the assumption of a steady flow, the vertical flows and the horizontal divergence show the same pattern close to the surface.
We now look at a few possible patterns in detail. The top row in Figure 2 shows simple convective rolls. Such a roll is given by a pattern X(x) for which where k 0 = (k x,0 , k y,0 ) is a chosen wave vector and we use the notation +c.c. to indicate that the previous terms are repeated with a complex conjugate. The top row in Figure 2 shows three examples of rolls for three different wave vectors k 0 , l 0 , m 0 , which are kept fixed for the entire section and are shown in the small subpanels of Figure 2 as colored arrows. These wave vectors form a triad, that is k 0 + l 0 = m 0 . The wave vectors chosen here have the same length and are at an angle of 60 degrees (up to a small deviation caused by the discretization). We now consider different patterns of the form where A is the (possibly complex) amplitude of the third Fourier component. This pattern can be seen as two independent rolls (k 0 and l 0 ) plus a nonlinear component (m 0 ). This nonlinear component is given by a quadratic coupling (Kim & Powers 1979), where A is a coupling coefficient. If one thinks of X(m 0 ) as being due to such a coupling, the nonlinear fraction of the variance is which is given in the panels in the bottom row in Figure 2. A hexagonal pattern is given by A = 1; see bottom left panel in Figure 2. Equation (4) with A = 1 is equivalent to the horizontal dependence in Equation (249) in Section 16c of Chandrasekhar (1961). Hexagonal patterns have the advantage that they can be used to model a convective pattern that is characterized by a central upflow region surrounded by a network of downflows (Chandrasekhar 1961;Toomre et al. 1977;Busse 2003), similar to supergranulation. In addition, hexagonal planforms are not symmetric with respect to a change in sign and they produce a nonzero skewness in the horizontal divergence, similar to the supergranular divergence maps (see Section 2); see the skewness of each pattern given in Figure 2.
The case A = −1 corresponds to a pattern of inverted hexagons (see right panel in the bottom row of Figure 2). This corresponds to downflows surrounded by a network of upflows as considered by Busse (2004Busse ( , 2007. The case A = 0 corresponds to rectangles, and we can vary A continuously between zero and one to obtain patterns with different coupling amplitudes, intermediate between rectangles and hexagons; see the example labeled "small coupling" (A = 0.3) in Figure 2.
In addition to these patterns, it is possible to tessellate space by a pattern generated from any combination of three wave vectors with k 0 + l 0 = m 0 , each with a separate, possibly complex Fourier amplitude. For a given pattern, the phases of X(k 0 ) and X(l 0 ) may be jointly put to zero by a translation in real space. Statistically, we do not expect any preferred direction for supergranular divergence maps near the equator, and it is therefore reasonable to assume |X(k 0 )| = |X(l 0 )| for illustrative purposes.
The Fourier amplitudes of X may then be normalized to one as done above.
For the patterns considered here, one pattern fills the whole domain. On the Sun, however, every supergranule has its individual shape. It is therefore possible that a coupling is present not only at the Fourier components with equal length (|k 0 | = |l 0 | = |m 0 |, as for regular hexagons), but potentially for any coupling triad of Fourier components. As a consequence, the signature in Fourier space is a more complex combination of the previous characteristics at several Fourier components. This motivates the definition of the bispectrum.

Bispectral analysis in two dimensions
Here we introduce and summarize the definition of the bispectrum and related quantities for two-dimensional random processes. This part is mainly based on Section 11.5.1 in Priestley (1981), Kim & Powers (1979), and Chandran & Elgar (1990). Later in this section, we give a few examples and illustrate the definitions with the help of the patterns introduced in the previous section. In addition, we introduce a method for generating data with a given bispectrum.

Definition of the bispectrum
Here we consider a random field X(x) that depends on a twodimensional discretely sampled spacial variable x. We assume that all statistical properties of X(x) are translation invariant. This is the generalization of a stationary process to two (spatial) dimensions. Since X(x) is a random variable, it induces the definition of an expectation value, which we denote by E[·]. The spatial Fourier transform of X(x) is again denoted by X(k). By X (i) (x), we denote an individual independent realization of X(x), that is, an individual map.
The bispectrum is the next higher-order statistic beyond the power spectrum, and is defined as The three-point correlation function is the Fourier transform of the bispectrum, just as the two-point (auto-)correlation function is the Fourier transform of the power spectrum: where the expectation value on the left-hand side of Equation (8) does not depend on x 0 due to the assumed translation invariance of X(x).
Similarly, the skewness of the random field X(x), γ 1 , which is the third-order moment normalized by the variance, is the sum of the variance-normalized bispectrum, where the power spectral density P is defined by Table B.1 provides a summary of key terms defined in this paper for later reference.
In the following, we assume that there is no preferred horizontal direction in the data. In this case, the bispectrum of the divergence field is real. The expectation value E[X(k)X(l)X * (m)] is zero if m k+l because of the assumed translation invariance. Therefore, the triadic relation m = k + l is implicitly understood in the following and the third variable is omitted in bispectrumrelated quantities.

Estimating the bispectrum
In principle, the bispectrum can be estimated using Equation (7) by simply averaging over independent realizations X (i) (k) of the data (e.g., Kim & Powers 1979). See Appendix B for a formal definition of the estimatorB, which is obtained this way. We take this approach although it may not be optimal; Nikias & Raghuveer (1987) discuss other possible approaches.
In order to assess the significance of the estimated bispectrum, it is necessary to consider the bispectral coherence (also known as the bicoherence) b, which is a normalized bispectrum (e.g., Kim & Powers 1979 The definition of b is independent of the convention used to define the Fourier transform because it is normalized and because the bispectrum is real in our case. We see later that the squared bicoherence b 2 quantifies the fraction of power which is due to quadratic nonlinear coupling at m = k + l. For such a coupling to be present in the data and a bispectral measurement to be significant, it is therefore necessary that the bicoherence be significantly different from zero. The variance of an estimatorb of the bicoherence that is built by simply averaging over independent realizations for estimating expectation values in Equation (12) see Appendix B -, is bounded by (Kim & Powers 1979) Var where N B is the number of independent data realizations used in the estimate. The statistical 1σ error in the bicoherence estimate is therefore σ b = 1/ √ N B . Additionally, we average the results azimuthally -see Appendix B.1 -which results in a typical σ b = 0.0035 at |m|R = 155.5.

A model with a weak quadratic nonlinearity
We here present a model that helps us to interpret the measured bispectrum signal and to generate synthetic data that have approximatively a given bispectrum. There are several ways to generate data with a given power spectrum and bispectrum; see for example Contaldi & Magueijo (2001) and Wagner et al. (2010). Such a method is not unique (e.g., Wagner et al. 2010). Following Wagner et al. (2010), we assume that the data can be represented as where X G is a Gaussian field and X NL a small nonlinear component. In the Fourier domain, the Gaussian field follows a complex Gaussian distribution, where N k are Gaussian random variables with independent real and imaginary parts, zero mean and unit variance. We set N −k = N * k to ensure that X G is real, but otherwise the N k are independent for different k.
We assume that the nonlinear component is obtained from the Gaussian one by an arbitrary quadratic nonlinearity, where A k,l denote coupling coefficients of modes k and l = m−k.
To guarantee that X NL is real, the coupling coefficients have to fulfill A −k,−l = A * k,l . Synthetic data with a given bispectrum B 0 (k, l) can then be generated by setting Appendix C outlines further details of this model.

The bispectrum of hexagons
For the patterns introduced in Section 3, the bispectrum of the patterns is given by B(k 0 , l 0 ) = h 2 k A and the coupling coefficient as defined above is given by A k 0 ,l 0 = A/6. The factor of onesixth appears because in the model introduced in this section it is assumed that a coupling occurs at several Fourier components. In the case of only one triad of coupling Fourier components as for the patterns from Section 3, one has A k 0 ,l 0 = A.
The bicoherence of the patterns with nonzero coupling coefficient A is |b(k 0 , l 0 )| = 1. This is because the signal at Fourier component m 0 is entirely due to the nonlinear coupling, and there is no noise or any other independent signal at this component.

The impact of noise on bispectral estimates
If noise is present in the data, this reduces the bispectral estimates. If we assume for example that where (x) is the noise, which we assume to have independent Gaussian Fourier components and to be independent of X G and X NL . If the noise is simply added to the data, the bispectrum (as an expectation value) does not change; see Eqs. (7) and (C.6). However, the estimated coupling coefficient A k,l and the bicoherence b are reduced in amplitude because of the increase in power from the noise; see Eqs. (12) and (17). It is therefore not possible to discriminate between a dataset with small noise and a small coupling coefficient, and another one with largeramplitude noise and a larger coupling coefficient. In other words, for a given power spectrum, which is due to the signal plus noise, stronger noise results in a smaller bispectrum. For the case of possible hexagonal convection cells, it might therefore be that their signal in the bispectrum is reduced by noise.

Results
In this section, we show results for data averaged over the azimuthal direction of m. We do not find any direction-dependence of the bispectrum. Figure 3 shows the estimated bicoherenceb from LCT. The bicoherence is significantly nonzero for a large number of combinations of wave vectors. The bicoherence is largest at wave vectors of similar amplitude, |k| ≈ |l|. In general, it is strongest if the coupling vectors are close to the supergranular peak in the power spectrum, in our case at mR = 155.5. At these wave vectors, the amount of power due to a nonlinear component in the data is largest. Figure 3 displays results as a function of k = (k x , k y ) and |m| for selected values of |m|. The results are averaged in such a way that m (orange arrow) is rotated to be in the x direction and thereby is fixed; see Appendix B. Each data point is then plotted at the location corresponding to k relative to the direction of m. For each k, the third coupling Fourier component is then given by l = m − k. The strongest coupling is present at a location where the triad k + l = m corresponds to a hexagonal pattern at the supergranular scale; see middle panel in Figure 3. Figures F.2 and F.4 illustrate these results further.

The bispectrum of supergranulation
The results for TD are very similar for |m|R 300 (Figure F.1 and second column in Table 1). For |m|R > 300, the signal considerably weakens compared to LCT and is dominated by noise and leakage due to the spatial smoothing. This is the case for both TD datasets.
Estimates for the bispectrum are qualitatively very similar to the bicoherence; see Figure F.3. As the sign of the bispectrum is predominantly positive, the dominant part of the quadratic interaction results in a positive contribution to the skewness; see Section 4.1. This is consistent with the positive skewness of the data; see Figure 1. The peak in the bispectrum coincides with the peak in the bicoherence. The coupling modes that produce the strongest nonlinearity therefore also result in the largest contribution to the positive skewness in the data.
A positive sign in the bispectral coherence at wave vectors similar to hexagons indicates that the divergence field is preferentially given by outflows surrounded by a network of inflows (see the patterns discussed in Section 3). The outflows are on average slightly stronger while the inflows occupy a larger area. However, in individual realizations of the data, the phase of X ( j) (k)X ( j) (l)X ( j) * (m) varies substantially. As a consequence, the opposite case of inflows surrounded by outflows is also present in the data, which can be seen in Figure 1.
In addition, there are regions with negative bispectrum and bicoherence. In those regions, the three vectors k, l, m are approximatively parallel, with one component being nearly twice as long as the other two. Figure F.2 shows one example for this in the central feature in the central panel (|mR | = 310.9, at k = l = m/2). Due to the symmetry relations B(k, l, m) = B(−k, m, l) * = B(m, −k, l) * and as the bispectrum is real in our case, this location has the same bispectral value as a location near the negative peaks in the middle panel in Figure 3. The pattern associated with these negative values and the associated Fourier components is a pattern made of asymmetric rolls, where the regions of outflow are slightly larger than the regions of inflow. This is consistent with the observation that hexagons with a positive bicoherence also have regions of outflow that are larger in diameter than the inflow regions.

Generating synthetic data
We generate synthetic data with approximately the given measured power spectrum and bispectrum using the method described in Section 4.3. As there is not a unique way to do so, this is only one possible model for the data. Figure 4 shows an example realization of the synthetic LCT data (second row, left column). For comparison, the panel directly below shows an example realization of the LCT data. The synthetic and the real data are very similar and a difference can hardly be seen by eye, apart from the fact that the synthetic data are periodic. Figure F.5 shows the corresponding results for TD, which are similar.
The bicoherence and bispectrum estimated from the synthetic data agree well with the estimates from the real data (Figure 5) with excellent agreement for the bispectrum. The bicoherence of the synthetic data is slightly lower due to the power in the synthetic data being slightly larger than the real data; see Equation (C.4), and below for further discussion of this topic.
As we know the nonlinear and Gaussian components of the synthetic data for every realization (e.g., middle and right panels in the top two rows in Figure 4), we determine the average fraction of the total power (or equivalently of the total variance) stemming from the nonlinear component; see Section D. We find a value of f NL = 4.0%, see right-hand column of Table 1.
In addition, we estimate the nonlinear fraction of power from the bicoherence using Equation (D.6). For the synthetic LCT data, we estimate a value off b NL = 3.6%; see third to last column in Table 1. This value is roughly consistent with the true value given the bias introduced by the higher-order term in Equation (D.5). This bias is due to the actual power spectrum of the synthetic data being larger than the real data by a factor of 1 + f NL (k) (see Eq. (C.4)). This reduces the value of the bicoherence by the same amount and results in a small negative bias; see also   Fig. 3. Bispectral coherence b(k, |m|) from LCT for selected values of |m|R = 77.7, 155.5, 233.2 ± 12.3. The vector mR is fixed for each plot and given by the orange arrow. The bispectral estimates were averaged over the direction of m, which is equivalent to assuming that there is no preferred horizontal direction and which results in a real bispectrum. The direction of m in the plot was chosen simply for convenience. The Fourier vectors for a triad k + l = m that corresponds to hexagons is given by the other two wave vectors (red: k, blue: l). The angles between the displayed Fourier vectors k, m, and l are not exactly 60 degrees because of the discretization. White color denotes locations where no estimate was obtained because one of the three vectors has zero length (as the mean was removed from each map, the bispectral coherence is not defined in this case). See Figure 5 ), estimates of the skewness from the data (γ) and from the bispectrum (γ B ), as well as the maximal value obtainable for low-pass filtered data (γ filt,max ). By modeling the data with a Gaussian field plus a weak quadratic nonlinear component, we obtain estimates of the fraction of variance due to this nonlinear component (f b NL ), the maximal value obtainable when low-pass filtering the data (f filt,max NL ), and the true value of f NL , which is known for the synthetic datasets only. See Figures 6 and 7 for howf b NL andγ B depend on low-pass filters applied to the data. The estimatef b NL is biased and differs from the true f NL due to second-order effects, see Eqs. (D.5) and (D.6). For the smoothed TD data,γ was computed here by taking the apodized region in Figure 1 into account to show the consistence withγ B .
Under the assumption that this model is a good model for our data, we similarly estimate the fraction of quadratic nonlinear power in the data. We obtain values of 4.2% for LCT, 3.8% for TD, and 5.6% for the smoothed TD data (see also Table 1). However, this result depends on whether the data are filtered in wave-number space. This is important because both measurements effectively smooth the flow field. To test this, we estimate f NL for different low-pass filters by limiting the sums over m and k in Equation (D.6) to |k|, |m|, |m − k| ≤ k max . Figure 6 shows the resulting dependence off b NL on (kR ) max = k max R . For LCT and the smoothed TD data, the fractional power of the nonlinear component increases until it levels off, while for TD,f b NL decreases after attaining its maximum. This is because at large wave numbers, TD is dominated by noise, which was filtered out in the smoothed data.
In addition to this, we estimate the quadratic nonlinear component directly from the divergence field using Equation (E.1). This estimate is biased and may be influenced by higher-order correlations, for example of six Fourier components, which we do not take into account in our study. It therefore serves as an orientation. Figure 4 (right column) shows how these estimates of the nonlinear component relate to the corresponding realizations of the data. The estimates of the nonlinear component for the real data (right column, bottom panel) and for the synthetic data (right column, top panel) are qualitatively similar. Their relationship to the complete synthetic or real data is similar as well. In both cases, outflows (red contours) are seen in the nonlinear component at locations where an outflow is surrounded by a ring of inflows, or when an inflow is surrounded by a ring of outflows. Inflows (blue contours) are seen mostly in the network between supergranular outflows, either amplifying the network inflow, or suppressing outflows that appear in a network-like location (especially visible in the synthetic case).
Taken together, we conclude that at least 4-6% of the variance in the data is due to quadratic nonlinear interactions, under the assumption that the data are well explained by a random Gaussian field plus a weak quadratic nonlinear component.

Estimating the skewness from the bispectrum
Finally, we obtain estimates of the skewness from the estimated bispectrum using Equation (9); see the fourth column in Table 1 (γ B ) and Appendix B for a definition of the estimator. The results are consistent with the estimates from the data,γ, shown in the third column of Table 1 Fig. 4. Example synthetic data (top two rows) generated from the measured bispectrum and example LCT data (bottom two rows). The different columns are the (synthetic or real) data (left), contours of the estimated quadratic nonlinear component overplotted over the data (right), the residual after subtracting the estimated nonlinear component from the data (middle column, two bottom rows), and the Gaussian field from which the synthetic data were generated (middle column, top two rows). Contour lines in the right column indicate levels of ±1σ X NL (dashed) and ±2σ X NL (solid) of the nonlinear component (positive: red, negative: blue). In addition, the skewness is given in the corresponding histogram; both were computed from the entire dataset. See Figure F.5 for the corresponding TD results. low-pass filter applied to the data; see Figure 7. The estimated skewness of the filtered dataset is obtained using the estimated bispectrum and by limiting the summation in Equation (9) to |k|, |l|, |k + l| ≤ k max = (kR ) max /R and by estimating σ using the similarly filtered power spectrum.
For the synthetic data, the two estimates of the skewness from the data and from the bispectrum agree similarly well; see Table 1. However, the skewness in the synthetic data is slightly reduced compared to the real data. This again is due to the total power in the data being larger than the original power by a factor of 1 + f NL (k) due to second-order effects in the model. This difference could be compensated by an iterative approach in which the power spectrum of the Gaussian component would be adjusted by 1 − f NL (k). The coupling coefficient A k,l would have to be adjusted in this case as well to guarantee that the resulting bispectrum is unaltered.

Conclusions
We measured the bispectrum and bispectral coherence of the horizontal divergence of near-surface supergranular motions from LCT and TD. The measurements for both datasets agree well and yield significantly nonzero values.
The bispectrum is the next higher-order statistic beyond the power spectrum (e.g., Priestley 1981, Section 11.5.1); it characterizes how the skewness in a dataset is generated by the quadratic interaction of wave vectors that form a triad k + l = m. The bispectral coherence, which is a power-normalized bispectrum, characterizes the strength of a quadratic nonlinear component in the underlying field as a function of the coupling Fourier components, relative to the total power at this Fourier component.
The bispectral measurements are most significant at supergranular scales, when the three coupling Fourier vectors are of the same magnitude and thus at an angle of about 60 • (middle panel in Figure 3). These Fourier vectors are the same as those of regular hexagons at the supergranular scale. In other words, supergranular scales are the scales where the horizontal divergence pattern looks closest to hexagonal among the scales probed by LCT and TD. At the same Fourier components, the bispectrum is of maximal amplitude and positive, which means that these Fourier components predominantly generate the positive skewness in the dataset (middle panel in Figure F.3). The positive skewness is due to a preference for broad outflows (positive divergence) surrounded by a weaker network of inflows (negative divergence), which is consistent with the typical appearance of supergranulation (see Figure 1). The bispectral measurement is consistent with the skewness of the supergranular divergence maps (see Table 1).
We use the measured bispectrum to generate synthetic data which look very much like supergranulation (Figures 4 and F.5). The synthetic data have the same bispectrum within error bars and a similar skewness ( Figure 5 and Table 1). To do this, we use a model that consists of a Gaussian field with independent Fourier components plus a weakly nonlinear component, which is generated from the Gaussian one by a quadratic coupling equivalent to the measured bispectrum. This method may be extended in the future to generate synthetic supergranular flow fields. These may be used, for example, for testing helioseismic methods.
Based on this weakly nonlinear model, we find that the nonlinear component is strongest at the supergranular scale. We find that at least 4-6% of the variance is due to this component. This estimate depends on how the small-scale noise is filtered out and it is biased by possibly a few percent ( Figure 6 and Table 1).
However, for all low-pass filters tested, the estimates are below 6%.
We find that it is at the supergranular scale that the nonlinear component contributes the largest fraction to the variance. In this sense, the nonlinearities have the strongest effect at supergranular scales. However, the coupling coefficient A k,l (see Eq. (17)), which is the bispectrum weighted with the inverse of the squared power, increases towards smaller scales.
The bispectral measurements presented in this work may be understood in terms of the example patterns introduced in Section 3. The skewness and the strength of the nonlinear component in the supergranular divergence maps is of similar order to the pattern termed "small coupling" (see central panel in the bottom row in Figure 2). This pattern is essentially the same as regular hexagons, but one Fourier component is lowered by a factor of about three.
The relationship between our finding that 4-6% of the variance is explained by the nonlinear component and the conclusion of Rincon et al. (2017) that supergranular dynamics are strongly nonlinear has yet to be determined. Here we present observations of near-surface flows. In future work, the depth-dependence of the skewness and the nonlinear component may be studied and the evolution in the time domain may be included in the analysis.
By measuring the bispectrum, we provide a statistical characterization both of the spatial pattern of supergranulation and of the nonlinear interaction in supergranular flows. Together with future studies, this may result in important insight into how supergranulation emerges as a particular length scale on the Sun because it lends itself as a test that simulations and theoretical models of supergranulation have to pass. Such a comparison can realistically be made because one needs about 17 days of simulated solar time to measure a bispectral peak as found in this study at a signal-to-noise ratio of about three.
Quadratic nonlinear interactions play an important part in turbulent convective dynamics. This is because the underlying triadic interactions, which couple two Fourier modes to a third one, are responsible for kinetic energy transport in turbulent flows (e.g., Alexakis & Biferale 2018). This may be used in future studies to estimate the direction of the kinetic energy cascade in supergranular flows and may shed light on the still mysterious origin of supergranulation.
To shorten notation, we write = m∈M |m| N j=1 .
(B.2) Following Kim & Powers (1979), our estimate of the bispectrum, B, can then be written where the vectorsk,l,m are obtained from k, l = k − m, by a rotation of − arcsin m x |m| , see Appendix B.1. Our estimate for the bicoherence then becomeŝ (B.4) and similarly for the power-normalized bicoherence, As a consequence, one haŝ

Appendix C: Details on the weakly nonlinear model
We here show the basic steps for demonstrating that the bispectrum of the synthetic data generated with the weakly nonlinear model of Section 4.3 is the same as the input bispectrum to leading order.
To do that, one first observes that different Fourier components of the nonlinear component are uncorrelated, where δ m,m is the Kronecker symbol and the nonlinear component is uncorrelated with the Gaussian component, Consequently, the total power is proportional to where f NL (k) is the ratio of then nonlinear power with regard to the background power of the Gaussian component at k,  Symbol name definition / formula see Eq. # major properties X(x) random field random variable Eq. (10) Eq. (D.5) Notes. Estimators for these quantities are given by averaging realizations for estimating the expectation values in the above formulas; see Appendix B. f (x) denotes a spatial average of f . The data have zero spatial mean by construction. (a) The bispectrum and power spectrum as defined here are actually bispectral and power spectral densities. (b) Also known as bispectral coherence. (c) NL is shorthand for nonlinear.
Using Eq. (17) and using the symmetry relation B 0 (k, l, m) = B 0 (k, −m, −l) = B 0 (l, −m, −k), which is fulfilled by every bispectrum, we then obtain where the correction term is given by If the nonlinearity is weak, the term O(|X NL | 3 ) can be neglected and this model is suited to generate data with a given bispectrum. If we generate synthetic data for a bispectrum B 0 =B, which was estimated from the data, we set A k,l to zero if the estimated bicoherenceb is not significantly different from zero. Using too many nonsignificant coupling terms results in additional random nonlinear terms in the data. In this work, we use a threshold for |b| of about 2σ b (1.7σ b for LCT and 2σ b for TD). This threshold was chosen to balance the removal of noise with keeping the signal. This procedure is in general agreement with the discussion in Kim & Powers (1979).

Appendix D: Estimating the power of the nonlinear component
Given the above model, one can estimate the fraction of power due to the nonlinear interaction based on the bispectral estimates.
To do that, one considers the power at a Fourier component m, and finds where b P is similar to the bicoherence, but normalized by the power spectra (compare to Eq. (12)), = 1 + δ k,l b(k, l) + higher order terms. (D.4) To first order, the fraction of power of the nonlinear component at m is thus proportional to the squared bicoherence, b 2 , summed over all triads that add up to m (except at k = l). The factor of 1/18 stems from the multiple permutations that are present when computing expectation values of four multiplied Gaussians in Equations (C.6) and (D.2).
The fractional variance from the nonlinear component among the total field can similarly be computed, The fraction of the total variance that is due to the nonlinear component, f NL , is thus a power-weighted average of the fractional nonlinear power at m, plus a small higher-order correction. We thus estimate the fraction of power due to the nonlinear component bŷ whereb P is given in Appendix B. From Equations (D.5) and (D.6), it is clear thatf b NL is biased due to the higher-order correction in Equation (D.5). When computing the sum in Equation (D.6), we again only keep terms where the bicoherenceb is significantly nonzero by at least 2σ b . This is consistent with using the same threshold for generating synthetic data as discussed at the end of Appendix C.
The bispectrum B and its variance-normalized version B/σ 3 are effectively bispectral densities (see Eq. (9), which includes an h 4 k term). As such, they are in principle invariant under changes in the box size, once the resolution of the Fourier grid is much smaller than the scale on which B varies. The same is not true for the bicoherence (Eqs. (D.2) and (D.5) do not include an h 4 k term).

Appendix E: Estimating the nonlinear component
In addition, it is possible to estimate the nonlinear component of the divergence field at each realization. One starts from Equation (16) and simply inserts the data X instead of the Gaussian component, where A k,m−k is obtained from the estimated bispectrumB and Equation (17) with B 0 =B. Again, we keep only terms where the estimated bicoherenceb is significantly nonzero. The estimator X NL fulfillŝ We note thatX NL is a biased estimator.

Appendix F: Additional figures
To further illustrate our results from Section 5, this appendix includes a few additional figures.
Appendix F.1: Results for TD Figure F.1 is essentially the same as Figure 3 and shows results for the bicoherence b for the (unsmoothed) TD data for selected values of |m|R that are closest to those in Figure 3. Results for the smoothed TD dataset are not shown but very similar with slightly less noise and slightly larger values, which is due to the apodization. Both TD results compare well with LCT. Figure F.5 shows the same comparison of synthetic and real data for the unsmoothed TD dataset that Figure 4 shows for LCT.

Appendix F.2: Further details on LCT results
In Figure F.2, we show results for the bicoherence for more values of |m|R as shown in Figure 3. Figure F.3 shows results for the bispectrum. Finally, in Figure F.4, we show results for the bicoherence as a function of the amplitude of all coupling Fourier vectors, that is of (|k|, |l|, |m|), for the same selection of |m| as in Figure F Estimates of the bicoherence b from LCT as a function of (|k|, |l|) for the same values of |m| as in Figure F.2, also noted in the top of each panel. White color denotes locations where either the triadic relation k + l = m cannot be met or where one of the three vectors has zero length (as the mean was removed from each map, the bicoherence is not defined in this case).