Evolution of solar surface inflows around emerging active regions

Solar active regions are associated with Evershed outflows in sunspot penumbrae, moat outflows surrounding sunspots, and extended inflows surrounding active regions. The latter have been identified on established active regions by various methods. The evolution of these inflows and their dependence on active region properties as well as their impact on the global magnetic field are not yet understood. We aim to understand the evolution of the average inflows around emerging active regions and to derive an empirical model for these inflows. We analyze horizontal flows at the surface of the Sun using local correlation tracking of solar granules observed in continuum images of SDO/HMI. We measure average flows of a sample of 182 isolated active regions up to seven days before and after their emergence onto the solar surface with a cadence of 12 hours. We investigate the average inflow properties with respect to active region characteristics of total flux and latitude. We fit a model to these observed inflows for a quantitative analysis. We find that converging flows of around $20$ to $30$ m/s are first visible one day prior to emergence, in agreement with recent results. These converging flows are present independently of active region properties of latitude or flux. We confirm a recently found prograde flow of about $40$ m/s at the leading polarity during emergence. We find that the time after emergence when the latitudinal inflows increase in amplitude depends on the flux of the active region, ranging from one to four days after emergence and increasing with flux. The largest extent of the inflows is up to about $7 \pm 1^\circ$ away from the center of the active region within the first six days after emergence. The inflow velocities have amplitudes of about $50$ m/s.


Introduction
Active regions (hereafter ARs) are patches of magnetic field at the surface of the Sun. They are the host to sunspots visible in white light images (van Driel-Gesztelyi & Green 2015) and the site of a large variety of phenomena, such as flares and coronal mass ejections. Theories of where and how active regions form include a buoyant rise of magnetic flux tubes from the bottom of the convection zone (see review by Fan 2009), formation in the bulk of the convection zone (Nelson et al. 2013), or in the near-surface layers (Brandenburg 2005).
At the surface, active regions are advected on large spatial scales by differential rotation (Snodgrass 1983) and the meridional flow (Duvall 1979). On smaller scales, magnetic field is diffused by the turbulent convective motions of granulation and supergranulation (Leighton 1964). Methods to infer the largescale flows include tracking the motions of small-scale convective or magnetic features, which are displaced by the larger-scale motions, as well as a variety of tools for the analysis of solar oscillations, which are known as local helioseismology (see review by Gizon & Birch 2005).
In addition, Gizon et al. (2001) found that evolved active regions are surrounded by horizontal converging flows, using timedistance helioseismology (Duvall et al. 1993) to study flows near the surface of the Sun. These surface inflows typically extend to about 10°from the active region and have horizontal velocities of up to 50 m s −1 (Gizon et al. 2001;Haber et al. 2004;Zhao & Kosovichev 2004;Hindman et al. 2009;Löptien et al. 2018;Braun 2019). What drives these inflows is an interesting question. One possible mechanism is that enhanced emission of radiation in the magnetic field structures leads to a horizontal temperature gradient which could drive the flows (Spruit 2003).
The inflows towards active regions are important as they affect the advection and flux cancellation of ARs, and may counteract diffusion (De Rosa & Schrijver 2006). De Rosa & Schrijver (2006) and Martin-Belda & Cameron (2017a,b) included models of the inflows into surface flux transport simulations, showing that they can provide a feedback mechanism which saturates the amplitude of the global magnetic field. This could explain observed variations in solar cycle strengths.
Several studies applied local helioseismic techniques to investigate different aspects of the properties of these flows. Haber et al. (2004) studied their depth dependence, identifying a transition to outflows in deeper layers for some AR. Komm et al. (2011Komm et al. ( , 2012 studied vertical flows and vorticity, finding that emerging flux is associated with weak upflows of less than 1 m s −1 and increasing strength of vorticity, and that decaying flux is associated with downflows and decreasing strength of vorticity. Recently, Braun (2019), using helioseismic holography (Lindsey & Braun 2000), investigated a sample of 336 ARs and derived flow patterns for subgroups of increasing magnetic flux. This work showed that the amplitude of the flows increases with magnetic flux, and reported a retrograde flow component mainly at the poleward side and, less distinct, the equatorward side of the AR. Löptien et al. (2017) used Local Correlation Tracking (LCT, November & Simon 1988) of the convective granulation pattern on the solar surface to measure inflows around an average high-flux AR. The regions in their sample have fluxes larger than 5.9 × 10 21 Mx. Their work confirms inflows of about 20 to 30 m s −1 spanning about 10°from the AR. Birch et al. (2019) found a compact (less than 2°) converging flow of about 40 m s −1 in the day before active region emergence, using both LCT and helioseismic holography as independent methods for flow inferences on the surface. Comparisons of flows derived with LCT and local helioseismology showed that they are in good agreement (Švanda et al. 2013;Birch et al. 2016).
In this study, we aim to measure the evolution and structure of the inflows around active regions at the solar surface during the early stages of the active-region evolution, that is before, throughout, and in the days after emergence. We also investigate potential differences in the inflows related to the total unsigned flux and the latitude of the active regions.
We perform a statistical study by averaging over a large sample of ARs. For this, we make use of the SDO Helioseismic emerging active regions (SDO/HEAR) survey . By using this sample rather than performing a case study, we decrease background noise as well as peculiarities of individual emerging active regions (EARs), which can have a dominating effect on the flow structure.
The paper is structured as follows: In Sect. 2 we describe the sample of emerging active regions as well as the flow measurements. Sect. 3 illustrates the data processing that we applied to derive the evolution of flows around ensemble averages of ARs. Sect. 4 presents our results, followed by a discussion (Sect. 5).

Data
2.1. The sample of emerging active regions Schunker et al. (2016) describe the HEAR survey in detail. Here, we summarize the main aspects.
The HEAR survey comprises 182 emerging active regions during the time from 2010 to 2014. They were selected from the National Oceanic and Atmospheric Administration (NOAA) solar region reports on the basis of several criteria, requiring that the AR a) reaches a total sunspot area of at least 10 micro hemispheres (µH; 1 µH ≈ 3 Mm 2 ), b) has its first NOAA record within 50°of the central meridian, and c) emerges into the quiet Sun, without pre-existing large-scale flux within a radius of 18°. These criteria were chosen to minimize contamination by pre-existing magnetic field, and to reduce projection effects. The emergence time of each active region is defined as the time at which the region has reached 10 % of the maximum unsigned flux within the first 36 h after the active region is first recorded by NOAA. The survey also provides the Carrington longitude and latitude as well as the start and end times of the observations for each region.
In addition to the emerging active regions, the survey includes one quiet Sun control region for each EAR. These control regions match their corresponding EARs in latitude and distance from the central meridian at a mock-emergence time, at which there are no numbered SHARP regions within a radius of 18°o f the control region. These control regions are important as a comparison against which any measurement on the EARs can be tested.

Flows inferred from local correlation tracking
We use an updated version of the flow maps generated by Löptien et al. (2017), who describe their processing procedure in detail. Here, we summarize the important aspects. Löptien et al. (2017) applied Local Correlation Tracking on full-disk continuum intensity images from SDO/HMI (Schou et al. 2012). Specifically, they ran the FLCT code (Welsch et al. 2004;Fisher & Welsch 2008) on hmi.ic_45s series data, starting on 24 April 2010 and ending on 27 April 2016. In order to reduce computation time, only one flow measurement every 30 minutes was calculated, by cross-correlating pairs of images that are 45 seconds apart. Each measurement consists of a map of velocities v x in x-direction and a map of velocities v y in ydirection (with x and y in CCD coordinates).
LCT is known to underestimate flow velocities (see e.g. Löptien et al. 2016a;Verma et al. 2013;Švanda et al. 2007 and references therein). This was addressed by generating calibration data with which the flow velocities were then corrected (see Sect. 2.1 of Löptien et al. 2017).
The noise in LCT calculations increases to the limb, due to projection effects. Therefore, the flow maps are cropped at 60°f rom disk center. The maps suffer from systematic effects, such as the orbital motion of HMI and the shrinking-Sun effect (Lisle & Toomre 2004;Löptien et al. 2016b), a systematic converging flow towards disk center. This was addressed by representing the background signal of the flow maps as a sum of loworder Zernike polynomials (up to radial degree n = 7). The time series of the Zernike polynomial coefficients were then transformed into Fourier space, where frequencies corresponding to known periods of systematic effects were isolated. These are the 24 hour period of the satellite orbit and its 23 observed harmonics, and the 365 day period of the earth orbit and its first 16 harmonics (down to a period of 22.8 days), which show significant power. The filtered, backtransformed background estimate was then subtracted from the observations. An exception was made for the time series of Zernike polynomials Z −1 1 = y for the v x maps and Z 1 1 = x for the v y maps, which are correlated to each other and were therefore not treated in the way outlined above. For these, the first element of a principle component analysis of the two components was subtracted (as described in Appendix B of Löptien et al. 2017). However, this retained a systematic variation for these terms, on the order of several m s −1 with a period of 24 hours and its harmonics, which corresponds to large-scale gradients in the flow maps. To correct for this, for the present study we repeated the data processing of Löptien et al. (2017), subtracting for the Zernike polynomials Z −1 1 = y for v x and Z 1 1 = x for v y the same harmonics as for the other components. In this rerun, the data coverage was increased to 6 March 2020.
As a last step, we remapped the velocity maps from the CCD coordinate system to Plate Carree projection (Carrington longitude, latitude), with a total size of 3600 x 1200 grid points and a grid spacing of 0.1°. In this projection, averaging over regions at different latitudes and longitudes can be done in a straightforward manner. In the same step, the velocities were transformed from v x , v y to the longitudinal and latitudinal velocities v lon , v lat . We then applied a Gaussian filter with a width of 0.4°and subsampled to the grid spacing of 0.4°used by Löptien et al. (2017). This processing step differs from Löptien et al. (2017), in order to remove aliasing from the data.

Processing and validation of the flow data
In the flow maps, individual pixels occasionally show unrealistically high values. This affects about 10 pixels per map, mostly close to the limb. We speculate that these arise from the decreased contrast at the limb in combination with the evolution of the granules. To mitigate their effect, we replace outlier pixels that are outside five standard deviations of the mean of the velocity distribution by the mean value of pixels surrounding them in a box of 5 pixel × 5 pixel.
In addition to the systematic effects described above (the orbital motion of SDO and the shrinking-Sun effect), the LCT data exhibits another long-term, large-scale modulation: At the beginning of the time series in 2010, the longitudinal flow component v lon has a bias towards prograde (retrograde) velocities in the eastern (western) half of the disk, and the latitudinal flow component v lat has a bias towards southward (northward) velocities in the northern (southern) half of the disk, mostly at latitudes above 30°. Over the years, this modulation changes, such that the hemispheric biases of the two flow components have reversed sign around the year 2015. The amplitude of this modulation is about 10 m s −1 in a half-year average. The reason for this modulation is not fully understood. Its time scale does not correspond to any known periodic effects, such as the B angle or the orbital motion of HMI.
To correct for this large-scale modulation, we translate all velocity maps from Carrington longitude to Stonyhurst longitude, create running time averages over half a year of data each, centered ten days apart, and subtract these from every individual map from five days before until five days after the central time of the average. The corrected maps are then mapped back to Carrington longitude. We choose an averaging time of half a year because a shorter time scale would potentially subtract real flows, and a spacing of ten days to avoid drastic changes between two blocks of subtraction. The average maps are smoothed with a Gaussian of σ = 0.8°to reduce small-scale effects.
With the above processing, which was necessary to address the various systematic errors (the orbital motion of HMI, the shrinking-Sun effect, the long-term modulation), the velocities that we measure throughout our analysis are therefore relative to velocities v lon = 0, v lat = 0, which are the half-year time averages at each location in Stonyhurst longitude and latitude.
As a test of the above processing, we want to make sure that the Zernike subtraction outlined by Löptien et al. (2017) does not subract the kind of flows we want to study together with the systematic effects. To evaluate how much this procedure alters the inflows, we create one Carrington rotation of synthetic flow data with inflow features and apply the same filtering on them as was done on the actual data. Appendix A describes this test. We find that the extent and amplitude of the synthetic flows is well preserved, with changes of less than 5 m s −1 for model inflows with an amplitude of 50 m s −1 . Small deviations are introduced to the data, with amplitudes of less than 5 m s −1 .
As another test of the LCT flows used in this study, we carry out a control against an independent data set. In this test, we project the LCT flows to the line-of-sight component and compare them to flow measurements obtained from direct Doppler images from SDO/HMI. Appendix B describes this test. The Dopplergrams only provide one velocity component, and therefore are not suited as the main flow measurement of this study. We find that in general, the two methods agree well, with correlations exceeding 0.8, except for in close proximity (2°) to sunspots. This might be due to the different depths at which the two measurements are taken, as well as different sensitivities in the presence of strong magnetic field.
To use the LCT flow data on the sample of emerging active regions and control regions, for each region we extract data cubes of the two flow components v lon , v lat , which are centered at the position of the region in Carrington longitude and latitude given in the HEAR survey, and span ±30°in both longitude and latitude. The cubes start (end) at the full or half hour before (after) the first (last) observation time given in the HEAR survey, respectively. This is because the HEAR survey defined the emergence time from the HARP vector magnetogram observations that have a cadence of 12 minutes, whereas the LCT data has a cadence of 30 minutes, with data at full and half hours. For the same reason, in this work we define the time of emergence as the first full or half hour after the time of emergence t 0 in the HEAR survey. We chose this to make sure that the emergence has occurred for all regions consistently.
The typical observation period of an individual region is about nine days (450 frames). The maximum difference between the start (end) of the observations and the time of emergence is seven days (343 frames) before (after) emergence (leaving about two days of observations after (before) emergence). The averaging over ensembles of active regions is done relative to the time of emergence (see Sect. 4). The total duration of observations therefore is 14 days, with a varying number of active regions at each time relative to the time of emergence. The parts of each frame that are off the visible disk are not included in any averaging.
For the further analysis of the characteristics of the EARs, for each region we generate magnetogram and continuum intensity data cubes that match the setup of the flow data described above. We use the same observation start and end times, with the same 30 minute cadence as for the flow maps (described above) on the continuum intensity series hmi.ic_45s to create the intensity data cubes. The data is remapped to Plate Carree projection, with a grid spacing of 0.1°. This is four times higher than that of the flow maps, and chosen for both continuum and magnetogram data to preserve small-scale information. We correct for limb darkening and normalize the continuum data cubes by convolving each frame with a Gaussian of σ = 5°, to retrieve the large-scale variation in the frame (the limb darkening profile), and then dividing the frame by this background. For the magnetograms, we extract the full-vector magnetic field from hmi.b_720s. From the vector field components, we calculate the radial field B z , using the transformation given by Gary & Hagyard (1990). For pixels where the field azimuth is not calculated with the minimum-energy algorithm, we use the random field disambiguation . We then average together the 5 frames of the 720 s radial field B z data within one hour.
The data of each EAR thus consists of three-dimensional data cubes (time, longitude, latitude) for the flow components in longitudinal and latitudinal direction, the radial magnetic field, and the continuum intensity. Each cube has frame sizes of 60°× 60°in longitude and latitude, centered at the Carrington longitude and latitude of the AR given in the HEAR survey, and covers about 14 days, centered on the time of emergence, with data coverage of the visible disk for around nine days. The cubes Article number, page 3 of 16 A&A proofs: manuscript no. Evolution_of_solar_surface_inflows_around_emerging_active_regions of the velocity maps and the continuum intensity have a cadence of 30 minutes, the cubes of the radial magnetic field have a cadence of 1 hour.

Measuring the location of the magnetic polarities
Over time, the regions move relative to the position given in the HEAR survey. To perform a sensible ensemble average of the active regions relative to their center or their leading or trailing polarity, we measure the positions of the leading and trailing polarities in longitude and latitude in each one-hour radial magnetic field average. To reduce sensitivity to small-scale structure within the AR, for this measurement we smooth the magnetograms with a Gaussian of σ = 0.4°.
As an initial guess, for each of the 182 AR we estimate the position of the polarities at one day after emergence by eye, at which time all regions show a clear bipolar structure. The procedure for each time frame in each of the 182 data cubes is then as follows: -Estimate the fluctuation in the radial magnetic field as the standard deviation in a box of 12°× 12°around the position in Carrington longitude and latitude from the HEAR survey. The magnetic field of the AR is part of the estimate. The size of the box is chosen to exclude other AR from the field of view which would bias the estimate. -Consider the negative and positive polarities separately. Depending on the AR, the positive (negative) field corresponds to the leading (trailing) polarity or vice versa. -Identify regions as neighboring positive (negative) valued grid points (for brevity called pixels in this context) above (below) a threshold value, taken as 1.5 times the standard deviation. We use this adaptive threshold because the levels of magnetic activity in the vicinity of the individual ARs differ. The factor of 1.5 is applied to further reduce detection noise and is a result of trial and error. -Exclude regions which are only one pixel in size. These are considered to be too small for a reliable polarity position estimate. -Exclude regions whose flux-weighted center is further away from the initial (or first-guess, see below) position than ±2°, to avoid selecting unrelated surrounding magnetic field. This is important especially during emergence, where the field of the AR is still weak. -Of the remaining regions: Identify the one which has the largest absolute sum in B z . Define the position of the active region polarity as the longitude and latitude of its fluxweighted center.
This method is iterated forward and backward in time, starting from the frame one day after emergence, where we identified a first-guess position by eye. For each frame the position from the previous frame is used as first guess.
With this procedure, we determine the positions in longitude and latitude of each polarity for each EAR at each time step. Before emergence, by definition there is only very little magnetic flux of the active region present, which makes a position measurement unreliable, with the risk of identifying surrounding small-scale field. For consistency, we therefore set the positions in all frames before emergence to the values at emergence. We point out that other choices could be made, for example extrapolating the post-emergence motion to the times before emergence. This would however require additional assumptions on the proportionality of the motion over a certain (flux dependent) time interval, and is therefore not attempted here. In some cases, the inferred positions in a single frame differ largely from those before and after, for example when a connection between two regions is established by pixels which have magnetic field strengths just above the threshold in the particular frame, and below in the previous and the following frame. Such artifacts will be present for any choice of threshold. To mitigate this, we smooth the time series of positions with a Gaussian of width σ = 3 hours. We define the position of the center as half the distance between the leading and the trailing polarity positions. The total motion of an AR over the course of the observations is typically a few degrees. The largest displacements are on the order of 6°for ARs at high latitudes, at which the difference between Carrington rotation rate and solar rotation rate at the surface is largest.
Other methods to infer the position of AR polarities are possible. To estimate the difference that this might make for our analysis, we apply the method of Schunker et al. (2019). Their algorithm iteratively calculates the field strengths in the surroundings of all grid points, and selects that patch as the polarity which lies closest to the one in the previous frame. Since in our method we select the strongest patch instead, this can lead to differences between the two methods, because in some ARs, the dominant patch of magnetic field of one polarity is replaced later on by another patch. Appendix C compares the two methods. We applied the positions of both methods to ensemble averages of the flow data (see following sections) and found a standard deviation of the difference of the methods of below 2 m s −1 close to emergence, and no systematic difference with respect to the inflows we want to investigate.

Weight maps for ensemble averages
When performing ensemble averaging, we have to take into account that at a given time relative to the time of emergence, the active regions have different disk positions and therefore different background noise levels, as the noise increases towards the limb. We account for this with weighted ensemble averages. For this, we create data cubes of weight maps for all EARs and control regions in the following way: Article number, page 4 of 16 N. Gottschling et al.: Evolution of solar surface inflows around emerging active regions First, we map the full disk velocity field v lon , v lat from the Plate Carree grid with 900 x 300 grid points and a grid spacing of 0.4°(see Sect. 2.2), from Carrington coordinates into heliocentric Cartesian coordinates (Thompson 2006), centered on the Carrington longitude and latitude at disk center. Transforming into this system accounts for the B angle and the P angle. We map with a linear interpolation onto a fixed grid with 300 x 300 grid points and a grid spacing of 0.006 solar radii (R )(corresponding to about 4.2 Mm at disk center). The frame size is 1.8 R , which is slightly larger than the LCT data coverage (the 60°from disk center of the LCT data corresponds to 2 sin π /3 = 1.73 R ).
We did this for the whole ten years of LCT data. We calculate the average flows v lon and v lat , and a map of the standard deviation of each flow component (see Fig. 1) on the fixed grid in heliocentric Cartesian coordinates. The velocities in the average maps are on the order of 10 m s −1 , which is much smaller than the velocities in an individual map (on the order of 1000 m s −1 ). This is because the velocities are equally distributed around zero, with the dominant noise from granulation and supergranulation.
For the longitudinal component, the standard deviation increases mainly towards the east and west limbs. For the latitudinal component, the northern and southern polar regions show the largest standard deviation.
We generate full disk weight maps by mapping the standard deviation σ of v lon and v lat back onto the Plate Carree projection in Carrington coordinates. From these, we create data cubes of the weight maps as 1 /σ 2 for each of the 182 EARs and control regions, in the same way as the data cubes of the velocity maps (see Sect. 3).

Exclusion of magnetic pixels
Local Correlation Tracking of solar granules is known to be unreliable within regions of strong magnetic field, where the granulation is attenuated. This is the case for sunspots or plage, which appear darker or brighter than the quiet Sun in continuum images. The motion that LCT tracks in these regions is therefore a combination of the motion of the granules and the motion of these features. In Appendix B, we find that the LCT flows are weaker in regions within 2°of sunspots, compared to direct SDO/HMI Doppler velocity measurements. This difference can however be attributed to physical causes as well, that is, the difference in the height of the two measurements.
To ensure that our measurements are not susceptible to this unreliability, we create additional data cubes of normalized continuum intensity for each AR, by binning to the same grid spacing as the flow maps (0.4°). We identify pixels which have values lower than 0.95 (pores, sunspots) and higher than 1.05 (plage), where the mean value of the quiet Sun is 1. In the flow maps, we exclude these pixels if they are connected to at least one other, to avoid excluding random high-valued pixels. The numbers are the result of trial and error, and represent a compromise between excluding pixels in the quiet Sun which are above/below the thresholds, and not excluding pixels which are systematically lower/higher, for example near the edge of a sunspot.

Ensemble averages
To carry out ensemble averages, we first investigate the dependence of the noise in the flow maps on spatial smoothing, temporal averaging, and the number of ARs in the ensemble average, to identify parameters that yield appropriate noise levels for our analysis. Previous studies reported active region inflows with velocities on the order of several tens of m s −1 . We therefore want the noise to be well below 10 m s −1 , while still being able to investigate flows on small spatial scales and without losing much temporal resolution.
For this, we use the sample of control regions of the HEAR survey, where no systematic flows are present. The data cubes of these are prepared in the same way as those of the EARs, with the center of the 60°× 60°maps on the coordinate given in the HEAR survey throughout the observations. For this test, we crop the frames to the inner 40°× 40°, which yields a more representative value, since the errors increase towards the limb and fewer control regions contribute to the average. Like the active regions, the quiet Sun regions have different disk positions at a given time relative to the assigned 'time of emergence'. For a representative estimate, we start at one day before the time of emergence, where the frames from the control regions will on average be close to disk center, and compute the average of N subsequent frames, in the range of N = 1 (no time average, 30 minute resolution) to N = 96 (2 days average). These are additionally smoothed by Gaussians of σ = 0.4, 0.8, 1.2, 1.6, 2.0 and 4.0°. We calculate the median of the standard error over all individual control regions. Secondly, we carry out the same calculation, at a fixed Gaussian N to the case of Gaussian smoothing of σ = 1.6°. The noise decreases slower than the fit, because subsequent maps are correlated due to the lifetime of supergranules. This holds for all cases. The bottom panel of Fig. 2 shows that for an ensemble average of fewer than 45 regions, temporal averaging of more than 2 days is necessary at this level of spatial smoothing. For an ensemble average of 45 regions, a time average of 12 hours is acceptable.
We use these values for spatial smoothing and temporal averaging for the ensemble averages in our analysis. The temporal averaging is aligned relative to the time of emergence, such that the frame of emergence is the first frame in the first time step after emergence. For the ensemble averaging, Hale's law and Joy's law are taken care of by reversing the latitudinal direction of all data cubes (magnetic field, continuum intensity and both v lon and v lat flow components), and the sign of the magnetogram and the v lat flow component, for regions in the southern hemisphere.
In addition to the flows around active regions, the flow maps exhibit signatures from the moat flow, due to the presence of sunspots in the sample (Sheeley 1972). To better understand their impact on the flow maps, we perform ensemble averages centered on the leading polarity (where we expect the strongest moat flow), in subsamples of the active regions, using a classification scheme to identify whether there is a sunspot with a clear penumbra present in an AR at that time or not. Appendix D describes this test. We find that a moat flow is present in our flow data for regions which have a clear sunspot, with typical velocities of about 150 m s −1 . The moat flow is not symmetric in this early phase of the active region, as the sunspots still grow, in agreement with Vargas Domínguez et al. (2008).

Flows as a function of time and magnetic flux
We investigate the evolution of the flows around active regions of different maximum values of magnetic flux. For each of the 182 ARs, we retrieve the maximum total unsigned flux from the hmi.sharp_720s series keyword USFLUX, in the period covered by the HEAR survey. USFLUX records Σ|B z |dA, with B z the radial magnetic field component and A the area of the SHARP region (Bobra et al. 2014). From this, we divide the sample into four subsamples, sorted by the maximum unsigned magnetix flux, with two subsamples containing 45 and two subsamples containing 46 EARs.
The left panel of Fig. 3 shows the distribution of maximum USFLUX values of all ARs, together with red-dotted lines indicating the boundaries between the four subsamples. The subsamples have mean total unsigned magnetic flux values of 1.5 × 10 21 Mx, 3.1 × 10 21 Mx, 7.0 × 10 21 Mx and 1.5 × 10 22 Mx, respectively. Fig. 4 shows several twelve-hour time steps for all four subsamples. The first row shows the time step of the flows at −18 hours (between 24 and 12 hours before emergence). At this time, all subsamples exhibit a converging flow towards the center of the AR. The velocities are around 20 to 30 m s −1 . The noise level is about 10 m s −1 , as discussed in Sect. 3.5. This converging flow is consistent with Birch et al. (2019). For the two higher-flux subsamples, the converging flow is spatially more extended than for the lower-flux ones. Shortly after emergence, the lowest-flux subsample still shows small-scale inflows, whereas they have ceased for the other subsamples. The three subsamples with higher flux show a prograde flow with velocities of about 40 m s −1 at the position of the leading polarity, similar to the findings by Birch et al. (2019). This feature is not present in the lowest-flux subsample. About two days after emergence (third row in Fig. 4), inflows towards the second-lowest flux subsample set in. For the third subsample, they are clearly visible after about three days, while the highest-flux subsample still shows mostly a strong moat flow at this time. For the highest-flux subsample, the inflows only set in at around four days after emergence. From this, we conclude that for large active regions, the inflows take longer to form.
After six days, the EARs of the lowest-flux subsample have already mostly decayed. For these small AR, inflows are seen throughout their lifetime, and decay along with the magnetic field. Over its lifetime, the lowest-flux subsample never exhibits diverging flows, neither moat flows (due to the lack of sunspots with clear penumbra), nor a prograde flow at the leading polarity of the kind observed by Birch et al. (2019).

Flows as a function of time and latitude
We examine whether the flows associated with the EARs depend on the latitude at which the EARs emerges. For that, we divide the sample of 182 ARs into four subsamples of 45 or 46 EARs each, sorted by their unsigned latitude |λ| (as provided by the HEAR survey). Each subsample in |λ| therefore contains regions with a variety of maximum total unsigned flux values.
The right panel of Fig. 3 shows the distribution of unsigned latitudes of all EARs, together with lines indicating the boundaries between the four subsamples.  Fig. 3). The labels, color scales and arrow scales are the same as in Fig. 4. The diverging flow in the bottom row of the third subsample is due to a strong moat flow in AR 11158, which is the largest AR in this subsample at this time.
wards the higher-latitude subsamples and in time. There appears to be no striking systematic change in the inflows as a function of latitude.

Flows averaged over the full sample as a function of time
While the sections above outline the evolution of the flows in subsamples of active regions with respect to flux and latitude, we want to put this into the context of the average EAR over the whole sample. Appendix E provides additional plots of the average flows around the time of emergence as well as averages in longitude covering the leading and the trailing polarity separately. Overall, the full ensemble average again exhibits converging flows starting one day prior to emergence. The inflows have a maximum extent of about 7°from the center of the AR, in the time between emergence and seven days after emergence, and amplitudes of about 50 m s −1 , at a noise level of about 5 m s −1 .

Quantitative model of the inflows as a function of time and magnetic flux
We want to make a quantitative description of the inflows around active regions and to evaluate the differences found in the fluxbinned subsamples (Sect. 4.1). For this, we fit a model to the flows. The main contribution of the inflows is along the latitudinal axis. We therefore average the latitudinal flows over the central 5.6°in longitude for each time step and each subsample. This range is chosen to cover a large part of the AR, while excluding most of the moat flow signature that is present in the high-flux subsamples. We do not attempt to fit a separate model to the moat flow, as the sunspots and with them the moat flow in the individual active regions form at different times relative to the time of emergence, and in some cases break up before the end  Fig. 4. The solid blue line shows the model fit of two Gaussians with opposite polarity (see text for details). The dashed blue lines show the 1σ error estimates of the fit. The red markings indicate the fit parameters of the Amplitude A (in m s −1 ), the peak position µ (relative to the center of the active region along latitude, ∆ Lat), and the standard deviation σ (in degree) of both Gaussians. The gray shaded region indicates the rms of the flow data. The fit parameters of the two Gaussians slightly deviate from the fitted curve due to superposition; this is however small compared to the error.
of the observations (cf. the lower panels for the second-highest flux subsample in Fig. 4).  The model we apply consists of the sum of up to two 1D Gaussians with positive/negative amplitude A, position µ and width σ. As an estimate of the background noise, we compute the standard deviation over the same 5.6°in longitude, for subsamples of the control regions corresponding to the AR subsamples. If the amplitude of a fitted Gaussian is smaller than 1.3 times the noise, it is discarded. The converging flows before emergence tend to be less spatially extended than the inflows after emergence. We therefore confine the region of the fits before emergence to about 2°in latitude, in order to avoid fitting of large supergranules.
To estimate realistic errors on the fit parameters, we add the model fit onto the control region frames and rerun the model fitting on these. The control region flow maps have the same properties (spatial scale, amplitude) as the active region flow maps, and provide a background noise that is uncorrelated with the background in the active region maps. Averaging over all realizations of this procedure and taking the standard deviation of each parameter from these realizations yields an error estimate for the model parameters. Of the 30 time steps of the control re-gion subsamples, the first and last five are not used because of their proximity to the limb, which would artificially increase the noise. Thus, for each model fit, there are 4 × 20 = 80 realizations that contribute to the error estimate. Fig. 6 shows an example of the model fit. The calculated errors (blue dashed lines) are of the same order as the noise in the data (gray shaded region). Fig. 7 shows the fit parameters A, µ and σ of v lat averaged over longitudes, for the four flux-binned subsamples, in the time range between one day before and six days after emergence (the last time step is left out because of low signal to noise). For the three higher-flux subsamples, the converging flows of the preemergence phase vanish shortly after emergence, such that there is no successful fit. After some days, the inflows set in. We define the onset time of the inflows as the time when the model fits start to be continuously successful. The amplitudes increase from this time on, starting from less than 20 m s −1 and reaching maximum velocities of 50 to 60 m s −1 in all cases. In the case of the lowestflux subsample, where there is no gap in the model fits, we define the onset time as the time when the amplitude of the fits starts to increase, similar to the other subsamples.  To verify that the moat flow does not affect these measurements, we performed the same analysis again, excluding those pixels from the fit which lie within the area of the moat flow for more than half of the ARs in the respective subsample. We approximate the moat flow area as regions of 2°surrounding positions with reduced continuum intensity (see Sect. 3.4 and Appendix B). This procedure leaves the model fits and onset times unaffected, except for the positive-valued Gaussian in the highest-flux subsample, where only that part of the inflow that lies outside of the central region is fitted, resulting in reduced amplitude and width of the fit.
The onset times illustrate what was discussed qualitatively in Sect. 4.1: The inflows towards the center of the active region set in at increasingly later times for increasing amount of flux of the AR, in the range between one and four days after emergence. Fig. 8 shows the onset time against the mean flux of each subsample. We point out that the subsamples differ substantially in flux range (cf. left panel of Fig. 4).
We measure the extent of the inflows from the models as the peak position µ added to the Half Width at Half Maximum (HWHM) of the fit. With this, we find inflow extents of about 5 to 7°with an error of about 1°. After five days, the inflows around the lowest-flux subsample decrease in width and amplitude. For the other subsamples, the fits illustrate that at the end of the observed time period, the flows have not reached a steady state, especially for the highest-flux subsample, where the inflows after emergence only formed two days before the end of the observational period.

Discussion
We investigated the evolution of surface flows associated with active regions before, during and up to six days after their emergence to the surface, using Local Correlation Tracking of granulation on the HEAR survey of 182 emerging active regions. We tested the processing of the flow measurements with synthetic input data and by cross-correlation with independent measurements from direct Doppler images.
About half of the active regions in the sample develop a sunspot with a clear penumbra in the observed period, which show moat flow structures with velocities on the order of 150 m s −1 in the prograde and equator-and poleward direction.
To study the evolution of the inflows around active regions, we carried out ensemble averages in subsamples ordered by total unsigned magnetic flux and unsigned latitude of the active regions as well as an average over all regions. We then fitted a model to the latitudinal flow component in the subsamples ordered by flux.
We find that AR emergence is preceded by converging flows of around 20 to 30 m s −1 , beginning one day before emergence. This agrees with results reported by Birch et al. (2019), and holds for all ARs in our sample, that is, it is independent of the maximum flux or the latitude of the AR. However, the extent of these early inflows is larger for stronger ARs.
We find that these pre-emergence converging flows cease shortly after emergence. Larger-scale latitudinal inflows form in the days following emergence. The time at which they form depends on the flux of the AR, between about one and four days, and increases with flux. There are several possibilities for the cause of this effect: It could be related to the separation speed of the polarities. Schunker et al. (2019) identified different phases of emergence with respect to the separation speed, finding that on average over all AR in the HEAR sample, the separation stops increasing at 2.5 -3 days after emergence. However, the distinction they make between regions with flux higher or lower than the median value indicates that this time also depends on the flux. It is therefore possible that the onset of the inflow coincides with the end of the separation increase. Another possibility is that it could depend on the relative amount of emerged flux, such that the inflows set in at a certain time relative to the peak flux. It could also depend on the ratio of flux within sunspots vs. flux within plage, since the proposed mechanism for driving the inflows depends on the plage (Spruit 2003). Further investigation will be necessary to clarify this.
Our ensemble averages show a maximum extent of the inflows of around 7°, which is somewhat smaller than the 10°reported recently by Braun (2019). Typical velocities during the later stages are around 50 m s −1 , which is at the high end of previously reported velocities (Gizon et al. 2001;Haber et al. 2004;Löptien et al. 2017;Braun 2019). This is in part attributable to differences in spatial resolution; with lower resolution (by e.g. spatial smoothing with a broader Gaussian), the inflows would have lower amplitudes and larger extents. Also, our sample consists of emerging active regions, whereas previous studies analyzed well-established, long-lived active regions.
Our measurements of the flows show that they are still evolving at the end of the observed time frame. Observations covering the transition to a steady state are needed to shed more light on the physical context of the flows. Since the observation of the emergence is necessary for an accurate age determination of an active region, the limitation to seven days before and after emergence is a constraint which is at present difficult to circumvent. One possibility is to add observations from different heliographic longitudes, from for example the recently launched Solar Orbiter, which increases the time that an active region can be tracked continuously. Another option is to track the ARs from the HEAR sample to the next rotation. This is however limited to the very long lived regions, and leaves an observational gap of more than two weeks.

Appendix A: Test of the effect of Zernike subtraction on the flow features
We want to test the effect that the processing applied to the data (Sect. 3.1), specifically the subtraction of the fitted Zernike polynomials Z m n , has on flows resembling the inflows towards active regions that we investigate. For this, we construct synthetic flow maps, fit the Zernike polynomials to these maps, and subtract the Fourier-filtered components of the fitted Zernike time series. This is done in the same way as for the actual LCT data. We create one Carrington rotation of synthetic data, with frames set 30 minutes apart, by rotating the grid 0.272 degrees from one time step to the next. This corresponds to a rotation period of 27.5735 days.
To create the data, we impose synthetic flows on a Plate Carree coordinate grid (longitude, latitude) of size 450 x 450 with a grid spacing of 0.4°. The B angle B 0 and the P angle Φ 0 are assumed to be zero, for the sake of simplicity. As a model for the flows, we use 2D Gaussian derivatives in the longitudinal direction (for the longitudinal flow component) and in the latitudinal direction (for the latitudinal flow component). We create two different setups, a periodic case and a random case. In the periodic case, the flows are equally distributed on a grid along the equator and the central meridian. In the random case, the flows are distributed randomly within ±45°latitude. The peak velocity of the flow features is 50 m s −1 in all cases. We ran two different cases for both the periodic and the random setup, with widths of the flow features of approximately 8 and 12°(which is where the velocities drop below 20 m s −1 ). In addition, we ran the same tests with additional moat flows imposed on the synthetic inflows, which are realized as Gaussians with widths of 1.2°and peak velocities of 500 m s −1 . This is done to test whether the small-scale moat flow introduces additional deviations or leaking into the inflows.
First, we transform the velocities of the flow maps from m/s to pixel/s. This is done by inverting the equations given in the Appendix C of Löptien et al. (2017): where we follow the definitions and nomenclature of Löptien et al. (2017). We then map the Plate Carree coordinate grid (longitude, latitude) to a CCD coordinate grid (x, y) of size 1024 x 1024 grid points, which is the effective size that the processing pipeline uses for the HMI data. The projection is done using the transformation specified by sphere2img() from the Stanford ring-diagram pipeline (Bogart et al. 2011). We fit the Zernike polynomials, Fourier-filter them and subtract them from the data. Then, we re-map from the CCD back to the Plate Carree projection. This is done using the transformation specified by img2sphere() from the Stanford ring-diagram pipeline (Bogart et al. 2011).
The last step is to retransform the velocities from pixel/s to m/s, by applying the equations from Löptien et al. (2017). Fig. A.1 shows the first time step of the random test case with inflow extents of about 12°, before and after subtraction. The lower panels show cuts through the maps for v lon and v lat . Towards the observation limb at 60°, deviations from the synthetic data are noticeable, with velocities below 5 m s −1 . Both the shape and extent of the flows are unaltered by the processing. The peak velocities are changed by less than 10 %. The tests including moat flows showed no artifacts in addition to this.

Appendix B: Comparison to flows from direct Doppler images
We compare the velocities obtained by LCT with those of direct Doppler images on an example active region. Comparisons of flows from helioseismic measurements to Doppler data were carried out by for example Gizon et al. (2000), Jackiewicz et al. (2008) and Švanda et al. (2013). For the comparison, we transform the LCT flow maps from (v lon , v lat ) to the line-of-sight (LOS) component.  applied similar transformations to calculate the spherical components from x-, y-and Doppler components. We point out that the conventions used in our derivation below differ from the ones used by .

Appendix B.1: Projection of LCT to the line of sight
We use a right-handed system in which x points towards the observer and z towards solar north. In this system, latitude λ is defined between [−90 : 90] and longitude ϕ is defined between [0 : 360). Disk center is thus at λ = 0°, ϕ = 0°(when the B angle B 0 = 0°). The basis vectors of this system are: The transformation between Cartesian and spherical coordinates follows as In addition, the rotation due to the changing B angle B 0 has to be corrected for. This is a rotation around the y-axis. Thus To compare the Doppler images to the line-of-sight projected LCT velocity images, large-scale systematics depending on disk position need to be removed. In addition, the data sets need to share a common frame of reference as well as temporal and spatial resolution. First, we apply the same mapping procedure as was used for the magnetograms and the intensity maps to the full-disk Dopplergrams from hmi.v_720s. This creates a Plate-Carree mapped cube for each EAR, with frames of 60°× 60°and a grid spacing of 0.1°. The cubes are centered on the coordinate specified in Schunker et al. (2016), and cover the time period between the start and end time provided in the HEAR survey. In this step, we also subtract the background signal that stems from solar rotation, which is taken from averages over a third of a Carrington rotation (hmi.v_avg120). Each image is subtracted by the first average frame whose central time as specified by the MidTime keyword lies after the observation time as specified by the T_REC keyword.
In addition to solar rotation, the observer motion in radial direction has to be corrected for. This is done by subtracting for each frame the velocity specified by the frame's OBS_VR keyword.
Next, we average in time over one day, relative to the time of emergence, and apply Gaussian smoothing of σ = 0.4°. We then resample the grid spacing of the map from 0.1°to the 0.4°of the LCT velocity maps.
The convention for the sign of the HMI Dopplergrams is positive for movement away from the observer (redshift), and negative for movement towards the observer (blueshift). This is opposite to the convention used in the transformation in Sect. B.1, where the x axis points towards the observer. The sign of the Dopplergrams is therefore flipped for the further analysis. maps are averages over 24 hours, starting four days after the time of emergence. Fig. B.2 shows a cut along the red line in Fig. B.1. Fig. B.3 shows a density plot of the two velocities. We fit a line through the data for the relation between the Doppler and the LOS data. Because both are subject to error, we perform the fit with Orthogonal Distance Regression (ODR), with the standard error of the 24 hour time averages as errors. The errors close to disk center are on the order of 15 m s −1 for the Doppler frames and 5 m s −1 for the LOS frames, and increase outward to 40 m s −1 and 60 m s −1 , respectively. These are rough estimates, as the individual frames are not uncorrelated. We also compute the correlation between the two flow measurements.

Appendix B.3: Results
The comparison (Figures B.1, B.3 and B.2) illustrates that the Doppler velocity and the LOS velocity inferred from LCT agree well, except in the presence of strong magnetic field. Here, velocities from LCT are lower than the Doppler velocities. This has several reasons: LCT is known to underestimate velocities at strong magnetic field (Fisher & Welsch 2008;Löptien et al. 2017). Strong magnetic field suppresses convective blueshift, which results in a redshift-signature in the Doppler maps (Stix 2002). Furthermore, the two measurements are taken at different depths (Jackiewicz et al. 2008).
When excluding regions of 2°around positions where the LCT flows are excluded due to reduced or increased continuum intensity (see Sect. 3.4), the systematic outliers are ruled out and the slope of the fit between the Doppler-and the inferred lineof-sight velocities is closer to 1 (cf. Fig. B.3). The correlations between the Doppler-and the LOS velocities are between 0.7 and 0.9 for all observed one-day averages.
The intercept of the fit is in the range of −11 to −16 m s −1 for all one-day averages, meaning that the line-of-sight velocities are systematically lower than the Doppler velocities (i.e. the Doppler velocities have a preference towards upflows at disk center). Since the projection to the line of sight only takes into account v lon and v lat , the line-of-sight velocity is necessarily zero at disk center, whereas the Doppler velocity is sensitive to any motion along the line of sight, including radial motions. Several studies investigated the rms velocity of Doppler images at disk center and the radial velocity of supergranulation, and reported velocities between 5 m s −1 and 15 m s −1 (Giovanelli 1980;Hathaway et al. 2002;Duvall & Birch 2010). Our findings are in accordance with this.  Same as left, with colored overlay of pixels for which |B z | is higher than 10 Gauss. Colors indicate no intensity darkening (green), penumbra (yellow) and umbra (red) (see text for definition). The blue cross indicates the center of the largest umbral region. The time step shown here has a sunspot classification SSQ=0.
ries of six-hour averages of each active region in the sample according to the presence of a sunspot with clear penumbra, which would potentially host a moat flow. We then ensemble average the resulting subsamples to identify the moat flow signature. The classification is done with the following procedure: -Crop to the inner 12°× 12°of the normalized continuum intensity frames and the magnetograms around the position of the leading polarity. -Create a mask of all pixels with the absolute radial field |B z | below 10 Gauss and apply that mask to the continuum frame. This is used as the total number of pixels corresponding to the active region. -Count the number of pixels in the normalized continuum intensity frame that fall in bins of below 0.6 (corresponding to umbra), between 0.6 and 0.95 (penumbra), and above 0.95 (no intensity darkening). The threshold at 0.95 is the same as was used for the exclusion of pixels in Sect. 3.4, the one at 0.6 is a result of testing.
Connected regions of pixels with normalized intensity below 0.6 are identified as umbrae. For the largest one, the total pixel count of umbra and the percentage of penumbra in a box of 2.2°a round the center of the umbra are calculated. Each time step is then classified into one of five categories with the following scheme: -4 (No intensity darkening) if the total number of pixels in the umbra is zero and the total percentage of pixels in the penumbra, relative to the total number of active region pixels, is less than 1. -3 (Small pore) if the total number of pixels in the umbra is zero and the total percentage of pixels in the penumbra is larger than 1. -2 (Large pore, no penumbra) if the number of pixels in the umbra of the largest spot is less than twelve and the percentage of penumbra around this umbra is less than 20. -1 (Large pore, with some penumbra) if either the number of pixels in the umbra of the largest spot is less than twelve and the percentage of penumbra around this umbra is larger than 20, or if the number of pixels in the umbra of the largest spot is larger than twelve and the percentage of penumbra around this umbra is less than 20. -0 (Clear penumbra) if the number of pixels in the umbra of the largest spot is larger than twelve and the percentage of penumbra around this umbra is larger than 20.
This classification scheme was derived and checked by inspecting several ARs. A table of all time steps between -2 and +7 days for all 182 active regions is available online. Fig. D.1 shows an example of the classification. The colored regions in the right panel indicate pixels with an absolute radial field above 10 Gauss, which we in this context associate with the AR. The small features at large distances from the sunspots do not contribute significantly.
Fig. D.2 shows an example time step for averages over three different sets of data: All 161 active regions for which data is available at this time relative to emergence, only those regions without a sunspot with clear penumbra (classifications of 1,2,3 and 4), and only those regions with a sunspot with clear penumbra (classification of 0). It is apparent that the diverging flow at the leading polarity is connected to the presence of developed sunspots with a clear penumbra, which is consistent with the moat flow. The velocities are around 150 m s −1 , which is lower than typical moat flow velocities observed in evolved sunspots. A reason for this could be that the sunspots in our sample are just forming, together with the (in this context) comparatively large grid spacing of 0.4°and additional Gaussian smoothing of σ = 0.8°of our flow data, which reduces localized peak velocities.
Also, the moat flow seen here is not symmetric: The westward (i.e. prograde) component is stronger than the eastward (retrograde) component. The components in the north-and southward directions have approximately the same strength as the westward component. Švanda et al. (2014) also found an asymmetry between the east-and westward moat flows.
Our findings are in agreement with Vargas Domínguez et al. (2008), who showed that moat flows are predominantly present in the directions where there is penumbra. The sunspots in our sample are only just forming, with additional flux joining the sunspot from the emergence site, that is, the retrograde direction. Once the accumulation of flux is ended and the penumbra surrounds the umbra in all directions, the moat flows become more symmetric. At later times however, the number of ARs that host a spot decreases, leading to high background noise.