Statistical properties and correlation length in star-forming molecular clouds

E. Jaupart; G. Chabrier

doi:10.1051/0004-6361/202141084

Home

All issues

Volume 663 (July 2022)

A&A, 663 (2022) A113

Full HTML

Open Access

Issue		A&A Volume 663, July 2022


Article Number		A113
Number of page(s)		20
Section		Interstellar and circumstellar matter
DOI		https://doi.org/10.1051/0004-6361/202141084
Published online		22 July 2022

A&A 663, A113 (2022)

I. Formalism and application to observations

E. Jaupart¹ and G. Chabrier¹^,2

¹ École normale supérieure de Lyon, CRAL, Université de Lyon, UMR CNRS 5574, 69364 Lyon Cedex 07, France
e-mail: etienne.jaupart@ens-lyon.fr
² School of Physics, University of Exeter, Exeter, EX4 4QL, UK
e-mail: chabrier@ens-lyon.fr

Received: 14 April 2021
Accepted: 12 May 2022

Abstract

Observations of molecular clouds (MCs) show that their properties exhibit large fluctuations. The proper characterization of the general statistical behavior of these fluctuations, from a limited sample of observations or simulations, is of prime importance to understand the process of star formation. In this article, we use the ergodic theory for any random field of fluctuations, as commonly used in statistical physics, to derive rigorous statistical results. We outline how to evaluate the autocovariance function (ACF) and the characteristic correlation length of these fluctuations. We then apply this statistical approach to astrophysical systems characterized by a field of density fluctuations, notably star-forming clouds. When it is difficult to determine the correlation length from the empirical ACF, we show alternative ways to estimate the correlation length. Notably, we give a way to determine the correlation length of density fluctuations from the estimation of the variance of the volume and column-density fields. We show that the statistics of the column-density field is hampered by biases introduced by integration effects along the line of sight and we explain how to reduce these biases. The statistics of the probability density function (PDF) ergodic estimator also yields the derivation of the proper statistical error bars. We provide a method that can be used by observers and numerical simulation specialists to determine the latter. We show that they (i) cannot be derived from simple Poisson statistics and (ii) become increasingly large for increasing density contrasts, severely hampering the accuracy of the high end part of the PDF because of a sample size that is too small. As templates of various stages of star formation in MCs, we then examine the case of the Polaris and Orion B clouds in detail. We calculate, from the observations, the ACF and the correlation length in these clouds and show that the latter is on the order of ~1% of the size of the cloud. This justifies the assumption of statistical homogeneity when studying the PDF of star-forming clouds. These calculations provide a rigorous framework for the analysis of the global properties of star-forming clouds from limited statistical observations of their density and surface properties.

Key words: methods: statistical / ISM: clouds / Oort Cloud / ISM: structure / ISM: kinematics and dynamics

© E. Jaupart and G. Chabrier 2022

Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe-to-Open model. Subscribe to A&A to support open access publication.

1 Introduction

Observations of molecular clouds (MCs) show that their main properties (velocity and column-density) exhibit large fluctuations. These fluctuations are at the heart of the star formation process (Padoan & Nordlund 2002; Mac Low & Klessen 2004; Hennebelle & Chabrier 2008; Hopkins 2012), implying that knowledge of their statistical characteristics is of prime importance. The accurate determination of the statistics of any quantity must rely on either a large enough number of samples or a large enough sample, so that a natural question arises: can we derive accurate statistical properties of MCs from observations, and if so how can we evaluate the level of accuracy? The relevance of a general statistical analysis of the global properties of MCs (e.g., mass, density PDF, temperature and velocity dispersion) deduced from observations and numerical simulations for studies of star formation processes must be assessed properly. For example, all of the theories that aim at determining the mass spectrum, that is the initial mass function (IMF) or the star formation rate (SFR) in a molecular cloud, rely on the assumption that a restricted number of observations or numerical simulations are representative of any MCs with similar properties (see e.g., Hennebelle & Chabrier 2008; Hopkins 2012; Vázquez-Semadeni et al. 2019 and reference therein). This key assumption must be tested.

Indeed, in studies of star formation based on observations or numerical simulations, one only has access to a small number of samples (and in reality only one most of the time). Therefore, in order to evaluate the statistics of the various stochastic fields of interest, one makes the basic assumption, sometimes called the “fair-sample hypothesis”, that the available sample is large enough for volumetric (or time) averages to be meaningful (see e.g., Peebles 1973 for a discussion in the context of cosmology). This assumption is only valid for stochastic fields that are statistically homogeneous and ergodic (Papoulis & Pillai 1965). Here, one should note that statistical homogeneity must not be confused with spatial homogeneity (we come back to this point below). The assumption of statistical homogeneity has been adopted by many authors, for example in studies of turbulent flows with or without self gravity (Chandrasekhar 1951a,b; Batchelor 1953; Pope 1985; Frisch 1995; Pan et al. 2018, 2019a,b; Jaupart & Chabrier 2020, 2021) and in cosmology for studies of the dynamics of structures in the Universe (Peebles 1973; Heinesen 2020). This assumption, however, provides no information on the magnitude of fluctuations around the average.

Quantifying (i) whether the “fair-sample hypothesis” is correct and, if so, (ii) what are the statistical error bars that derive from it, is an important issue when addressing PDF determinations in star-forming clouds. This is related to the completeness of the observations which we reformulate in this study in terms of statistical accuracy. In the context of the PDF of column-densities, (Alves et al. 2017), for instance, made an attempt to illustrate this problem by introducing the concept of open and closed density contours. These authors suggest that complete observations, that is considered as statistically significant and worth studying, correspond to closed contours. Although interesting, such an approach, however, can not be considered as a robust statistical determination of the bias and the statistical errors corresponding to incomplete observations of a cloud PDF. It is one of the very aims of this article to provide such a robust statistical analysis, using standard tools of random field theory and signal processing. Notably, one of the goals of the paper is to identify the statistical properties of the density field in a cloud inferred from column density data, and to derive procedures, based on the aforementioned tools, to accurately estimate these properties, whatever the PDF (lognormal or not). As such our approach does not make any assumption about the initial driving mechanism of the random motions responsible for the PDF of a cloud: turbulence or gravity.

The proper and standard way of addressing the previous issue relies on Ergodic theory. Ergodic Theory allows one to circumvent the problem of dealing with a single sample and to derive a robust measure of the accuracy of field statistics derived from the available data. In the present context, it also enables us to assess and quantify the relevance of a statistical approach on the evolution of star forming MCs. The key quantity is the correlation length, which is defined in terms of the integral of the auto-covariance function (see e.g., Papoulis & Pillai 1965). The fundamental result is that ergodic estimates are accurate if the dimensions of the sample, i.e. a whole cloud or part of it, are large enough compared to the correlation length. A proper determination of the correlation length in MCs is therefore of prime importance.

In pioneer works, (Scalo 1984); (Kleiner & Dickman 1985) studied the correlations of centroid velocities of the ρ-Oph and Taurus complex respectively and only found evidence of weak correlations at short scale on the order of their resolution. (Kleiner & Dickman 1984) studied the correlation of the column density field of the Taurus complex in search of a statistically significant length scale characterizing the separation of condensations within the complex but did not perform an evaluation of the correlation length, as defined above.

The objectives of this article are twofold. First, our main objective aims at examining the relevance and validity of a statistical approach based on ergodic theory to study the stochastic fields of star-forming MCs. Second, we seek to identify which statistical properties of the density field can be inferred from column density data and we derive procedures to obtain accurate estimates of these properties. The article is organized as follows. In Sect. 2, we outline the mathematical framework that yields the definition of the auto-covariance function (ACF) and correlation length of any statistical sample. In Sect. 3 we derive ways to determine the correlation length of any stochastic field without having to compute the ACF. In Sect. 4, we examine the case of astrophysical stochastic fields induced by compressible turbulent motions. In Sect. 5, we focus on the case of star-forming clouds and on the ways to infer the statistics of these fields from observations of column-densities. In Sect. 6, we apply our calculations to the typical star-forming cloud Polaris. We identify artifacts that are generated when one uses the statistical properties of the column-density field to infer those of the real density field; we show how to reduce these biases. In particular we derive a procedure to obtain proper error bars for the column density PDF. In Sect. 7, we examine the case of Orion B. Section 8 is devoted to the conclusion.

2 Methods: Mathematical Framework for a Statistical Approach

As mentioned in the introduction, a statistical approach of the properties of a cloud (or part of) is valid if this latter is large enough, compared to the correlation length of the quantity of interest, for the measured statistical quantities to be representative with high confidence of the genuine quantities. How to measure this confidence level and thus the relevance of a statistical approach is given by the ergodic theory, as commonly used in statistical physics or in the study of dynamical systems. Indeed, ergodicity implies by definition that different observations and realizations of a given statistical quantity yield results comparable enough for each of them to be representative of the average real quantity. It is described in the next section.

2.1 Ergodic Theory

We rederive here some ergodic theorems that lead to the definition of the correlation length. Let us consider a (scalar) stochastic field X(y), which depends on a D-dimensional position vector y (D = 1, 2 or 3). For a specific and fixed y, X(y) is a random variable of which we want to accurately determine the statistics.

2.1.1 Frequency Interpretation and Repeated Trials

The usual way to estimate the statistical average or expectation $E (X (y))$ ${\mathbb{E}}\left( {X\left( y \right)} \right)$ of the random variable X(y) is to observe N samples X(y, ω_i), 1 ≤ i ≤ N, of X(y) and to build the unbiased estimator ${\hat{X}}_{y},_{N} = \frac{1}{N} \sum_{i = 1}^{N} X (y, ω_{i}),$ ${\hat X_y}{,_N}\, = \,{1 \over N}\sum\limits_{i\, = 1}^N {X\left( {y,{\omega _i}} \right),}$ (1)

with variance $var ({\hat{X}}_{y, N}) = σ {({\hat{X}}_{x, N})}^{2} = \frac{var (X (y))}{N} = \frac{σ {(X (y))}^{2}}{N},$ ${\mathop{\rm var}} \left( {{{\hat X}_{y,N}}} \right)\, = \,\sigma {\left( {{{\hat X}_{x,N}}} \right)^2} = {{{\mathop{\rm var}} \left( {X\left( y \right)} \right)} \over N} = {{\sigma {{\left( {X\left( y \right)} \right)}^2}} \over N},$ (2)

where σ is the standard deviation (std). From Bienaymé-Tchebychev inequality (Papoulis & Pillai 1965), we know that, for any real number m > 0, $ℙ (| {\hat{X}}_{y, N} - E (X (y)) | \leq m σ ({\hat{X}}_{y, N})) \geq 1 - \frac{1}{m^{2}},$ $\left( {\left| {{{\hat X}_{y,N}} - \left( {X\left( y \right)} \right)} \right| \le m\sigma \left( {{{\hat X}_{y,N}}} \right)} \right) \ge 1 - {1 \over {{m^2}}},$ (3)

where ℙ denotes the probability of a given event. We note that this inequality is valid for any random field, whether it is Gaussian or not. Although Tchebychev inequality gives a lower limit for the probability, it allows to give a confidence interval to measure the accuracy of the estimator given by Eq. (1). The larger the number of samples N, the smaller the std $σ ({\hat{X}}_{y, N})$ $\sigma \left( {{{\hat X}_{y,N}}} \right)$ and the more accurate the estimate in Eq. (1).

In case of statistical homogeneity for field X, the expectation $E (X (y))$ ${\mathbb{E}}\left( {X\left( y \right)} \right)$ and std σ (X(y)) are no longer functions of the positions and one can drop the reference to y in Eqs. (2) and (3).

2.1.2 Ergodic Theorems, Autocovariance Function, Correlation Length

In the context of the study of MCs, one usually has only a single sample of X(y). As mentioned in the introduction, if one wants to be able to describe the stochastic fields at play, one assumes statistical homogeneity and build the unbiased estimator ${\hat{x}}_{L} = \frac{1}{L^{D}} \int_{Ω} X (y) d y,$ ${{\rm{\hat x}}_L} = {1 \over {{L^D}}}\int_\Omega {X\left( y \right){\rm{d}}y{\rm{,}}}$ (4)

where $Ω = {[- \frac{L}{2}, \frac{L}{2}]}^{D}$ $\Omega = {\left[ { - {L \over 2},{L \over 2}} \right]^D}$ is a control volume of linear size L and volume L^D, which we want to be as large as possible ¹. The ergodic estimator ${\hat{X}}_{L}$ ${\hat X_L}$ has a variance $var ({\hat{X}}_{L}) = \frac{1}{{(L)}^{D}} \int_{{[- L, L]}^{D}} C_{X} (y) ∐_{k = 1}^{D} (1 - \frac{| y k |}{L}) d y,$ ${\mathop{\rm var}} \left( {{{\hat X}_L}} \right)\, = \,{1 \over {{{\left( L \right)}^D}}}\int_{{{\left[ { - L,L} \right]}^D}} {{C_X}\left( y \right)\,\coprod\limits_{k = 1}^D {\left( {1 - {{\left| {yk} \right|} \over L}} \right){\rm{d}}y,} }$ (5)

where $C_{X} (y) = E (X (y^{'} + y) X (y^{'})) - E {(X)}^{2}$ ${C_X}\left( y \right) = {\mathbb{E}}\left( {X\left( {y' + y} \right)X\left( {y'} \right)} \right) - {\mathbb{E}}{\left( X \right)^2}$ is the autocovariance function (ACF) of X at a lag y. The stochastic field X is said to be mean ergodic if the estimator ${\hat{X}}_{L}$ ${\hat X_L}$ converges toward $E (X)$ ${\mathbb{E}}\left( X \right)$ as L → ∞ either in the mean square (MS) sense, meaning: $E ({| {\hat{X}}_{L} - E (X) |}^{2}) = var ({\hat{X}}_{L}) \underset{L \to \infty}{\to} 0,$ $\left( {{{\left| {{{\hat X}_L} - \left( X \right)} \right|}^2}} \right)\, = {\mathop{\rm var}} \left( {{{\hat X}_L}} \right)\mathrel{\mathop{\kern0pt\longrightarrow} \limits_{L \to \infty }} 0,$ (6)

or in probability meaning that, for every ϵ > 0, $ℙ (| {\hat{X}}_{L} - E (X)) | > \in) \underset{L \to \infty}{\to} 0.$ $\left( {\left| {{{\hat X}_L} - \left. {\left( X \right)} \right)} \right| > \in } \right)\,\mathrel{\mathop{\kern0pt\longrightarrow} \limits_{L \to \infty }} 0.$ (7)

Bienaymé-Tchebychev inequality (Eq. (3)) not only shows that if X is MS mean ergodic ${\hat{X}}_{L}$ ${\hat X_L}$ also converges in probability, but it also provides a confidence interval for the estimate ${\hat{X}}_{L}$ ${\hat X_L}$ . Slutsky’s theorem allows to write an equivalence for the ergodicity of X in a more convenient form: indeed, X is MS mean ergodic if and only if $\frac{1}{{(L)}^{D}} \int_{{[- L, L]}^{D}} C_{X} (y) dy \underset{L \to \infty}{\to} 0.$ ${1 \over {{{\left( L \right)}^D}}}\int_{{{\left[ { - L,L} \right]}^D}} {{C_X}\left( y \right){\rm{dy}}\mathrel{\mathop{\kern0pt\longrightarrow} \limits_{L \to \infty }} 0.}$ (8)

From there we obtain two sufficient (physical) conditions for X to be mean ergodic. Either: $\int_{R^{D}} C_{X} (y) d y < \infty,$ $\int_{{{\rm{R}}^D}} {{C_X}\left( {\bf{y}} \right){\rm{d}}{\bf{y}} < \infty ,}$ (9)

or $c_{x} (y) \underset{| y | \to \infty}{\to} 0$ $Cx\left( y \right)\mathrel{\mathop{\kern0pt\longrightarrow}\limits_{\left| y \right| \to \infty }} 0.$ (10)

We assume both, and use the common definition of the correlation length l_c(X) of the field X as a function of the ACF (Papoulis & Pillai 1965): ${(l_{c} (X))}^{D} = \frac{1}{2^{D} C_{X} (0)} \int_{ℝ^{D}} C_{X} (y) d y .$ ${\left( {{l_c}\left( X \right)} \right)^D} = {1 \over {{2^D}{C_X}\left( {\bf{0}} \right)}}\int_{{^D}} {{C_X}\left( {\bf{y}} \right)d{\bf{y}}} .$ (11)

This definition generalizes the usual definitions for 1D fields: $l_{c} (X) = \frac{1}{C x (0)} \int_{[0, + \infty]} C_{x} (y) d y = \frac{1}{2 C_{X} (0)} \int_{ℝ} c_{X} (y) dy.$ ${l_c}\left( X \right) = {1 \over {Cx\left( 0 \right)}}\int_{\left[ {0, + \infty } \right]} {{C_x}\left( y \right)} {\rm{d}}y\, = {1 \over {2{C_X}\left( 0 \right)}}\int_ {{c_X}\left( y \right)} {\rm{dy}}{\rm{.}}$ (12)

For l_c(X) «L we then have from Eq. (5) $V a r ({\hat{X}}_{L}) ≃ Var (X) (\frac{2 l_{c} {(X)}^{D}}{L}) = var (X) {(\frac{l_{c} (X)}{R})}^{D},$ $Var\left( {{{\hat X}_L}} \right) \simeq {\rm{Var}}\left( X \right)\left( {{{2{l_c}{{\left( X \right)}^D}} \over L}} \right) = {\mathop{\rm var}} \left( X \right){\left( {{{{l_c}\left( X \right)} \over R}} \right)^D},$ (13)

where R = L/2. If we compare Eq. (13) with Eq. (2), we see that instead of having the number of samples, N, we now have the ratio (R/l_c)^D, where R (or L) is usually an observationally accessible quantity. We can thus interpret the ratio (R/l_c)^D as an effective number of “independent” samples. This result is of prime importance in the analysis of fluctuations within any stochastic field.

Furthermore, the correlation length is linked to the value of the power spectrum Ƥ_X(k) of X at k = 0. Indeed $\begin{matrix} {(l_{c} (X))}^{D} = \frac{1}{2^{D} C_{X} (0)} \int_{ℝ^{D}} C_{X} (y) d y, \\ = \frac{1}{2_{D} c_{X} (0)} P_{X} (0) . \end{matrix}$ $\matrix{ {{{\left( {{l_c}\left( X \right)} \right)}^D}\, = \,{1 \over {{2^D}{C_X}\left( 0 \right)}}\int_{{^D}} {{C_X}\left( y \right)} {\rm{d}}y,} \cr { = {1 \over {{2_D}{c_X}\left( 0 \right)}}{{\cal P}_X}\left( 0 \right).} \cr }$ (14)

2.1.3 Ergodic Hypothesis and Ergodic Theory

The results of ergodic theory derived above enable us to define under which conditions volumetric averages correspond to statistical averages and to provide a confidence interval for the estimate ${\hat{X}}_{L}$ ${\hat X_L}$ of the expectation of X. However, these results rely on the knowledge of the statistical properties of X and more precisely of its ACF, which is in general not known. To apply this theory to the study of a real field, such as the density field for example, one must use ergodicity as an assumption. The above results can then be used to test the validity of this assumption and the accuracy of the estimates that are derived from it in a self-consistent way.

2.2 Estimates of the Autocovariance Function and Correlation Length

As shown in the previous section, the knowledge of the ACF of X (or of the value of the power spectrum of X at k = 0) is of crucial importance to measure the relevance of a statistical approach in studies of the properties of large (astrophysical) systems. In practice, however, the ACF of X must be evaluated from data.

2.2.1 Reliability of the Estimators of the Auto-Covariance and the Power Spectrum

In most cases, data are drawn from a finite size sample so that the ACF is not reliable at large lag (large scales). To simplify the notation, we now introduce the variable $X_{μ} = X - E (X) = X - μ$ ${X_\mu } = X - {\mathbb{E}}\left( X \right) = X - \mu$ and define the estimate, for a sample of size L, ${\hat{C}}_{X}^{L} = \frac{1}{\prod_{i} (L - | y_{i} |)} \iint \int_{- R + \frac{| y_{i} |}{2}}^{R - \frac{| y_{i} |}{2}} X_{μ} (u + \frac{y}{2}) d u$ $\hat C_X^L = {1 \over {\prod {_i\left( {L - \left| {{y_i}} \right|} \right)} }}\int\!\!\!\int {\int_{ - R + {{\left| {{y_i}} \right|} \over 2}}^{R - {{\left| {{y_i}} \right|} \over 2}} {{X_{\rm{\mu }}}\left( {u + {y \over 2}} \right){\rm{d}}} u}$ (15) $= \frac{1}{\prod_{i} (L - | y_{i} |)} \iint \int_{- L + | y_{i} |}^{L - | y_{i} |} X_{μ} (\frac{u - y}{2}) X_{μ} (\frac{u + y}{2}) \frac{d u}{2^{D}} .$ $= {1 \over {\prod {_i\left( {L - \left| {{y_i}} \right|} \right)} }}\int\!\!\!\int {\int_{ - L + \left| {{y_i}} \right|}^{L - \left| {{y_i}} \right|} {{X_{\rm{\mu }}}\left( {{{u - y} \over 2}} \right){X_{\rm{\mu }}}\left( {{{u + y} \over 2}} \right){{{\rm{d}}u} \over {{2^D}}}.} }$ (16)

This is an unbiased estimate of C_X(y) but its variance is increasing as $| y_{i} | \to L$ $\left| {{y_i}} \right| \to L$ and eventually becomes very large due to poor sampling. We thus introduce the biased estimate ${\hat{C}}_{X, L} (y) = \frac{\prod_{i} (L - | y_{i} |)}{L^{D}} {\hat{C}}_{X}^{L} (y),$ ${\hat C_{X,L}}\left( y \right) = {{\prod {_i\left( {L - \left| {{y_i}} \right|} \right)} } \over {{L^D}}}\hat C_X^L\left( y \right),$ (17)

which is still a good estimate at small scales compared to L and has a reduced variance. We note however that it is an unbiased estimator of the quantity entering the integral in Eq. (5). Finally, it is also the Fourier Transform of the periodogram S_L which is defined as: $S_{L} (K) = \frac{1}{L^{D}} {| \int_{{[- \frac{L}{2}, \frac{L}{2}]}^{D}} X (y) e^{i k . y} d y |}^{2} .$ ${S_L}\left( {\bf{K}} \right) = {1 \over {{L^D}}}{\left| {\int_{{{\left[ { - {L \over 2},{L \over 2}} \right]}^D}} {X\left( {\bf{y}} \right){e^{ik.y}}d{\bf{y}}} } \right|^{\rm{2}}}.$ (18)

It is the usual estimate of the power spectrum of X, Ƥ_X. It is, however, a biased estimator of the power spectrum Ƥ_X and is only unbiased asymptotically, in the limit L → ∞. Moreover, the variance of the estimator S_L does not vanish as L → ∞ (Papoulis & Pillai 1965), which makes it quite unreliable.

We thus see that, because of the finite size of the sample, one cannot obtain a reliable estimate of the ACF (or of the power spectrum) for all lag values. Furthermore, in many cases, the mean value of X is not known and is replaced in Eq. (16) by its estimate ${\hat{X}}_{L} = \hat{μ}$ ${\hat X_L} = \hat \mu$ , which introduces further, but reasonable, bias (see Papoulis & Pillai 1965 for a more complete discussion).

2.2.2 Periodic Estimators

To get rid of the effect of finite sampling, one may perform simulations in periodic calculation boxes or may artificially add some periodicity to the available data to obtain the following estimate: ${\hat{C}}_{X,per} (y) = \frac{1}{L^{D}} \int_{{[- \frac{L}{2}, \frac{L}{2}]}^{D}} (X_{\hat{μ}} (y + u) X_{\hat{μ}} (u)) d u,$ ${\hat C_{X{\rm{,per}}}}\left( {\bf{y}} \right) = {1 \over {{L^D}}}\int_{{{\left[ { - {L \over 2},{L \over 2}} \right]}^D}} {\left( {{X_{\hat \mu }}\left( {{\bf{y}} + {\bf{u}}} \right){X_{\hat \mu }}\left( {\bf{u}} \right)} \right)} {\rm{d}}u,$ (19)

where $X_{\hat{μ}} = X - \hat{μ} = X - {\hat{X}}_{L}$ ${X_{\hat \mu }} = X - \hat \mu = X - {\hat X_L}$ and where one makes the identification $X_{\hat{μ}} (y + n L) = X_{\hat{μ}} (y)$ ${X_{\hat \mu }}\left( {y + nL} \right) = {X_{\hat \mu }}\left( y \right)$ . However, in such cases, the spatial average of the estimated ACF is necessarily 0. Indeed $\begin{array}{l} \int_{{[- \frac{L}{2}, \frac{L}{2}]}^{D}} \frac{{\hat{C}}_{X, per} (y)}{L^{D}} d y & = & \int_{{({[- \frac{L}{2}, \frac{L}{2}]}^{D})}^{2}} \frac{(X_{\hat{μ}} (y + u) X_{\hat{μ}} (u))}{L^{2 D}} d u d y \\ = & {(\int_{{[- \frac{L}{2}, \frac{L}{2}]}^{D}} \frac{X_{\hat{μ}} (u)}{L^{D}} d u)}^{2} = 0, \end{array}$ $\matrix{{\int_{{{\left[ { - {L \over 2},{L \over 2}} \right]}^D}} {{{{{\hat C}_{X,{\rm{per}}}}\left( {\bf{y}} \right)} \over {{L^D}}}d\,{\bf{y}}} } \hfill & = \hfill & {\int_{{{\left( {{{\left[ { - {L \over 2},{L \over 2}} \right]}^D}} \right)}^2}} {{{\left( {{X_{\hat \mu }}\left( {{\bf{y}} + {\bf{u}}} \right){X_{\hat \mu }}\left( {\bf{u}} \right)} \right)} \over {{L^{2D}}}}d{\bf{u}}\,{\rm{d}}{\bf{y}}} } \hfill \cr {} \hfill & = \hfill & {{{\left( {\int_{{{\left[ { - {L \over 2},{L \over 2}} \right]}^D}} {{{{X_{\hat \mu }}\left( {\bf{u}} \right)} \over {{L^D}}}{\rm{d}}{\bf{u}}} } \right)}^2} = 0,} \hfill \cr }$ (20)

due to the assumption that $X_{\hat{μ}}$ ${X_{\hat \mu }}$ is periodic.

Therein lies a significant problem: as the correlation length is defined as an integral over all possible lags, it is not easy to evaluate the reliability of estimates that are obtained in this manner.

Therefore, one traditionally produces an estimate for l_c (or the integral scale l_i) in either of the next two ways. Either one searches for the e^–1 value of the reduced ACF ${\hat{C}}_{X} / Var (X)$ ${{{{\hat C}_X}} \mathord{\left/ {\vphantom {{{{\hat C}_X}} {{\rm{Var}}\left( X \right)}}} \right. \kern-\nulldelimiterspace} {{\rm{Var}}\left( X \right)}}$ to obtain an estimate of the correlation length (see e.g. Kleiner & Dickman 1984, 1985), assuming some exponential envelop for the ACF. Or, if the ACF decays fast enough at scales larger than l_c(X), as is the case in turbulence (see previous section), the ACF is then generally extrapolated with a decaying exponential in regions where it becomes non monotonic (see e.g. Batchelor 1953; Reinke et al. 2016, 2018) so one can perform the integral and give a reliable estimate of l_c(X) if l_c(X) «L.

3 Fluctuations and Estimation of the Correlation Length

3.1 Expected Fluctuations in Repeated Trials

Be it for (numerical) experiments that can be repeated several times or for a statistically homogeneous and stationary field, volume averaged quantities fluctuate around their true expectations. In the former case, the volume averaged quantities fluctuate between the different samples while in the latter case they fluctuate in time. As we show in the following, these fluctuations depend on the ratio (l_c/R). By studying these, we thus aim to obtain an accurate estimate of (l_c/R), without having to calculate the ACF.

We consider here the case where one can reproduce several times the same experiment, as can be done for instance with numerical experiments or as can be approximated for clouds with similar conditions. We wish to determine the expected amplitudes of fluctuations of volume averaged quantities between samples.

To each experiment i of the N trials corresponds a value of the estimate ${\hat{X}}_{L, i}$ ${\hat X_{L,i}}$ defined by Eq. (4). From Bienaymé-Tchebychev inequality, we know that ${\hat{X}}_{L, i}$ ${\hat X_{L,i}}$ lies around the true expectation $E (X)$ ${\mathbb{E}}\left( X \right)$ within a distance such that, in probability, $ℙ (| {\hat{X}}_{L, i} - E (X) | \leq m σ (X) {(\frac{l_{C} (X)}{R})}^{D / 2}) \geq 1 - \frac{1}{m^{2}} .$ $\left( {\left| {{{\hat X}_{L,i}} - \left( X \right)} \right| \le m\,\sigma \left( X \right){{\left( {{{{l_C}\left( X \right)} \over R}} \right)}^{D/2}}} \right) \ge 1 - {1 \over {{m^2}}}.$ (21)

The average over the N trials (or sample average) ${\hat{X}}_{L}^{(N)} = \frac{1}{N} \sum_{I = 1}^{N} {\hat{X}}_{L, i}$ $\hat X_L^{\left( N \right)} = {1 \over N}\sum\limits_{I = 1}^N {{{\hat X}_{L,i}}}$ (22)

is obviously a better estimate of $E (X)$ ${\mathbb{E}}\left( X \right)$ as $ℙ (| {\hat{X}}_{L}^{(N)} - E (X) | \leq m \frac{σ (X)}{\sqrt{N}} {(\frac{l_{c} (X)}{R})}^{D / 2}) \geq 1 - \frac{1}{m^{2}} .$ $\left( {\left| {\hat X_L^{\left( N \right)} - \left( X \right)} \right| \le m{{\sigma \left( X \right)} \over {\sqrt N }}{{\left( {{{{l_c}\left( X \right)} \over R}} \right)}^{D/2}}} \right) \ge 1 - {1 \over {{m^2}}}.$ (23)

We then expect the ${\hat{X}}_{L, i}$ ${\hat X_{L,i}}$ to fluctuate around the sample average ${\hat{X}}_{L}^{(N)}$ $\hat X_L^{\left( N \right)}$ with variance $Var ({\hat{X}}_{L, i} - {\hat{X}}_{L}^{(N)}) = σ {(X)}^{2} {(\frac{l_{c} (X)}{R})}^{D} (1 - \frac{1}{N}),$ ${\rm{Var}}\left( {{{\hat X}_{L,i}} - \hat X_L^{\left( N \right)}} \right) = \sigma {\left( X \right)^2}{\left( {{{{l_c}\left( X \right)} \over R}} \right)^D}\left( {1 - {1 \over N}} \right),$ (24)

such that $ℙ (| {\hat{X}}_{L, i} - {\hat{X}}_{L}^{(N)} | \leq m σ (X) {\frac{L_{C} (X)}{R}}^{D / 2} {(1 - \frac{1}{N})}^{1 / 2}) \geq 1 - \frac{1}{M^{2}} .$ $\left( {\left| {{{\hat X}_{L,i}} - \hat X_L^{\left( N \right)}} \right| \le m\sigma \left( X \right){{{{{L_C}\left( X \right)} \over R}}^{D/2}}{{\left( {1 - {1 \over N}} \right)}^{1/2}}} \right) \ge 1 - {1 \over {{M^2}}}.$ (25)

If l_c(X) is known, Eqs. (21), (24) and (25) allow one to give statistical error bars. Conversely, if l_c(X) is not known, these equations give information on the product σ(X)(l_c(X)/R)^D/2 by performing the same statistical experiment several times and studying the dispersion of the ${\hat{X}}_{L, i}$ ${\hat X_{L,i}}$ around ${\hat{X}}_{L}^{(N)}$ $\hat X_L^{\left( N \right)}$ .

Indeed, the half length l_50%, of the segment centered on ${\hat{X}}_{L}^{(N)}$ $\hat X_L^{\left( N \right)}$ within which 50% of the estimate ${\hat{X}}_{L, i}$ ${\hat X_{L,i}}$ falls, verifies² $\sqrt{2} σ (X) {(\frac{l_{c} (X)}{R})}^{D / 2} {(1 - \frac{1}{N})}^{1 / 2} \underset{˜}{<} l_{50 % .}$ $\sqrt 2 \sigma \left( X \right){\left( {{{{l_c}\left( X \right)} \over R}} \right)^{D/2}}{\left( {1 - {1 \over N}} \right)^{1/2}}{l_{50\% .}}$ (26)

This gives a quick and easy qualitative determination of σ(X)(l_c(X)/R)^D/2. More quantitatively, the empirical variance of the sample of the N trials ${Var}^{(N)} = \frac{1}{N - 1} {\sum_{i = 1}^{N} ({\hat{X}}_{L, i} - {\hat{X}}_{L}^{(N)})}^{2},$ ${\rm{Va}}{{\rm{r}}^{\left( N \right)}} = {1 \over {N - 1}}{\sum\limits_{i = 1}^N {\left( {{{\hat X}_{L,i}} - \hat X_L^{\left( N \right)}} \right)} ^2},$ (27)

is an unbiased estimator of $Var ({\hat{X}}_{L, i}) = σ {(X)}^{2} {(l_{c} (X) / R)}^{D} .$ ${\rm{Var}}\left( {{{\hat X}_{L,i}}} \right)\, = \sigma {\left( X \right)^2}{\left( {{l_c}\left( X \right)/R} \right)^D}.$ (28)

Computing the variance Var^(N) thus yields an easy and rigorous method to determine (l_c(X)/R)^D and the correct error bars for statistical experiments.

3.2 Expected Temporal Fluctuations for a Statistically Homogeneous and Stationary Field

We consider here the case of a statistically homogeneous and stationary field (i.e., whose statistical properties are invariant under space and time translations). This can for example describe a steady turbulent flow (e.g. as simulated in a periodic box) without gravity.

As before, to each time t corresponds a value of the estimate ${\hat{X}}_{L} (t)$ ${\hat X_L}\left( t \right)$ defined by Eq. (4) which lies around the true expectation $E (X)$ ${\mathbb{E}}\left( X \right)$ within a distance such that, in probability, $ℙ (| {\hat{X}}_{L} (t) - E (X) | \leq m σ (X) {(\frac{l_{c} (X)}{R})}^{D / 2}) \geq 1 - \frac{1}{m^{2}} .$ $\left( {\left| {{{\hat X}_L}\left( t \right) - \left( X \right)} \right| \le m\sigma \left( X \right){{\left( {{{{l_c}\left( X \right)} \over R}} \right)}^{D/2}}} \right) \ge 1 - {1 \over {{m^2}}}.$ (29)

The time average over a large timescale T ${\hat{X}}_{L, T} = \frac{1}{T} \int_{t o}^{t o + T} {\hat{X}}_{L} (t) d t,$ ${\hat X_{L,T}}\, = \,{1 \over T}\int_{to}^{to + T} {{{\hat X}_L}\left( t \right){\rm{d}}t,}$ (30)

where t₀ is a time at which the steady state is reached, is a better estimate of $E (X)$ ${\mathbb{E}}\left( X \right)$ as it has a variance $var ({\hat{X}}_{L, T}) ≃ σ {(\hat{X})}^{2} {(\frac{l_{c} (X)}{R})}^{D} \frac{τ_{c (X_{L})}}{2 T},$ ${\mathop{\rm var}} \left( {{{\hat X}_{L,T}}} \right) \simeq \sigma {\left( {\hat X} \right)^2}{\left( {{{{l_c}\left( X \right)} \over R}} \right)^D}{{{\tau _{c\left( {{X_L}} \right)}}} \over {2T}},$ (31)

where τ_c(X_L) « T is the correlation time. Then, the signal ${\hat{X}}_{L} (t)$ ${\hat X_L}\left( t \right)$ will fluctuate around ${\hat{X}}_{L, T}$ ${\hat X_{L,T}}$ with an empirical (temporal) variance ${var}_{T} = \frac{1}{T} \int_{t o}^{t o + T} {({\hat{X}}_{L} (t) - {\hat{X}}_{L, t})}^{2} d t$ ${{\mathop{\rm var}} _T} = {1 \over T}\int_{to}^{to + T} {{{\left( {{{\hat X}_L}\left( t \right) - {{\hat X}_{L,t}}} \right)}^2}{\rm{d}}t}$ (32)

which, providing that τ_c(X_L) « T, is an accurate estimate of $var ({\hat{X}}_{L} (t)) = σ {(X)}^{2} {(l_{c} (X) / R)}^{D} .$ ${\mathop{\rm var}} \left( {{{\hat X}_L}\left( t \right)} \right) = \sigma {\left( X \right)^2}{\left( {{l_c}\left( X \right)/R} \right)^D}.$ (33)

Hence, computing the variance Var_T yields an easy and robust estimate of (l_c(X)/R)^D and the correct error bars for a statistically stationary field.

3.3 Fluctuations of Integrated Fields Overa Column

The previous cases work for experiments that can be repeated or for statistically stationary fields. However, in some situations, the two conditions cannot be fulfilled either because it is impossible to reproduce the experiment a large number of times or because the fields are not stationary (for instance in the presence of gravity).

In that case an estimate of ratio (l_c/R) can be obtained if one has access to the integral of field X over a column of fixed length L = 2R: $\sum_{x} (r) = \int_{[0, L]} X (r, z,) dz,$ $\sum\nolimits_x {\left( r \right)} = \int_{\left[ {0,L} \right]} {X\left( {r,z,} \right)} {\rm{dz}},$ (34)

where r is a vector of D − 1 dimension (typically 2). The column must have a constant length to avoid creating spurious biases (see Sect. 5).

As ∑_X/L corresponds to averaging X along one direction, we thus expect that its fluctuations will be reduced in comparison of those of X. The longer the length L of the column, the smaller the fluctuations of ∑_X/L are expected to be. More quantitatively, the variance of ∑_X/L is $var (\frac{\sum_{X}}{L}) = \frac{var (\sum_{X})}{L^{2}} = \frac{E (\sum_{X} (r) \sum_{X} (r))}{L^{2}}$ ${\mathop{\rm var}} \left( {{{\sum {_X} } \over L}} \right) = {{{\mathop{\rm var}} \left( {\sum {_X} } \right)} \over {{L^2}}} = {{\left( {\sum {_X\left( r \right)} \sum {_X\left( r \right)} } \right)} \over {{L^2}}}$ (35) $= \frac{1}{L^{2}} \int_{{[0, L]}^{2}} C_{X} (0, z - z^{'}) d z d z^{'}$ $= {1 \over {{L^2}}}\int_{{{\left[ {0,L} \right]}^2}} {{C_X}\left( {{\bf{0}},z - z'} \right)} {\rm{d}}z{\rm{d}}z'$ (36) $= \frac{1}{L} \int_{[- L, L]} C_{ρ} (0, u) (1 - \frac{| u |}{L}) d u .$ $= {1 \over L}\int_{\left[ { - L,L} \right]} {{C_\rho }\left( {{\bf{0}},u} \right)} \left( {1 - {{\left| u \right|} \over L}} \right){\rm{d}}u.$ (37)

A similar equation for the centroid velocities was given in (Scalo 1984). Now, if the ACF of X is isotropic at short lags and l_c(X) « L, one can make the approximation $var (\frac{\sum_{X}}{L}) ≃ \frac{1}{L} \int_{[- L . L]} C ρ (| u |) d u,$ ${\mathop{\rm var}} \left( {{{\sum {_X} } \over L}} \right) \simeq {1 \over L}\int_{\left[ { - L.L} \right]} {C\rho \left( {\left| u \right|} \right){\rm{d}}u,}$ (38) $≃ var (X) \frac{2 l_{i} (X)}{L},$ $\simeq {\mathop{\rm var}} \left( X \right){{2{l_i}\left( X \right)} \over L},$ (39)

where l_i(X) is the integral scale (Batchelor 1953) which, in most cases, verifies l_i(X) ≃ l_c(X) (see Jaupart & Chabrier 2021). Thus a quick and easy estimate of ratio l_c(X)/R is given by $\frac{Var (\sum_{X})}{L^{2} var (X)} ≃ \frac{l c (X)}{R} .$ ${{{\rm{Var}}\left( {\sum {_X} } \right)} \over {{L^2}{\mathop{\rm var}} \left( X \right)}} \simeq {{lc\left( X \right)} \over R}.$ (40)

This method was applied to the density field (ρ = X) in (Jaupart & Chabrier 2021). The above estimate (Eq. (40)) was shown to produce the trends predicted analytically and thus to be a good approximation of the actual ratio l_c(ρ)/R.

4 Application to Astrophysical Fields

The general results derived in Sect. 2 can be applied to many physical and astrophysical systems. They have been used extensively in cosmology but have somehow been overlooked in studies of star formation. Today, it is generally accepted that star formation is triggered by density fluctuations generated by compressible turbulence injected at a large scale in MCs (see e.g. McKee & Ostriker 2007 and reference therein). In this context, the density field ρ (or the logarithmic density field $s = \ln (ρ / E (ρ))$ $s = \ln \left( {{\rho \mathord{\left/ {\vphantom {\rho {{\mathbb{E}}\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {{\mathbb{E}}\left( \rho \right)}}} \right)$ ) is of prime interest and its cumulative distribution function (CMF) and probability density function (PDF) must be determined accurately.

Each of these statistical quantities is associated with a stochastic field X to which the results of Sect. 2 can be applied. For instance, the CMF F_ρ(ρ₀) at ρ₀ is linked to the stochastic field h_ρ0(y) = Θ (ρ₀ − ρ(y)) (where Θ is the Heavyside function), as $F_{ρ} (ρ_{0}) Δ ρ = E (h_{ρ_{0}} (y)) - E) (h_{ρ_{0}}),$ ${F_\rho }\left( {{\rho _0}} \right)\Delta \rho = \left( {{h_{{\rho _0}}}\left( {\bf{y}} \right)} \right) - )\left( {{h_{{\rho _0}}}} \right),$ (41)

while the PDF f_ρ(ρ0) is given by $f_{ρ} (ρ_{0}) = E (δ_{ρ_{0}} (y)) = E (δ_{ρ_{0}})$ ${f_\rho }\left( {{\rho _0}} \right) = {\mathbb{E}}\left( {{\delta _{{\rho _0}}}\left( y \right)} \right) = {\mathbb{E}}\left( {{\delta _{{\rho _0}}}} \right)$ where δ_ρ0(y) = δ (ρ0 − ρ(y)) is Dirac’s distribution. Usually the PDF is rather deduced from histograms with some bin size ∆ρ such that $f_{ρ} (ρ_{0}) Δ ρ ≃ F_{ρ} (ρ_{0} + Δ ρ) - F_{ρ} (ρ_{0}) = E (h_{ρ_{0} + Δ ρ} - h_{ρ 0}) .$ ${f_\rho }\left( {{\rho _0}} \right)\Delta \rho \simeq {F_\rho }\left( {{\rho _0} + \Delta \rho } \right) - {F_\rho }\left( {{\rho _0}} \right) = \left( {{h_{{\rho _0} + \Delta \rho }} - {h_{\rho 0}}} \right).$ (42)

In principle, knowledge of the ACF of all these fields is required to establish the accuracy of the estimations. Fortunately, it can be shown that sometimes, with a few simplifying assumptions, one can proceed with the ACF of ρ, only, in some situations. This is explained in detail in Appendix B.

4.1 Exact Results Regarding the Properties of the Auto-Covariance Function (ACF) ofρ

For a statistically homogeneous field, the ACF of ρ, the density field, can be expressed in term of the second order structure function: $S_{ρ}^{(2)} (y) = E ({ρ (u + y) - ρ (u)}^{2}),$ $S_\rho ^{\left( 2 \right)}\left( {\bf{y}} \right) = \left( {{{\left\{ {\rho \left( {{\bf{u + y}}} \right) - \rho \left( {\bf{u}} \right)} \right\}}^2}} \right),$ (43)

as $S_{ρ}^{(2)} (y) = 2 (C_{ρ} (0) - C_{ρ} (y)$ $S_\rho ^{\left( 2 \right)}\left( y \right) = 2\left( {{C_\rho }\left( {\bf{0}} \right) - {C_\rho }\left( y \right.} \right)$ . A similar statement can be made for the logarithmic density field $s = \ln (ρ / E (ρ))$ $s = \ln \left( {{\rho \mathord{\left/ {\vphantom {\rho {{\mathbb{E}}\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {{\mathbb{E}}\left( \rho \right)}}} \right)$ . This helps us to grasp some key features of the ACF. At very short scale (below the viscous scale), the density field is supposed to be differen-tiable and hence C_ρ must possess second-derivatives at y = 0. Then, due to the parity of the ACF and because it is maximal at y = 0, its gradient must exist and be equal to 0 at y = 0.

Furthermore, (Jaupart & Chabrier 2021), generalizing the work of (Chandrasekhar 1951a), show that for a statistically homogeneous density field, the quantity $E (ρ) var (e^{s}) l_{c} {(ρ)}^{3} = \frac{var (ρ)}{E (ρ)} l_{c} {(ρ)}^{3}$ $\left( \rho \right){\mathop{\rm var}} \left( {{e^s}} \right){l_c}{\left( \rho \right)^3} = {{{\mathop{\rm var}} \left( \rho \right)} \over {\left( \rho \right)}}{l_c}{\left( \rho \right)^3}$ (44)

is an invariant of the dynamics.

4.2 Phenomenology of (Compressible) Turbulence

The phenomenology of compressible turbulence (Kritsuk et al. 2007) can be derived, with some adjustments, from that of incompressible turbulence (Frisch 1995). Thus, we use the latter to derive some expected features of the density ACF in star-forming MCs that can be described by such phenomenology.

In isotropic turbulence, the second order structure function is observed to be a monotonic increasing function of separation distance, at least in the inertial range, and to converge rapidly toward 2Var (ρ) at scales that are larger than the integral scale l_i. This integral scale (not to be confused with the injection scale of turbulence, see Sect. 4.3) is defined in the same manner as the correlation length (Batchelor 1953) $l_{i} = \frac{1}{c (0)} \int_{0}^{\infty} C (r) d r .$ ${l_i} = {1 \over {c\left( {\bf{0}} \right)}}\int_0^\infty {C\left( r \right)} {\rm{d}}r.$ (45)

In many situations, l_c ~ l_i, as shown in (Jaupart & Chabrier 2021). Thus, at small scales (short lags) and in the inertial range, the ACF must be a monotonically decreasing function. Above the inertial range, it is often assumed that the structure function and the ACF are still monotonic and the ACF is usually approximated by a decaying exponential, even though density fluctuations are likely to generate oscillations of the observed and estimated ACF as it tends to zero (Batchelor 1953; Reinke et al. 2016, 2018).

In compressible isothermal and stationary turbulence, the density field ρ is found to be approximately lognormal (Kritsuk et al. 2007; Federrath et al. 2010), implying that the logarithmic density field $s = \ln (ρ / E (ρ))$ $s = \ln \left( {{\rho \mathord{\left/ {\vphantom {\rho {{\mathbb{E}}\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {{\mathbb{E}}\left( \rho \right)}}} \right)$ is Gaussian with variance $σ {(s)}^{2} = \ln (1 + {(b ℳ)}^{2})$ $\sigma {\left( s \right)^2} = \ln \left( {1 + {{\left( {b{\cal M}} \right)}^2}} \right)$ . In such Gaussian conditions, the ACFs of ρ and s are linked by the following equation: $C_{ρ} (y) = E {(ρ)}^{2} (e^{c_{s} (y)} - 1) .$ ${C_\rho }\left( {\bf{y}} \right) = {\left( \rho \right)^2}\left( {{e^{{c_s}\left( {\bf{y}} \right)}} - 1} \right).$ (46)

As a consequence, if C_ρ (or C_s) is monotonically decaying toward 0, we deduce that: ${(\frac{σ {(s)}^{2}}{e^{σ {(s)}^{2}} - 1})}^{1 / 3} l_{c} (s) \leq l_{c} (ρ) \leq l_{c} (s),$ ${\left( {{{\sigma {{\left( s \right)}^2}} \over {{e^{\sigma {{\left( s \right)}^2}}} - 1}}} \right)^{{1 \mathord{\left/ {\vphantom {1 3}} \right. \kern-\nulldelimiterspace} 3}}}{l_c}\left( s \right) \le {l_c}\left( \rho \right) \le {l_c}\left( s \right),$ (47)

where we have used the following inequalities: e^ax − 1 ≤ x(e^a − 1) for 0 ≤ x ≤ 1 and ax ≤ e^ax − 1 ∀x. For typical star forming conditions, σ(s)² ≲ 4, implying that: $0.4 l_{c} (s) \underset{˜}{<} l_{c} (ρ) \leq l_{c} (s),$ $0.4{l_c}\left( s \right){l_c}\left( \rho \right) \le {l_c}\left( s \right),$ (48)

or, $l_{c} (s) \sim l_{c} (ρ) .$ ${l_c}\left( s \right) \sim {l_c}\left( \rho \right).$ (49)

This shows that under Gaussian conditions, for the two lengths l_c(s) and l_c(ρ), knowledge of one of them is sufficient to characterize the other one within an order of magnitude.

4.3 Large Injection Scale but small Correlation Length

Density and velocity fluctuations in MCs are thought to originate from turbulent motions driven at large scale (McKee & Ostriker 2007; Brunt et al. 2009), i.e. at scale comparable to the cloud scale L. This means that the energy of these turbulent motions is injected at an injection scale l_inj ~ L, below which the turbulent cascade eventually occurs.

The injection scale l_inj, however, is not the correlation length l_c of either the velocity, kinetic energy or density fields. Otherwise, if l_inj = l_c, every estimate produced from volumetric averages of the former fields would be inaccurate and far from the actual statistical values. This could result in large fluctuations of these averaged quantities either between different simulations (samples) or at different times for steady turbulent flows (see Sect. 3). What is instead observed in numerical simulations of compressible and steady turbulent flow is that volume averaged quantities of these fields display rather small fluctuations around their mean values. This is the case, for example, for the rms Mach number $ℳ = υ_{rms} / c_{s}$ ${\cal M} = {{{\upsilon _{{\rm{rms}}}}} \mathord{\left/ {\vphantom {{{\upsilon _{{\rm{rms}}}}} {{c_s}}}} \right. \kern-\nulldelimiterspace} {{c_s}}}$ , where υ_rms is the root of the volume average square velocity υ² and c_s the sound speed. To be more explicit, $ℳ^{2} = \frac{1}{L^{3}} \int_{{[- \frac{L}{2} . \frac{L}{2}]}^{3}} \frac{v^{2} (y)}{c_{s}^{2}} d y,$ ${{\cal M}^2} = {1 \over {{L^3}}}\int_{{{\left[ { - {L \over 2}.{L \over 2}} \right]}^3}} {{{{{\bf{v}}^2}\left( {\bf{y}} \right)} \over {c_s^2}}} {\rm{d}}{\bf{y}},$ (50)

so the result of Sect. 3 can be applied with $X = υ^{2} / c_{s}^{2}$ $X = {{{\upsilon ^2}} \mathord{\left/ {\vphantom {{{\upsilon ^2}} {c_s^2}}} \right. \kern-\nulldelimiterspace} {c_s^2}}$ and $ℳ^{2} = {\hat{X}}_{L}$ ${{\cal M}^2} = {\hat X_L}$ .

In (Federrath 2013), a series of numerical simulations of isothermal compressible turbulence driven to $ℳ ≃ 17$ ${\cal M} \simeq 17$ with an injection scale l_inj = L/2 = R is presented. Once a statistical steady state is reached, the volume averaged Mach number $ℳ$ ${\cal M}$ (or $ℳ^{2}$ ${{\cal M}^2}$ which is a measure of the volume averaged specific kinetic energy) displays fluctuations that are rather small compared to their average values (their Fig. 1). Would we have l_c(υ²) = l_inj = R, fluctuations of the signal $ℳ^{2} (t)$ ${{\cal M}^2}\left( t \right)$ would have yielded a temporal variance (Eq. (32)): $V a r_{T} ≃ σ^{2} (v^{2}) (\frac{l_{c} {(v^{2})}^{3}}{R}) = σ^{2} (v^{2}),$ $Va{r_T} \simeq {\sigma ^2}\left( {{{\bf{v}}^2}} \right)\left( {{{{l_c}{{\left( {{{\bf{v}}^{\bf{2}}}} \right)}^3}} \over {\bf{R}}}} \right) = {\sigma ^2}\left( {{{\bf{v}}^2}} \right),$ (51)

from Eq. (33). Since the statistics of υ are close to being Gaussian (their Fig. A1), this would imply $V a r_{T} ≃ 2 ℳ_{T}^{4},$ $Va{r_T} \simeq 2{\cal M}_T^4,$ (52)

where $ℳ_{T}$ ${{\cal M}_T}$ is the average of the signal $ℳ (t)$ ${\cal M}\left( t \right)$ over a time T. This would thus yield large fluctuations incompatible with their Fig. 1. The actual temporal variance Var_T of signal $ℳ {(t)}^{2}$ ${\cal M}{\left( t \right)^2}$ in (Federrath 2013) rather yields a ratio $\frac{l_{c} (v^{2})}{R} = \frac{l_{c} (v^{2})}{l_{i n j}} ≃ 0.1,$ ${{{l_c}\left( {{{\bf{v}}^2}} \right)} \over {\bf{R}}} = {{{l_c}\left( {{{\bf{v}}^2}} \right)} \over {{l_{inj}}}} \simeq 0.1,$ (53)

which shows that l_c(υ²) « l_inj = R.

Furthermore, (Jaupart & Chabrier 2021) used the estimate produced by Eq. (40) to compute the correlation length of the density field ρ. They found that within a factor of order unity, $l_{c} (ρ) ≃ λ_{s} = L / ℳ^{2} ≪ L = 2_{linj},$ ${l_c}\left( \rho \right) \simeq {\lambda _s} = L/{{\cal M}^2} \ll L = {2_{{\rm{linj}}}},$ (54)

where λ_s is the sonic length which is found to be close to the average width of filamentary structures in isothermal turbulence (Federrath 2016).

The above results show that a large injection scale does not imply a large correlation length and that, on the contrary, correlation lengths in star-forming MCs are small compared with the injection scale (see above and Jaupart & Chabrier 2021).

4.4 Practical Assumptions Regarding the ACF

From the above results, we thus assume that the ACF decays rapidly at scales larger than the correlation length l_c (l_c ~ l_i, the integral scale) and then that the defining integral Eq. (11) can be calculated only up to a few l_c. Moreover, we assume that the ACF can be bounded by a decaying exponential exp(−|y|/λ), where λ ~ l_c above and in the inertial range to allow the computation of the correlation length (we note that such an exponential behavior is prohibited at very small scales due to the differentiability of ρ).

5 Star-Forming Clouds. Column Densities as Tracers of the Underlying Density Field

We now turn to observations of star-forming molecular clouds. Measurements provide values of the column-density ∑(x, y), which is the integral of density along the line of sight (l.o.s.(x, y)): $\begin{array}{l} \sum_{(x, y)} & = & \int_{l,o,s (x, y)} ρ (x, y, z) d z \\ = & E (ρ) l (x, y) + \int_{l.o,s (x, y)} δ ρ (x, y, z) d z, \end{array}$ $\matrix{ {\sum {_{\left( {x,y} \right)}} } \hfill & = \hfill & {\int_{{\rm{l,o,s}}\left( {x,y} \right)} {\rho \left( {x,y,z} \right){\rm{d}}z} } \hfill \cr {} \hfill & = \hfill & {\left( \rho \right)l\left( {x,y} \right) + \int_{{\rm{l}}{\rm{.o,s}}\left( {x,y} \right)} {\delta \rho \left( {x,y,z} \right){\rm{d}}z{\rm{,}}} } \hfill \cr }$ (55)

where l(x, y) is the thickness of the cloud along the l.o.s. at (x, y) and $δ ρ = ρ - E (ρ)$ $\delta \rho = \rho - \left( \rho \right)$ is the density fluctuation. Column densities are the only data that depend directly on the density field and one must determine how to retrieve reliable information from them.

5.1 Inhomogeneity and Anisotropy due to Integration over the Line of Sight

Star forming clouds are shaped by turbulent motions conferring statistical properties to their geometrical characteristics, and hence to the area projected in a plane perpendicular to the line of sight and to the thickness projected along the line of sight. This is responsible for difficulties in evaluating exactly the statistical average of ∑(x, y). However, provided that the cloud thickness is much larger than the correlation length, that is if l(x, y) » l_c(ρ), we can reasonably assume that (see Eq. (55)): $E (\sum (x, y)) ≃ E (ρ) E (x, y)) .$ $\left( {\sum {\left( {x,y} \right)} } \right) \simeq \left. {\left( \rho \right)\left( {x,y} \right)} \right).$ (56)

One must note here that we are dealing with the statistical average and not with the spatial average. This equation shows that ∑(x, y) may not be statistically homogeneous even if the density field ρ is, just because of integration effects. To illustrate this important point, let us imagine two idealized situations. In one of them, the cloud is a sphere of radius R. In the other one, the cloud is a “cube” of side L misaligned with the line of sight and seen from one of its edges such that the projected surface is of size $\sqrt{2} L \times L$ $\sqrt 2 L \times L$ (see Fig. 1). For the sphere, the thickness along the line of sight is: $E (l_{s (R)}) (x, y) = 2 R (1 - \frac{x^{2} + y^{2}}{R^{2}}), x^{2} + y^{2} < R^{2},$ $\left( {{l_{s\left( {\rm{R}} \right)}}} \right)\left( {x,y} \right) = 2R\left( {1 - {{{x^2} + {y^2}} \over {{R^2}}}} \right),{x^2} + {y^2} < {R^2},$ (57)

whereas for the cubic cloud it is: $E (l_{c (L)}) (x, y) = \sqrt{2} L (1 - \frac{\sqrt{2} | x |}{L}), | x | \leq \frac{L}{\sqrt{2}}, | y | \leq \frac{L}{2} .$ $\left( {{l_{c\left( {\rm{L}} \right)}}} \right)\left( {x,y} \right) = \sqrt 2 L\left( {1 - {{\sqrt 2 \left| x \right|} \over L}} \right),\left| x \right| \le {L \over {\sqrt 2 }},\left| y \right| \le {L \over 2}.$ (58)

Even though they are very simple, these two examples demonstrate that the column-density field may exhibit large-scale gradients and hence may not be statistically homogeneous, even if the density field is. Furthermore, as seen with the example of the cube, integration effects can also generate some anisotropy in the column-density field.

To reduce these effects, one can use a low pass filter to filter out large-scale gradients (Kleiner & Dickman 1984) and then treat the column density field as if it were homogeneous. Furthermore, most of the integration effects are expected to be produced by the first term of the r.h.s of Eq. (55). Thus they are expected to affect column densities that are around the (surface) average of the column density map 〈∑〉. In contrast, high column density (∑(x, y) > 〈∑〉, the regions of interest for star formation) are most likely to originate from the second term of the r.h.s of Eq. (55) and be produced by dense pockets along the line of sight. They are thus expected to be less affected by the integration effects and by the low pass filter. Studying the statistics of these high column density is thus expected to bear insights on the bias introduced by integration effects.

In Sect. 6, we will apply the above considerations to the observations of the Polaris cloud.

5.2 Column-Density Field in a Simulation Box

For a cubic simulation domain of size L, projecting the density field along one of the 3 principal directions of the cube leads to a statistically homogeneous column density field such that: $E (\sum (x, y)) = E (ρ) \times L .$ $\left( {\sum {\left( {x,y} \right)} } \right) = \left( \rho \right) \times L.$ (59)

The results of Sects. 2 and 3 can thus be applied with X = ∑ and the ACF of ∑ in this case is $\begin{array}{l} C_{\sum} (r) & = & E ((\sum (u + r) - E (ρ) L) (\sum (u) - E (ρ) L)) \\ = & \int_{{[- L / 2, L / 2]}^{2}} C ρ (r, z - z^{'}) dzd z^{'} \\ = & \int_{[- L, L]} C ρ (r, u) d u \int_{- L + | u |}^{L - | u |} \frac{d v}{2} \\ = & L \int_{[- L, L]} C_{ρ} (r, u) (1 - \frac{| u |}{L}) d u . \end{array}$ $\matrix{{{C_\sum }\left( {\bf{r}} \right)} \hfill & = \hfill & {\left( {\left( {\sum {\left( {{\bf{u + r}}} \right)} - \left( \rho \right)L} \right)\left( {\sum {\left( {\bf{u}} \right)} - \left( \rho \right)L} \right)} \right)} \hfill \cr {} \hfill & = \hfill & {\int_{{{\left[ {{{ - L} \mathord{\left/{\vphantom {{ - L} {2,}}} \right.\kern-\nulldelimiterspace} {2,}}{L \mathord{\left/{\vphantom {L 2}} \right.\kern-\nulldelimiterspace} 2}} \right]}^2}} {C\rho \left( {{\bf{r}},z - z'} \right){\rm{dzdz'}}} } \hfill \cr {} \hfill & = \hfill & {\int_{\left[ { - L,L} \right]} {C\rho \left( {{\bf{r,u}}} \right)} {\rm{d}}u\int_{ - L + \left| u \right|}^{L - \left| u \right|} {{{{\rm{d}}v} \over 2}.} } \hfill \cr {} \hfill & = \hfill & {L\int_{\left[ { - L,L} \right]} {{C_\rho }\left( {{\bf{r,u}}} \right)\left( {1 - {{\left| u \right|} \over L}} \right){\rm{d}}u.} } \hfill \cr }$ (60)

Its variance is $Var (\sum) = C_{\sum} (0) = L \int_{[- L, L]} C_{ρ} (0, u) (1 - \frac{| u |}{L}) d u .$ ${\rm{Var}}\left( \sum \right) = {C_\sum }\left( {\bf{0}} \right) = L\int_{\left[ { - L,L} \right]} {{C_\rho }\left( {{\bf{0}},u} \right)\left( {1 - {{\left| u \right|} \over L}} \right){\rm{d}}u} .$ (61)

Then, if the density field is statistically isotropic at small scales (i.e. the ACF is isotropic at short lag) and l_c(ρ) « L, $Var (\sum) ≃ 2 L l_{c} (ρ) Var (ρ) .$ ${\rm{Var}}\left( \sum \right) \simeq 2L{l_c}\left( \rho \right){\rm{Var}}\left( \rho \right).$ (62)

Then Eq. (62) yields: $Var (\sum) ≃ Var (\frac{ρ}{E (\sum)}) ≃ Var (\frac{ρ}{E (ρ)}) \frac{2 l_{c} (ρ)}{R},$ ${\rm{Var}}\left( \sum \right) \simeq {\rm{Var}}\left( {{\rho \over {{\rm{E}}\left( \sum \right)}}} \right) \simeq {\rm{Var}}\left( {{\rho \over {{\rm{E}}\left( \rho \right)}}} \right){{2{l_c}\left( \rho \right)} \over R},$ (63)

which is a reformulation of Eq. (40). This is an important result because it gives a measure of l_c(ρ)/R without having to compute the ACF. (Brunt et al. 2010); (Federrath et al. 2010) for example found a ratio Var $(Σ / E (Σ)) / Var (ρ / E (ρ))$ ${{\left( {{{\rm{\Sigma }} \mathord{\left/ {\vphantom {{\rm{\Sigma }} }} \right. \kern-\nulldelimiterspace} }\left( {\rm{\Sigma }} \right)} \right)} \mathord{\left/ {\vphantom {{\left( {{{\rm{\Sigma }} \mathord{\left/ {\vphantom {{\rm{\Sigma }} }} \right. \kern-\nulldelimiterspace} }\left( {\rm{\Sigma }} \right)} \right)} {{\rm{Var}}\left( {{\rho \mathord{\left/ {\vphantom {\rho {\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {\left( \rho \right)}}} \right)}}} \right. \kern-\nulldelimiterspace} {{\rm{Var}}\left( {{\rho \mathord{\left/ {\vphantom {\rho {\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {\left( \rho \right)}}} \right)}}$ between 0.03 and 0.15.

(Vázquez-Semadeni & García 2001) were the first to study the impact of the l_c(ρ)/R ratio on the statistics of column-density fields. Based on a crude interpretation of the central limit theorem (CLT), they proposed that, for l_c(ρ)/R → 0, the column-density PDF should appear to be Gaussian instead of lognormal. This is not consistent with the apparent lognormality of the observed column-density PDFs, which led these authors to conclude that l_c(ρ)/R cannot be vanishingly small and that it must be on the order of 10^–1. However, the CLT only applies to independent variables and can hardly be valid for the sum ofcor-related variables, even if correlations decay. This casts doubt on the conclusions of (Vázquez-Semadeni & García 2001). More recently, (Szyszkowicz & Yanikomeroglu 2009) and (Beaulieu 2011) have shown that, for some special types of correlations, the sum of a large number N of lognormal variables tends to a lognormal distribution as N → ∞. We conclude that knowledge of the l_c(ρ)/R value does not allow robust conclusions on the shape of the column-density PDF. However, as shown by Eq. (63), the variance Var $(\frac{Σ}{E (Σ)})$ ${{\left( {{{\rm{\Sigma }} \mathord{\left/ {\vphantom {{\rm{\Sigma }} }} \right. \kern-\nulldelimiterspace} }\left( {\rm{\Sigma }} \right)} \right)} \mathord{\left/ {\vphantom {{\left( {{{\rm{\Sigma }} \mathord{\left/ {\vphantom {{\rm{\Sigma }} }} \right. \kern-\nulldelimiterspace} }\left( {\rm{\Sigma }} \right)} \right)} {{\rm{Var}}\left( {{\rho \mathord{\left/ {\vphantom {\rho {\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {\left( \rho \right)}}} \right)}}} \right. \kern-\nulldelimiterspace} {{\rm{Var}}\left( {{\rho \mathord{\left/ {\vphantom {\rho {\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {\left( \rho \right)}}} \right)}}$ does become vanishingly small as l_c(ρ)/R tends to zero. In that case, one can show with high probability that: $\ln (\frac{\sum}{E (\sum)}) ≃ \frac{\sum - E (\sum)}{E (\sum)} .$ $\ln \left( {{\sum \over {{\rm{E}}\left( \sum \right)}}} \right) \simeq {{\sum - {\rm{E}}\left( \sum \right)} \over {{\rm{E}}\left( \sum \right)}}.$ (64)

Thus, in the limit of vanishing values of l_c(ρ)/R, the distributions of $Σ / E (Σ)$ ${{\rm{\Sigma }} \mathord{\left/ {\vphantom {{\rm{\Sigma }} }} \right. \kern-\nulldelimiterspace} }\left( {\rm{\Sigma }} \right)$ and its logarithm are both Gaussian if one of them is.

Fig. 1

Projection of the two idealized situation. Left panel: case of a sphere. Right panel: case of a cuboid mis-aligned with the line of sight.

5.3 Decay Length of Correlations

We now examine how the decay of correlations of ρ impacts the decay of correlations of ∑. For sake of simplicity, we again consider the case of a cubic box in order to avoid unncessary complications. For the 2D field ∑, the correlation length is given by: $\begin{matrix} l_{c} {(\sum)}^{2} = \frac{1}{4} \frac{1}{Var (\sum)} \iint C_{\sum (r)} d r, \\ = \frac{1}{4} \frac{1}{Var (\sum)} \iint L \int_{[- L, L]} C_{ρ} (r, u) (1 - \frac{| u |}{L}) d u d r, \\ ≃ 2 \frac{L Var (ρ)}{Var (\sum)} l_{c} {(ρ)}^{3} . \end{matrix}$ $\matrix{ {{l_c}{{\left( \sum \right)}^2} = {1 \over 4}{1 \over {{\rm{Var}}\left( \sum \right)}}\int\!\!\!\int {{C_{\sum \left( r \right)}}{\rm{d}}r,} } \cr { = {1 \over 4}{1 \over {{\rm{Var}}\left( \sum \right)}}\int\!\!\!\int L \int_{\left[ { - L,L} \right]} {{C_\rho }\left( {r,u} \right)\left( {1 - {{\left| u \right|} \over L}} \right){\rm{d}}u\,{\rm{d}}r} ,} \cr { \simeq 2{{L{\rm{Var}}\left( \rho \right)} \over {{\rm{Var}}\left( \sum \right)}}{l_c}{{\left( \rho \right)}^3}.} \cr }$ (65)

Using Eq. (62), this implies that: $l_{c} {(\sum)}^{2} ≃ l_{c} {(ρ)}^{2} .$ ${l_c}{\left( \sum \right)^2} \simeq {l_c}{\left( \rho \right)^2}.$ (66)

This shows that correlations of the column-density fields are decaying over a characteristic length close to l_c(ρ), the correlation length of the underlying density field. In general, we can thus assume that l_c(∑) ~ l_c(ρ), so that information gathered from the column-density yields an estimate of the characteristic decay length of correlations of the underlying density field ρ.

6 Application to the Observations of the Polaris Cloud

As mentioned in Sect. 5, observations trace back the column-density (Kleiner & Dickman 1984; Schneider et al. 2015; Ossenkopf-Okada et al. 2016). These observations of the column-density PDFs in MCs show that regions where star formation has not occurred yet exhibit lognormal PDFs while regions with numerous prestellar cores develop power-law tails (PLTs) at high column densities (Kainulainen et al. 2009; Schneider et al. 2013). In addition to the integration effects yielding potentially the observed column-density to be anisotropic and inhomogeneous, observational data suffer further biases due to line of sight (l.o.s) contamination and noise (Schneider et al. 2015; Ossenkopf-Okada et al. 2016). L.o.s contamination causes two important biases. The observed power-law tail appears to be steeper than its corrected and uncontaminated counterpart while the observed variance in the lognormal part appears to be smaller than its corrected counterpart (Schneider et al. 2013). The overall effect of l.o.s contamination is to produce an underestimation of the total variance of the column-density.

6.1 Polaris

As a typical example of initial conditions of star formation in MCs, we focus on the Polaris flare, where line of sight contamination appears to be negligible (André et al. 2010; Miville-Deschênes et al. 2010; Schneider et al. 2013). Furthermore, most of the stellar cores in this cloud are still unbound (André et al. 2010), showing that star formation activity is very recent. Polaris is therefore a good candidate to probe the statistics of initial phases of star formation in MCs.

Data from Herschel Gould Belt survey extend across part of this cloud over approximately a 10 square degrees region with a linear size L ~ 10 parsecs (pc) (André et al. 2010). The cloud total mass and area above an extinction A_v ≥ 1 are $M_{c, A_{v} \geq 1} = 1.21 \times 10^{3} M_{⊙}$ ${M_{c,{A_{\rm{v}}} \ge 1}} = 1.21 \times {10^3}{M_ \odot }$ and $A_{c, A_{v} \geq 1} = 3.9 {pc}^{2}$ ${A_{c,{A_{\rm{v}}} \ge 1}} = 3.9\,{\rm{p}}{{\rm{c}}^2}$ , respectively. Dust temperatures are in a narrow T_dust = 13 + 1K interval, indicating fairly isothermal conditions with an average Mach-number $ℳ ≃ 3$ ${\cal M} \simeq 3$ (Schneider et al. 2013).

The Polaris logarithmic column-density field ŋ, where ŋ = ln(∑/ 〈∑〉), has a PDF with an extended Gaussian part and two emerging power-law tails, a first one with exponent α_ŋ,1 ≃ −4 followed by a shallower one with exponent α_ŋ,2 ≃ −2 (Fig. 2). (Jaupart & Chabrier 2020) have shown that the first steep PLT is due to gravity beginning to affect turbulence in parts of the cloud and hence records an early stage of (local) collapse. Furthermore the authors outlined a procedure to reconstruct the underlying logarithmic volume density PDF, noted s-PDF, where $s = \ln (ρ / E (ρ))$ $s = \ln \left( {{\rho \mathord{\left/ {\vphantom {\rho {\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {\left( \rho \right)}}} \right)$ , from data on the η-PDF. The underlying s-PDF displays a Gaussian part and two PLTs with exponents α_S,1 = −2 and α_S,2 = 3/2, respectively (see Fig. 2).

Fig. 2

Column and volume density PDF of the Polaris cloud. Left: Observed logarithmic column-density (ŋ = ln(∑/ 〈∑〉)-PDF (Schneider et al. 2013; Jaupart & Chabrier 2020). Right: Estimated and reconstructed underlying logarithmic density $(s = \ln (ρ / E (ρ)))$ $\left( {s = \ln \left( {{\rho \mathord{\left/ {\vphantom {\rho {\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {\left( \rho \right)}}} \right)} \right)$ -PDF with the procedure from (Jaupart & Chabrier 2020).

6.2 Filtering Large-Scale Gradients

As seen in Sect. 5, integration effects can produce large-scale gradients and break statistical homogeneity as well as isotropy in the column density field.

Filtered and unfiltered column-density maps of the Polaris flare are displayed in Figs. 3 and 4. The low pass filter does not alter qualitatively the intricate structures that exist, while the high-pass filter reveals a large-scale gradient likely due to integration effects. In order to partially reduce measurement artifacts, we use a low pass filter that screens out structures larger than L/2 in the column-density contrast (∑ − 〈∑〉), where L is the linear size of the observed region and we recall that 〈∑〉 is the (surface) average of the column density map. We can then treat the column-density field as if it was homogeneous.

The low pass filter slightly diminishes the variance Var(∑/〈∑〉) which is ≃0.20 and ≃0.17 for the unfiltered and low pass filtered data, respectively. It barely affects structures with a positive column-density contrasts but increases the occurrence of highly negative column-density contrasts. This is seen in Fig. 5, which portrays the η-PDFs of the unfiltered and low pass filtered column-density maps. As mentioned in Sect. 5, high column density are not expected to be sensitive to integration effects.

6.3 Estimated ACF and Correlation Length

6.3.1 Correlation Length of Ŋ from the ACF

We now estimate the ACFs of the logarithmic column-density field ŋ = ln(∑/ 〈∑〉) for the three data sets (unfiltered, low and high pass filtered), using Eq. (17). The 2D heat-maps of the reduced ACFs ${\hat{C}}_{η} / Var (η)$ ${{{{\hat C}_\eta }} \mathord{\left/ {\vphantom {{{{\hat C}_\eta }} {{\rm{Var}}\left( \eta \right)}}} \right. \kern-\nulldelimiterspace} {{\rm{Var}}\left( \eta \right)}}$ are given in Fig. 6. We only display the top-right quadrant of possible lags (x > 0, y > 0) which amounts to half of the space useful to study the ACF due to its symmetry. The high pass filtered ACF illustrates the bias that can be introduced by integration effects.

The three ACFs all seem to be fairly isotropic at very short lags (scales) but are anisotropic at large ones. The low pass filtered ACF seems to decay more rapidly at short lags with a reduce anisotropy than the unfiltered one. Figure 7 displays the reduced ACF of the low pass filtered map in three different directions, x (θ = 0), x = y (θ = π/4) and y (θ = π/2). As can be seen from the heat maps but also from Fig. 7, a strong anisotropy is detected at large scales in the x direction (x/L ≥ 2 × 10^–2), while the ACF if fairly isotropic at shorter lags. From the y-direction to the π/4-direcťıon, the data seem to be fairly isotropic and bounded by an exponential with λ/L ≃ 5× 10^–2. Anisotropy is most pronounced along the x-direction and the resulting estimated correlation length ${\hat{l}}_{c} (η)$ ${{\hat l}_c}\left( \eta \right)$ is: ${\hat{l}}_{c} (η) ≃ 6 \times 10^{- 2} L ≃ \frac{1}{2} {(2 π)}^{1 / 2} λ,$ ${\widehat l_c}\left( \eta \right) \simeq 6 \times {10^{ - 2}}L \simeq {1 \over 2}{\left( {2\pi } \right)^{1/2}}\lambda ,$ (67)

or ${\hat{l}}_{c} (η) / R ≃ 1.2 \times 10^{- 1}$ ${{{{\hat l}_c}\left( \eta \right)} \mathord{\left/ {\vphantom {{{{\hat l}_c}\left( \eta \right)} {R \simeq 1.2 \times {{10}^{ - 1}}}}} \right. \kern-\nulldelimiterspace} {R \simeq 1.2 \times {{10}^{ - 1}}}}$ , thus ${\hat{l}}_{c} (η) / R \sim 10^{- 1}$ ${{{{\hat l}_c}\left( \eta \right)} \mathord{\left/ {\vphantom {{{{\hat l}_c}\left( \eta \right)} {R \sim {{10}^{ - 1}}}}} \right. \kern-\nulldelimiterspace} {R \sim {{10}^{ - 1}}}}$ . We then use ${\hat{l}}_{c} (η)$ ${{\hat l}_c}\left( \eta \right)$ as an estimate of l_c(ρ) to within an order of magnitude, such as $l_{c} (ρ) ~ 10^{- 1} R .$ ${l_c}\left( \rho \right)\~{10^{ - 1}}R.$ (68)

In fact, we expect Eq. (67) to provide upper bounds for ratios l_c(ŋ)/R and l_c(ρ)/R, because integration artifacts are only partially cancelled by the low pass filter.

6.3.2 Correlation Length of ŋ from Eq. (40)

To obtain an estimate of l_c(ŋ) we could also apply the results of Sect. 3 to the low pass filtered map. By integrating the column density map along the x (θ = 0) or y (θ = π/2) direction and computing the variance of the resulting integrated field we can obtain with Eq. (40) two estimates ${\hat{l}}_{c, x} (η)$ ${{\hat l}_{c,x}}\left( \eta \right)$ and ${\hat{l}}_{c, y} (η)$ ${{\hat l}_{c,y}}\left( \eta \right)$ of l_c(ŋ) within a factor of order unity. However, the estimated ACF displays a strong anisotropy in the x direction at large scales (x/L ≥ 2 10^–2), so we expect the estimates to give rather different results. Computing the estimates yields ${\hat{l}}_{c, x} (η) ≃ 2.6 \times 10^{- 1} R$ ${\widehat l_{c,x}}\left( \eta \right) \simeq 2.6 \times {10^{ - 1}}R$ (69) ${\hat{l}}_{c, y} (η) ≃ 3.5 \times 10^{- 2} R,$ ${\widehat l_{c,y}}\left( \eta \right) \simeq 3.5 \times {10^{ - 2}}R,$ (70)

with ${\hat{l}}_{c, x} (η) > {\hat{l}}_{c, y} (η)$ ${{\hat l}_{c,x}}\left( \eta \right) > {{\hat l}_{c,y}}\left( \eta \right)$ , as can be expected from the anisotropy of the observed ACF. Since the anisotropy of the ACF starts before it is significantly smaller than the variance, it is not clear which of the two estimates gives the better approximation of the actual l_c(ŋ). However, they are both within a factor 3 of the estimate produced by Eq. (67) from the ACF which yielded ${\hat{l}}_{c} (η) \sim 10^{- 1} R$ ${{\hat l}_c}\left( \eta \right) \sim {10^{ - 1}}R$ .

Using l_c(ŋ) as an estimate of l_c(ρ) yields again, to within an order of magnitude, l_c(ρ) ≲ 10^–1 R.

6.3.3 Correlation Length of ρ from Eqs. (40) or (63) and the Variance of ∑

As discussed in Sect. (5.2) and Eq. (63), one can also estimate the ratio l_c(ρ)/R by (1) computing the variance Var $(Σ / E (Σ)), (2)$ $\left( {{{\rm{\Sigma }} \mathord{\left/ {\vphantom {{\rm{\Sigma }} }} \right. \kern-\nulldelimiterspace} }\left( {\rm{\Sigma }} \right)} \right),\left( 2 \right)$ giving an estimate of Var $(ρ / E (ρ))$ $\left( {{\rho \mathord{\left/ {\vphantom {\rho {\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {\left( \rho \right)}}} \right)$ and (3) giving an estimate of the average thickness of the cloud (the length of the line of sight), for example by assuming that the cloud has roughly the same dimension in the three directions.

In pure isothermal turbulence, Var $(ρ / E (ρ)) \approx {(b ℳ)}^{2}$ $\left( {{\rho \mathord{\left/ {\vphantom {\rho {\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {\left( \rho \right)}}} \right) \approx {\left( {b{\cal M}} \right)^2}$ , which is ≃ 1 for the Polaris case $(b ≃ 0.3 - 0.4, ℳ ≃ 3)$ $\left( {b \simeq 0.3 - 0.4,\,{\cal M} \simeq 3} \right)$ . However, when gravity starts generating power-law tails in the density PDF, the variance becomes larger than ${(b ℳ)}^{2}$ ${\left( {b{\cal M}} \right)^2}$ (Jaupart & Chabrier 2020). For Polaris, the column-density PDF displays a power-law tail with exponent α_ŋ≃−4, which is linked to an underlying density PDF with a power law tail exponent α_s≃−2 (Federrath & Klessen 2013; Jaupart & Chabrier 2020). Using the reconstructed s-PDF of Fig. (2) from the procedure described in (Jaupart & Chabrier 2020), we can derive an estimate of Var $(ρ / E (ρ))$ $\left( {{\rho \mathord{\left/ {\vphantom {\rho {\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {\left( \rho \right)}}} \right)$ . In principle, for such a model PDF, the variance is infinitely large due to the power-law tails exponents α_s,1=−2 and α_s,2=−3/2. However, we expect a cut-off at high (column)-density, which is indeed visible in the data. This cutoff may be due to a change of thermodynamic conditions of the cloud, e.g. from isothermal to adiabatic conditions. For a typical cut-off number-density n_ad = 10¹⁰ cm^–3 (Masunaga & Inutsuka 2000; Machida et al. 2006; Vaytet et al. 2013, 2018) and for a cloud of average density $\bar{n} = 10^{3} {cm}^{- 3}$ $\bar n = {10^3}\,{\rm{c}}{{\rm{m}}^{ - 3}}$ , the cutoff occurs at s_ad ≃ 16. However, there may be other causes for a high density cut-off. In order to assess this possibility, we thus determine three different estimates of the variance Var $(ρ / E (ρ))$ $\left( {{\rho \mathord{\left/ {\vphantom {\rho {\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {\left( \rho \right)}}} \right)$ from the reconstructed s-PDF of Fig. 2: one densities up to 6.3 (s ≤ 6.3), which corresponds to the onset of the 2nd PLT, a second one for s ≤ 8 in order to include contributions from the 2nd PLT, and a third one for s ≤ 16 ≃ s_ad in order to include all the data up to the adiabatic limit. We obtain Var $(ρ / E (ρ)) ≃ 5, 7, 227$ $\left( {{\rho \mathord{\left/ {\vphantom {\rho {\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {\left( \rho \right)}}} \right) \simeq 5,\,7,\,227$ , respectively, such that: ${\hat{f}}_{X, L} (ξ_{0}) Δ ξ ≃ {\hat{F}}_{X; L} (ξ_{0} + Δ ξ) - {\hat{F}}_{X; L} (ξ_{0}),$ ${l_c}\left( \rho \right)/R \simeq 0.04,\,0.03,\,0.001.$ (71)

This provides us with the conservative estimate l_c(ρ)/R ~ 10^–2, which is an order of magnitude smaller than the value estimated from the ACF (Eq. (67)) but closer to the estimate Eq. (70).

It is thus important to understand whether most of the anisotropy in the ACF originates from some integration artifacts and whether it causes or not an overestimation of the correlation lengths of ŋ or ρ.

Fig. 3

Column-density maps of the Polaris cloud. Left panel: without filter. Middle panel: with a high-pass filter filtering scales smaller than L/2. Right panel: with a low pass filter filtering scales larger than L/2. The low pass filter does not alter qualitatively the richness of structures found in the Polaris flare, while the high-pass filter shows a large-scale gradient that can be produced by an integration effect.

Fig. 4

Same as Fig. (3), but for the binary map Θ(log(∑/ 〈∑〉))where Θ is Heaviside’s step function. Regions where ∑ > 〈∑〉 appear darker than regions where ∑ < 〈∑〉.

Fig. 5

η-PDFs. Blue round and purple triangular symbols represent the PDFs of the unflltered and low pass filtered maps, respectively. The filter does not alter regions with η > 0 but increases the occurrence of regions with η < −1. Horizontal errorbars represent bin spacing.

6.4 Ergodic Estimate of the Observed PDF, Real Error Bars, and Reduced Integration Artifacts

As mentioned earlier, column-density PDFs serve as tracers of the statistics of the underlying density field. The various forms of these PDFs can be attributed to the various processes that are operating in MCs, from a fully lognormal distribution when purely turbulent motions dominate to a lognormal distribution with high density PLTs when gravitational effects become significant (Vazquez-Semadeni 1994; Passot & Vázquez-Semadeni 1998; Kainulainen et al. 2009; Schneider et al. 2013). This calls for a precise determination of the statistical uncertainty on the observed PDF, especially at high-density values.

The empirical PDF ${\hat{f}}_{X} (ξ_{0})$ ${{\hat f}_X}\left( {{\xi _0}} \right)$ of stochastic field X (here X will be the column density ŋ) is deduced from histograms with some bin size ∆ξ. Error bars are usually estimated from Poisson statistics (using the number of points per bin) and can therefore be very small (Schneider et al. 2013). It is worth delving deeper into this issue. A histogram yields the following estimate: ${\hat{f}}_{X, L} (ξ_{0}) Δ ξ ≃ {\hat{F}}_{X; L} (ξ_{0} + Δ ξ) - {\hat{F}}_{X; L} (ξ_{0}),$ ${\widehat f_{X,L}}\left( {{\xi _0}} \right)\Delta \xi \simeq {\widehat F_{X;L}}\left( {{\xi _0} + \Delta \xi } \right) - {\widehat F_{X;L}}\left( {{\xi _0}} \right),$ (72)

where ${\hat{F}}_{X; L}$ ${{\hat F}_{X;L}}$ is the empirical cumulative distribution function. Formally, this amounts to the ergodic estimate of the average of the following field, noted g_ξ0(y): $g_{ξ_{0}} (y) = h_{ξ_{0} + Δ ξ} (X (y)) - h_{ξ_{0}} (X (y)),$ ${g_{{\xi _0}}}\left( y \right) = {h_{{\xi _0} + \Delta \xi }}\left( {X\left( y \right)} \right) - {h_{{\xi _0}}}\left( {X\left( y \right)} \right),$ (73)

where $where h_{ξ_{0}} (X (y)) = Θ (ξ_{0} - X (y)),$ ${\rm{where}}\,{h_{{\xi _0}}}\left( {X\left( y \right)} \right) = \Theta \left( {{\xi _0} - X\left( y \right)} \right),$ (74)

(see Sect. 2.1.2 and 4). Thus, proper statistical error levels must be calculated using the results of Sect. 2 and in general are not given by Poisson statistics.

In Appendix B.2, we study in detail ergodic estimates of average quantities. In general, the correlation length of g_ξ0 is a function of ξ₀itself. For Gaussian (or lognormal) distributions, an important result is that the confidence interval becomes quite large for values $| ξ_{0} - E (X) | \geq σ (X)$ $\left| {{\xi _0} - \left( X \right)} \right| \ge \sigma \left( X \right)$ , resulting in large errors if the sample size is too small. Thus, a reliable evaluation of the statistics of rare events (away from the average) requires very large sample sizes.

6.4.1 Reduced Integration Effects at High Density Contrasts

In this study, we focus on the column-density field X = ŋ and its PDF, noted p(ŋ). Using g_ŋ0(y) and its ACF for various values η₀, we are able to determine the appropriate statistical error bars and to get rid of some of the artifacts that are due to integration along the line of sight. In practice, we expect that such artifacts are not significant in high column-density regions (see Sect. 6.2). For example, anisotropy of the Polaris column density ACF in the x-direction is likely due to integration effects (see Sect. 6.3).

We expect, however, that the ACF of field g_ŋ0 for ŋ₀ > 0 is expected to show a reduced anisotropy at short scales. We thus obtain an empirical ACF of g_ŋ0 using Eq. (17). Figure 8 displays the estimated PDF of g_ŋ0 for the low pass filtered column-density map. At low column-density (η₀ = −1.06), a strong anisotropy is observed in the x-direction starting at x/L ≥2× 10^–2, as for the ACF of ŋ (see Fig. 7). For positive column density contrasts (η₀ > 0), this anisotropy is reduced and the ACFs are fairly isotropic at small scales in both the x and θ = π/4 directions, up to x/L ~ 10^–1 and r/L ~ 10^–1, respectively, where r denotes separation distance in the θ = π/4 direction. At larger separation distances, the data become quite noisy. This is consistent with the fact that the low path filtering procedure does not modify the PDF significantly in regions where ŋ > 0 (see Fig. 5).

This suggests that most of the ŋ ACF anisotropy in the x-direction at scales in the 10^–2−10^–1 range is due to integration effects. The peak of the correlation in the π/4-direction at high column-densities (ŋ₀ = 1.58) is probably due to the presence of the “Saxophone”-shaped filamentary structure that may be seen at the top of Fig. 3, which hosts most of the Polaris high density regions (Schneider et al. 2013).

6.4.2 Statistical Errorbars

Using the statistics of ɡ_ŋ0(y) has several advantages. One is that it reduces the impact of l.o.s. integration artifacts. In addition, it leads to proper error estimates for the empirical PDF.

Introducing some function of ŋ₀ noted ϕ(ŋ₀) which is expected to increase for increasing values of |ŋ₀|, the confidence interval above (1 − 1/m²) can be written as follows (see Bienayme-Tchebychev inequality, Eq. (21)): $p (η_{0}) \equiv f_{n} (η_{0}) = {\hat{f}}_{L} (η_{0}) (1 \pm m {(φ (η_{0}))}^{1 / 2} {(\frac{l_{c} (η)}{R})}^{D / 2}),$ $p\left( {{\eta _0}} \right) \equiv {f_n}\left( {{\eta _0}} \right) = {\widehat f_L}\left( {{\eta _0}} \right)\left( {1 \pm m{{\left( {\varphi \left( {{\eta _0}} \right)} \right)}^{1/2}}{{\left( {{{{l_c}\left( \eta \right)} \over R}} \right)}^{D/2}}} \right),$ (75)

with D = 2 and where $φ (η_{0}) = \frac{Var (g_{η_{0}})}{{\hat{f}}_{L} {(η_{0})}^{2} {(Δ η)}^{2}} \times {(\frac{l_{c} (g_{η_{0}})}{l_{c} (η)})}^{2},$ $\varphi \left( {{\eta _0}} \right) = {{{\rm{Var}}\left( {{g_{{\eta _0}}}} \right)} \over {{{\widehat f}_L}{{\left( {{\eta _0}} \right)}^2}{{\left( {\Delta \eta } \right)}^2}}} \times {\left( {{{{l_c}\left( {{g_{{\eta _0}}}} \right)} \over {{l_c}\left( \eta \right)}}} \right)^2},$ (76) $= \frac{1}{{\hat{f}}_{L} (η_{0}) (Δ η)} \times {(\frac{l_{c} (g_{η_{0}})}{l_{c} (η)})}^{2},$ $= {1 \over {{{\widehat f}_L}\left( {{\eta _0}} \right)\left( {\Delta \eta } \right)}} \times {\left( {{{{l_c}\left( {{g_{{\eta _0}}}} \right)} \over {{l_c}\left( \eta \right)}}} \right)^2},$ (77)

because $Var (g_{η_{0}}) ≃ {\hat{f}}_{L} (η_{0}) (Δ η)$ ${\rm{Var}}\left( {{g_{{\eta _0}}}} \right) \simeq {{\hat f}_L}\left( {{\eta _0}} \right)\left( {{\rm{\Delta }}\eta } \right)$ and where l_c(ɡη₀)² ∝ Δη; so that the error bars on the PDF do not depend on the choice of bin size (for small Δη, see Appendix B.2).

From the empirical ACF $C_{g_{η_{0}}}$ ${C_{{g_{{\eta _0}}}}}$ , one can then estimate the correlation length of ɡ_ŋ0 and thus ϕ(ŋ₀) for every ŋ₀. Unfortunately, this procedure is hampered by the fact that the ACF becomes increasingly noisy at high contrasts |η₀| > 1, due to sample sizes that are too small.

In principle, to determine ϕ(ŋ₀) and its variation one must calculate the complete integral that defines l_c(ɡ_ŋ0). This may be avoided as follows. The growth of ϕ(ŋ₀) may be obtained by looking at the short scale behavior of the ACFs of ɡ_ŋ0. In Fig. 8, it appears that the values of ${\hat{C}}_{g_{η_{0}}}$ ${{\hat C}_{{g_{{\eta _0}}}}}$ for positive column density contrasts (ŋ > 0) are isotropic and close to being ∝ |y|^–1/2 at short scales. We thus write that: ${\hat{C}}_{g_{η_{0}}} (y) {(f_{η} (η_{0}) Δ η)}^{2} \times c_{\sqrt{}} {| y / L |}^{- 1 / 2}$ ${\widehat C_{{g_{{\eta _0}}}}}\left( y \right){\left( {{f_\eta }\left( {{\eta _0}} \right)\Delta \eta } \right)^2} \times {c_{\sqrt {} }}{\left| {y/L} \right|^{ - 1/2}}$ (78)

where $c_{\sqrt}$ ${c_\surd }$ is a constant of proportionality that depends on ŋ₀. Values of $c_{\sqrt}$ ${c_\surd }$ as a function of ŋ₀ are given in Fig. 9. We have only studied ɡ_ŋ0 for −0.7 ≤ ŋ₀ ≤ 1.58, because the ACFs are extremely noisy at high positive density contrasts (ŋ ≥ 1.58) due to poor sampling. At negative density contrasts (ŋ ≤ −0.7), where integration artifacts are the largest (see Sect. 6.4.1), the ACFs are no longer sufficiently isotropic and do not conform to a scaling in |y|^–1/2. Figure 9 shows that $c_{\sqrt} (η_{0})$ ${c_\surd }\left( {{\eta _0}} \right)$ is an increasing function of |η₀| for large |η₀|, illustrating the fact that ϕ(ŋ₀) is expected to be large compared to f_ŋ(ŋ₀) for large contrasts |η₀| > 1.

Fig. 6

Reduced ACF function of $η ({\hat{C}}_{η} / Var (η))$ $\eta \,\left( {{{{{\hat C}_\eta }} \mathord{\left/ {\vphantom {{{{\hat C}_\eta }} {{\rm{Var}}\left( \eta \right)}}} \right. \kern-\nulldelimiterspace} {{\rm{Var}}\left( \eta \right)}}} \right)$ for the Polaris flare. Left panel: without filter. Middle panel: with a high pass Alter filtering scales smaller than L/2. Right panel: with a low pass Alter Altering scales larger than L/2. Contours from black to purple to blue to light blue give the value of the reduced ACF at 0.5, e^–1 ≃ 0.37, 0.1, −0.1.

Fig. 7

Reduced ACF of the low pass Altered map in three different directions. Blue line: x-direction (y = 0). Purple line: y-direction (x = 0). Red line: π/4 or x = y-direction. Green line: exponential pro Ale giving an estimate of the rate of decay (here λ/L 5 × 10^–2). A strong anisotropy is present in the x direction at large scales (x/L ≥ 2 × 10^–2).

6.4.3 Correlation Length of ɡ_η0 from Eq. (40)

To obtain an estimate of l_c(ɡ_ŋ0) and thus of ϕ(ŋ₀), we can use the results of Sect. 3. For each ŋ₀ we compute the field ɡ_η0(x, y) from the column density map ŋ(x, y). We then produce the two fields $\sum_{g_{η_{0}}, x} (y)$ $\sum\nolimits_{{g_{{\eta _0}}},x} {\left( y \right)}$ and $\sum_{g_{η_{0}}, y} (x)$ $\sum\nolimits_{{g_{{\eta _0}}},y} {\left( x \right)}$ obtained from the integration of ɡ_η0(x, y) along the x and y direction, respectively: $\sum_{g_{η_{0},} x} (y) = \int g_{η_{0}} (x, y) d x,$ ${\sum _{{g_{{\eta _0},}}x}}\left( y \right) = \int {{g_{{\eta _0}}}\left( {x,y} \right){\rm{d}}x,}$ (79) $\sum_{g_{η_{0},} y} (y) = \int g_{η_{0}} (x, y) d y .$ ${\sum _{{g_{{\eta _0},}}y}}\left( y \right) = \int {{g_{{\eta _0}}}\left( {x,y} \right){\rm{d}}y.}$ (80)

Computing the variance of these integrated fields we can obtain with Eq. (40) two estimates ${\hat{l}}_{c, x} (g_{η_{0}})$ ${\hat l_{c,x}}\left( {{g_{{\eta _0}}}} \right)$ and ${\hat{l}}_{c, y} (g_{η_{0}})$ ${\hat l_{c,y}}\left( {{g_{{\eta _0}}}} \right)$ of l_c(ɡ_ŋ0) within a factor of order unity: $\frac{{\hat{l}}_{c, x} (g_{η_{0}})}{R_{x}} = \frac{Var (\sum_{g_{η_{0}}, x})}{Var (g_{η_{0}})} \frac{1}{L_{x}^{2}},$ ${{{{\widehat l}_{c,x}}\left( {{g_{{\eta _0}}}} \right)} \over {{R_x}}} = {{{\rm{Var}}\left( {{\sum _{{g_{{\eta _0}}},x}}} \right)} \over {{\rm{Var}}\left( {{g_{{\eta _0}}}} \right)}}{1 \over {L_x^2}},$ (81) $\frac{{\hat{l}}_{c, y} (g_{η_{0}})}{R_{y}} = \frac{Var (\sum_{g_{η_{0}}, y})}{Var (g_{η_{0}})} \frac{1}{L_{y}^{2}},$ ${{{{\widehat l}_{c,y}}\left( {{g_{{\eta _0}}}} \right)} \over {{R_y}}} = {{{\rm{Var}}\left( {{\sum _{{g_{{\eta _0}}},y}}} \right)} \over {{\rm{Var}}\left( {{g_{{\eta _0}}}} \right)}}{1 \over {L_y^2}},$ (82)

where L_X,y = 2R_X,y are the lengths of the column density map in the x and y directions. For the present map of Polaris these two lengths are approximately equal, L_x ≃ L_y = L.

As, for ŋ₀ > 0, the experimental ACFs of ɡ_ŋ0 are fairly isotropic, we expect the above estimates of l_c(ɡ_ŋ0)/R (Eqs. (81) and (82)) to give similar and accurate results. They are given in Fig. 10 for Δη = 0.1 and always yield the same order of magnitude.

To test the accuracy of the estimates ${\hat{l}}_{c, x} (g_{η_{0}})$ ${\hat l_{c,x}}\left( {{g_{{\eta _0}}}} \right)$ and ${\hat{l}}_{c, y} (g_{η_{0}})$ ${\hat l_{c,y}}\left( {{g_{{\eta _0}}}} \right)$ of l_c(ɡ_ŋ0), we have tested how they scale with bin size Δη. If they were accurate estimates, they would have to conform to a scaling ${\hat{l}}_{c, x, y} (g_{η_{0}}) \propto {(Δ η)}^{1 / 2}$ ${\hat l_{c,x,y}}\left( {{g_{{\eta _0}}}} \right) \propto {\left( {{\rm{\Delta }}\eta } \right)^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}$ so that the error bars on the PDF (Eq. (75)) do not depend on the choice of bin size (see Appendix B.2). At positive density contrasts, ŋ > 0, the estimates ${\hat{l}}_{c, x} (g_{η_{0}})$ ${\hat l_{c,x}}\left( {{g_{{\eta _0}}}} \right)$ and ${\hat{l}}_{c, y} (g_{η_{0}})$ ${\hat l_{c,y}}\left( {{g_{{\eta _0}}}} \right)$ follow a scaling close to the predicted ${\hat{l}}_{c, x, y} (g_{η_{0}}) \propto {(Δ η)}^{1 / 2}$ ${\hat l_{c,x,y}}\left( {{g_{{\eta _0}}}} \right) \propto {\left( {{\rm{\Delta }}\eta } \right)^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}$ . They do not, however, for negative contrasts, ŋ < 0, where the ACF is no longer isotropic and where there are strong integration artifacts. This can be seen on Fig. 11 where we display the shadded regions bounded by the two estimates ${\hat{l}}_{c, x} (g_{η_{0}})$ ${\hat l_{c,x}}\left( {{g_{{\eta _0}}}} \right)$ and ${\hat{l}}_{c, y} (g_{η_{0}})$ ${\hat l_{c,y}}\left( {{g_{{\eta _0}}}} \right)$ for different bin sizes Δη at 4 values of η₀ = −1.1,0.7, 1.1, 1.6.

We then conclude that ${\hat{l}}_{c, x} (g_{η_{0}})$ ${\hat l_{c,x}}\left( {{g_{{\eta _0}}}} \right)$ and ${\hat{l}}_{c, y} (g_{η_{0}})$ ${\hat l_{c,y}}\left( {{g_{{\eta _0}}}} \right)$ can be used to accurately estimate l_c(ɡ_ŋ0) for ŋ > 0, i.e. in the regions of interest for star formation. In practice we should assume that l_c(ɡ_ŋ0) lies somewhere between ${\hat{l}}_{c, x} (g_{η_{0}})$ ${\hat l_{c,x}}\left( {{g_{{\eta _0}}}} \right)$ and ${\hat{l}}_{c, y} (g_{η_{0}})$ ${\hat l_{c,y}}\left( {{g_{{\eta _0}}}} \right)$ and compute error bars with both values. This is done in the next section.

Fig. 8

Estimated ACF of the field ɡ_η0(y) for different values of ŋ₀ = ŋ in 3 different directions. Blue, purple and red lines represent respectively the x, y and π/4 (x = y) directions. Two top panels are for ŋ₀ = −1.06 and 0.69, whereas the two bottom panels are for ŋ₀ = 0.94 and 1.58. At low column-densities (η₀ = −1.06), a strong anisotropy is detected in the x-direction and becomes noticeable at x/L ≥ 2 × 10^–2 as was the case for the column-density ACF (see Fig. 7). For high column-densities (η₀ > 0), however, the anisotropy is subdued and the ACFs are fairly isotropic at small scales up to x/L, r/L ~ 10^–1 where the data become quite noisy. Green dashed lines show the profile of an isotropic ACF proportional to r^–1/2 that matches the data at short scales fairly well, at least over a decade.

Fig. 9

Constant of proportionality $c_{\sqrt}$ ${c_\surd }$ such that $C_{g_{η_{0}}} (y) = {(f_{η} (η_{0}) Δ η)}^{2} \times c_{\sqrt} / \sqrt{| y | / L}$ ${C_{{g_{{\eta _0}}}}}\left( {\bf{y}} \right) = {\left( {{f_\eta }\left( {{\eta _0}} \right){\rm{\Delta }}\eta } \right)^2} \times c{{_\surd } \mathord{\left/ {\vphantom {{_\surd } {\sqrt {{{\left| {\bf{y}} \right|} \mathord{\left/ {\vphantom {{\left| {\bf{y}} \right|} L}} \right. \kern-\nulldelimiterspace} L}} }}} \right. \kern-\nulldelimiterspace} {\sqrt {{{\left| {\bf{y}} \right|} \mathord{\left/ {\vphantom {{\left| {\bf{y}} \right|} L}} \right. \kern-\nulldelimiterspace} L}} }}$ at short scales.

Fig. 10

Estimate of the ratio l_c(ɡη₀)/R from Eqs. (81) and (82) at different η₀ for Δη = 0.1. The blue line gives the values of estimate ${\hat{l}}_{c, y} (g_{η_{0}})$ ${\hat l_{c,y}}\left( {{g_{{\eta _0}}}} \right)$ while the red dash-dotted line gives the values of ${\hat{l}}_{c, x} (g_{η_{0}})$ ${\hat l_{c,x}}\left( {{g_{{\eta _0}}}} \right)$ . The two estimates always yield the same order of magnitude.

Fig. 11

Shadded regions bounded by the two estimates ${\hat{l}}_{c, x} (g_{η_{0}})$ ${\hat l_{c,x}}\left( {{g_{{\eta _0}}}} \right)$ and ${\hat{l}}_{c, y} (g_{η_{0}})$ ${\hat l_{c,y}}\left( {{g_{{\eta _0}}}} \right)$ for different bin sizes Δη at four values of η₀ = −1.1, 0.7, 1.1, 1.6 from light to dark blue respectively. For clarity the values of the estimates for η₀ = −1.1 and η₀ = 0.7 were multiplied by 15 and 1.5 to shift their curves upward. Red dash-dotted lines indicate a scaling ∝ (Δη)^1/2. At positive density contrasts ŋ > 0 the estimates follow a scaling close to the predicted ${\hat{l}}_{c, x, y} (g_{η_{0}}) \propto {(Δ η)}^{1 / 2}$ ${\hat l_{c,x,y}}\left( {{g_{{\eta _0}}}} \right) \propto {\left( {{\rm{\Delta }}\eta } \right)^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}$ .

6.4.4 Effective Error Bars on the Observed PDF

Once we have obtained these estimates of l_c(ɡ_ŋ0)/R and tested their accuracy, we can compute effective error-bars at a given confidence interval for the PDF p(ŋ_o) = f_ŋ(ŋ_o) with Bienayme-Tchebychev inequality, Eq. (21): $p (η_{0}) \equiv f_{n} (η_{0}) = {\hat{f}}_{L} (η_{0}) (1 \pm m \frac{1}{{({\hat{f}}_{L} (η_{0}) Δ η)}^{1 / 2}} \frac{l_{c} (g_{η_{0}})}{R}),$ $p\left( {{\eta _0}} \right) \equiv {f_n}\left( {{\eta _0}} \right) = {\widehat f_L}\left( {{\eta _0}} \right)\left( {1 \pm m{1 \over {{{\left( {{{\widehat f}_L}\left( {{\eta _0}} \right)\Delta \eta } \right)}^{1/2}}}}{{{l_c}\left( {{g_{{\eta _0}}}} \right)} \over R}} \right),$ (83)

with ${\hat{f}}_{L} (η_{0})$ ${{\hat f}_L}\left( {{\eta _0}} \right)$ the estimate of the PDF produced by histograms of bin size Δη and m giving a confidence interval of over 1 − 1/m².

Figure 12 displays the empirical Polaris PDF, with error bars computed from Eq. (83) for the two estimates ${\hat{l}}_{c, x} (g_{η_{0}})$ ${\hat l_{c,x}}\left( {{g_{{\eta _0}}}} \right)$ and ${\hat{l}}_{c, y} (g_{η_{0}})$ ${\hat l_{c,y}}\left( {{g_{{\eta _0}}}} \right)$ with Δη = 0.1. We have taken m = 2 to obtain a confidence interval of over 75%. As expected, the amplitudes of the error bars and thus of ϕ(ŋ₀) grow with increasing values of |ŋ₀|. These error bars may be inaccurate for ŋ₀ < 0 because the estimates ${\hat{l}}_{c, x} (g_{η_{0}})$ ${\hat l_{c,x}}\left( {{g_{{\eta _0}}}} \right)$ and ${\hat{l}}_{c, y} (g_{η_{0}})$ ${\hat l_{c,y}}\left( {{g_{{\eta _0}}}} \right)$ show a dependence that is too strong on bin size Δη at these low column densities (see Sect. 6.4.3). However, they are accurate at high column densities ŋ₀ > 0 and serve to emphasize that error bars should not be derived from Poisson statistics and that the accuracy of the low and high end parts of the PDF are severely degraded by sample sizes that are too small.

7 Applications to the Orion B Cloud

In this section, we apply the results of Sect. 2 to the Orion B cloud (Schneider et al. 2013; Orkisz et al. 2017), another well studied star-forming MC. In this case, one encounters additional difficulties because the observed field is markedly elongated in the “vertical” direction (y) with data over a region whose geometrical shape is not suited to a straightforward data analysis (see Fig. 13). For this reason, we have extracted 2 parts of the cloud with rectangular shapes. One is elongated with a length that is close to the vertical dimension (L_y) of the total field of observation, which we shall refer to as a “filament”. A second part is a rectangular one with an aspect ratio close to 1, with a length close to the maximum horizontal length of the full cloud (L_x), which we shall refer to as a “square” region (see Fig. 13). We determine the ACF of these two subregions and the associated correlation lengths in Appendix C.

Using the ACF, we find that l_c(ŋ)/L_x ~ 10^–1. When using the variances Var (ρ) and Var (∑), we obtain a lower value: l_c(ŋ)/L_z ~ 10^–2, where L_z is the characteristic thickness of the cloud (along the line of sight).

Fig. 12

PDF of the logarithmic column-density ŋ with statistical error-bars for m = 2 and the two estimates ${\hat{l}}_{c, x} (g_{η_{0}})$ ${\hat l_{c,x}}\left( {{g_{{\eta _0}}}} \right)$ and ${\hat{l}}_{c, y} (g_{η_{0}})$ ${\hat l_{c,y}}\left( {{g_{{\eta _0}}}} \right)$ respectively in blue and red. This emphasizes that error bars should not be derived from Poisson statistics and that the accuracy of the low and high end parts of the PDF are degraded by the small sample size.

8 Conclusion

In this article, we have examined the validity of statistical homogeneity and ergodicity when deriving general properties of star-forming molecular clouds from observations or numerical results of some of their properties. Notably, we have focused on the field of density fluctuations and its PDF. This is a fundamental quantity since these fluctuations are believed to be at the root of the star formation process. It is thus essential to examine the validity of a statistical approach in order to assess the accuracy of the determination of the statistical properties of the cloud from the observations or simulations of a limited number of samples. To fulfill this goal, we first use the ergodic theory for any random field X to derive some rigorous statistical results. We explain how to calculate the correlation length of fluctuations in this field, l_c(X), from the autocovariance function (ACF) (Eq. (11)). We show that the estimation of the correlation length allows one to define an effective number of samples, N, such that a space (or time) average of a single realization is formally equivalent to averaging over N independent samples (see e.g. Papoulis & Pillai 1965). When it is difficult to determine the correlation length from the empirical ACF, we have shown alternative ways to estimate it from fluctuations in Sect. 3.

We then apply this statistical approach and the results of ergodic theory to astrophysical systems in Sects. 4–6. In Sect. 4, we examine in particular the stochastic fields induced by compressible turbulent motions driven at large scale. We show that while the energy of turbulent motion is injected at large scale, comparable to the whole system L, the correlation length of either the specific kinetic energy or the density is actually small compared to L (see Sect. 4.3). We stress that there is no contradiction in these results since the injection scale and correlation length are two completely different quantities.

In Sect. 5 we apply our results to the observed column-density field, which is related to the (volume) density field in the cloud. Applying the results of Sect. 3, we have devised a method to determine the correlation length, or more exactly the ratio of the correlation length over the size of the cloud (or the box of numerical simulations), from the variances of both the volume-density and column-density fields. We have also shown that the statistics of the column-density field are affected by artifacts due to integration along the line of sight. These artifacts tend to generate an artificial anistropy in the colum density field and thus in the empirical ACF. Using the previous results, we have then examined in detail the Polaris cloud, which serves as a template for initial stages of star formation in MCs, in Sect. 6. We showed that the artificial anistropy in the empirical ACF results in an overestimation of the correlation length of density fluctuations within the cloud. Estimating the variance of the underlying density field, Var(ρ/ 〈ρ〉), and computing the variance of the column-density field, Var (∑/ 〈∑〉), we are able to derive a more accurate estimate of the correlation length l_c (Eq. (63)), which can be an order of magnitude smaller than the one obtained from the empirical ACF (Sect. 6.3).

Moreover, we have shown that studying the statistics of the PDF ergodic estimator for positive column-density contrasts enables us to get rid of most of the integration anisotropy bias (Sect. 6.4.1). It also allows a proper evaluation of statistical error bars and shows that these (i) cannot be derived from simple Poisson statistics and (ii) become increasingly large for increasing density contrasts (|η| ≥ 1), severely reducing, in particular, the accuracy of the high end part of the PDF because of the small sample size (see Sect. 6.4.2). Furthermore, we provide a method that can be used by observers and numericists to determine robust error bars in Sect. 6.4.4.

Finally, we found that the correlation length of the density field in the Polaris cloud is of about ~1% of the size of the cloud (l_c(ρ)/R ~ 10^–2). We have also examined the more complex Orion B cloud to confirm the results obtained for Polaris in Sect. 7.

These calculations provide a rigorous framework for the analysis of the global properties of star-forming clouds from limited statistical observations of their density and surface fluctuating properties. They show in particular that for typical star-forming clouds at the onset of the star formation process, the correlation length of density fluctuations is much smaller than the size of the cloud. This justifies the assumption and the relevance of a statistical approach based on statistical homogeneity when studying the PDF of the cloud (Jaupart & Chabrier 2020, 2021), as done e.g. in cosmology or in the study of turbulence.

Fig. 13

Column-density maps of the Polaris cloud. Left panel: full observed field. Middle panel: extracted filamentary region. Right panel: extracted “square” region, unfiltered and low pass filtered (see Sect. 6).

Acknowledgements

The authors are grateful to Christoph Federrath for always providing data from his numerical simulations upon request. This research has made use of data from the Herschel Gould Belt survey (HGBS) project (http://gouldbelt-herschel.cea.fr). The HGBS is a Herschel Key Programme jointly carried out by SPIRE Specialist Astronomy Group 3 (SAG 3), scientists of several institutes in the PACS Consortium (CEA Saclay, INAF-IFSI Rome and INAF-Arcetri, KU Leuven, MPIA Heidelberg), and scientists of the Herschel Science Center (HSC).

Appendix A Ergodic Estimate for a General Control Volume Ω

We described in Sec. (2) some known ergodic results, but they are derived for a cubic control volume $Ω= {[- \frac{L}{2}, \frac{L}{2}]}^{D}$ ${\rm{\Omega = }}{\left[ { - {L \over 2},{L \over 2}} \right]^D}$ . These results obviously do not depend on the shape of the control volume. We give here the general formulation for any control volume Ω possessing a center of symmetry (meaning that ∀y ϵ Ω, −y ϵ Ω). We again denote |Ω| the volume of Ω and define the linear size of Ω as L^D = |Ω|. The ergodic estimate Eq. (4) is then ${\hat{X}}_{Ω} = \frac{1}{| Ω |} \int_{Ω} X (y) d y .$ ${\hat X_\Omega } = {1 \over {\left| \Omega \right|}}\int_\Omega {X\left( {\bf{y}} \right){\rm{d}}{\bf{y}}} .$ (A.1)

To obtain its variance, one has to compute the double integral $\begin{array}{l} Var ({\hat{X}}_{Ω}) & = \frac{1}{| Ω |^{2}} \int \int_{Ω^{2}} E (X (y) X (z) - E {(X)}^{2}) d y d z \\ = \frac{1}{| Ω |^{2}} \int \int_{Ω^{2}} C_{X} (y - z) d y d z . \end{array}$ $\matrix{ {{\rm{Var}}({{\hat X}_\Omega })} \hfill & { = {1 \over {|\Omega {|^2}}}{\mathbb{E}}\left( {X\left( {\bf{y}} \right)X\left( z \right) - {\mathbb{E}}{{\left( X \right)}^2}} \right){\rm{d}}{\bf{y}}\,{\rm{d}}z} \hfill \cr {} \hfill & { = {1 \over {|\Omega {|^2}}}\int {\int_{{\Omega ^2}} {{C_X}} \left( {{\bf{y}} - z} \right)} {\rm{d}}{\bf{y}}\,{\rm{d}}z.} \hfill \cr }$ (A.2)

Using the change of variables (u, v) = ϕ(y, z) = (y − z, y + z), one obtain $Var ({\hat{X}}_{Ω}) = \frac{1}{| Ω |^{2}} \int_{2 Ω} C_{X} (u) \int_{φ_{2}^{u} (Ω)} \frac{d u d v}{2^{D}}$ ${\rm{Var}}\left( {{{\hat X}_{\rm{\Omega }}}} \right) = {1 \over {|{\rm{\Omega }}{|^2}}}\int_{2{\rm{\Omega }}} {{C_X}} ({\bf{u}})\int_{\varphi _2^u\left( {\rm{\Omega }} \right)} {{{{\rm{d}}{\bf{u}}\,{\rm{d}}{\bf{v}}} \over {{2^D}}}}$ (A.3)

where $φ_{2}^{u} (Ω) = 2 ((Ω - u) \cap Ω) + u$ $\varphi _2^u\left( {\rm{\Omega }} \right) = 2\left( {\left( {{\rm{\Omega }} - {\bf{u}}} \right) \cap {\rm{\Omega }}} \right) + {\bf{u}}$ (A.4)

to obtain $Var ({\hat{X}}_{Ω}) = \frac{1}{| Ω |} \int_{2 Ω} C_{X} (u) \frac{| (Ω - u) \cap Ω |}{| Ω |} d u .$ ${\rm{Var}}\left( {{{\hat X}_{\rm{\Omega }}}} \right) = {1 \over {|{\rm{\Omega }}|}}\int_{2{\rm{\Omega }}} {{C_X}} \left( {\bf{u}} \right){{|\left( {{\rm{\Omega }} - {\bf{u}}} \right) \cap {\rm{\Omega }}|} \over {|{\rm{\Omega }}|}}{\rm{d}}{\bf{u}}.$ (A.5)

We then obtain the general Slutsky’s theorem, X is mean ergodic if and only if $\frac{1}{L^{D}} \int_{2 Ω} C_{X} (u) d u_{\vec{L \to \infty}} 0.$ ${1 \over {{L^D}}}\int_{2{\rm{\Omega }}} {{C_X}} ({\bf{u}}){\rm{d}}{{\bf{u}}_{\overrightarrow {L \to \infty } }}0.$ (A.6)

Appendix B Ergodic Estimators of the CMF and PDF

Appendix B.1 Cumulative Distribution Function (CMF)

The CMF of the stochastic field X can be constructed as the average of a particular function of the field X. Indeed, by definition, F_X(x₀) = ℙ (X(y) ≤ x₀) and a simple calculation shows that $ℙ (X (y) \leq x_{0}) = E (h_{x_{0}} (x (y)))$ ${\mathbb{P}}\left( {X\left( {\bf{y}} \right) \le {x_0}} \right) = {\mathbb{E}}\left( {{h_{{x_0}}}\left( {x\left( {\bf{y}} \right)} \right)} \right)$ with $h_{x_{0}} (z) = Θ (x_{0} - z)$ ${h_{{x_0}}}\left( z \right) = {\rm{\Theta }}\left( {{x_0} - z} \right)$ , where Θ is Heaviside step function. We are then ready to determine the confidence interval for the estimated CMF F_X of X. To do so, we need to apply the results of Sec. (2) to the field $h_{x_{0}} (X (y))$ ${h_{{x_0}}}\left( {X\left( {\bf{y}} \right)} \right)$ . The “natural” ergodic estimator of F_X(x₀) is thus: ${\hat{F}}_{L} (x_{0}) = \frac{1}{L^{D}} \int_{{[- \frac{L}{2}, \frac{L}{2}]}^{D}} h_{x_{0}} (X (y)) d y .$ ${\hat F_L}\left( {{x_0}} \right) = {1 \over {{L^D}}}\int_{{{\left[ { - {L \over 2},{L \over 2}} \right]}^D}} {{h_{{x_0}}}} \left( {X\left( {\bf{y}} \right)} \right){\rm{d}}{\bf{y}}.$ (B.1)

Then, to obtain the variance of ${\hat{F}}_{L} (x_{0})$ ${\hat F_L}\left( {{x_0}} \right)$ we need to express the ACF of $h_{x_{0}}$ ${h_{{x_0}}}$ (X(y)). We have $C_{h_{x_{0}}} (y) = F_{X}^{(2)} (x_{0}, x_{0}, y) - F_{X} {(x_{0})}^{2}$ ${C_{{h_{{x_0}}}}}\left( {\bf{y}} \right) = F_X^{\left( 2 \right)}\left( {{x_0},\;{x_0},{\bf{y}}} \right) - {F_X}{\left( {{x_0}} \right)^2}$ (B.2)

where $F_{X}^{(2)} (x_{0}, x_{0}, y) = ℙ (X (u + y)) \leq x_{0}$ $F_X^{\left( 2 \right)}\left( {{x_0},{x_0},{\bf{y}}} \right) = {\mathbb{P}}\left( {X\left( {{\bf{u}} + {\bf{y}}} \right)} \right) \le {x_0}$ and X(u) ≤ x₀) is the second-order distribution function and is the probability to have both X(u + y) ≤ x₀ and X(u) ≤ x₀. The variance of ${\hat{F}}_{L} (x_{0})$ ${\hat F_L}\left( {{x_0}} \right)$ is then $Var ({\hat{F}}_{L} (x_{0})) = \frac{1}{{(L)}^{D}} \int_{{[- L, L]}^{D}} C_{h_{x_{0}}} (y) \prod_{k = 1}^{D} (1 - \frac{| y_{k} |}{L}) d y$ ${\rm{Var}}\left( {{{\hat F}_L}\left( {{x_0}} \right)} \right) = {1 \over {{{\left( L \right)}^D}}}\int_{{{\left[ { - L,L} \right]}^D}} {{C_{{h_{{x_0}}}}}} \left( {\bf{y}} \right)\prod\limits_{k = 1}^D {\left( {1 - {{|{y_k}|} \over L}} \right)} {\rm{d}}{\bf{y}}$ (B.3) $\begin{array}{l} ≃ C_{h_{x_{0}}} (0) {(\frac{l_{c} (h_{x_{0}})}{R})}^{D} \\ = F_{X} (x_{0}) (1 - F_{X} (x_{0})) {(\frac{l_{c} (h_{x_{0}})}{R})}^{D}, \end{array}$ $\matrix{{\simeq {C_{{h_{{x_0}}}}}({\bf{0}}){{\left( {{{{l_c}\left( {{h_{{x_0}}}} \right)} \over R}} \right)}^D}} \hfill \cr { = {F_X}\left( {{x_0}} \right)\left( {1 - {F_X}\left( {{x_0}} \right)} \right){{\left( {{{{l_c}\left( {{h_{{x_0}}}} \right)} \over R}} \right)}^D},} \hfill \cr }$ (B.4)

providing $C_{h_{x_{0}}}$ ${C_{{h_{{x_0}}}}}$ is integrable so one can define $l_{c} (h_{x_{0}})$ ${l_c}\left( {{h_{{x_0}}}} \right)$ . Again, comparing with the result for a repeated trial experiment where N samples of X(y) are drawn (for the same point y) shows that the ratio ${(R / l_{c} (h_{x_{0}}))}^{D}$ ${\left( {{R \mathord{\left/ {\vphantom {R {{l_c}}}} \right. \kern-\nulldelimiterspace} {{l_c}}}\left( {{h_{{x_0}}}} \right)} \right)^D}$ serves as an effective number N of trials (see e.g. Papoulis & Pillai 1965).

For practical purpose and in order to give an interval of confidence, when F_X is not known one can use the estimate ${\hat{F}}_{L}$ ${\hat F_L}$ in Eq. (B.4) (Papoulis & Pillai 1965). Furthermore, here, $l_{c} (h_{x_{0}})$ ${l_c}\left( {{h_{{x_0}}}} \right)$ is a function of x₀ and cannot in general be simply estimated from l_c(X). The length $l_{c} (h_{x_{0}})$ ${l_c}\left( {{h_{{x_0}}}} \right)$ can, however, be estimated by repeating the experiment several times and using the results of Sec. (3.1).

Appendix B.2 Probability Density Function (PDF)

To build an estimator of the PDF f_X(x₀) of X we do not use the definition $f_{X} (x_{0}) = E (δ (X (y) - x_{0}))$ ${f_X}\left( {{x_0}} \right) = {\mathbb{E}}\left( {\delta \left( {X\left( {\bf{y}} \right) - {x_0}} \right)} \right)$ but the common approximation, suited for data analysis, $f_{X} (x_{0}) Δ x ≃ F_{X} (x_{0} + Δ x) F_{X} (x_{0}) = E (h_{x_{0} + Δ x} (X (y)) - h_{x_{0}} (X (y)))$ ${f_X}\left( {{x_0}} \right)\Delta x \simeq {F_X}\left( {{x_0} + \Delta x} \right){F_X}\left( {{x_0}} \right) = {\mathbb{E}}\left( {{h_{{x_0} + \Delta x}}\left( {X\left( {\bf{y}} \right)} \right) - {h_{{x_0}}}\left( {X\left( {\bf{y}} \right)} \right)} \right)$ for a sufficiently small bin spacing Δx. Noting $g_{x_{0}} (X (y) = h_{x_{0} + Δ x} (X (y)) - h_{x_{0}} (X (y))$ ${g_{{x_0}}}\left( {X\left( {\bf{y}} \right)} \right. = {h_{{x_0} + \Delta x}}\left( {X\left( {\bf{y}} \right)} \right) - {h_{{x_0}}}\left( {X\left( {\bf{y}} \right)} \right)$ we build the estimator ${\hat{f}}_{L} (x_{0}) Δ x = \frac{1}{L^{D}} \int_{{[- \frac{L}{2}, \frac{L}{2}]}^{D}} g_{x_{0}} (X (y)) d y .$ ${\hat f_L}\left( {{x_0}} \right){\rm{\Delta }}x = {1 \over {{L^D}}}\int_{{{\left[ { - {L \over 2},{L \over 2}} \right]}^D}} {{g_{{x_0}}}} \left( {X\left( {\bf{y}} \right)} \right){\rm{d}}{\bf{y}}.$ (B.5)

The ACF of $g_{x_{0}} (X)$ ${g_{{x_0}}}\left( X \right)$ is $\begin{array}{l} C_{g_{x_{0}}} (y) & = F_{X}^{(2)} (x_{0} + Δ x, x_{0} + Δ x, y) + F_{X}^{(2)} (x_{0}, x_{0}, y) \\ - F_{X}^{(2)} (x_{0} + Δ x, x_{0}, y) - F_{X}^{(2)} (x_{0}, x_{0} + Δ x, y) \\ - {(F_{X} (x_{0} + Δ x) - F_{X} (x_{0}))}^{2} \end{array}$ $\matrix{ {{{\rm{C}}_{{g_{{x_0}}}}}{\rm{(}}{\bf{y}}{\rm{)}}} \hfill & { = F_X^{\left( 2 \right)}\left( {{x_0} + {\rm{\Delta }}x,\;{x_0} + {\rm{\Delta }}x,{\bf{y}}} \right) + F_X^{\left( 2 \right)}\left( {{x_0},\;{x_0},{\bf{y}}} \right)} \hfill \cr {} \hfill & { - F_X^{\left( 2 \right)}\left( {{x_0} + {\rm{\Delta }}x,\;{x_0},\;{\bf{y}}} \right) - F_X^{\left( 2 \right)}\left( {{x_0},\;{x_0} + {\rm{\Delta }}x,{\bf{y}}} \right)} \hfill \cr {} \hfill & { - {{\left( {{F_X}\left( {{x_0} + {\rm{\Delta }}x} \right) - {F_X}\left( {{x_0}} \right)} \right)}^2}} \hfill \cr }$ (B.6)

with $\begin{array}{l} C_{g_{x_{0}}} (0) & = F_{X} (x_{0} + Δ x) - F_{X} (x_{0}) - {(F_{X} (x_{0} + Δ x) - F_{X} (x_{0}))}^{2} \\ ≃ f_{X} (x_{0}) Δ x (1 - f_{X} (x_{0}) Δ x) + O (Δ x^{2}) \end{array}$ $\matrix{ {{C_{{g_{{x_0}}}}}\left( {\bf{0}} \right)} \hfill & { = {F_X}\left( {{x_0} + {\rm{\Delta }}x} \right) - {F_X}\left( {{x_0}} \right) - {{\left( {{F_X}\left( {{x_0} + {\rm{\Delta }}x} \right) - {F_X}\left( {{x_0}} \right)} \right)}^2}} \hfill \cr {} \hfill & { \simeq {f_X}\left( {{x_0}} \right){\rm{\Delta }}x\left( {1 - {f_X}\left( {{x_0}} \right){\rm{\Delta }}x} \right) + {\rm{O}}\left( {{\rm{\Delta }}{x^2}} \right)} \hfill \cr }$ (B.7) $≃ f_{X} (x_{0}) Δ x + O (Δ x^{2}) .$ $\simeq {f_X}\left( {{x_0}} \right){\rm{\Delta }}x + {\rm{O}}\left( {{\rm{\Delta }}{x^2}} \right).$ (B.8)

We then know that a sufficient condition for X to be density ergodic is either $C_{g_{x_{0}}} (y) \vec{| y | \to \infty} 0$ ${C_{{g_{{x_0}}}}}\left( {\bf{y}} \right)\overrightarrow {\left| {\bf{y}} \right| \to \infty } \,0$ or $C_{g_{x_{0}}} (y)$ ${C_{{g_{{x_0}}}}}\left( {\bf{y}} \right)$ is integrable.

To find out how rapidly $C_{g_{x_{0}}} (y)$ ${C_{{g_{{x_0}}}}}\left( {\bf{y}} \right)$ decays to zero we note that $C_{g_{x_{0}}} (y) ≃ (\frac{\partial^{2} F_{X}^{(2)}}{\partial x_{1} \partial x_{2}} (x_{0}, x_{0}, y) - f_{X} {(x_{0})}^{2}) Δ x^{2} + O ((Δ x^{3})$ ${C_{{g_{{x_0}}}}}\left( {\bf{y}} \right) \simeq \left( {{{{\partial ^2}F_X^{\left( 2 \right)}} \over {\partial {x_1}\partial {x_2}}}\left( {{x_0},\;{x_0},{\bf{y}}} \right) - {f_X}{{\left( {{x_0}} \right)}^2}} \right){\rm{\Delta }}{x^2} + {\rm{O}}(\left( {{\rm{\Delta }}{x^3}} \right)$ (B.9) $= (f_{X}^{(2)} (x_{0}, x_{0}, y) - f_{X} {(x_{0})}^{2}) Δ x^{2} + O (Δ x^{3}),$ $= \left( {f_X^{\left( 2 \right)}\left( {{x_0},\;{x_0},\;{\bf{y}}} \right) - {f_X}{{\left( {{x_0}} \right)}^2}} \right){\rm{\Delta }}{x^2} + {\rm{O}}\left( {{\rm{\Delta }}{x^3}} \right),$ (B.10)

where $f_{X}^{(2)}$ $f_X^{\left( 2 \right)}$ is the second-order density function. Eqs. B.9 and B.10 are only valid for y + 0 because $f_{X}^{(2)}$ $f_X^{\left( 2 \right)}$ is degenerate for $y = 0 as F_{X}^{(2)} (x_{1}, x_{2}, 0) = F_{X} (\min (x_{1}, x_{2}))$ ${\bf{y}} = {\bf{0}}\,{\rm{as}}\,F_X^{\left( 2 \right)}\left( {{x_1},{x_2},{\bf{0}}} \right) = {F_X}\left( {\min \left( {{x_1},{x_2}} \right)} \right)$ , where min(x₁, x₂) is not differentiable. The variance of the ergodic estimator ${\hat{f}}_{L, x_{0}}$ ${\widehat f_{L,{x_0}}}$ is, then, $Var ({\hat{f}}_{L, x_{0}}) = (f_{X} (x_{0}) Δ x) {(\frac{l_{c} (g_{x_{0}})}{R})}^{D},$ ${\rm{Var}}\left( {{{\hat f}_{L,{x_0}}}} \right) = \left( {{f_X}\left( {{x_0}} \right){\rm{\Delta }}x} \right){\left( {{{{l_c}\left( {{g_{{x_0}}}} \right)} \over R}} \right)^D},$ (B.11)

where $l_{c} {(g_{x_{0}})}^{D} \propto Δ x$ ${l_c}{\left( {{g_{{x_0}}}} \right)^D} \propto \Delta x$ (see Eq. (B.10)).

Appendix B.3 Gaussian Process

If the field X(y) is Gaussian we have $f_{X, G}^{(2)} (x_{1}, x_{2}, y) = \frac{1}{2 π | \underline{Σ} (y) |^{1 / 2}} \exp (- \frac{1}{2} {(x_{μ})}^{T} \underline{Σ} {(y)}^{- 1} (x_{μ}))$ $f_{X,G}^{\left( 2 \right)}\left( {{x_1},\;{x_2},\;{\bf{y}}} \right) = {1 \over {2\pi |\underline \Sigma \left( {\bf{y}} \right){|^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}}\exp \left( { - {1 \over 2}{{\left( {{x_\mu }} \right)}^T}\underline \Sigma {{\left( {\bf{y}} \right)}^{ - 1}}\left( {{{\bf{x}}_\mu }} \right)} \right)$ (B.12)

where x_µ = (x₁ − µ, x₂ − µ), with $μ = E (X) | \underline{Σ} (y) |$ $\mu = {\bf{E}}\left( X \right)\left| {\underline \Sigma \left( {\bf{y}} \right)} \right|$ is the determinant of the matrix $\underline{Σ} (y)$ ${\underline \Sigma \left( {\bf{y}} \right)}$ and $\underline{Σ} (y) = (\begin{array}{l} σ {(X)}^{2} & C_{X} (y) \\ C_{X} (y) & σ {(X)}^{2} \end{array}),$ $\underline \Sigma ({\bf{y}}) = \left( {\matrix{ {\sigma {{\left( X \right)}^2}} \hfill & {{C_X}\left( {\bf{y}} \right)} \hfill \cr {{C_X}\left( {\bf{y}} \right)} \hfill & {\sigma {{\left( X \right)}^2}} \hfill \cr } } \right),$ (B.13)

We see that, as $| \underline{Σ} (y) | = σ {(X)}^{4} - C_{X} {(y)}^{2} = (σ {(X)}^{2} - C_{X} (y)) (σ {(X)}^{2} + C_{X} (y)), f^{(2)}$ $\left| {\underline \Sigma \left( {\bf{y}} \right)} \right| = \sigma {\left( X \right)^4} - {C_X}{\left( {\bf{y}} \right)^2} = \left( {\sigma {{\left( X \right)}^2} - {C_X}\left( {\bf{y}} \right)} \right)\left( {\sigma {{\left( X \right)}^2} + {C_X}\left( {\bf{y}} \right)} \right),\,\,\,{f^{\left( 2 \right)}}$ is degenerate for y = 0. However, for y ≠ 0, we have $\begin{array}{l} f_{X, G}^{(2)} (x_{0}, x_{0}, y) & = \frac{1}{2 π | \underline{Σ} (y) |^{1 / 2}} \exp (- x_{0, μ}^{2} \frac{σ {(X)}^{2} - C_{X} (y)}{σ {(X)}^{4} - C_{X} {(y)}^{2}}) \\ = \frac{1}{2 π | \underline{Σ} (y) |^{1 / 2}} \exp (- \frac{{(x_{0} - μ)}^{2}}{σ {(X)}^{2} + C_{X} (y)}) \end{array}$ $\matrix{ {f_{X,G}^{\left( 2 \right)}\left( {{x_0},\;{x_0},\;{\bf{y}}} \right)} \hfill & { = {1 \over {2\pi |\underline {\rm{\Sigma }} \left( {\bf{y}} \right){|^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}}\exp \left( { - x_{0,\mu }^2{{\sigma {{\left( X \right)}^2} - {C_X}\left( {\bf{y}} \right)} \over {\sigma {{\left( X \right)}^4} - {C_X}{{\left( {\bf{y}} \right)}^2}}}} \right)} \hfill \cr {} \hfill & { = {1 \over {2\pi |\underline {\rm{\Sigma }} \left( {\bf{y}} \right){|^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}}\exp \left( { - {{{{\left( {{x_0} - \mu } \right)}^2}} \over {\sigma {{\left( X \right)}^2} + {C_X}\left( {\bf{y}} \right)}}} \right)} \hfill \cr }$ (B.14)

Hence $\begin{array}{l} C_{g_{x_{0}}} (y) & ≃ & (\frac{1}{{((1 + \frac{C_{X} (y)}{σ {(X)}^{2}}) (1 - \frac{C_{X} (y)}{σ {(X)}^{2}}))}^{1 / 2}} \exp (\frac{C_{X} (y) {(x_{0} - μ)}^{2}}{σ {(X)}^{4} (1 + \frac{C_{X} (y)}{σ {(X)}^{2}})}) - 1) \\ \times \frac{Δ x^{2}}{2 π σ {(X)}^{2}} \exp (- \frac{{(x_{0} - μ)}^{2}}{σ {(X)}^{2}}) + O (Δ x^{3}) \end{array}$ $\matrix{ {{C_{{g_{{x_0}}}}}\left( {\bf{y}} \right)} \hfill & \simeq \hfill & {\left( {{1 \over {{{\left( {\left( {1 + {{{C_X}\left( {\bf{y}} \right)} \over {\sigma {{\left( X \right)}^2}}}} \right)\left( {1 - {{{C_X}\left( {\bf{y}} \right)} \over {\sigma {{\left( X \right)}^2}}}} \right)} \right)}^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}}\exp \left( {{{{C_X}\left( {\bf{y}} \right){{\left( {{x_0} - \mu } \right)}^2}} \over {\sigma {{\left( X \right)}^4}\left( {1 + {{{C_X}\left( y \right)} \over {\sigma {{\left( X \right)}^2}}}} \right)}}} \right) - 1} \right)} \hfill \cr {} \hfill & {} \hfill & { \times {{{\rm{\Delta }}{x^2}} \over {2\pi \sigma {{\left( X \right)}^2}}}\exp \left( { - {{{{\left( {{x_0} - \mu } \right)}^2}} \over {\sigma {{\left( X \right)}^2}}}} \right) + {\rm{O}}\left( {{\rm{\Delta }}{x^3}} \right)} \hfill \cr }$ (B.15)

Noting the normalized ACF ${\tilde{C}}_{X} = C_{X} / C_{X} (0) = C_{X} / σ {(X)}^{2}$ ${\widetilde C_X} = {{{C_X}} \mathord{\left/ {\vphantom {{{C_X}} {{C_X}}}} \right. \kern-\nulldelimiterspace} {{C_X}}}\left( {\bf{0}} \right) = {{{C_X}} \mathord{\left/ {\vphantom {{{C_X}} \sigma }} \right. \kern-\nulldelimiterspace} \sigma }{\left( X \right)^2}$ and the reduced variable $x_{0}^{r} = (x_{0} - μ) / 0 (X)$ $x_0^r = {{\left( {{x_0} - \mu } \right)} \mathord{\left/ {\vphantom {{\left( {{x_0} - \mu } \right)} {0\left( X \right)}}} \right. \kern-\nulldelimiterspace} {0\left( X \right)}}$ we have $\begin{array}{l} C_{g_{x_{0}}} (y) & ≃ (\frac{1}{{(1 - {\tilde{C}}_{X} {(y)}^{2})}^{1 / 2}} \exp (\frac{{\tilde{C}}_{X} (y) {(x_{0}^{r})}^{2}}{1 + {\tilde{C}}_{X} (y)}) - 1) \\ \times \frac{Δ x^{2}}{2 π σ {(X)}^{2}} \exp (- {(x_{0}^{r})}^{2}) + O (Δ x^{3}) \end{array}$ $\matrix{ {{C_{{g_{{x_0}}}}}({\bf{y}})} \hfill & { \simeq \left( {{1 \over {{{\left( {1 - {{\tilde C}_X}{{\left( {\bf{y}} \right)}^2}} \right)}^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}}\exp \left( {{{{{\tilde C}_X}\left( y \right){{\left( {x_0^r} \right)}^2}} \over {1 + {{\tilde C}_X}\left( y \right)}}} \right) - 1} \right)} \hfill \cr {} \hfill & {\quad \times {{{\rm{\Delta }}{x^2}} \over {2\pi \sigma {{\left( X \right)}^2}}}\exp \left( { - {{\left( {x_0^r} \right)}^2}} \right) + {\rm{O}}\left( {{\rm{\Delta }}{x^3}} \right)} \hfill \cr }$ (B.16) $\begin{array}{l} ≃ & (\frac{1}{{(1 - {\tilde{C}}_{X} {(y)}^{2})}^{1 / 2}} \exp (\frac{{\tilde{C}}_{X} (y) {(x_{0}^{r})}^{2}}{1 + {\tilde{C}}_{X} (y)}) - 1) \\ \times f_{X} {(x_{0})}^{2} {(Δ x)}^{2} + O (Δ x^{3}) . \end{array}$ $\matrix{ { \simeq \;} \hfill & {\left( {{1 \over {{{\left( {1 - {{\tilde C}_X}{{\left( {\bf{y}} \right)}^2}} \right)}^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}}\exp \left( {{{{{\tilde C}_X}\left( {\bf{y}} \right){{\left( {x_0^r} \right)}^2}} \over {1 + {{\tilde C}_X}\left( {\bf{y}} \right)}}} \right) - 1} \right)} \hfill \cr {} \hfill & { \times {f_X}{{\left( {{x_0}} \right)}^2}{{\left( {{\rm{\Delta }}x} \right)}^2} + {\rm{O}}\left( {{\rm{\Delta }}{x^3}} \right).} \hfill \cr }$ (B.17)

Appendix B.3.1 Integrability of the ACF and Short Scale Analysis

If C_X decays to zero (as assumed) then for |y| → ∞, we have $C_{g_{x} 0} (y) \sim {\tilde{C}}_{X} (y) {(x_{0}^{r})}^{2} f_{X} {(x_{0})}^{2} {(Δ x)}^{2}$ ${C_{{g_x}0}}\left( {\bf{y}} \right) \sim {\widetilde C_X}\left( {\bf{y}} \right){\left( {x_0^r} \right)^2}{f_X}{\left( {{x_0}} \right)^2}{\left( {\Delta x} \right)^2}$ . Thus, if C_X is integrable, then so is $C_{g_{x} 0}$ ${C_{{g_x}0}}$ at |y| → ∞.

As mentioned above, Eq. (B.17) is only valid for |y| > 0, so the divergence at y = 0 is artificial as $C_{g_{x_{0}}} (0) f_{X} (x_{0}) Δ x$ ${C_{{g_{{x_0}}}}}\left( {\bf{0}} \right){f_X}\left( {{x_0}} \right)\Delta x$ is finite. However, if Eq. (B.17) is integrable at y = 0, this ensures that the errors of approximation of $C_{g_{x_{0}}}$ ${C_{{g_{{x_0}}}}}$ near y = 0 have a small effect on the estimation of $l_{c} (g_{x_{0}})$ ${l_c}\left( {{g_{{x_0}}}} \right)$ (which is an integral). The divergence of Eq. (B.17) at y = 0 is given by $\frac{1}{{(1 - {\tilde{C}}_{X} {(y)}^{2})}^{1 / 2}} \exp (- \frac{1}{2} {(x_{0}^{r})}^{2}) \times f_{X} {(x_{0})}^{2} {(Δ x)}^{2} .$ ${1 \over {{{\left( {1 - {{\tilde C}_X}{{\left( {\bf{y}} \right)}^2}} \right)}^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}}\exp \left( { - {1 \over 2}{{\left( {x_0^r} \right)}^2}} \right) \times {f_X}{\left( {{x_0}} \right)^2}{\left( {{\rm{\Delta }}x} \right)^2}.$ (B.18)

For an exponential isotropic ACF this yields a divergence ∝ r^–1/2, while for a differentiable field X with an ACF being isotropic at short scales this yields a divergence ∝ r^–1. Thus, in most cases for D ≥ 2 Eq. (B.17) is integrable at |y| → 0.

Computing the integral of $C_{g_{x_{0}}} (y)$ ${C_{{g_{{x_0}}}}}\left( {\bf{y}} \right)$ is not straightforward for any decaying and integrable ACF C_X(y). Expanding the exponential in Eq. (B.17), we have $\exp (\frac{{\tilde{C}}_{X} (y) {(x_{0}^{r})}^{2}}{1 + {\tilde{C}}_{X} (y)}) = 1 + \sum_{n \geq 1} \frac{{(x_{0}^{r})}^{2 n}}{n!} {(\frac{{\tilde{C}}_{X} (y)}{1 + {\tilde{C}}_{X} (y)})}^{n} .$ $\exp \left( {{{{{\tilde C}_X}\left( {\bf{y}} \right){{\left( {x_0^r} \right)}^2}} \over {1 + {{\tilde C}_X}\left( {\bf{y}} \right)}}} \right) = 1 + \sum\limits_{n \ge 1} {{{{{\left( {x_0^r} \right)}^{2n}}} \over {n!}}} {\left( {{{{{\tilde C}_X}\left( {\bf{y}} \right)} \over {1 + {{\tilde C}_X}\left( {\bf{y}} \right)}}} \right)^n}.$ (B.19)

We then have to specify or bound the integrals $\frac{1}{2^{D}} \int_{ℝ^{D}} {(\frac{{\tilde{C}}_{X} (y)}{1 + {\tilde{C}}_{X} (y)})}^{n} \frac{d y}{{(1 - {\tilde{C}}_{X} {(y)}^{2})}^{1 / 2}} = l_{c} {(X)}^{D} c_{n},$ ${1 \over {{2^D}}}\int_{{{\mathbb{R}}^D}} {{{\left( {{{{{\tilde C}_X}\left( {\bf{y}} \right)} \over {1 + {{\tilde C}_X}\left( {\bf{y}} \right)}}} \right)}^n}} {{{\rm{d}}{\bf{y}}} \over {{{\left( {1 - {{\tilde C}_X}{{\left( {\bf{y}} \right)}^2}} \right)}^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}} = {l_c}{\left( X \right)^D}{c_n},$ (B.20) $\frac{1}{2^{D}} \int_{ℝ^{D}} (\frac{1}{{(1 - {\tilde{C}}_{X} {(y)}^{2})}^{1 / 2}} - 1) d y = l_{c} {(X)}^{D} c_{0},$ ${1 \over {{2^D}}}\int_{{{\mathbb{R}}^D}} {\left( {{1 \over {{{\left( {1 - {{\tilde C}_X}{{\left( {\bf{y}} \right)}^2}} \right)}^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}} - 1} \right)} {\rm{d}}{\bf{y}} = {l_c}{\left( X \right)^D}{c_0},$ (B.21)

to obtain $\frac{1}{2^{D}} \int_{ℝ^{D}} C_{g_{x_{0}}} (y) d y = l_{c} {(X)}^{D} φ (x_{0}^{r}) \times f_{X} {(x_{0})}^{2} {(Δ x)}^{2} + O (Δ x^{3}),$ ${1 \over {{2^D}}}\int_{{{\mathbb{R}}^D}} {{C_{{g_{{x_0}}}}}} \left( {\bf{y}} \right){\rm{d}}{\bf{y}} = {l_c}{\left( X \right)^D}\varphi \left( {x_0^r} \right) \times {f_X}{\left( {{x_0}} \right)^2}{\left( {{\rm{\Delta }}x} \right)^2} + {\rm{O}}\left( {{\rm{\Delta }}{x^3}} \right),$ (B.22)

where $φ (x_{0}^{r})$ $\varphi \left( {x_0^r} \right)$ is a function of x₀ which we need to bound to obtain a confidence interval. A lower bound of $φ (x_{0}^{r})$ $\varphi \left( {x_0^r} \right)$ can be obtained due to the convexity of the exponential: $φ (x_{0}^{r}) \geq c_{0} + c_{1} {(x_{0}^{r})}^{2}$ $\varphi \left( {x_0^r} \right) \ge {c_0} + {c_1}{\left( {x_0^r} \right)^2}$ (B.23)

For general monotonic decreasing (hence positive) ACFs, the study of the functions $\frac{x}{1 + x} \frac{1}{{(1 - x^{2})}^{1 / 2}}$ ${x \over {1 + x}}{1 \over {{{\left( {1 - {x^2}} \right)}^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}}$ and $\frac{1}{{(1 - x^{2})}^{1 / 2}} - 1$ ${1 \over {{{\left( {1 - {x^2}} \right)}^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}} - 1$ shows that C₀ ≳ 0.1 and c₁ ≥ 0.77.

Appendix B.3.2 Exponential ACF

To go a little further and obtain a formula that will help to grasp some expected features of the ergodic estimate of $g_{x_{0}}$ ${g_{{x_0}}}$ we study the special case of an exponential ACF. For the present study we limit ourselves to the case D ≤ 2. Then if ${\tilde{C}}_{X}$ ${\widetilde C_X}$ is an (isotropic) exponential, ${\tilde{C}}_{X} (y) = \exp (- | y | / λ)$ ${\widetilde C_X}\left( y \right) = \exp \left( { - {{\left| y \right|} \mathord{\left/ {\vphantom {{\left| y \right|} \lambda }} \right. \kern-\nulldelimiterspace} \lambda }} \right)$ , we can bound the integral of Eq. (B.17). Indeed, for n ≥ 1, $\frac{l_{c} {(X)}^{D}}{2^{n - 1}} \geq \frac{1}{2^{D}} \int_{ℝ^{D}} {(\frac{{\tilde{C}}_{X} (y)}{1 + {\tilde{C}}_{X} (y)})}^{n} \times \frac{d y}{{(1 - {\tilde{C}}_{X} {(y)}^{2})}^{1 / 2}} .$ ${{{l_c}{{\left( X \right)}^D}} \over {{2^{n - 1}}}} \ge {1 \over {{2^D}}}\int_{{{\mathbb{R}}^D}} {{{\left( {{{{{\tilde C}_X}\left( {\bf{y}} \right)} \over {1 + {{\tilde C}_X}\left( y \right)}}} \right)}^n}} \times {{{\rm{d}}{\bf{y}}} \over {{{\left( {1 - {{\tilde C}_X}{{\left( {\bf{y}} \right)}^2}} \right)}^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}}.$ (B.24)

We then have $\begin{array}{r} \frac{1}{2^{D}} \int_{ℝ^{D}} C_{g_{x_{0}}} (y) d y \leq l_{c} {(X)}^{D} (2 \exp (\frac{{(x_{0}^{r})}^{2}}{2}) + {\tilde{c}}_{0}^{D} - 2) \\ \times {(f_{X} (x_{0}) Δ x)}^{2} + O (Δ x^{3}) \end{array}$ $\matrix{ \hfill {{1 \over {{2^D}}}\int_{{{\mathbb{R}}^D}} {{C_{{g_{{x_0}}}}}} \left( {\bf{y}} \right){\rm{d}}{\bf{y}} \le {l_c}{{\left( X \right)}^D}\left( {2\exp \left( {{{{{\left( {x_0^r} \right)}^2}} \over 2}} \right) + \tilde c_0^D - 2} \right)} \cr \hfill { \times {{\left( {{f_X}\left( {{x_0}} \right){\rm{\Delta }}x} \right)}^2} + {\rm{O}}\left( {{\rm{\Delta }}{x^3}} \right)} \cr }$ (B.25)

where ${\tilde{c}}_{0}^{D}$ $\widetilde c_0^D$ and 0.17 for D = 1 and D = 2, respectively. This gives an upper bound to the correlation length of $g_{x_{0}}$ ${g_{{x_0}}}$ but overestimates its value for $| x_{0}^{r} | ≫ 1$ $\left| {x_0^r} \right| \gg 1$ . However, near the average $(| x_{0}^{r} | ≪ 1)$ $\left( {\left| {x_0^r} \right| \ll 1} \right)$ we can approximate $\begin{array}{r} \frac{1}{2^{D}} \int_{ℝ^{D}} C_{g_{x_{0}}} (y) d y ≃ l_{c} {(X)}^{D} ({\tilde{c}}_{1}^{D} {(x_{0}^{r})}^{2} + {\tilde{c}}_{0}^{D}) \\ \times {(f_{X} (x_{0}) Δ x)}^{2} + O (Δ x^{3}), \end{array}$ $\matrix{ \hfill {{1 \over {{2^D}}}\int_{{{\mathbb{R}}^D}} {{C_{{g_{{x_0}}}}}} \left( {\bf{y}} \right){\rm{d}}{\bf{y}} \simeq {l_c}{{\left( X \right)}^D}\left( {\tilde c_1^D{{\left( {x_0^r} \right)}^2} + \tilde c_0^D} \right)} \cr \hfill { \times {{\left( {{f_X}\left( {{x_0}} \right){\rm{\Delta }}x} \right)}^2} + {\rm{O}}\left( {{\rm{\Delta }}{x^3}} \right),} \cr }$ (B.26)

where ${\tilde{c}}_{0}^{D} = 1$ $\widetilde c_0^D = 1$ and 0.88 for D = 1 and D = 2, respectively. We note that, due to the convexity of the exponential, the right hand side of Eq. (B.26) is actually a lower bound of the integral $\forall_{x_{0}}$ ${\forall _{{x_0}}}$ .

We can then construct a confidence interval with more than 1 − 1/m² of confidence such that the true f_X(x₀) lies in $f_{X} (x_{0}) = {\hat{f}}_{L} (x_{0}) (1 \pm m {(φ (x_{0}^{r}))}^{1 / 2} {(\frac{l_{c} (X)}{R})}^{D / 2}),$ ${f_X}\left( {{x_0}} \right) = {\hat f_L}\left( {{x_0}} \right)\left( {1 \pm m{{\left( {\varphi \left( {x_0^r} \right)} \right)}^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}{{\left( {{{{l_c}\left( X \right)} \over R}} \right)}^{{D \mathord{\left/ {\vphantom {D 2}} \right. \kern-\nulldelimiterspace} 2}}}} \right),$

where ${\tilde{c}}_{1}^{D} {(x_{0}^{r})}^{2} + {\tilde{c}}_{0}^{D} \leq φ (x_{0}^{r}) \leq 2 \exp (\frac{{(x_{0}^{r})}^{2}}{2}) + {\tilde{c}}_{0}^{D} - 2.$ $\tilde c_1^D{\left( {x_0^r} \right)^2} + \tilde c_0^D \le \varphi \left( {x_0^r} \right) \le 2\exp \left( {{{{{\left( {x_0^r} \right)}^2}} \over 2}} \right) + \tilde c_0^D - 2.$ (B.27)

Using the lower bound to approximate $φ (x_{0}^{r}), φ (x_{0}^{r}) ≃ {\tilde{c}}_{1}^{D} {(x_{0}^{r})}^{2} + {\tilde{c}}_{0}^{D}$ $\varphi \left( {x_0^r} \right),\varphi \left( {x_0^r} \right) \simeq \widetilde c_1^D{\left( {x_0^r} \right)^2} + \widetilde c_0^D$ while accurate for $| x_{0}^{r} | ≪ 1$ $\left| {x_0^r} \right| \ll 1$ , is most probably an underestimation for $| x_{0}^{r} | ≫ 1$ $\left| {x_0^r} \right| \gg 1$ . However, it allows to show that the statistics of events that deviate largely from the mean needs an increasingly large sample size to have a high degree of confidence.

Appendix B.4 Deterministic Function of a Gaussian Field

The results derived in Sec. (B.3) can be extended to the case where X(y) = ψ(S (y)) with ψ a diffeomorphism and S a Gaussian field. A particular example is that of a lognormal field where ψ = exp. We call this function ψ a deterministic function because statistical properties of the field X can be obtained from those of S. Indeed, for such a field X, the first and second order distribution functions read $f_{X} (x_{0}) = \frac{f_{S} (s_{0})}{| (ψ^{- 1})' (x_{0}) |}$ ${f_X}\left( {{x_0}} \right) = {{{f_S}\left( {{s_0}} \right)} \over {|\left( {{\psi ^{ - 1}}} \right)'\left( {{x_0}} \right)|}}$ (B.28) $f_{X}^{(2)} (x_{1}, x_{2}; y) = \frac{f_{s}^{(2)} (s_{1}, s_{2}; y)}{| (ψ^{- 1})' (x_{1}) ‖ (ψ^{- 1})' (x_{2}) |},$ $f_X^{\left( 2 \right)}\left( {{x_1},\;{x_2};{\bf{y}}} \right) = {{f_s^{\left( 2 \right)}\left( {{s_1},{s_2};{\bf{y}}} \right)} \over {|\left( {{\psi ^{ - 1}}} \right)'\left( {{x_1}} \right)\left\| {\left( {{\psi ^{ - 1}}} \right)'\left( {{x_2}} \right)} \right.|}},$ (B.29)

where S_j = ψ^–1(x_j) (Papoulis & Pillai 1965). Without loss of generality we can further assume that the field S is centered with variance unity. We note that the function ψ can be obtained by inverting Eq. (B.28). Indeed, if only f_X and f_S are known we can obtain ψ by realizing that ψ^–1 verifies the differential equation: $| (ψ^{- 1})' (x_{0}) | = \frac{f_{S} ((ψ^{- 1}) (x_{0}))}{f_{X} (x_{0})} .$ $|\left( {{\psi ^{ - 1}}} \right)'\left( {{x_0}} \right)| = {{{f_S}\left( {\left( {{\psi ^{ - 1}}} \right)\left( {{x_0}} \right)} \right)} \over {{f_X}\left( {{x_0}} \right)}}.$ (B.30)

If one further assumes that ψ is an increasing diffeomorphism (ψ^–1)′ ≥ 0, one obtains $s_{0} = ψ^{- 1} (x_{0}) = \sqrt{2} {erf}^{- 1} (- 1 + 2 \int_{x_{\min}}^{x_{0}} f_{X} (x) d x),$ ${s_0} = {\psi ^{ - 1}}\left( {{x_0}} \right) = \sqrt 2 {\rm{er}}{{\rm{f}}^{ - 1}}\left( { - 1 + 2\int_{{x_{\min }}}^{{x_0}} {{f_X}} \left( x \right){\rm{d}}x} \right),$ (B.31)

where x_min is the minimum value that can be taken by the field X and erf^–1 is the inverse of the error function. The use of this equation requires a high precision on f_X due to the large variation of erf^–1 , which is complicated in general.

Then the ACF of X can be obtained by performing the integral: $C_{X} (y) = \int ψ (s_{1}) ψ (s_{2}) (f_{S}^{(2)} (s_{1}, s_{2}; y) - f_{S} (s_{1}) f_{S} (s_{2})) d s_{1} d s_{2} .$ ${C_X}\left( {\bf{y}} \right) = \int \psi \left( {{s_1}} \right)\psi \left( {{s_2}} \right)\left( {f_S^{\left( 2 \right)}\left( {{s_1},\;{s_2};{\bf{y}}} \right) - {f_S}\left( {{s_1}} \right){f_S}\left( {{s_2}} \right)} \right){\rm{d}}{s_1}{\rm{d}}{s_2}.$

Then Eq. (B.17) becomes $\begin{array}{l} C_{g_{x_{0}}} (y) & ≃ & (\frac{1}{{(1 - {\tilde{C}}_{S} {(y)}^{2})}^{1 / 2}} \exp (\frac{{\tilde{C}}_{S} (y) {(s_{0})}^{2}}{1 + {\tilde{C}}_{S} (y)}) - 1) \\ \times f_{X} {(x_{0})}^{2} {(Δ x)}^{2} + O (Δ x^{3}) . \end{array}$ $\matrix{ {{C_{{g_{{x_0}}}}}\left( {\bf{y}} \right)} \hfill & \simeq \hfill & {\left( {{1 \over {{{\left( {1 - {{\tilde C}_S}{{\left( {\bf{y}} \right)}^2}} \right)}^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}}\exp \left( {{{{{\tilde C}_S}\left( {\bf{y}} \right){{\left( {{s_0}} \right)}^2}} \over {1 + {{\tilde C}_S}\left( {\bf{y}} \right)}}} \right) - 1} \right)} \hfill \cr {} \hfill & {} \hfill & { \times {f_X}{{\left( {{x_0}} \right)}^2}{{\left( {{\rm{\Delta }}x} \right)}^2} + {\rm{O}}\left( {{\rm{\Delta }}{x^3}} \right).} \hfill \cr }$ (B.32)

Appendix B.4.1 Log-normal Fields

For a log-normal field ρ = exp(s), ψ = exp, ψ^–1 = ln and s is not centered $(E (s) \neq 0)$ $\left( {{\mathbb{E}}\left( s \right) \ne 0} \right)$ and does not have a variance unity (σ(s) ≠ 1), in general. Then the calculation of the ACF yields: $C_{ρ} (y) = E {(ρ)}^{2} (e^{C_{s} (y)} - 1) .$ ${C_\rho }({\bf{y}}) = {\mathbb{E}}{\left( \rho \right)^2}\left( {{e^{{C_s}\left( {\bf{y}} \right)}} - 1} \right).$ (B.33)

As a consequence, because e^ax − 1 ≤ x(e^a − 1) for 0 ≤ x ≤ 1 and ax ≤ e^ax − 1 ∀x, if C_ρ (or C_s) is monotonically decaying to 0 then ${(\frac{σ {(s)}^{2}}{e^{σ {(s)}^{2}} - 1})}^{1 / 3} l_{c} (s) \leq l_{c} (ρ) \leq l_{c} (s) .$ ${\left( {{{\sigma {{\left( s \right)}^2}} \over {{e^{\sigma {{\left( s \right)}^2}}} - 1}}} \right)^{{1 \mathord{\left/ {\vphantom {1 3}} \right. \kern-\nulldelimiterspace} 3}}}{l_c}\left( s \right) \le {l_c}\left( \rho \right) \le {l_c}\left( s \right).$ (B.34)

In typical star forming conditions σ(s)² ≲ 4, giving $0.4 l_{c} (s) ≲ l_{c} (ρ) \leq l_{c} (s),$ $0.4{l_c}\left( s \right) \mathbin{\lower.3ex\hbox{$\buildrel<\over {\smash{\scriptstyle\sim}\vphantom{_x}}$}} {l_c}\left( \rho \right) \le {l_c}\left( s \right),$ (B.35)

or $l_{c} (s) \sim l_{c} (ρ) .$ ${l_c}\left( s \right) \sim {l_c}\left( \rho \right).$ (B.36)

This suggests that as long as Var (X) = Var (ψ(S)) is not too big, one can expect to have l_c(X) ~ l_c(S).

Appendix C Orion B Cloud

Appendix C.1 ACF of the Square and Filament Region

We computed the ACF of the unfiltered and low pass filtered square region (up to scale L/2, see §6), as well as the (unfiltered) “filament” region. The results are presented in Figs. (C.1) and (C.2). Filtering large-scale gradients reduces again the anisotropy at short scales and reduces the estimated correlation length.

To have a closer look at the behavior of the ACF of Orion B, we display in Fig. (C.3) the reduced ACF of the low pass filtered map in 3 different directions, x (θ = 0), x = y (θ = π/4) and y (θ = π/2). As seen from the heat maps but also from fig. (C.3), a strong anisotropy is present and located in the y direction at large scales (y/L ≥ 10^–1). The resulting estimated correlation length ${\hat{l}}_{c} (η)$ ${\widehat l_c}\left( \eta \right)$ is of the order l_c(η)/L_x ≃ 10^–1.

Fig. C.1

ACF of the “square region”. Left panel: unflltered. Right panel: low pass filtered up to scale L/2. Again, filtering large-scale gradients reduces the anisotropy at short scales and reduces the estimated correlation length.

Appendix C.2 Correlation Length from the Variance of the Column Densities

As seen from Sec. (5.2) and Eq. (63), one can also give an estimate of the ratio l_c(ρ)/R by (1) computing the variance Var $(Σ / E (Σ))$ $\left( {{\Sigma \mathord{\left/ {\vphantom {\Sigma {{\mathbb{E}}\left( \Sigma \right)}}} \right. \kern-\nulldelimiterspace} {{\bf{E}}\left( \Sigma \right)}}} \right)$ , (2) giving an estimate of Var $(ρ / E (ρ))$ $\left( {{\rho \mathord{\left/ {\vphantom {\rho {{\mathbb{E}}\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {{\bf{E}}\left( \rho \right)}}} \right)$ and (3) giving an estimate of the average thickness of the cloud (along the line of sight) L_z. Here Orion B appears as a very elongated structure, and we will therefore only assume that L_y ≥ L_z ≳ L_x (with L_y ≃ 3 - 4 L_x).

From observations of column densities we obtain Var(∑/ 〈∑〉) ≃ 1.1 while the PDF of column densities, exhibiting a power-law tail of exponent α_η = −2 (Schneider et al. 2013; Jaupart & Chabrier 2020), indicates an underlying density PDF with a power-law tail of exponent −3/2 implying a large variance. As For Polaris, running the power-law tail from s = 8 to s = s_ad ≃ 16 yields a variance Var $(ρ / E (ρ)) = 40$ $\left( {{\rho \mathord{\left/ {\vphantom {\rho {{\mathbb{E}}\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {{\bf{E}}\left( \rho \right)}}} \right) = 40$ and Var $(ρ / E (ρ)) ≃ 2300$ $\left( {{\rho \mathord{\left/ {\vphantom {\rho {{\mathbb{E}}\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {{\bf{E}}\left( \rho \right)}}} \right) \simeq 2300$ respectively (see Jaupart & Chabrier 2020). This yields a ratio l_c(ρ)/L_z ≲ 10^–2.

Fig. C.2

ACF of the unflltered “filament region”. A strong anisotropy is present in the y-direction.

Fig. C.3

Reduced ACF of the low pass filtered map in three different directions. Red line: x-direction (y = 0). Blue line: y-direction (x = 0). Green line: π/4 or x = y-direction. Dash dotted lines represent the values of the ACF when it is negative. A strong anisotropy is present in the y direction at large scales (y/L ≥ 10^–1).

References

Alves, J., Lombardi, M., & Lada, C. J. 2017, A&A, 606, L2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
André, P., Men’shchikov, A., Bontemps, S., et al. 2010, A&A, 518, L102 [CrossRef] [EDP Sciences] [Google Scholar]
Batchelor, G. K. 1953, The Theory of Homogeneous Turbulence (Cambridge: Cambridge University Press) [Google Scholar]
Beaulieu, N. C. 2011, IEEE Trans. Commun., 60 23 [Google Scholar]
Brunt, C., Heyer, M., & Mac Low, M.-M. 2009, A&A, 504 883 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Brunt, C. M., Federrath, C., & Price, D. J. 2010, MNRAS, 403 1507 [Google Scholar]
Chandrasekhar, S. 1951a, Proc. R. Soc. London Ser. A. Math. Phys. Sci., 210 18 [NASA ADS] [Google Scholar]
Chandrasekhar, S. 1951b, Proc. R. Soc. London Ser. A Math. Phys. Sci., 210 26 [NASA ADS] [Google Scholar]
Federrath, C. 2013, MNRAS, 436 1245 [NASA ADS] [CrossRef] [Google Scholar]
Federrath, C. 2016, MNRAS, 457 375 [Google Scholar]
Federrath, C., & Klessen, R. S. 2013, ApJ, 763 51 [Google Scholar]
Federrath, C., Roman-Duval, J., Klessen, R., Schmidt, W., & Mac Low, M.-M. 2010, A&A, 512, A81 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Frisch, U. 1995, Turbulence: the Legacy of AN Kolmogorov (Cambridge: Cambridge University Press) [CrossRef] [Google Scholar]
Heinesen, A. 2020, J. Cosmol. Astropart. Phys., 2020 052 [CrossRef] [Google Scholar]
Hennebelle, P., & Chabrier, G. 2008, ApJ, 684 395 [Google Scholar]
Hopkins, P. F. 2012, MNRAS, 423 2037 [NASA ADS] [CrossRef] [Google Scholar]
Jaupart, E., & Chabrier, G. 2020, ApJ, 903, L2 [NASA ADS] [CrossRef] [Google Scholar]
Jaupart, E., & Chabrier, G. 2021, ApJ, 922, L36 [NASA ADS] [CrossRef] [Google Scholar]
Kainulainen, J., Beuther, H., Henning, T., & Plume, R. 2009, A&A, 508, L35 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Kleiner, S., & Dickman, R. 1984, ApJ, 286 255 [NASA ADS] [CrossRef] [Google Scholar]
Kleiner, S., & Dickman, R. 1985, ApJ, 295 466 [NASA ADS] [CrossRef] [Google Scholar]
Kritsuk, A. G., Norman, M. L., Padoan, P., & Wagner, R. 2007, ApJ, 665 416 [NASA ADS] [CrossRef] [Google Scholar]
Mac Low, M.-M., & Klessen, R. S. 2004, Rev. Mod. Phys., 76 125 [Google Scholar]
Machida, M. N., Inutsuka, S.-I., & Matsumoto, T. 2006, ApJ, 647, L151 [NASA ADS] [CrossRef] [Google Scholar]
Masunaga, H., & Inutsuka, S.-I. 2000, ApJ, 531 350 [NASA ADS] [CrossRef] [Google Scholar]
McKee, C. F., & Ostriker, E. C. 2007, ARA&A, 45 565 [Google Scholar]
Miville-Deschênes, M.-A., Martin, P., Abergel, A., et al. 2010, A&A, 518, L104 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Orkisz, J. H., Pety, J., Gerin, M., et al. 2017, A&A, 599, A99 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Ossenkopf-Okada, V., Csengeri, T., Schneider, N., Federrath, C., & Klessen, R. S. 2016, A&A, 590, A104 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Padoan, P., & Nordlund, Å. 2002, ApJ, 576 870 [NASA ADS] [CrossRef] [Google Scholar]
Pan, L., Padoan, P., & Nordlund, Å. 2018, ApJ, 866, L17 [NASA ADS] [CrossRef] [Google Scholar]
Pan, L., Padoan, P., & Nordlund, Å. 2019a, ApJ, 876 90 [NASA ADS] [CrossRef] [Google Scholar]
Pan, L., Padoan, P., & Nordlund, Å. 2019b, ApJ, 881 155 [NASA ADS] [CrossRef] [Google Scholar]
Papoulis, A., & Pillai, S. 1965, Variables Stochastic Processes (New York: McGraw-Hill) [Google Scholar]
Passot, T., & Vázquez-Semadeni, E. 1998, Phys. Rev. E, 58 4501 [Google Scholar]
Peebles, P. J. E. 1973, ApJ, 185 413 [Google Scholar]
Pope, S. B. 1985, Prog. Energy Combustion Sci., 11 119 [NASA ADS] [CrossRef] [Google Scholar]
Reinke, N., Fuchs, A., Hölling, M., & Peinke, J. 2016, in Fractal Flow Design: How to Design Bespoke Turbulence and Why (Berlin: Springer), 165 [Google Scholar]
Reinke, N., Fuchs, A., Nickelsen, D., & Peinke, J. 2018, J. Fluid Mech., 848 117 [NASA ADS] [CrossRef] [Google Scholar]
Scalo, J. 1984, ApJ, 277 556 [NASA ADS] [CrossRef] [Google Scholar]
Schneider, N., André, P., Könyves, V., et al. 2013, ApJ, 766, L17 [NASA ADS] [CrossRef] [Google Scholar]
Schneider, N., Ossenkopf, V., Csengeri, T., et al. 2015, A&A, 575, A79 [CrossRef] [EDP Sciences] [Google Scholar]
Szyszkowicz, S. S., & Yanikomeroglu, H. 2009, IEEE Trans. Commun., 57 3538 [CrossRef] [Google Scholar]
Vaytet, N., Chabrier, G., Audit, E., et al. 2013, A&A, 557, A90 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Vaytet, N., Commerçon, B., Masson, J., González, M., & Chabrier, G. 2018, A&A, 615, A5 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Vazquez-Semadeni, E. 1994, ApJ, 423 681 [Google Scholar]
Vázquez-Semadeni, E., & García, N. 2001, ApJ, 557 727 [CrossRef] [Google Scholar]
Vázquez-Semadeni, E., Palau, A., Ballesteros-Paredes, J., Gómez, G. C., & Zamora-Avilés, M. 2019, MNRAS, 490 3061 [Google Scholar]

¹

The following calculations are made with this particular type of cubic control volume, as the calculations are easier to follow. We give in Appendix A the calculations for a control volume of any shape.

²

If the statistics are Gaussian, we obtain an equality where we have 0.67 instead of $\sqrt{2}$ $\sqrt 2$ in Eq. (26).

All Figures

	Fig. 1 Projection of the two idealized situation. Left panel: case of a sphere. Right panel: case of a cuboid mis-aligned with the line of sight.
In the text

Fig. 2

Column and volume density PDF of the Polaris cloud. Left: Observed logarithmic column-density (ŋ = ln(∑/ 〈∑〉)-PDF (Schneider et al. 2013; Jaupart & Chabrier 2020). Right: Estimated and reconstructed underlying logarithmic density $(s = \ln (ρ / E (ρ)))$ $\left( {s = \ln \left( {{\rho \mathord{\left/ {\vphantom {\rho {\left( \rho \right)}}} \right. \kern-\nulldelimiterspace} {\left( \rho \right)}}} \right)} \right)$ -PDF with the procedure from (Jaupart & Chabrier 2020).

In the text

Fig. 3

Column-density maps of the Polaris cloud. Left panel: without filter. Middle panel: with a high-pass filter filtering scales smaller than L/2. Right panel: with a low pass filter filtering scales larger than L/2. The low pass filter does not alter qualitatively the richness of structures found in the Polaris flare, while the high-pass filter shows a large-scale gradient that can be produced by an integration effect.

In the text

	Fig. 4 Same as Fig. (3), but for the binary map Θ(log(∑/ 〈∑〉))where Θ is Heaviside’s step function. Regions where ∑ > 〈∑〉 appear darker than regions where ∑ < 〈∑〉.
In the text

	Fig. 5 η-PDFs. Blue round and purple triangular symbols represent the PDFs of the unflltered and low pass filtered maps, respectively. The filter does not alter regions with η > 0 but increases the occurrence of regions with η < −1. Horizontal errorbars represent bin spacing.
In the text

Fig. 6

Reduced ACF function of $η ({\hat{C}}_{η} / Var (η))$ $\eta \,\left( {{{{{\hat C}_\eta }} \mathord{\left/ {\vphantom {{{{\hat C}_\eta }} {{\rm{Var}}\left( \eta \right)}}} \right. \kern-\nulldelimiterspace} {{\rm{Var}}\left( \eta \right)}}} \right)$ for the Polaris flare. Left panel: without filter. Middle panel: with a high pass Alter filtering scales smaller than L/2. Right panel: with a low pass Alter Altering scales larger than L/2. Contours from black to purple to blue to light blue give the value of the reduced ACF at 0.5, e^–1 ≃ 0.37, 0.1, −0.1.

In the text

	Fig. 7 Reduced ACF of the low pass Altered map in three different directions. Blue line: x-direction (y = 0). Purple line: y-direction (x = 0). Red line: π/4 or x = y-direction. Green line: exponential pro Ale giving an estimate of the rate of decay (here λ/L 5 × 10^–2). A strong anisotropy is present in the x direction at large scales (x/L ≥ 2 × 10^–2).
In the text

Fig. 8

Estimated ACF of the field ɡ_η0(y) for different values of ŋ₀ = ŋ in 3 different directions. Blue, purple and red lines represent respectively the x, y and π/4 (x = y) directions. Two top panels are for ŋ₀ = −1.06 and 0.69, whereas the two bottom panels are for ŋ₀ = 0.94 and 1.58. At low column-densities (η₀ = −1.06), a strong anisotropy is detected in the x-direction and becomes noticeable at x/L ≥ 2 × 10^–2 as was the case for the column-density ACF (see Fig. 7). For high column-densities (η₀ > 0), however, the anisotropy is subdued and the ACFs are fairly isotropic at small scales up to x/L, r/L ~ 10^–1 where the data become quite noisy. Green dashed lines show the profile of an isotropic ACF proportional to r^–1/2 that matches the data at short scales fairly well, at least over a decade.

In the text

Fig. 9

Constant of proportionality $c_{\sqrt}$ ${c_\surd }$ such that $C_{g_{η_{0}}} (y) = {(f_{η} (η_{0}) Δ η)}^{2} \times c_{\sqrt} / \sqrt{| y | / L}$ ${C_{{g_{{\eta _0}}}}}\left( {\bf{y}} \right) = {\left( {{f_\eta }\left( {{\eta _0}} \right){\rm{\Delta }}\eta } \right)^2} \times c{{_\surd } \mathord{\left/ {\vphantom {{_\surd } {\sqrt {{{\left| {\bf{y}} \right|} \mathord{\left/ {\vphantom {{\left| {\bf{y}} \right|} L}} \right. \kern-\nulldelimiterspace} L}} }}} \right. \kern-\nulldelimiterspace} {\sqrt {{{\left| {\bf{y}} \right|} \mathord{\left/ {\vphantom {{\left| {\bf{y}} \right|} L}} \right. \kern-\nulldelimiterspace} L}} }}$ at short scales.

In the text

Fig. 10

Estimate of the ratio l_c(ɡη₀)/R from Eqs. (81) and (82) at different η₀ for Δη = 0.1. The blue line gives the values of estimate ${\hat{l}}_{c, y} (g_{η_{0}})$ ${\hat l_{c,y}}\left( {{g_{{\eta _0}}}} \right)$ while the red dash-dotted line gives the values of ${\hat{l}}_{c, x} (g_{η_{0}})$ ${\hat l_{c,x}}\left( {{g_{{\eta _0}}}} \right)$ . The two estimates always yield the same order of magnitude.

In the text

Fig. 11

Shadded regions bounded by the two estimates ${\hat{l}}_{c, x} (g_{η_{0}})$ ${\hat l_{c,x}}\left( {{g_{{\eta _0}}}} \right)$ and ${\hat{l}}_{c, y} (g_{η_{0}})$ ${\hat l_{c,y}}\left( {{g_{{\eta _0}}}} \right)$ for different bin sizes Δη at four values of η₀ = −1.1, 0.7, 1.1, 1.6 from light to dark blue respectively. For clarity the values of the estimates for η₀ = −1.1 and η₀ = 0.7 were multiplied by 15 and 1.5 to shift their curves upward. Red dash-dotted lines indicate a scaling ∝ (Δη)^1/2. At positive density contrasts ŋ > 0 the estimates follow a scaling close to the predicted ${\hat{l}}_{c, x, y} (g_{η_{0}}) \propto {(Δ η)}^{1 / 2}$ ${\hat l_{c,x,y}}\left( {{g_{{\eta _0}}}} \right) \propto {\left( {{\rm{\Delta }}\eta } \right)^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}$ .

In the text

Fig. 12

PDF of the logarithmic column-density ŋ with statistical error-bars for m = 2 and the two estimates ${\hat{l}}_{c, x} (g_{η_{0}})$ ${\hat l_{c,x}}\left( {{g_{{\eta _0}}}} \right)$ and ${\hat{l}}_{c, y} (g_{η_{0}})$ ${\hat l_{c,y}}\left( {{g_{{\eta _0}}}} \right)$ respectively in blue and red. This emphasizes that error bars should not be derived from Poisson statistics and that the accuracy of the low and high end parts of the PDF are degraded by the small sample size.

In the text

	Fig. 13 Column-density maps of the Polaris cloud. Left panel: full observed field. Middle panel: extracted filamentary region. Right panel: extracted “square” region, unfiltered and low pass filtered (see Sect. 6).
In the text

	Fig. C.1 ACF of the “square region”. Left panel: unflltered. Right panel: low pass filtered up to scale L/2. Again, filtering large-scale gradients reduces the anisotropy at short scales and reduces the estimated correlation length.
In the text

	Fig. C.2 ACF of the unflltered “filament region”. A strong anisotropy is present in the y-direction.
In the text

	Fig. C.3 Reduced ACF of the low pass filtered map in three different directions. Red line: x-direction (y = 0). Blue line: y-direction (x = 0). Green line: π/4 or x = y-direction. Dash dotted lines represent the values of the ACF when it is negative. A strong anisotropy is present in the y direction at large scales (y/L ≥ 10^–1).
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Alves, J., Lombardi, M., & Lada, C. J. 2017, A&A, 606, L2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[2] André, P., Men’shchikov, A., Bontemps, S., et al. 2010, A&A, 518, L102 [CrossRef] [EDP Sciences] [Google Scholar]

[3] Batchelor, G. K. 1953, The Theory of Homogeneous Turbulence (Cambridge: Cambridge University Press) [Google Scholar]

[4] Beaulieu, N. C. 2011, IEEE Trans. Commun., 60 23 [Google Scholar]

[5] Brunt, C., Heyer, M., & Mac Low, M.-M. 2009, A&A, 504 883 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[6] Brunt, C. M., Federrath, C., & Price, D. J. 2010, MNRAS, 403 1507 [Google Scholar]

[7] Chandrasekhar, S. 1951a, Proc. R. Soc. London Ser. A. Math. Phys. Sci., 210 18 [NASA ADS] [Google Scholar]

[8] Chandrasekhar, S. 1951b, Proc. R. Soc. London Ser. A Math. Phys. Sci., 210 26 [NASA ADS] [Google Scholar]

[9] Federrath, C. 2013, MNRAS, 436 1245 [NASA ADS] [CrossRef] [Google Scholar]

[10] Federrath, C. 2016, MNRAS, 457 375 [Google Scholar]

[11] Federrath, C., & Klessen, R. S. 2013, ApJ, 763 51 [Google Scholar]

[12] Federrath, C., Roman-Duval, J., Klessen, R., Schmidt, W., & Mac Low, M.-M. 2010, A&A, 512, A81 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[13] Frisch, U. 1995, Turbulence: the Legacy of AN Kolmogorov (Cambridge: Cambridge University Press) [CrossRef] [Google Scholar]

[14] Heinesen, A. 2020, J. Cosmol. Astropart. Phys., 2020 052 [CrossRef] [Google Scholar]

[15] Hennebelle, P., & Chabrier, G. 2008, ApJ, 684 395 [Google Scholar]

[16] Hopkins, P. F. 2012, MNRAS, 423 2037 [NASA ADS] [CrossRef] [Google Scholar]

[17] Jaupart, E., & Chabrier, G. 2020, ApJ, 903, L2 [NASA ADS] [CrossRef] [Google Scholar]

[18] Jaupart, E., & Chabrier, G. 2021, ApJ, 922, L36 [NASA ADS] [CrossRef] [Google Scholar]

[19] Kainulainen, J., Beuther, H., Henning, T., & Plume, R. 2009, A&A, 508, L35 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[20] Kleiner, S., & Dickman, R. 1984, ApJ, 286 255 [NASA ADS] [CrossRef] [Google Scholar]

[21] Kleiner, S., & Dickman, R. 1985, ApJ, 295 466 [NASA ADS] [CrossRef] [Google Scholar]

[22] Kritsuk, A. G., Norman, M. L., Padoan, P., & Wagner, R. 2007, ApJ, 665 416 [NASA ADS] [CrossRef] [Google Scholar]

[23] Mac Low, M.-M., & Klessen, R. S. 2004, Rev. Mod. Phys., 76 125 [Google Scholar]

[24] Machida, M. N., Inutsuka, S.-I., & Matsumoto, T. 2006, ApJ, 647, L151 [NASA ADS] [CrossRef] [Google Scholar]

[25] Masunaga, H., & Inutsuka, S.-I. 2000, ApJ, 531 350 [NASA ADS] [CrossRef] [Google Scholar]

[26] McKee, C. F., & Ostriker, E. C. 2007, ARA&A, 45 565 [Google Scholar]

[27] Miville-Deschênes, M.-A., Martin, P., Abergel, A., et al. 2010, A&A, 518, L104 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[28] Orkisz, J. H., Pety, J., Gerin, M., et al. 2017, A&A, 599, A99 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[29] Ossenkopf-Okada, V., Csengeri, T., Schneider, N., Federrath, C., & Klessen, R. S. 2016, A&A, 590, A104 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[30] Padoan, P., & Nordlund, Å. 2002, ApJ, 576 870 [NASA ADS] [CrossRef] [Google Scholar]

[31] Pan, L., Padoan, P., & Nordlund, Å. 2018, ApJ, 866, L17 [NASA ADS] [CrossRef] [Google Scholar]

[32] Pan, L., Padoan, P., & Nordlund, Å. 2019a, ApJ, 876 90 [NASA ADS] [CrossRef] [Google Scholar]

[33] Pan, L., Padoan, P., & Nordlund, Å. 2019b, ApJ, 881 155 [NASA ADS] [CrossRef] [Google Scholar]

[34] Papoulis, A., & Pillai, S. 1965, Variables Stochastic Processes (New York: McGraw-Hill) [Google Scholar]

[35] Passot, T., & Vázquez-Semadeni, E. 1998, Phys. Rev. E, 58 4501 [Google Scholar]

[36] Peebles, P. J. E. 1973, ApJ, 185 413 [Google Scholar]

[37] Pope, S. B. 1985, Prog. Energy Combustion Sci., 11 119 [NASA ADS] [CrossRef] [Google Scholar]

[38] Reinke, N., Fuchs, A., Hölling, M., & Peinke, J. 2016, in Fractal Flow Design: How to Design Bespoke Turbulence and Why (Berlin: Springer), 165 [Google Scholar]

[39] Reinke, N., Fuchs, A., Nickelsen, D., & Peinke, J. 2018, J. Fluid Mech., 848 117 [NASA ADS] [CrossRef] [Google Scholar]

[40] Scalo, J. 1984, ApJ, 277 556 [NASA ADS] [CrossRef] [Google Scholar]

[41] Schneider, N., André, P., Könyves, V., et al. 2013, ApJ, 766, L17 [NASA ADS] [CrossRef] [Google Scholar]

[42] Schneider, N., Ossenkopf, V., Csengeri, T., et al. 2015, A&A, 575, A79 [CrossRef] [EDP Sciences] [Google Scholar]

[43] Szyszkowicz, S. S., & Yanikomeroglu, H. 2009, IEEE Trans. Commun., 57 3538 [CrossRef] [Google Scholar]

[44] Vaytet, N., Chabrier, G., Audit, E., et al. 2013, A&A, 557, A90 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[45] Vaytet, N., Commerçon, B., Masson, J., González, M., & Chabrier, G. 2018, A&A, 615, A5 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]