Into the third dimension: stochastic measurements of Stokes parameters within the Poincaré sphere
Ul. Sępia 14,
04512
Warszawa,
Poland
email:
jason.lee.quinn@gmail.com
Received: 1 April 2014
Accepted: 29 September 2014
Inspired by recent use of polarimetry to study the Cosmic Microwave Background and extragalatic supernovae, a foray into the statistical properties of Stokes parameters expressed in spherical coordinates is began, allowing circular polarization and linear polarization to be treated in a unified manner. The use of spherical coordinates is quite necessary as it permits a Stokes polarization state to be expressed in terms of the customary polarization angles and degree of polarization usually needed for human interpretation. As shall be demonstrated, circular and linear polarization are not statistically independent quantities but intertwined in a way that is especially important, for instance, at low signaltonoise. New distributions, classical estimators, and marginalizations are presented for this “threedimensional” polarization problem including a generalization of the Rice distribution. The paper concludes with discussion regarding the potential pitfalls of a lower dimensional analysis.
Key words: polarization / methods: data analysis / methods: statistical
© ESO, 2014
1. Introduction
Astronomical polarimetry and spectropolarimetry, which have historically been underutilized, are now routinely being used in many experiments to search for and test new physics. Besides the diverse application of polarimetric techniques in solar and stellar astronomy, two areas of particular note are polarimetric observations of the cosmic microwave background (CMB) and spectropolarimetric observations of supernovae (SNe).
Detection of inflation signatures on the CMB using polarimetric imaging were recently announced by the BICEP2 team (Ade et al. 2014b,a). This result would confirm one of astrophysics most cherished but experimentally elusive theories. After a review of the public BICEP2 data, some have questioned the interpretation of the BICEP2 data and suggested that the foreground dust contamination had been underestimated or that the signal could have been produced by other mechanisms such as “radio loops” (Liu et al. 2014). The BICEP2 team have stood by their results although they have moderated their language in the peerreviewed version of their paper (Ade et al. 2014b). If valid, the BICEP2 discovery continues the trend set by allsky mapping satellites like COBE, WMAP, and Planck and various ground and balloon surveys of the CMB being perhaps the single most fruitful source to test cosmological theory and sometimes even fundamental physics. Further results related to CMB polarization are expected in the near future from BICEP3 (Kuo & BICEP3 and POLAR1 Collaborations 2013), the Keck Array (Staniszewski et al. 2012), Polarbear (The POLARBEAR Collaboration 2014), and the Planck team (Planck Collaboration XIX 2014). It is clear that the CMB will continue to be a vital source of cutting edge science; but, as the controversy around the BICEP2 announcement indicates, polarization work is tricky and extreme care must be made to handle properly statistical and systematic error in the data.
Other research groups have been using spectropolarimetric observations to investigate the systematic behavior and statistical variation of thermonuclear (Type Ia) and corecollapse SNe (all other types) arising from asymmetries. (The Wang & Wheeler 2008 Annual Review is a good introduction.) Spectropolarimetry is the only known way to probe asymmetries of unresolved sources such as extragalactic supernovae so it will continue to be an effective research tool. The detection of asymmetries has helped place much needed constraints on SNe explosion physics/progenitor scenarios (recent examples include Maund et al. 2010b; Tanaka et al. 2012; Patat et al. 2012; Zelaya et al. 2013; Maund et al. 2013) and environments (Mauerhan et al. 2014). Asymmetries of Type Ia SNe are of particular interest as highredshift events were famously used to discover cosmic acceleration (Riess et al. 1998; Perlmutter et al. 1999), a result for which leads of two large teams shared the 2011 Nobel Prize. This acceleration implies the existence of socalled “dark energy”, whose physical nature is completely puzzling and is perhaps the biggest unsolved problem in astrophysics. As such, any possible correlations between metrics of asymmetry in Type Ia SNe and other observables (Leonard et al. 2005; Wang et al. 2007; Maund et al. 2010a) are of high importance because correlations may be used to identify contaminating Type Ia subclasses (Wang et al. 2013) or apply statistical corrections to the maximum brightness, both of which could help refine the measurement of the cosmological constant (Maeda et al. 2011). Spectropolarimetry of SNe is a blooming field but, just as with the CMB, handling the statistical and systematic error of polarimetric data for SNe is challenging.
As both polarimetric imaging and spectropolarimetry require dividing incoming light into bins, researchers in these fields often find themselves in photonstarved situations. Indeed, in both subjects discussed above, photonlimited data at low signaltonoise must be confronted. (Low signaltonoise is often the hallmark of forefront science by its very nature.) This problem is compounded in polarimetric imaging when a narrowband filter is needed. Working at low signaltonoise will continue to be an issue for polarimetrists.
Polarization data is often measured in Stokes parameters but expressed in humandigestible quantities like degrees of polarization and angles of polarization. Despite this, there has been relatively little attention given to the surprisingly complicated statistics involved when making transformations to these quantities at low signaltonoise. Most of the published literature of this fundamental, practical topic has focused on the analysis of linear polarization alone that involves marginalizations over a twodimensional gaussian distribution over the QU plane cross section of the Poincaré sphere (the “Poincaré disk” is a useful expression although it is already in use in other mathematical contexts). Marginalization over the linear polarization angle results in the Rice distribution (Rice 1945) while marginalization over the degree of linear polarization results in another distribution given by Vinokur (1965). The authors refer to this marginalized approach as the “onedimensional problem”. A small number of papers such as Wang et al. (1997), Vaillancourt (2006), and Quinn (2012) have used a nonmarginalized distribution over the Poincaré disk (the “twodimensional problem”), with the latter being the first to put the subject on a rigorous mathematical footing. To date, there has been little to no attention paid to the threedimensional problem defined by a threedimensional distribution over the Poincaré sphere. Progress on higher dimensional generalizations of the twodimensional problem have been made, for instance, by members of the Planck Collaboration (Planck Collaboration XIX 2014; Montier et al. 2014a,b) by including intensity in the analysis (see also Viola et al. 2014) but researchers have thus far generally ignored circular polarization when measuring linear polarization in astronomical sources (and vice versa); however, the measurement of one is not actually independent of the other. This effect is generally negligible except when one is working at very small signaltonoise ratios or with objects with polarization near 100% (see Quinn 2012 for details on how large polarization introduces complication). The aim of this paper is to elucidate some of this interdependence and to see how it relates to the more common ways of treating linear polarization measurements at low signaltonoise via the Rice distribution or through Bayesian techniques as outlined in Vaillancourt (2006) and Quinn (2012). The results should be of interest to all polarimetrists and those those studying the CMB or SNe in particular.
2. The sampling distribution
A very practical way of encoding the polarization state of electromagnetic radiation is the Stokes parameter formalism invented by George Stokes in 1852. Stokes parameters consist of four quantities: I (intensity), Q and U (related to linear polarization), and V (related to circular polarization)^{1}. They are convenient quantities to measure but it is easier for people to think about polarization in terms of percentages and angles. This may be done by introducing a spherical coordinate system (p,θ,φ) on points of the Poincaré sphere (a unit ball in Cartesian coordinates centered on the origin of , , and axes), which represents all possible polarization states. Accessible modern introductions to Stokes parameters and their theory are given in del Toro Iniesta (2003) and Landi Degl’Innocenti (2002).
If an astronomical source’s polarization state is measured with gaussian error (Σ_{I},Σ_{Q},Σ_{U},Σ_{V}) on each Stokes parameter (I,Q,U,V), how do the corresponding spherical coordinates (p,θ,φ) relate to the “true” polarization state (p_{0},θ_{0},φ_{0}) determined uniquely from (I_{0},Q_{0},U_{0},V_{0})? (Unsubscripted variables will usually refer to measured values while “0”subscripted values will refer to “true” values. Later, an overbar on some variables and distributions will indicate signaltonoise related quantities.) As it turns out, a spherical coordinate system introduces nontrivial biases into the measured quantities and the measurement itself must be “corrected” to remove them. This paper extends Quinn (2012), which studied these effects in polar coordinates before discussing the Bayesian approach, to spherical coordinates. The reader is referred to that paper for a more extensive introduction on Stokes parameters.
Under the assumptions that the source intensity is large compared to the measurement error, i.e., I_{0} ≫ Σ_{I} which implies I ≈ I_{0} (Quinn 2012)^{2}, and the error on a Stokes measurement is described by a threedimensional uncorrelated ellipsoidal Gaussian distribution (with semimajor axes Σ_{Q}, Σ_{U}, Σ_{V}), the normalized distribution, F_{C}, of the measured Stokes parameters (Q, U, V) given the “true” Stokes parameters (Q_{0}, U_{0}, V_{0}) is (1)These parameters are not usually worked with directly. Instead “normalized Stokes parameters” are defined as q ≡ Q/I_{0}, u ≡ U/I_{0}, v ≡ V/I_{0} (with also q_{0} ≡ Q_{0}/I_{0}, u_{0} ≡ U_{0}/I_{0}, and v_{0} ≡ V_{0}/I_{0}), and σ_{q} ≡ Σ_{Q}/I_{0}, σ_{u} ≡ Σ_{U}/I_{0}, and σ_{v} ≡ Σ_{V}/I_{0}, yielding a new normalized distribution (2)Lastly, one most often works with “signaltonoise” versions of the Stokes parameters defined as , , and which gives another distribution (3)which is normalized, . This distribution no longer explicitly depends on the σ’s and naturally handles ellipsoidal distributions. The σ’s do, however, play a minor role: they limit the possible values of q_{0}, u_{0}, and v_{0} so it is important not to forget their significance (Quinn 2012).
Fig. 1 Illustration of the spherical coordinate system used in the paper. The variable θ is used for the azimuthal angle with a range of (−π,π ], and the variable ϕ is measured from the positive vaxis with a range of [ 0,π ]. 

Open with DEXTER 
We wish to now transform our equations to spherical coordinates as the Poincaré sphere suggests. The transformation equations^{3} are (4)which has inverse (5)where θ is the angle in the qu plane and ϕ is the angle from the positive vaxis (see Fig. 1). The “true values”, p_{0}, θ_{0}, and ϕ_{0}, are similarly related to q_{0}, u_{0}, and v_{0}.
It is crucial not to confuse the degree of polarization, p, with the degree of linear polarization in the following. The degree of linear polarization is a separate quantity equal to or psin(ϕ) and, although it is certainly important, is not discussed in this paper.
We now restrict ourselves to the case where all three standard deviations are equal (Σ_{Q} = Σ_{U} = Σ_{V} ≡ Σ) so that σ_{q} = σ_{u} = σ_{v} ≡ σ. The distribution in spherical coordinates is then (6)The p^{2}sinϕ in the numerator of the factor out front is due to the Jacobian of the transformation, i.e., dq du dv = p^{2}sinϕ dp dθ dϕ^{4}. It is rather promising that the analytic form of Eq. (6) is similar to and not much more complicated than the twodimensional polar version. Already one can see that the functional form of the equation is not independent of the true value of the circular polarization.
The barred version is (7)where and . This is still normalized, .
3. Classical estimators
Now that (the sampling distribution in signaltonoise variables) is at hand, two important classical estimators may be calculated. These are the socalled “Most Likely” and “Most Probable” estimators. (Note that the prime is not indicating a derivative in but serving as a warning that we are using the qu plane angle, θ, instead of the sky angle for linear polarization.)
3.1. The “Most Likely” solution
The “Most Likely” (ML) solution is obtained by maximizing with respect to , θ_{0}, and ϕ_{0}. Physically, this corresponds to finding the value of which makes the measured value of the most statistically likely, hence the name. A general solution may be found by solving the system , , and for , θ_{0}, and ϕ_{0}, which are then treated as estimators. Technically, a second derivative test must also be done to check that the point is actually a maximum. For functions of more than two variables, this test requires checking that all the eigenvalues of the function’s Hessian matrix are positive. (All negative would be a minima, while mixed values would be inconclusive.) The equations involved in the test are very large and cumbersome. From graphical investigation or intuition, it is obvious, however, that the solution to be found is a maximum.
From the azimuthal equation, , it is quickly deduced that θ_{0} = θ everywhere except for values that correspond to an origin (where or equals zero) or a circular polarization axis (where ϕ_{0} or ϕ equals zero or π) of the true and measured Poincaré spheres. Further tedious but straightforward algebra allows the full solution to be found, which is just
This solution involves no “bias corrections” at all. What you measure is your best estimate for the “true” polarization.
This solution is consistent with the polar coordinate case (that is, the twodimensional case corresponding to ϕ → π/ 2, ϕ_{0} → π/ 2, and σ_{v} → 0) presented in Quinn (2012). It is only when the Rice distribution (a onedimensional distribution arising from marginalization) is used that a nontrivial solution is found for the ML method (Simmons & Stewart 1985). As marginalization intentionally removes information present in the original distribution, the logical basis for the use of nontrivial ML estimators found from a marginalized distribution to perform a “bias correction” is questionable. This is underscored by the fact that the p_{0}, θ_{0}, and ϕ_{0} are not stochastic variables but input parameters in and should not be treated as such as is done in the ML approach. The logically correct way to “invert” the distribution is to use Bayes’ Theorem.
3.2. The “Most Probable” solution
The “Most Probable” (MP) solution is found by maximizing with respect to , θ, and ϕ. This corresponds to finding the point that produces a distribution of measured points with a maximum at the actual measured point . A general solution may be found by solving the system , , and for , θ_{0}, and ϕ_{0}. (As before, a second derivative test is technically necessary to determine if the solution is actually a maximum as opposed to a minimum or some point of mixed inflection but this test is similarly impractical to perform because the equations end up being quite large. Graphically it can be shown that the solutions are maximums.)
Again, one quickly finds θ_{0} = θ from the azimuthal equation, . Using this condition, the ϕ and equations yield and , respectively. This twoequation system is somewhat difficult to solve and the solution is best accomplished with a computer algebra system. The full analytic solution is
which is only valid for points satisfying the condition (10)where csc(x) is the cosecant function. Thus there are two coupled “bias corrections” that must be made to the data: one for and one for ϕ. A graphical representation of the preceding solution is given in Fig. 2. The leftmost extension of the region consisting of points satisfying Eq. (10) (shown in blue in the figure) approaches the point . In the special case of ϕ = π/ 2 and , the correction is just ϕ_{0,MP} = π/ 2 and .
Fig. 2 Visualization of the bias correction field of Eq. (9). The tail of each arrow is located at a measured point and the head is attached to the corresponding “Most Probable” estimate of . This correction should only be applied to points lying in the blue region given by Eq. (10). See the text for discussion about points outside the blue region. 

Open with DEXTER 
Fig. 3 Behavior of the bias correction field (Eq. (9)) on the boundary of the blue valid region (Eq. (10)). The leftmost point on the boundary of the blue region is at . 

Open with DEXTER 
The “bias correction field” in Fig. 2 shows that the correction applied to gets larger as gets smaller (the actual magnitude is ϕdependent). This is consistent with the known behavior in the twodimensional case. Figure 2 also shows that ϕ itself must be corrected: it should be made larger when ϕ>π/ 2 and smaller when ϕ<π/ 2. The magnitude of this correction also tends to be larger as gets smaller. The exact magnitude of the ϕcorrection at a given depends on ϕ itself with the magnitude being largest in the “midlatitudes” of the Poincaré sphere and zero on the equator. Generally, points not on the equator are “attracted” towards the poles, an effect most prominent at low signaltonoise (between, say, ). The bias correction vector field diminishes as gets larger but for ϕ values near the poles is seen to remain significant even for , a regime typically thought of as having signaltonoise high enough not to have to worry about such effects.
For points infinitesimally close to the boundary defined by Eq. (10), the bias correction for ϕ estimates ϕ_{0} = 0 or ϕ_{0} = π, whichever pole is nearest. Figure 3 illustrates this behavior. The special case of ϕ = π/ 2 “corrects” to ϕ_{0} = π/ 2. More specifically, points on the boundary with π/ 2 <ϕ<π all bias correct to ϕ_{0} = π while those with 0 <ϕ<π/ 2 all bias correct to ϕ_{0} = 0. The correction on the boundary acts typically and corrects more and more towards the origin as gets smaller.
From Figs. 2 and 3, it can be seen that every possible with and sinϕ_{0} ≠ 0 is associated with a point within the blue region and vice versa. Unfortunately, a closed form solution for the inverse of Eq. (9) was not found and may not exist.
Intuitively^{5}, the correction may be understood by considering a value of zero. The measured value of cannot be negative and, due to scatter from measurement error, you expect that will almost always be positive (that is, there is a negligibly small chance of zero) even though . Thus, the measurement process “biases” to larger values by some amount. Therefore, a correction should be made that subtracts a bit from the measured value to obtain a good estimate of . Even for a nonzero , this biasing occurs although with diminishing effect as gets larger. The same line of reasoning can be used to gain an intuitive understanding of the ϕ bias correction. Consider a source in a state of pure circular polarization. This corresponds to a pole of the Poincaré sphere. Let’s focus on the purely rightcircularly polarized case (ϕ_{0} = 0). Measurements of this state will be clustered around the pole (including outside the Poincaré sphere) but ϕ is never negative and is almost always positive. Similarly for the leftcircularly polarized case (ϕ_{0} = π), measurements of ϕ will be clustered near that pole but cannot have ϕ greater than π and almost always will have ϕ less than π. Just as with , a correction is needed to remove this effect from ϕ. It is important to remark that both the and ϕ bias corrections are purely coordinate effects. It may seem at first as if the ϕcorrection is inconsistent with the symmetry of the system (from Σ_{Q} = Σ_{U} = Σ_{V}) but it is not. While conventionally the vaxis is always chosen to isolate circular polarization, the effect would arise regardless of the orientation of the vaxis within the Poincaré sphere, which preserves the overall symmetry (i.e., the symmetry is spontaneously broken by introduction of the coordinate system).
It must be stressed that the solution of Eq. (9) only applies to measured points satisfying Eq. (10) (illustrated by the blue region in the figures). Presumably, the area outside the blue region divides into three other regions: one characterized by , another by and ϕ_{0,MP} = 0, and still another by and ϕ_{0,MP} = π. In the latter two cases, there could still be a bias correction at work. It is unclear how to determine mathematically a “MP solution” in the nonblue region of Fig. 2. A formal solution may simply not exist there. The authors’ best guess as to the best practice is that measurements within the “triangle” defined by points (0,0), (0,π), and correct to (0,π) if ϕ>π/ 2, (0,0) if ϕ<π/ 2, and (0,π/ 2) if ϕ = π/ 2^{6}. The remaining points could correct to the same value as the blue region boundary point whose “correction arrow” passes through the point. Another possibility for these later points is to translate the correction arrow from the boundary of the blue region with the same ϕ to the point and use the intersection of that arrow with the axis as the corrected value. Formal solutions in these regions where not, however, found.
Equation (9) does indeed provide an estimate of the point that would produce a distribution of measured points with maximum at . It is however just a point estimate. The actual distribution for a small values of tends to be broad, shallow, and nearly flat, especially in the ϕ dimension. In other words, the errors bars associated with each estimate will be large so it is cautioned that the point estimate should not be taken too literally. Using the distribution, one could construct confidence regions. These regions are however not uniquely determined, even in the onedimensional case (Simmons & Stewart 1985). Construction of confidence regions is reserved for future work.
4. Marginalized distributions
Experimenters often work with only a partial description of a physical system. This may be necessary when independent variables of a distribution of interest related to the system are not measurable, or desired when there are socalled nuisance variables of little concern. In a polarization study, for instance, it may be that only the degree of linear polarization is the focus and not the angle of linear polarization. In such situations, one can marginalize over “unwanted” independent variables to find a new distribution which no longer depends on them but at the cost of losing some information present in the original distribution.
The marginalization possibilities for f′ are to marginalize over (I) p only; (II) θ only; (III) ϕ only; (IV) p and θ; (V) p and ϕ; and (VI) θ and ϕ. The first and last cases are perhaps the most interesting and shall now be investigated.
4.1. The pmarginalized angular distribution
The pmarginalized angular distribution is (11)or, in signaltonoise variables, (12)This equation can be simplified. Start with and use the definite integral (13)where erfc(x) is the complementary error function defined as^{7}(14)and erf(x) is the error function to reduce this integral. (15)is obtained where (16)which may be viewed as a higherdimensional analog of the angular distribution presented in Vinokur (1965; see also Clarke & Stewart 1986; NaghizadehKhouei & Clarke 1993; Quinn 2012).
Examining a few density plots of Eq. (15) for various values of , θ_{0}, and ϕ_{0} reveals that the probability distribution is attenuated near the poles of the coordinate system, an effect that becomes broader for small values of . In particular, one should notice that even for , the probability density still varies with ϕ. This is due to the pole bias effect discussed earlier.
One would like to go further and find the two onedimensional marginalized angular distributions (cases IV and V) but the integrations seem unable to be completely performed using elementary or even common special functions.
4.2. The angularmarginalized pdistribution
One is often primarily concerned with the degree of polarization, p, and not the value of the angular variables. Marginalization over the angular variables will therefore produce a new distribution, to be called Q ( in signaltonoise variables) of high practical importance. In the polar coordinate case, marginalization over the (lone) angular variable, θ, produces the Rice distribution. In the spherical coordinate case, the marginalization must occur over both θ and ϕ. Thus, the function may be considered a higherdimensional analog of the Rice distribution. Let (17)or similarly (18)Let’s calculate. (19)The θintegral can be solved using , where ℐ_{0}(x) is the modified Bessel function of the first kind. ℐ_{0} is an even, real function when its argument is real. (Because the range of integration is over a full period of θ, the value of θ_{0} is immaterial because it appears as a phase shift and does not affect the integral.) Once the θintegral is performed, the distribution no longer explicitly depends on θ_{0}. We shall therefore drop θ_{0} from the conditions. Continuing with the calculation, (20)The ϕintegral in the last expression is nontrivial but it appears it can be solved. Let (21)and (22)where . No computer algebra system tested could solve the integral in Eq. (22) and no applicable integrals were found in tables of integrals. Manual attempts to solve it also failed. From graphs of Ω versus ϕ_{0} created numerically^{8} for several values of a, it is noticed that the function may actually be constant in ϕ_{0} for a given a. Unfortunately, it is not a simple matter to show but if is graphed for 0 ≤ ϕ_{0} ≤ π at several sampled values of a it seems to always show residuals about zero with scatter consistent with numerical noise. The previous numerical observations lead one to suspect very strongly that Ω(a,ϕ_{0}) is independent of the value of ϕ_{0}. This may be surprising in light of the bias corrections previously discussed; however, the same marginalized formula seems like it should be found regardless of the rotational orientation of the coordinate system within the Poincaré sphere. If it is true that Ω(a,ϕ_{0}) is constant in ϕ_{0}, any value of ϕ_{0} may be chosen and it does not alter Ω(a,ϕ_{0})’s value at a given a. In that case, a value of ϕ_{0} may be judiciously chosen that simplifies the integral. Good choices are ϕ_{0} = 0 or π for which (23)is found using the common integral , where sinh(x) is the hyperbolic sine of x. Indeed, simple arguments modifying the factors in the integrand of Eq. (22) can prove this to be a lower limit for Ω(a,ϕ_{0}) at a given a (see Appendix A). Proof that it is also the upper limit was elusive (except at ϕ_{0} = 0, π/ 2, and π). A Taylor series expansion of Ω(a,ϕ_{0}) in ϕ_{0} at ϕ_{0} = 0 was found to be consistent with Eq. (23) out to 26th order using computer algebra systems^{9}. Based on physical suspicion corroborated by numerical results, Eq. (23) shall be assumed to be true henceforth.
Returning to the main distribution, , and using Eq. (23), it is proposed that (24)Since the value of ϕ_{0} is irrelevant, ’s dependence on ϕ_{0} has been dropped from the notation. This is the higherdimensional analog of the Rice distribution. Surprisingly, it is a simpler function than the Rice distribution in the sense that it depends on the hyperbolic sine function rather than a modified Bessel function. The distribution is normalized, .
Fig. 4 Contour plot of the angularmarginalized distribution given by Eq. (24). The contours are at 0.1 intervals with the lighter shades corresponding to larger values. The dotted and dashed curves (Eqs. (25) and (26)) trace maximums along horizontal and vertical slices and end at the red points at and , respectively. Although the “shape” of the distribution is not affected by the value of σ, the plot assumes that σ ≤ 1/4 so that the plotted values for up to 4 (=1 /σ_{max}) exist. 

Open with DEXTER 
Some general properties of can now be investigated. A contour plot of the function is shown in Fig. 4 with contours at 0.1 intervals. The function achieves a global maximum as approaches of . Tracing the maximums of horizontal slices (the dotted line in the figure) yields the equation or (25)This implicit curve intersects the axis at the global maximum at and is the “Most Probable” estimator curve for . Tracing the maximums of vertical slices (dashed line) yields or (26)This curve intersects at the axis at and is the “Most Likely” estimator curve for .
The MP estimator curve of Eq. (25) (dotted curve in Fig. 4) can be used for all . Yet it was stressed that the full MP solution given Eqs. (9) and (10) is only valid for some values with . Marginalization has thus hidden some of the complexity of the problem. The ML estimator given by Eq. (26) (dashed curve) now has nontrivial behavior unlike the full solution in Eq. (8). It is not easy to anticipate the effect that marginalization will have on various estimators.
It is interesting to compare the long form of these estimator curves and the corresponding curves based on the Rice distribution (Wardle & Kronberg 1974; Simmons & Stewart 1985; Quinn 2012). These curves are based on the hyperbolic trigonometric functions whereas the curves in the twodimensional case are based on Bessel functions. Their points of intersection with the axis have also been shifted. The differences arise because the Jacobian factor is proportional to p^{2} in spherical coordinates but only p for polar coordinates.
5. Discussion
How do these new results relate to the old results derived using polar coordinates for the “Poincaré disk”? First, the old results are not special cases of the new results when ϕ → π/ 2 and ϕ_{0} → π/ 2; for instance, in the special case of ϕ = π/ 2, the equation of Eq. (9) reduces to rather than the Wang estimator (Wang et al. 1997; Quinn 2012) and does not reduce to angular distribution given by Vinokur (1965). Similarly, is altogether distinct from the Rice distribution. The new results are therefore not direct extensions of the old functions but they do have somewhat similar forms. The three versus two dimensional nature of the parameter space is responsible for this.
Particularly interesting is a comparison of Fig. 4 with Fig. 2a of Quinn (2012). The intercepts of all the estimator curves have undergone shifts of the form . This again is due to the increased dimensionality of the problem. Thus, when circular polarization has equal measurement error to the linear Stokes axes’ measurement error, it takes a bigger measurement of the magnitude of the polarization vector to be significant than the twodimensional case suggests. The twodimensional results can perhaps be recovered by some limiting procedure where σ_{v} → 0 while σ (=σ_{q} = σ_{u}) is held constant. This suggests that decreasing the measurement error on one of σ_{q}, σ_{u}, or σ_{v} should decrease the threshold for a significant measurement. On the other hand, often an instrument measures only linear or only circular polarization. This may correspond to a scenario where the distribution of Stokes parameters is best modeled by assuming a flat (instead of gaussian) distribution for the variables of the undetected type. This would violate our starting assumptions in Eq. (1) and would produce rather different results. As a flat distribution is in an informal sense a “wide” distribution, it is probable that this would raise the detection threshold higher. It should be clear that interpreting polarization data at low signaltonoise in “human digestible” quantities like degrees of polarization or angle of polarization is difficult.
6. Conclusion
The threedimensional Stokes sampling distribution in spherical coordinates is given. From it the “Most Likely” and “Most Probable” classical (that is, nonBayesian) estimators for the “true” Stokes parameters given measured values have been derived. Additionally, a two useful marginalizations of the sampling distribution have been calculated including a higher dimensional analog of the Rice distribution. These results are necessary stepping stones towards a full and proper treatment of polarization measurements.
It is cautioned that a deep understanding of the measurement of Stokes parameters in spherical coordinates like here or the polar coordinates usually used for linear polarization requires a Bayesian approach. The presented results may nonetheless have tolerable accuracy for many experiments.
While the setup of the problem and the presentation of the results has been framed in terms of the study of polarization, the solutions are in fact far more general. They are applicable to the measurement of any “source point” contained in a threedimensional ball measured with Gaussian error in each Cartesian direction and presented in spherical coordinates. The results could be used to study the location of neutrino production within the Sun or the determination of the epicenter of an earthquake. The results could also influence the interpretation of cosmic microwave background polarization data. Extensions of the work removing the constraint that the error be equal in each direction or that involve correlated variables would be highly desirable.
This somewhat technical assumption does not (especially at low signaltonoise) imply Q ≈ Q_{0}, U ≈ U_{0}, or V ≈ V_{0}. It is a weaker condition used implicitly (but without recognition) by many important past papers on polarization statistics even though it is crucial for defining the reduced Stokes parameters as, for instance, q ≡ Q/I_{0} rather than the problematic q ≡ Q/I. Even more fundamentally, without the condition, studying a distribution over I, Q, U, and V instead of just Q, U, and V would be thrust upon us. That I ≈ I_{0} does not imply the other approximations is easy to understand intuitively with backoftheenvelope calculations by comparing the relative (Poisson) error on the intensity to the relative expected (Poisson) error on the Stokes parameters for some low degree of polarization source and some assumed moderate number of total counts for the measured intensity (like, for instance, a 1% source and 1000 total counts). The cited paper details the proper role of the condition in the theory.
In this set of equations, the arctangent function, arctan(y,x), is the twodimensional version often supported in computer languages. Its value is assumed to vary continuously from −π to + π on the unit circle starting from the negative xaxis counterclockwise again to the negative xaxis. Be extremely careful when implementing this function in computer code. Some languages switch the arguments so that it is arctan(x,y). Make sure your implementation gives valid angles for all quadrants and axes, especially if you use the single argument arctangent.
This argument, while “intuitive” for understanding the bias correction from a classical statistical perspective, can easily be misapplied in a Bayesian scenario. Nevertheless, when properly used, it is still an important tool to understand Bayesian results. The key difference is that in the classical approach, it applies to a single “true” point producing the measured points, whereas in the Bayesian approach it applies individually to all “possibly true” points (weighted by their likelihood).
The zerothorder term of the expansion is 2sinh(a) /a. If Eq. (23) is true, then it is the only nonzero term. It was proven that all the oddorder coefficients in the expansion are identically zero. It appears that the positive evenorder expansion coefficients will also always be identically zero but this remains unproven.
References
 Ade, P. A. R., Aikin, R. W., Amiri, M., et al. 2014a, ApJ, 792, 62 [NASA ADS] [CrossRef] (In the text)
 Ade, P. A. R., Aikin, R. W., Barkats, D., et al. 2014b, Phys. Rev. Lett., 112, 241101 [NASA ADS] [CrossRef] [PubMed] (In the text)
 Clarke, D., & Stewart, B. G. 1986, Vistas Astron., 29, 27 [NASA ADS] [CrossRef] (In the text)
 del Toro Iniesta, J. C. 2003, Introduction to Spectropolarimetry (Cambridge University Press) (In the text)
 Kuo, C.L., & BICEP3 and POLAR1 Collaborations 2013, in IAU Symp. 288, eds. M. G. Burton, X. Cui, & N. F. H. Tothill, 80 (In the text)
 Landi Degl’Innocenti, E. 2002, in Astrophysical Spectropolarimetry, Proc. of the XII Canary Islands Winter Scool of Astrophysics, eds. J. TrujilloBueno, F. MorenoInsertis, & F. Sánchez (Cambridge: Cambridge University Press), 1 (In the text)
 Leonard, D. C., Li, W., Filippenko, A. V., Foley, R. J., & Chornock, R. 2005, ApJ, 632, 450 [NASA ADS] [CrossRef] (In the text)
 Liu, H., Mertsch, P., & Sarkar, S. 2014, ApJ, 789, L29 [NASA ADS] [CrossRef] (In the text)
 Maeda, K., Leloudas, G., Taubenberger, S., et al. 2011, MNRAS, 413, 3075 [NASA ADS] [CrossRef] (In the text)
 Mauerhan, J., Williams, G. G., Smith, N., et al. 2014, MNRAS, 442, 1166 [NASA ADS] [CrossRef] (In the text)
 Maund, J. R., Höflich, P., Patat, F., et al. 2010a, ApJ, 725, L167 [NASA ADS] [CrossRef] (In the text)
 Maund, J. R., Wheeler, J. C., Wang, L., et al. 2010b, ApJ, 722, 1162 [NASA ADS] [CrossRef] (In the text)
 Maund, J. R., Spyromilio, J., Höflich, P. A., et al. 2013, MNRAS, 433, L20 [NASA ADS] [CrossRef] (In the text)
 Montier, L., Plaszczynski, S., Levrier, F., et al. 2014a [arXiv:1406.6536] (In the text)
 Montier, L., Plaszczynski, S., Levrier, F., et al. 2014b [arXiv:1407.0178] (In the text)
 NaghizadehKhouei, J., & Clarke, D. 1993, A&A, 274, 968 [NASA ADS] (In the text)
 Patat, F., Höflich, P., Baade, D., et al. 2012, A&A, 545, A7 [NASA ADS] [CrossRef] [EDP Sciences] (In the text)
 Perlmutter, S., Aldering, G., Goldhaber, G., et al. 1999, ApJ, 517, 565 [NASA ADS] [CrossRef] (In the text)
 Planck Collaboration XIX. 2014, A&A, 571, A19 [NASA ADS] [CrossRef] [EDP Sciences] (In the text)
 Quinn, J. L. 2012, A&A, 538, A65 [NASA ADS] [CrossRef] [EDP Sciences] (In the text)
 Rice, S. O. 1945, Bell Systems Tech. J., 24, 46 [CrossRef] (In the text)
 Riess, A. G., Filippenko, A. V., Challis, P., et al. 1998, AJ, 116, 1009 [NASA ADS] [CrossRef] (In the text)
 Simmons, J. F. L., & Stewart, B. G. 1985, A&A, 142, 100 [NASA ADS] (In the text)
 Staniszewski, Z., Aikin, R. W., Amiri, M., et al. 2012, J. Low Temp. Phys., 167, 827 [NASA ADS] [CrossRef] (In the text)
 Stokes, G. G. 1852, Trans. Cambridge Phil. Soc., 9, 399 [NASA ADS] (In the text)
 Tanaka, M., Kawabata, K. S., Hattori, T., et al. 2012, ApJ, 754, 63 [NASA ADS] [CrossRef] (In the text)
 The POLARBEAR Collaboration, Ade, P. A. R., Akiba, Y., Anthony, A. E., et al. 2014, ApJ, 794, 171 [NASA ADS] [CrossRef] (In the text)
 Vaillancourt, J. E. 2006, PASP, 118, 1340 [NASA ADS] [CrossRef] (In the text)
 Vinokur, M. 1965, Ann. d’Astrophys., 28, 412 [NASA ADS] (In the text)
 Viola, M., Kitching, T. D., & Joachimi, B. 2014, MNRAS, 439, 1909 [NASA ADS] [CrossRef] (In the text)
 Wang, L., & Wheeler, J. C. 2008, ARA&A, 46, 433 [NASA ADS] [CrossRef] (In the text)
 Wang, L., Wheeler, J. C., & Hoeflich, P. 1997, ApJ, 476, L27 [NASA ADS] [CrossRef] (In the text)
 Wang, L., Baade, D., & Patat, F. 2007, Science, 315, 212 [NASA ADS] [CrossRef] (In the text)
 Wang, X., Wang, L., Filippenko, A. V., Zhang, T., & Zhao, X. 2013, Science, 340, 170 [NASA ADS] [CrossRef] (In the text)
 Wardle, J. F. C., & Kronberg, P. P. 1974, ApJ, 194, 249 [NASA ADS] [CrossRef] (In the text)
 Zelaya, P., Quinn, J. R., Baade, D., et al. 2013, AJ, 145, 27 [NASA ADS] [CrossRef] (In the text)
Appendix A: Bounds on Ω(a, ϕ_{0})
It helps to have functional bounds on (A.1)to show, for instance, that it is at least noninfinite over the ϕ_{0}domain of interest. There are three factors (sin(ϕ), e^{acos(ϕ)cos(ϕ0)}, and I_{0}(asin(ϕ)sin(ϕ_{0}))) in the integrand, each of which is simple enough that they allow various bounds to be obtained for the total expression.
Appendix A.1: Lower bound
Under variation of ϕ_{0}, the e^{acos(ϕ)cos(ϕ0)} factor is minimized when cos(ϕ_{0}) is the smallest, which occurs for ϕ_{0} = π, so that the factor becomes e^{−acos(ϕ)}. Furthermore, the I_{0}(asin(ϕ)sin(ϕ_{0})) factor is the smallest when its argument is zero, which also occurs at ϕ_{0} = π (also ϕ_{0} = 0) so that the factor become ℐ_{0}(0) = 1. After these substitutions, the integral for Ω can be performed explicitly: (A.2)For any given value of a, this sets a (constant) global lower bound on the integral: the value of integral must be greater than or equal to for all values of ϕ_{0}. It is probably also the greatest lower bound for each ϕ_{0}.
Appendix A.2: Upper bounds
The e^{acos(ϕ)cos(ϕ0)} factor is maximized when its exponent is the largest while the I_{0}(asin(ϕ)sin(ϕ_{0})) factor is the largest when the magnitude of its argument is as big as possible.
A constant global upper bound is found by modifying the factors in the obvious way: (A.3)A stricter (but nonconstant) global upper bound can be found merely by allowing sin(ϕ_{0}) → 1 in the Bessel function factor, (A.4)This last solution has functional dependence on ϕ_{0} and is not valid for ϕ_{0} = π/ 2 because of the sec(ϕ_{0}) component. For ϕ_{0} = 0 this bound is and it decreases smoothly and approaches as ϕ_{0} → π/ 2^{+}. Similarly, for ϕ_{0} = π it is and it decreases smoothly and approaches as ϕ_{0} → π/ 2^{−}. Of course, if ϕ_{0} = π/ 2, Eq. (A.1) can be accomplished directly: (A.5)so the discontinuity in Eq. (A.4) at ϕ_{0} = π/ 2 can be “removed”. Equation (A.1) can also be performed for ϕ_{0} = 0 and ϕ_{0} = π and in both cases results, so we have explicit upper bounds at those values too.
Appendix A.3: Best bounds
Combining the previous information, the best bounds are (A.6)which are finite for all ϕ_{0} at a given value of a.
All Figures
Fig. 1 Illustration of the spherical coordinate system used in the paper. The variable θ is used for the azimuthal angle with a range of (−π,π ], and the variable ϕ is measured from the positive vaxis with a range of [ 0,π ]. 

Open with DEXTER  
In the text 
Fig. 2 Visualization of the bias correction field of Eq. (9). The tail of each arrow is located at a measured point and the head is attached to the corresponding “Most Probable” estimate of . This correction should only be applied to points lying in the blue region given by Eq. (10). See the text for discussion about points outside the blue region. 

Open with DEXTER  
In the text 
Fig. 3 Behavior of the bias correction field (Eq. (9)) on the boundary of the blue valid region (Eq. (10)). The leftmost point on the boundary of the blue region is at . 

Open with DEXTER  
In the text 
Fig. 4 Contour plot of the angularmarginalized distribution given by Eq. (24). The contours are at 0.1 intervals with the lighter shades corresponding to larger values. The dotted and dashed curves (Eqs. (25) and (26)) trace maximums along horizontal and vertical slices and end at the red points at and , respectively. Although the “shape” of the distribution is not affected by the value of σ, the plot assumes that σ ≤ 1/4 so that the plotted values for up to 4 (=1 /σ_{max}) exist. 

Open with DEXTER  
In the text 