Cosmic webtype classification using decision theory
^{1}
Institut d’Astrophysique de Paris (IAP), UMR 7095, CNRS – UPMC Université
Paris 6, Sorbonne Universités,
98bis boulevard Arago,
75014
Paris,
France
^{2}
Institut Lagrange de Paris (ILP), Sorbonne
Universités, 98bis boulevard
Arago, 75014
Paris,
France
^{3}
École polytechnique ParisTech, Route de Saclay,
91128
Palaiseau,
France
email:
florent.leclercq@polytechnique.org
^{4}
Excellence Cluster Universe, Technische Universität
München, Boltzmannstrasse
2, 85748
Garching,
Germany
^{5}
Department of Physics, University of Illinois at
UrbanaChampaign, Urbana, IL
61801,
USA
Received: 2 March 2015
Accepted: 29 March 2015
Aims. We propose a decision criterion for segmenting the cosmic web into different structure types (voids, sheets, filaments, and clusters) on the basis of their respective probabilities and the strength of data constraints.
Methods. Our approach is inspired by an analysis of games of chance where the gambler only plays if a positive expected net gain can be achieved based on some degree of privileged information.
Results. The result is a general solution for classification problems in the face of uncertainty, including the option of not committing to a class for a candidate object. As an illustration, we produce highresolution maps of webtype constituents in the nearby Universe as probed by the Sloan Digital Sky Survey main galaxy sample. Other possible applications include the selection and labelling of objects in catalogues derived from astronomical survey data.
Key words: largescale structure of Universe / methods: statistical / catalogs
© ESO, 2015
1. Introduction
Building accurate maps of the cosmic web from galaxy surveys is one of the most challenging tasks in modern cosmology. Rapid progress in this field took place in the last few years with the introduction of inference techniques based on Bayesian probability theory (Kitaura et al. 2009; Jasche et al. 2010, 2015; Nuza et al. 2014). This facilitates the connection between the properties of the cosmic web, thoroughly analyzed in simulations (e.g. Hahn et al. 2007; AragónCalvo et al. 2010; Cautun et al. 2014), and observations (see Leclercq et al. 2014, for a review on the interface between theory and data in cosmology).
In Leclercq et al. (2015), we conducted a fully probabilistic analysis of structure types in the cosmic web as probed by the Sloan Digital Sky Survey (SDSS) main galaxy sample. This study capitalized on the largescale structure inference performed by Jasche et al. (2015) using the BORG (Bayesian Origin Reconstruction from Galaxies, Jasche & Wandelt 2013) algorithm. As the full gravitational model of structure formation COLA (COmoving Lagrangian Acceleration, Tassev et al. 2013) was used, our approach resulted in the first probabilistic and timedependent classification of cosmic environments at nonlinear scales in physical realizations of the largescale structure conducted with real data. Using the Hahn et al. (2007) definition (see also its extensions, ForeroRomero et al. 2009; Hoffman et al. 2012), we obtained threedimensional, timedependent maps of the posterior probability for each voxel to belong to a void, sheet, filament or cluster.
These posterior probabilities represent all the available structure type information in the observational data assuming the framework of Λ cold dark matter cosmology. Since the largescale structure cannot be uniquely determined from observations, uncertainty remains about how to assign each voxel to a particular structure type. The question we address in this letter is how to proceed from the posterior probabilities to a particular choice of assigning a structure type to each voxel. Decision theory (see, for example, Berger 1985) offers a way forward, since it addresses the general problem of how to choose between different actions under uncertainty. A key ingredient beyond the posterior is the utility function that assigns a quantitative profit to different actions for all possible outcomes of the uncertain quantity. The optimal decision is that which maximizes the expected utility.
After setting up the problem using our example and briefly recalling the relevant notions of Bayesian decision theory, we will discuss different utility functions and explore the results based on a particular choice.
2. Method
The decision problem for structuretype classification can be stated as follows. We have four different webtypes that constitute the “space of input features:” {T_{0} = void, T_{1} = sheet, T_{2} = filament, T_{3} = cluster}. We want to either choose one of them, or remain undecided if the data constraints are not sufficient. Therefore our “space of actions” consists of five different elements: {a_{0} = “decide void,” a_{1} = “decide sheet,” a_{2} = “decide filament,” a_{3} = “decide cluster,” and a_{1} = “do not decide.”} The goal is to write down a decision rule prescribing which action to take based on the posterior information.
Bayesian decision theory states that the action a_{j} that should be taken is that which maximizes the expected utility function (conditional on the data d), given in this example by (1)where x_{k} labels one voxel of the considered domain, are the posterior probabilities of the different structure types given the data, and G(a_{j}  T_{i}) are the gain functions that state the profitability of each action, given the “true” underlying structure. Formally, G is a mapping from the space of input features to the space of actions. For our particular problem, it can be thought of as a 5 × 4 matrix G such that G_{ij} ≡ G(a_{j}  T_{i}), in which case Eq. (1)can be rewritten as a linear algebra equation, U = G.P where the 5vector U and the 4vector P contain the elements U_{j} ≡ U(a_{j}(x_{k})  d) and , respectively.
Let us consider the choice of gain functions. Several choices are possible. For example, the 0/1gain functions reward a correct decision with 1 for each voxel, while an incorrect decision yields 0. This leads to choosing the structure type with the highest posterior probability. While this seems like a reasonable choice, we need to consider that a decision is taken in each voxel, whereas we are interested in identifying structures as objects that are made of many voxels. For instance, since clusters are far smaller than voids, the a priori probability for a voxel to belong to a cluster is much smaller than for the same voxel to belong to a void. To treat different structures on an equal footing, it makes sense to reward the correct choice of structure type T_{i} by an amount inversely proportional to the average volume V_{i} of one such structure. In the following, we use the prior probability as a proxy for the volume fractions, (2)We further introduce an overall cost for choosing a structure with respect to remaining undecided, leading to the following specification of the utility, (3)This choice limits 20 free functions to only one free parameter, α. With this set of gain functions, making (or not) a decision between structure types can be thought of as choosing to play or not to play a gambling game costing α. Not playing the game, i.e. remaining undecided (j = −1), is always free (G(a_{1}  T_{i}) = 0 for all i). If the gambler decides to play the game, i.e. to make a decision (j ∈ ˜0,3¨), they pay α but may win a reward, , by betting on the correct underlying structure (i = j).
In the absence of data, the posterior probabilities in Eq. (1)are the prior probabilities , which are independent of the position x_{k}, and the utility functions are, for j ∈ ˜0,3¨, Equations (4)and (4)mean that, in the absence of data, this reduces to the roulette game utility function, where, if correctly guessed, a priori unlikely outcomes receive a higher reward, inversely proportional to the fraction of the probability space they occupy. Betting on outcomes according to the prior probability while paying α = 1 leads to a “fair game” with zero expected net gain. The gambler will always choose to play if the cost per game is α ≤ 1 and will never play if α> 1.
The posterior probabilities update the prior information in light of the data, providing an advantage to the gambler through privileged information about the outcome. In the presence of informative data, betting on outcomes based on the posterior probabilities will therefore ensure a positive expected net gain and the gambler will choose to play even if α> 1. Increasing the parameter α therefore represents a growing “aversion for risk” and limits the probability of losing. Indeed, for high α, the gambler will only play in cases where the posterior probabilities give sufficient confidence that the game will be won, i.e. that the decision will be correct.
3. Maps of structure types in the SDSS
Fig. 1 Slices through maps of structure types in the latetime largescale structure, at a = 1. The colour coding is blue for voids, green for sheets, yellow for filaments, and red for clusters. Black corresponds to regions where data constraints are insufficient to make a decision. The parameter α, defined by Eq. (3), quantifies to the risk aversion in the map: α = 1.0 corresponds to the most speculative map of the largescale structure, and maps with α ≥ 1 are increasingly conservative. These maps are based on the posterior probabilities inferred by Leclercq et al. (2015) and on the Bayesian decision rule subject of the present work. 

Open with DEXTER 
Fig. 2 Same as Fig. 1 for the primordial largescale structure, at a = 10^{3}. 

Open with DEXTER 
We applied the above decision rule to the webtype posterior probabilities presented in Leclercq et al. (2015), for different values of α ≥ 1 as defined by Eq. (3). In doing so, we produced various maps of the volume of interest, consisting of the northern Galactic cap of the SDSS main galaxy sample and its surroundings. Slices through these threedimensional maps^{1} are shown in Fig. 1 for the latetime largescale structure (at a = 1) and in Fig. 2 for the primordial largescale structure (at a = 10^{3}).
When the game is fair (namely when α = 1), it is always played, i.e. a decision between one of the four structure types is always made. This results in the “speculative map” of structure types (top left panel of Figs. 1 and 2). There, a decision is made even in regions that are not constrained by the data (at high redshift or outside of the survey boundaries), based on prior betting odds.
By increasing the value of α> 1, we demand higher confidence in making the correct decision. This yields increasingly “conservative maps” of the Sloan volume (see Figs. 1 and 2). In particular, at high values of α, the algorithm makes decisions in the regions where data constraints are strong (see Figs. 3 and 6 in Leclercq et al. 2015), but often stays undecided in the unobserved regions. It can be observed that even at very high values, α ≳ 3, a decision for one structure is made in some unconstrained voxels (typically in favour of the structure for which the reward is the highest: clusters in the final conditions, and clusters or voids in the initial conditions). This effect is caused by the limited number of samples used in our analysis. Indeed, because of the finite length of the Markov Chain, the sampled representation of the posterior has not yet fully converged to the true posterior. For this reason, the numerical representation of the posterior can be artificially displaced too much from the prior, which results in an incorrect webtype decision. This effect could be mitigated by obtaining more samples in the original BORG analysis (for an increased computational cost); or can be avoided by further increasing α, at the expense of also degrading the map in the observed regions. We found the value of α = 4 (bottom right panel of Figs. 1 and 2) to be the best compromise between reducing the number of unobserved voxels in which a decision is made to a tiny fraction and keeping information in the volume covered by the data.
As expected, structures for which the prior probabilities are the highest disappear first from the map when one increases α: betting on these structures being poorly rewarded, this choice is avoided in case of high risk aversion. In the final conditions (Fig. 1), we found that sheets completely disappear for α ≈ 1.68 and filaments for α ≈ 4.01. In the initial conditions (Fig. 2), the critical value is around α ≈ 2.36 for both sheets and filaments. In the most conservative maps displayed in Figs. 1 and 2 (α = 4.0), the SDSS data provide extremely high evidence for the voids and clusters shown. In constrained parts, extended regions belonging to a given structure type may not have the expected shape. This is true in particular for filamentary regions. Several factors can explain this: first, slicing through filaments make them appear as dots; second, with the dynamic Hahn et al. (2007) definition, filament regions often extend out into sheets and voids, and their static skeleton geometry is not the most prominent at the voxel scale (3 Mpc/h in this work).
As detailed in Jasche et al. (2015), data constraints are propagated by the structure formation model assumed in the inference process (secondorder Lagrangian perturbation theory) and therefore radiate out of the SDSS boundaries. For this reason, for moderate values of α, webtype classification can be extended beyond the survey boundaries to regions influenced by data. This can be observed in Figs. 1 and 2, where one can see, for instance, that the shape of voids that intersect the mask is correctly recovered. Similarly, the classification of highredshift structures confirms that the treatment of selection effects by BORG is correctly propagated to webtype analysis.
We finally comment on the required computational resources for the complete chain for running BORG, computing the webtype posterior, and making a decision. Inference with BORG is the most expensive part: on average, one sample is generated in 1500 s on 16 cores (Jasche et al. 2015). Then, in each sample, tidal shear analysis (Leclercq et al. 2015) is a matter of a few seconds. Once the webtype posterior is known, making a decision, which is the subject of the present letter, is almost instantaneous. Therefore, once the density field has been inferred, which is useful for a much larger variety of applications, our method is substantially cheaper than several stateoftheart techniques for cosmic web analysis (e.g. the method of Tempel et al. 2013, 2014, for detecting filaments).
4. Conclusions
In this letter, we proposed a rule for optimal decision making in the context of cosmic web classification. We described the problem setup in Bayesian decision theory and proposed a set of gain functions that permit an interpretation of the problem in the context of game theory. This framework enables the dissection of the cosmic web into different elements (voids, sheets, filaments, and clusters) given their prior and posterior probabilities and naturally accounts for the strength of data constraints.
As an illustration, we produced threedimensional templates of structure types with various risk aversion, describing a volume covered by the SDSS main galaxy sample and its surrounding. These maps constitute an efficient statistical summary of the inference results presented in Leclercq et al. (2015) for crossuse with other astrophysical and cosmological data sets.
Beyond this specific application, our approach is more generally relevant to the solution of classification problems in the face of uncertainty. For example, the construction of catalogues from astronomical surveys is directly analogous to the problem we describe here: it simultaneously involves a decision about whether or not to include a candidate object and which class label (e.g. star or galaxy) to assign to it.
For all slices shown, we kept the coordinate system of Jasche et al. (2015).
Acknowledgments
F.L. acknowledges funding from an AMX grant (École polytechnique ParisTech). J.J. is partially supported by a Feodor Lynen Fellowship by the Alexander von Humboldt foundation. B.W. acknowledges funding from an ANR Chaire d’Excellence (ANR10CEXC00401) and the UPMC Chaire Internationale in Theoretical Cosmology. This work has been done within the Labex Institut Lagrange de Paris (reference ANR10LABX63) part of the Idex SUPER, and received financial state aid managed by the Agence Nationale de la Recherche, as part of the programme Investissements d’avenir under the reference ANR11IDEX000402. This research was supported by the DFG cluster of excellence “Origin and Structure of the Universe”.
References
 AragónCalvo, M. A., van de Weygaert, R., & Jones, B. J. T. 2010, MNRAS, 408, 2163 [NASA ADS] [CrossRef] (In the text)
 Berger, J. O. 1985, Statistical decision theory and Bayesian analysis (New York: Springer) (In the text)
 Cautun, M., van de Weygaert, R., Jones, B. J. T., & Frenk, C. S. 2014, MNRAS, 441, 2923 [NASA ADS] [CrossRef] (In the text)
 ForeroRomero, J. E., Hoffman, Y., Gottlöber, S., Klypin, A., & Yepes, G. 2009, MNRAS, 396, 1815 [NASA ADS] [CrossRef] (In the text)
 Hahn, O., Porciani, C., Carollo, C. M., & Dekel, A. 2007, MNRAS, 375, 489 [NASA ADS] [CrossRef] (In the text)
 Hoffman, Y., Metuki, O., Yepes, G., et al. 2012, MNRAS, 425, 2049 [NASA ADS] [CrossRef] (In the text)
 Jasche, J., & Wandelt, B. D. 2013, MNRAS, 432, 894 [NASA ADS] [CrossRef] (In the text)
 Jasche, J., Kitaura, F. S., Li, C., & Enßlin, T. A. 2010, MNRAS, 409, 355 [NASA ADS] [CrossRef] (In the text)
 Jasche, J., Leclercq, F., & Wandelt, B. D. 2015, J. Cosmol. Astropart. Phys., 1, 36 [NASA ADS] [CrossRef] (In the text)
 Kitaura, F. S., Jasche, J., Li, C., et al. 2009, MNRAS, 400, 183 [NASA ADS] [CrossRef] (In the text)
 Leclercq, F., Pisani, A., & Wandelt, B. D. 2014 [arXiv:1403.1260] (In the text)
 Leclercq, F., Jasche, J., & Wandelt, B. 2015, JCAP, submitted [arXiv:1502.02690] (In the text)
 Nuza, S. E., Kitaura, F.S., Heß, S., Libeskind, N. I., & Müller, V. 2014, MNRAS, 445, 988 [NASA ADS] [CrossRef] (In the text)
 Tassev, S., Zaldarriaga, M., & Eisenstein, D. J. 2013, J. Cosmol. Astropart., Phys., 6, 36 [CrossRef] (In the text)
 Tempel, E., Stoica, R. S., & Saar, E. 2013, MNRAS, 428, 1827 [NASA ADS] [CrossRef] (In the text)
 Tempel, E., Stoica, R. S., Martínez, V. J., et al. 2014, MNRAS, 438, 3465 [NASA ADS] [CrossRef] (In the text)
All Figures
Fig. 1 Slices through maps of structure types in the latetime largescale structure, at a = 1. The colour coding is blue for voids, green for sheets, yellow for filaments, and red for clusters. Black corresponds to regions where data constraints are insufficient to make a decision. The parameter α, defined by Eq. (3), quantifies to the risk aversion in the map: α = 1.0 corresponds to the most speculative map of the largescale structure, and maps with α ≥ 1 are increasingly conservative. These maps are based on the posterior probabilities inferred by Leclercq et al. (2015) and on the Bayesian decision rule subject of the present work. 

Open with DEXTER  
In the text 
Fig. 2 Same as Fig. 1 for the primordial largescale structure, at a = 10^{3}. 

Open with DEXTER  
In the text 