A&A 474, 315338 (2007)
DOI: 10.1051/00046361:20077880
M. A. AragónCalvo  B. J. T. Jones  R. van de Weygaert  J. M. van der Hulst
Kapteyn Astronomical Institute, University of Groningen, PO Box 800, 9700 AV Groningen, The Netherlands
Received 14 May 2007 / Accepted 9 August 2007
Abstract
Aims. We present here a new method, MMF, for automatically segmenting cosmic structure into its basic components: clusters, filaments, and walls. Importantly, the segmentation is scale independent, so all structures are identified without prejudice as to their size or shape. The method is ideally suited for extracting catalogues of clusters, walls, and filaments from samples of galaxies in redshift surveys or from particles in cosmological Nbody simulations: it makes no prior assumptions about the scale or shape of the structures.
Methods. Our Multiscale Morphology Filter (MMF) method has been developed on the basis of visualization and feature extraction techniques in computer vision and medical research. The density or intensity field of the sample is smoothed over a range of scales. The smoothed signals are processed through a morphology response filter whose form is dictated by the particular morphological feature it seeks to extract, and depends on the local shape and spatial coherence of the intensity field. The morphology signal at each location is then defined to be the one with the maximum response across the full range of smoothing scales.
The success of our method in identifying anisotropic features such as filaments and walls depends critically on the use of an optimally defined intensity field. This is accomplished by applying the DTFE reconstruction methodology to the sample particle or galaxy distribution.
Results. We have tested our MMF Filter against a set of heuristic models of weblike patterns such as are seen in the Megaparsec cosmic matter distribution. To test its effectiveness in the context of more realistic configurations we also present preliminary results from the MMF analysis of an Nbody model. Comparison with alternative prescriptions for feature extraction shows that MMF is a remarkably strong structure finder
Key words: cosmology: theory  largescale structure of Universe  methods: statistical  surveys
Large computer simulations of the evolution of cosmic structure (Springel et al. 2005) show prominent cellular patterns arising from gravitational instability. Galaxies accumulate in flattened walls, elongated filaments and dense compact clusters. These structures surround large nearempty void regions (Zeldovich et al. 1982). Their spatial distribution displays a distinctive frothy texture, interconnected in a cosmic weblike pattern.
While it is rather straightforward to find qualitative descriptions of the spatial structure and components of the cosmic web, a useful, and physically meaningful, quantitative analysis has proven to be far from trivial. This would be important, for example, when we wish to study the effect of environment on the formation of galaxies and their halos.
There are two parts to this: firstly, the reconstruction of a continuous density field from a point sample and secondly, the identification of structures within that density field. For the first part we use the Delaunay Tessellation Field Estimator (DTFE) technique of Schaap & van de Weygaert (2000). The second part, which is the main thrust of this paper, consists of a series of morphology filters that identify, in a scale independent manner, particular kinds of structure in data. The method is referred to as the Multiscale Morphology Filter (MMF) and is based on the kind of Scale Space analysis that has in recent years proved so successful in imaging science.
It is worth emphasising at this juncture that we have chosen a specific implementation of this kind of multiscale analysis. Our choice is made on the following grounds: (a) it is simple to understand and program, (b) it works under quite general conditions and (c) the approach is generic and easy to modify. There are many alternative multiscale strategies: we leave those for another day or for other people to follow up. Thus we shall try to keep this presentation as general as possible so that the points at which we make implementation specific choices are clear.
The most prominent property is its hierarchical nature. The gravitational clustering process proceeds such that small structures are the first to materialize and subsequently merge into ever larger entities. As a result each emerging cosmic structure consists of various levels of substructure. Hence, upon seeking to identify structure at one characteristic spatial scale we need to take into account a range of scales.
The second prominent aspect is that of the weblike geometry marked by highly elongated filamentary and flattened planar structures. The existence of the cosmic web can be understood through the tendency of matter concentrations to contract and collapse gravitationally in an anisotropic manner.
A final conspicuous aspect is that of the dominant presence of large roundish underdense regions, the voids. They form in and around density troughs in the primordial density field.
The challenge for any viable analysis tool is to trace, highlight and measure these features of the cosmic web.
The Void probability Function (White 1979; LachiezeRey et al. 1992) provided a characterisation of the "voidness'' of the Universe in terms of a function that combined information from many higher moments of the point distribution. But, again, this has not provided any identification of individual voids.
The Minkowski functionals provide global characterisations of structure. An attempt to extend its scope towards providing locally defined topological measures of the density field has been developed in the SURFGEN project defined by Sahni and Shandarin and their coworkers (Shandarin et al. 2004; Sahni et al. 1998). The main problem remains the userdefined, and thus potentially biased, nature of the continuous density field inferred from the sample of discrete objects. The usual filtering techniques suppress substructure on a scale smaller than the filter radius, introduce artificial topological features in sparsely sampled regions and diminish the flattened or elongated morphology of the spatial patterns. Quite possibly the introduction of more advanced geometry based methods to trace the density field may prove a major advance towards solving this problem.
Importantly, Martínez et al. (2005) and Saar et al. (2007) have generalized the use of Minkowski Functionals by calculating their values in a hierarchy of scales generated from waveletsmoothed volume limited subsamples of the 2dF catalogue. This approach is particularly effective in dealing with nonGaussian point distributions since the smoothing is not predicated on the use of Gaussian smoothing kernels.
Finding filaments joining neighbouring clusters has been tackled, using quite different techniques, by Colberg et al. (2005) and by Pimbblet (2005). More general filament finders have been put forward by a number of authors. Stoica et al. (2005) use a generalization of the classical Candy model to locate and catalogue filaments in galaxy surveys. This approach has the advantage that it works directly with the original point process and does not require the creation of a continuous density field. However, it is very computationally intensive.
The mathematically most rigorous program for filament description and analysis is that of the skeleton analysis of density fields by Novikov et al. (2006) (2D) and Sousbie et al. (2007) (3D). Based on Morse theory (see Colombi et al. 2000) the skeleton formalism analyzes continuous density fields and detects morphological features  maxima and saddle points in the density field  by relating density field gradients to the Hessian of the density field (also see Doré et al. 2007). It results in an elegant and effective tool with a particular focus towards tracing the filamentary structures in the cosmic web. However, it is computationally intensive and may be sensitive to the specific method of reconstruction of the continuous density field. The Hessian of the density field also forms the basis of the MMF presented in this study, although MMF embeds this within a formalism that explicitly adresses the multiscale character of the cosmic density field and includes the shape conserving abilities of the tessellation based density field reconstruction Schaap & van de Weygaert (2000).
Several factors contribute to making systematic voidfinding difficult. The fact that voids are almost empty of galaxies means that the sampling density plays a key role in determining what is or is not a void (Schmidt et al. 2001). Moreover, void finders are often predicated on building void structures out of cubic cells (Kauffmann & Fairall 1991) or out of spheres (e.g. Patiri et al. 2006). Such methods attempt to synthesize voids from the intersection of cubic or spherical elements and do so with varying degrees of success. The AspenAmsterdam Void Finder Comparison Project of Colberg et al. (2007) will clarify many of these issues.
The Watershedbased algorithm of Platen et al. (2007) aims to avoid issues of both sampling density and shape.
Scale space analysis looks for structures of a mathematically specified type in a hierarchical, scale independent, manner. It is presumed that the specific structural characteristic is quantified by some appropriate parameter (e.g.: density, eccentricity, direction, curvature components). The data is filtered to produce a hierarchy of maps having different resolutions, and at each point, the dominant parameter value is selected from the hierarchy to construct the scale independent map. We refer to this scalefiltering processes as a Multiscale morphology filter.
For simplicity, the paper describes one specific implementation, or embodiment, of the process in relation to the problem of cataloguing the structural elements of the cosmic web. Other embodiments are possible, but the present one turns out to be highly effective in structure segregation and feature identification.
While this sounds relatively straightforward, in practise a number of things are required to execute the process. Firstly there must be an unambiguous definition of the structuredefining characteristic. In the present case we shall use the principal components of the local curvature of the density field at each point as a morphology type indicator. This requires that the density be defined at all points of a grid, and so there must be a method for going from a discrete point set to a grid sampled continuous density field. We choose to do this using the DTFE methodology since that does minimal damage to the structural morphology of the density field.
Since we are looking for three distinct structural morphologies, blobs, walls and filaments, we have to apply the segmentation process three times. However, since we shall be using curvature components as structural indicators, we shall have to eliminate the blobs before looking for filaments, and we shall then have to eliminate the filaments before looking for walls.
Figure 1: DTFE image of a slice through the Nbody simulation used in this work. Left: DTFE density field in a central slice. Right: the corresponding particle distribution in a slice of width .  
Open with DEXTER 
The input samples for our analysis are mostly samples of galaxy positions obtained by galaxy redshift surveys or the positions of a large number of particles produced by Nbody simulations of cosmic structure formation. In order to define a proper continuous field from discrete distribution of points  computer particles or galaxies  we translate the spatial point sample into a continuous density field by means of the Delaunay Tessellation Field Estimator (Schaap & van de Weygaert 2000; Schaap 2007).
The DTFE technique recovers fully volumecovering and volumeweighted continuous fields from a discrete set of sample field values. The method has been developed by Schaap & van de Weygaert (2000, also see Schaap 2007) and forms an elaboration of the velocity interpolation scheme introduced by Bernardeau & van de Weygaert (1996). It is based on the use of the Voronoi and Delaunay tessellations of a given spatial point distribution. It provides a basis for a natural, fully selfadaptive filter in which the Delaunay tessellations are used as multidimensional interpolation intervals.
The primary ingredient of the DTFE method is the Delaunay tessellation of the particle distribution. The Delaunay tessellation of a point set is the uniquely defined and volumecovering tessellation of mutually disjunct Delaunay tetrahedra. A Delaunay tetrahedron is defined by the set of four points whose circumscribing sphere does not contain any of the other points in the generating set (Delaunay 1934) (triangles in 2D). The Delaunay tessellation is intimately related to the Voronoi tessellation of the point set, they are each others dual. The Voronoi tessellation of a point set is the division of space into mutually disjunct polyhedra, each polyhedron consisting of the part of space closer to the defining point than any of the other points (Okabe et al. 2000; Voronoi 1908).
DTFE exploits three particular properties of Voronoi and Delaunay tessellations. The tessellations are very sensitive to the local point density. The DTFE method uses this fact to define a local estimate of the density on the basis of the inverse of the volume of the tessellation cells. Equally important is their sensitivity to the local geometry of the point distribution, which allows them to trace anisotropic features such as encountered in the cosmic web. Finally it uses the adaptive and minimum triangulation properties of Delaunay tessellations to use them as adaptive spatial interpolation intervals for irregular point distributions. In this it is the first order version of the Natural Neighbour method (NN method: Watson 1992; Braun & Sambridge 1995; Sukumar 1998; Okabe et al. 2000; Sibson 1981,1980).
One of the important  and crucial  properties of a processed DTFE density field is that it is capable of delineating three fundamental characteristics of the spatial structure of the Megaparsec cosmic matter distribution. It outlines the full hierarchy of substructures present in the sampling point distribution, relating to the standard view of structure in the Universe having arisen through the gradual hierarchical buildup of matter concentrations. DTFE also reproduces any anisotropic patterns in the density distribution without diluting their intrinsic geometrical properties. This is a great advantage when seeking to analyze the cosmic matter distribution, characterized by prominent filamentary and walllike components linking up into a cosmic web. A third important aspect of DTFE is that it outlines the presence and shape of voidlike regions. DTFE renders the lowdensity regions as regions of slowly varying, moderately low density values through the interpolation definition of the DTFE field reconstruction.
An outline of the DTFE reconstruction procedure can be found in Appendix A.
Of course, substructure and morphological characteristics will be altered during this hierarchical smoothing process. The smearing of features through smoothing is inevitable if we smooth using isotropic filters and there has been some discussion as to whether one might do better by rescaling in such a way as to minimise feature smearing (for example Martínez et al. 2005; Saar et al. 2007). It is possible to use refined (nonlinear) smoothing procedures that minimize the side effects of smoothing but that issue is not addressed here. Here, we simply rescale using isotropic Gaussian filters: this seems to work very well and avoids complications arising from using other filters.
We use a particular feature recognition process based on eigenvalues of the Hessian matrix of the density field. It should be understood that the technique we describe here could well be used with other feature recognition systems, such as, for example, the ShapeFinder process (Sahni et al. 1998). Scale Space is a powerful tool for scale independent data analysis.
More recently, Frangi et al. (1998) and Sato et al. (1998) used Scale Space analysis for detecting the web of blood vessels in a medical image. The vascular system is a notoriously complex pattern of elongated tenuous features whose branching make it closely resemble a fractal network. We translate, extend and optimize this technology towards the recognition of the major characteristic structural elements in the Megaparsec matter distribution. The resulting methodology yields a unique framework for the combined identification of dense, compact bloblike clusters, of the salient and moderately dense elongated filaments and of tenuous planar walls.
Symbol  Name  Description  Eq. 
Scale Space Map  Combination filtered density maps over all levels n.  (3)  

Morphology Mask  Region of space obeying shape constraint.  (9) 
E=1: locations obeying shape constraint  (Table 1)  
E=0: locations not obeying shape constraints  

Shape Significance Map  Feature shape fidelity for each point locale.  
Measures conformance to local shape criteria  (9)  
Morphology Map  Soft thresholded version of . The threshold selects out the most  
locally shape conformant features. Requires input of a threshold parameter  (10)  
Morphology Intensity Map  Map of for blobs, for filaments or for walls  
Modulates Morphology map, meant to avoid enhancing noisy low intensity structures  (11)  

Morphology Filter  Constructed from and . Morphology weighted filter  
for the Morphology Mask. Provides each location which obeys the morphology constraint  
with a measure of the strength of morphology signal.  (12)  
Feature Map  Product of morphology mask and corresponding morphology filter .  
There is one Feature Map for each level in the ScaleSpace, representing local structures as  
seen on the different scales of the ScaleSpace  (13)  
ScaleSpace Map Stack  Constructed from the for all levels in the ScaleSpace  
Each pixel in this map is the greatest value of the corresponding pixels  
in the Feature maps that make up the ScaleSpace stack  (14)  
Object Map  Inclusion of astrophysical and cosmological criteria to select physically recognizable objects  
Produced by thresholding ScaleSpace Map Stack  
Threshold criterion determined by cosmological/astrophysical considerations  (Sect. 7) 
Each pass involves the following components and procedures:
Figure 2: Scalespace: a particle distribution ( left) is translated by DTFE into a density field ( centre), followed by the determination of the field, by means of filtering, at a range of scales ( righthand).  
Open with DEXTER 
In this paper we generate scaled representations of the data by repeatedly smoothing the DTFE reconstructed density field
with a hierarchy of spherically symmetric Gaussian filters
having different widths R:
The scalespace MMF analysis described in this study involves a discrete number of N+1 levels,
.
Following Sato et al. (1998) we use a nested hierarchy of filters having widths differing by a factor of :
(2) 
The largest structure that survives this process is determined by the effective width of the filter used in the final smoothing stage. For our purposes it is sufficient to use n=5.
We shall denote the level smoothed version of the DTFE reconstructed field by the symbol f_{n}.
The Scale Space itself is constructed by stacking these variously smoothed data sets, yielding the family
of smoothed density maps f_{n}:
Figure 3: Maps of the eigenvalues of the Hessian matrix at 3 different scales (levels). From top to bottom: the 3 eigenvalues , and ( ). From left to right: 3 different scales R_{1}R_{3} and R_{5}, ( R_{1}>R_{2}>R_{5}). Positive values are represented as gray shades in logarithmic scale while negative values are indicated by contour lines also in logarithmic scale.  
Open with DEXTER 
We can express the local variations around a point
of the density field
as a Taylor expansion:
In our case we compute the scalespace Hessian matrices for each level n directly from the DTFE density field, via the convolution
=  
=  (6) 
In order to properly compare the values of the Hessian arising from the differently scaled variants of the data that make up the Scale Space we must use a renormalised Hessian:
(7) 
We denote these eigenvalues by
and arrange them so that
:
=  
>  (8) 
Figure 4: Morphology Mask : on the basis of the 3 eigenvalues , and at each location we determine whether the morphological criterion  here whether it corresponds to a filament (Table 1)  is valid. If so , otherwise it is . Top row: maps of the three eigenvalues; bottom row: the Morphology Mask .  
Open with DEXTER 
Table 1: Eigenvalue relationships defining the characteristic morphologies. The conditions describe objects with intensity higher than their local background as clusters, filaments or walls. For voids we would have to reverse the sign of the eigenvalues.
With the local curvature and shape encapsulated in the three eigenvalues , and of the Hessian, the MMF seeks to identify the regions in space which correspond to a certain morphology and at the scale at which the corresponding morphology signal attains its optimal value. First we set out to select these regions by means of a Morphology Mask. Subsequently we develop a filterbased procedure for assigning at each scale a local weight which is used to select the scale at which the morphology reaches its strongest signal.
There are many ways of using the eigenvalues of the Hessians in the Scale Space representation of the data to identify and demarcate specific types of structure. Here we start by defining a morphology mask. The Morphology Mask
is a hard filter which identifies all pixels obeying the morphology and shape condition:
Figure 5: Morphology Filter . The Morphology Response function ( top centre) is the soft thresholded version of the Shape Significance map ( left frame), determined from the values of the eigenvalues , and . The Morphology Intensity function ( bottom centre) is also computed from the 's using Eq. (11). Finally, the Morphology Filter ( right frame) is obtained by combining with .  
Open with DEXTER 
The following shape indices reflect the strength
of the classification in terms of the local geometry as characterised by the 's.
Figure 6: The Feature Map ( righthand frame) is computed for each scale and is equal to the Morphology Filter at the locations where the Morphology Mask is unity (and nonzero).  
Open with DEXTER 
As a cautionary warning it must be stressed that we cannot identify a point as being part of a locally filamentary structure and assess the significance by using an evaluation of that applies to blobs or walls. Likewise the value of cannot be used to assess the relative significance of different types of structure. This means that the identification of structural elements using this eigenvalue classification scheme must be done cyclically: first find blobs (three roughly equal eigenvalues), then lines (two roughly equal and dominant eigenvalues) and finally walls (one dominant eigenvalue). There are other schemes that are onepass classifiers.
We shall use the symbols , , to denote the values of computed for each kind of feature.
We shall use the symbols , , to denote the values of computed for each kind of feature.
Methods of thresholding image data such as Eq. (10) are generally referred to as "soft thresholding'', as opposed to "hard thresholding'' in which all values below a critical value are zeroed. Soft thresholding results in visually more appealing density distributions. See Fig. 5.
Qian et al. (2003) noted that the smallest eigenvalue ()
will be large only for blobs, while
will be large for blobs and filaments, and
for blobs, filaments, and walls. Combining these relations with the
constraints in table 1 (3rd column) we
can use the following intensity function:
There are other posible measures of feature intensity. Frangi et al. (1998) introduced the Frobenius matrix as a measure of secondorder structure. However, this measure is biased towards bloblike structures and can produce erroneous signals in the detection of filaments and walls.
Figure 7: The Scale Space Map Stack : the formalism selects at any location the scale with the optimal signal of the feature map. Depicted are the Feature maps for three different scales ( top row), and the resulting Map Stack ( bottom row), determined over the full range of scales.  
Open with DEXTER 
For each level of the scale space, we can generate a Feature Map,
.
The feature map comprises the information
contained in the Morphology Filter
and allocates it to the locations contained in the Morphology Mask
.
Formally we can write this as
Given the Scale Space Map Stack for a given feature (blobs, filaments or walls), we can assign each particle of the original dataset to the specific feature identified in the Scale Space Map Stack.
For blob finding the thresholding is quite straightforward. At very low threshold, there will be many round objects (the eigenvalue criterion fixes their shape) of which only a small fraction will be the blobs we are seeking. As the threshold is raised from zero, the noise and the the less significant blobs are eliminated. There comes a point when the threshold stops annihilating these small, less significant, blobs and simply starts eroding the large blobs. This is the point where we define out optimal threshold. The dotted vertical line indicates the best value of .
If we plot a graph of the fraction of the sample volume occupied by belowthreshold blobs against the threshold we obviously find a monotonic curve that rises from zero to one. This is shown in Fig. 8a where we see a two powerlaw behaviour with a break marking where the transition from texture noise annihilation to simple blob erosion takes place.
Figure 8: Thresholds for feature isolation based on the feature erosion criterion. The selected value is shown as a dotted vertical line. The object count to the right of the line declines due to erosion.  
Open with DEXTER 
We use to denote the value of above which a pixel is considered as part of a filament. Figure 8b plots the normalised number of objects detected for each value of the threshold, .
The explanation for the shape of this curve is as follows. The low threshold (small ) objects are largely due to texture noise: the number of these declines as the threshold increases. When real filamentary features appear the number of detections increases with to reach a maximum. This is because at lower thresholds the features tend to percolate, so that raising the threshold breaks the structure up into a greater number of filamentary objects. As the threshold rises further the filaments are eroded and get rarer. The point at which filament erosion starts to act is taken as the optimal value of . This is indicated by the dotted line in the figure.
We use to denote the value of above which a pixel is considered as part of a wall. Figure 8c plots the normalised number of objects detected for each value of the threshold, .
The threshold for defining walls is determined in the same way as for filaments. Note, however, that the particles classified as lying in blobs and filaments have been removed in previous cycles of the analysis so there is no longer a significant texture noise component. As the threshold is varied there is a peak in the number of walls that are found. At thresholds below this critical value the walls join up and percolate, eventually leaving one vast percolating structure. At higher threshold values walls are eroded and eventually destroyed. The dotted vertical line indicates the best value of .
The analysis cycle can be expressed in pseudocode (see accompanying code in next column). In this form of pseudocode, keywords (which correspond to class methods in object oriented programming) are in boldface.
The nature of the hierarchy is such that we have first to identify blobs, remove them from the sample, then identify filaments, and after removing them from the sample finally identify the walls. This arises because data points in blobs are defined by having three significant eigenvalues, data points in filaments are defined by having two significant eigenvalues, and data points in walls have only one significant eigenvalue. Identifying a filament before eliminating blobs would not work since the blobs would be more strongly detected.
Code 1: Pseudocode MMF procedure.
: Map_Feature
resample PointSet to Mesh using DTFE
construct ScaleSpace Hierarchy
for each Level in ScaleSpace
{
build Hessian Eigenvalue Maps
build using Eigenvalue Criteria for Feature
{
Morphology Mask,
Feature Shape Fidelity,
Morphology Response Filter,
Feature Intensity Map,
}
generate
{
Morphology Filter,
Feature Map,
}
}
stack ScaleSpace Feature Maps,
threshold Feature Maps using Feature Threshold Method
in thresholded regions
{
identify Points
publish Points
remove Points from PointSet
}
if Feature = Blobs
set Feature = Filaments
else if Feature = Filaments
set Feature = Walls
else
quit
goto Map_Feature
Our use of isotropic Gaussian filters is perhaps the most important limiting factor in this analysis. The largest filter radius which is chosen is substantially smaller than the lengths of the typical filaments. Only the shorter filaments will get isotropised and they are "lost'' since they make no contribution in the scalespace stack. Our algorithm is indeed a long thin filament finder. The main sideeffect of the Gaussian smoothing is to make the profile (perpendicular to the filament) of the sharper (narrow) filaments Gaussian. A narrow filament having high density contrast will, under linear Gaussian smoothing, spill over into the large scales at a variety of thresholds and it will appear to be fatter than it really is. This latter problem is a consequence of applying simple linear filters: it is generally overcome within the scale space context by using nonlinear filters or by using wavelets (Martínez et al. 2005; Saar et al. 2007)
Another area for improvement is to use the eigenvectors as well as the eigenvalues themselves. Here we have simply relied on the relative magnitudes of the eigenvalues as indicators of curvature morphology. Had the eigenvectors themselves been uncorrelated we might have concluded that there was structure when in fact there was only noise: the eigenvector correlations are good indicators of noise.
A third area for improvement would be to use anisotropic smoothing filters. This leads us into another related approach to this problem: the use of nonlinear diffusion equations to locate structural features. This will be the subject of another article later on.
A more detailed description of the model construction may be found in Appendix B.1. We distinguish two different yet complementary approaches, Voronoi Element Models and kinematic Voronoi models.
Simple Voronoi models confine their galaxy distributions to one of the distinct structural components of a Voronoi tessellation:
Figure 9: Recovered particles in Blobs, Filaments and Walls from a voronoi particle distribution. Particles inside blobs are detected ( left), at 90/15 percent real/false detections. From the new blobfree distribution we detect particles in filaments ( center) at 90/10 percent real/false detections. Finally the blobfilamentfree distribution is used to find the particles inside walls ( right) at 80/10 percent real/false detections.  
Open with DEXTER 
For our study we generated four different Voronoi clustering models, labelled as A, B, C and D. They are all based upon a Voronoi tessellation generated by M=53 nuclei distributed within a box of size . The models are composite Voronoi Element Models and consist of the superposition of galaxies located in field, walls, edges and vertices of a Voronoi tessellation. Our four test models contain N=32^{3} galaxies. The fraction of galaxies in the various components is a key parameter of the model, and is specified in Table 2. In and around the walls, edges and vertices the galaxy distribution follows a radial Gaussian density profile, with scale factors , and .
Table 2: Voronoi Clustering Models. Percentage of galaxies/points in the various morphological elements of the model.
A considerable virtue of the Voronoi clustering models is that it is a priori known which galaxies reside in the various morphological components of the Voronoi test models. This allows an evaluation of the absolute performance of the MMF and other morphology detection techniques by determining the fraction of the galaxies which are correctly identified as vertex, filament and wall galaxy.
For each Voronoi model we computed the DTFE density field from the particle distribution and applied the MMF. Following our previously described scheme, we first identified the blobs from the complete particle distribution. After removal of the blob particles, the filaments are found. Following the equivalent process for the filaments, the last step of the MMF procedure concerns the identification of the wall particles. The remaining particles are tagged as field particles.
Figure 9 shows the outcome of the MMF applied to Voronoi Model C. Visually, the resemblance between real and MMF identified blob, filament and wall particles is remarkably good. The second row of panels shows the real detections of MMF: MMF clearly manages to identify all clusters, filaments and even the more tenuous walls in the weblike galaxy distribution. The false detections do appear to have a somewhat broader spatial distribution than those of the corresponding real detections. Most of them reside in the boundary regions of the blobs, filaments and walls: they are mainly an artefact due to the fact that the effective extent of the MMF morphology masks is slightly larger than the intrinsic extent of the Voronoi components. Finetuning of the filter scales (Eq. (1)) is a potential solution for curing this artefact.
Figure 10: Reals versus false detections for different voronoi models (see Table 2) (A: solid, B: dotted, C: dashed, D: dotteddashed) for blobs ( left), filaments ( center) and walls ( right). We applied the MMF (black) and simple density thresholding (grey) in order to compare both methods.  
Open with DEXTER 
The main tunable parameters for optimizing the number of detected galaxies are blob, filament and wall threshold values, , and . By lowering the blob threshold level , defined through a regular density threshold (see Sect. 7), the number of MMF detected blob galaxies increases. The same holds for adjusting the filament and wall thresholds, in terms of the lowering of the and levels. The galaxies detected by MMF include both real and false detections. As the threshold levels are adjusted the number of both will tend to increase.
The detection rate at a given threshold level is the fraction of genuine blob, filament or wall galaxies which have been detected by the MMF. Ideally one would want to trace them all and have a 100% detection rate, in practice this is set by the the applied threshold. Based upon the 11 relation between , and on the one hand and the corresponding blob, filament and wall detection rate on the other we use the detection rate as threshold parameter.
The ratio of the corresponding number of false blob galaxies to the total number of genuine blob galaxies is the blob contamination rate rate. The filament and wall contamination rate are defined in a similar way. Because a lowering of the threshold levels will result in a larger number of detections, both real and false, the contamination rate will be an increasing function of the detection rate. Note that the contamination rate may exceed in the case the number of false detections exceeds that of the total number of genuine (blob, filament or wall) galaxies.
Each of the morphological elements are identified with a particular (disjunct) range of density values. Blobs, ie. clusters, are identified with the highest density values. Filaments are associated with more moderately high density values. Walls follow with density values hovering around unity to a few, while the field/voids may include densities down to a zero value. This approach has frequently been used to discriminate between virialized haloes and the surrounding matter distribution, and has even been used in an attempt to define filamentary or planar features (Dolag et al. 2006). However, it seriously oversimplifies and distorts the perceived structure of the cosmic web. (This is presumably because filaments and walls differ in density and have significant internal density structure. The simplistic density threshold approach does not reflect the reality of the structure: the range of densities in filaments overlaps with densities in walls and even with those of the outskirts of clusters. (Hahn et al. 2007) reach similar conclusions.
Figure 11: MMF applied to Nbody simulation. The top row shows a subsample a) consisting of 10 of the total number of particles together with, in panels b) and c), the structures resulting from simple density thresholding using two different thresholds. Panels b) and c) both contain both spherical and elongated structures: there is a large amount of cross contamination between morphologies. Simple density thresholding is not an effective morphological discriminator. The second row shows the results of applying the MMF procedure showing clearly segregated a) blobs, b) filaments and c) walls (for clarity we display only the largest structures. The third row shows the particles associated with the MMF defined structures.  
Open with DEXTER 
Figure 12: Comparing blobs found from HOP and from MMF. A): Particles. B) Isosurfaces of the blob identiffied with MMF. C) Particles inside the blobs (black) and background particles in grey. D) The position of the HOP Haloes (circles) and the particles inside the MMF blobs (dark grey). Light grey particles are just the rest.  
Open with DEXTER 
In order to explore the response of the MMF in this complex scenario we performed a cosmological Nbody simulation. We give here only a few preliminary results to illustrate how the methodology works with a "real'' galaxy distribution. A more detailed exploratin follows in a later paper.
The simulation represents a LCDM model with , , h=0.73 in a periodic box of side 150 Mpc containing 256^{3} dark matter particles. We also run the same simulation lowering the resolution to 128^{3} particles according to the prescription given by Klypin et al. (2001) in order to assess the effect of mass resolution in the density field determination. For the scales studied here there is no significant difference between the density fields computed from the two simulations since the mean interparticle separation is small enough to resolve the intermediatedensity structures (Schaap 2007).
In the case of filaments and walls (see Figs. 11e and f) the multiscale nature of the MMF is less obvious, however it is nonetheless there.
It is clear from Fig. 11 that even though the LSS presents a great challenge it can succesfully recover each morphological component at its characteristic scale.
Figure 13: Particles defining filamentary structures in a slice of an Nbody model. The grayscale images show the MMF detection of filamentary features on various filtering scales. Top lefthand: the filament volume occupancy (number of sample grid cells with a filament signal) as a function of smoothing scale.  
Open with DEXTER 
Figure 12A shows a section through the Nbody model and Fig. 12B shows the blobs that are found in that section by the MMF process. Figure 12C shows how these blobs relate to the underlying structure displayed in panel A.
In an attempt to see how these blobs compare with clusters that would be identified via a more traditional method, we have use the HOP algorithm to find clusters. The HOP clusters are shown in Fig. 12D superposed on the MMF blobs. Superficially, we see that the agreement is remarkably good: the MMF blobs are indeed what we would subjectively call clusters. As in Fig. 11 we can appreciate the scale range revealed by MMF.
Making this comparison more precise is rather difficult owing to the vastly different approaches to blobfinding. This will be discussed in detail in a subsequent paper dealing specifically with the application of MMF to Nbody simulations.
When these are stacked, application of Eq. (15) determines whether a given pixel is a part of a filament. The process yields the filamentary map of Fig. 11.
Figure 14: Occupancy of Cosmic Web features, by volume ( top) and mass ( bottom) for a CDM Nbody simulation (see text).  
Open with DEXTER 
The result is hardly surprising: the clusters and filaments occupy about the same mass fraction and together contain more than half the haloes in the simulation. The clusters occupy by far the least volume  they are dense systems and they are denser than the filaments. Recall, however, the important remark that we could not use density threshold alone to define these structures (see the top row of panels in Fig. 11).
The large volume occupancy of filamentary structures explains why our impression of the cosmic matter distribution is predominantly filamentary, and the fact that they are all long and thin (as illustrated in Fig. 13) emphasises the weblike nature of the structure.
Perhaps the only surprise in this analysis is the relatively low volume occupancy of the walls in comparison with the filaments. This may be in part because most of the walls have broken up by the present epoch. It may also be in part due to the fact that the low number of particles in walls makes it relatively difficult to find them: they may get misclassified as being part of the general field. It is difficult to assess this on the basis of the present experiments alone.
The technique has been illustrated in terms of spatial point data since that is relatively unambiguous. However, the MMF technique we have described is a quite general technique for scalefree feature finding: it only needs a mathematical prescription of what is being looked for, which in general may not be so easy! Bearing that in mind, the following is a list of possible application areas.
The technique can readily be extended to analysis of velocity data of various kinds such as Fingers Of God in cosmological redshift surveys, analysis of dynamical phase spaces, feature detection in solar images, morphological characterization of structure in spiral arms, feature detection in radio datacubes, etc. Finding clusters and their substructures using MMF would provide an important alternative to HOP. Finding small, low surface brightness, galaxies in noisy neutral hydrogen surveys would be another useful application.
Acknowledgements
We are grateful to JeanLuc Starck for his interest in this work and to the referee, Stephane Colombi, for positive and helpful comments. We would like to thank Pablo Araya for providing the Nbody simulations and Erwin Platen for many useful and clarifying discussions.
2: !:} uniform sampling process:
the point sample is an unbiased sample of
the underlying density field. Typical example is that of Nbody
simulation particles. For Ddimensional space the density estimate
is,
2: !:} systematic nonuniform sampling process:
sampling density according to specified
selection process. The nonuniform sampling process is quantified by
an a priori known selection function
.
This situation
is typical for galaxy surveys,
may encapsulate
differences in sampling density
as function of
sky position
,
as well as the radial redshift
selection function
for magnitude or fluxlimited
surveys. For Ddimensional space the density estimate is,
(A.5) 
We distinguish two different yet complementary approaches (see van de Weygaert 2007). One is the fully heuristic approach of "Voronoi Element models''. They are particularly apt for studying systematic properties of spatial galaxy distributions confined to one or more structural elements of nontrivial geometric spatial patterns. The second, supplementary, approach is that of the Voronoi Evolution models or Voronoi Kinematic models, which attempt to "simulate'' weblike galaxy distributions on the basis of simplified models of the evolution of the Megaparsec scale distribution. The Voronoi clustering models offer flexible templates for cellular patterns, and they are easy to tune towards a particular spatial cellular morphology. To investigate the performance of MMF we use composite Voronoi Element Models, tailormade heuristic "galaxy'' distributions composed of a superposition of particle distributions in and around the walls, edges and vertices of the Voronoi skeleton. A complete composite particle distribution includes particles located in four distinct structural components:
In the case of composite models the fraction of field galaxies , wall galaxies , filaments galaxies and blob galaxies , with the constraint , is a key input parameter of the model.
All different Voronoi models are based upon the displacement of a sample of N "model galaxies''. The initial spatial distribution of these N galaxies within the sample volume V is purely random, their initial locations ( defined by a homogeneous Poisson process. A set of M nuclei within the volume V corresponds to the cell centres, or expansion centres driving the evolving matter distribution. The nuclei have locations .
Following the specification of the initial positions of all galaxies, the second stage of the procedure consists of the calculation of the complete Voronoi track for each galaxy (Sect. B.2) towards its wall, filament or vertex, or its location within a cell when it is a field galaxy.
Simple Voronoi Element Models place all model galaxies in either walls, edges or vertices. The versatility of the model also allows combinations of element models, in which field (cell), wall, filament and vertex distributions are superimposed. The characteristics of the patterns and spatial distribution in these Mixed Voronoi Element Models can be varied and tuned according to the fractions of wall galaxies, filament galaxies, vertex and field galaxies.
Figure B.1: Schematic illustration of the galaxy projections in the Voronoi clustering model. See text. 
In the second step the galaxy n is moved from its initial position , towards its final destination in a wall, filament or vertex (see Fig. B.1). The first section of the galaxy displacement is the radial path along the direction defined by the galaxies' initial location wrt. its expansion centre . This direction is defined by the unity vector .
If the galaxy is a field galaxy it remains at its original location. If it is a wall galaxy it is projected along direction onto the Voronoi wall with which the radial path first intersects. Filament galaxies are moved along the wall to the location where the path intersects . Finally, if it is cluster galaxy the galaxies' path is continued along the edge until it reaches its final destination, vertex . The identity of the neighbouring nuclei , , and , and therefore the identity of the cell , the wall , the edge and the vertex , depends on the initial location of the galaxy, the position of its closest nucleus and the definition of the galaxies' path within the Voronoi skeleton.
In summary, the path
is codified by
A finite thickness is assigned to all Voronoi structural elements. The walls, filaments and vertices are assumed to have a Gaussian radial density distribution specified by the widths of the walls, of the filaments and of the vertices. Voronoi wall galaxies are displaced according to the specified Gaussian density profile in the direction perpendicular to their wall. A similar procedure is followed for the Voronoi filament galaxies and the Voronoi vertex galaxies. As a result the vertices stand out as threedimensional Gaussian peaks.