Open Access
Issue
A&A
Volume 637, May 2020
Article Number A31
Number of page(s) 26
Section Cosmology (including clusters of galaxies)
DOI https://doi.org/10.1051/0004-6361/201936397
Published online 08 May 2020

© I. Santiago-Bautista et al. 2020

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

The large-scale structure (LSS) of the Universe is composed of a network of groups and clusters of galaxies, elongated filaments, widely spread sheets, and voids (e.g., Peebles 1980; Davis et al. 1982; Bond et al. 1996). Both the ΛCDM cosmological model (e.g., Bond & Szalay 1983; Doroshkevich & Khlopov 1984) and recent numerical N-body simulations (e.g., Millennium, Springel et al. 2005; Bolshoi, Klypin et al. 2011; Illustris, Vogelsberger et al. 2014) reinforce the idea that these structures are assembled under the effect of gravity generated by the total matter content. Since the baryonic matter follows, to first order, the distribution of the dark matter, the galaxies and gas populate these substructures accordingly (e.g., Eisenstein et al. 2005). Moreover, there is increasing evidence that the galaxy properties (e.g., mass, activity, morphology, luminosity, surface brightness, orientation, etc.) correlate with the LSS environment in which they are located (e.g., Smargon et al. 2012; Scoville et al. 2013; Poudel et al. 2016; Kuutma et al. 2017; Chen et al. 2017; Wang et al. 2018) or, more specifically, with the internal structure of the supercluster (e.g., Einasto et al. 2008; Gallazzi et al. 2009; Gavazzi et al. 2010; Cybulski et al. 2014; Guglielmo et al. 2018). Furthermore, theoretical studies (e.g., Cen & Ostriker 1999) suggest that from one-half to two-thirds of the baryonic matter in the Universe is hidden in the filamentary structures of the LSS. Therefore, characterization of the LSS (e.g., topology, density, temperature, dynamical state, matter distribution, and its evolution over time) is an important step in placing constraints on the current cosmological models.

Galaxy clusters are well studied through their gas component since they are the densest regions of the LSS. However, the gas in filaments is most likely in a relatively cool (T ∼ 105–107 K, or 0.01–1 keV) and relative low-density gas phase called the warm hot intergalactic medium (WHIM). There is already some evidence for such gas from X-ray emission observed within pairs of close clusters (e.g., Ursino et al. 2015; Alvarez et al. 2018). In addition, the WHIM between pairs of clusters has been observed through the Sunyaev–Zel’dovich effect (SZ; e.g., Planck Collaboration VIII 2013). Tanimura et al. (2019) carried out statistical analyses using Planck SZ observations in the regions of superclusters. The results of these latter authors provide evidence for inter-cluster gas at a temperature of T ∼ 8 × 106 K. Also, Eckert et al. (2017) presented deep X-ray observations of the galaxy cluster Abell 2744, the analysis of which suggests a gas fraction of between 5 and 15% for the filaments that surround the cluster and a plasma temperature of 1 − 2 × 107 K. Therefore, the characterization of these structures through observables like X-ray emission or the Sunyaev–Zel’dovich effect is still challenging due to the low density and temperature of the WHIM.

An alternative is to analyze the galaxy distribution at large scales. Recently, with the availability of large sky area databases such as the Two Degree Field Galaxy Redshift Survey (2dFGRS; Colless et al. 2001), the 2MASS Redshift Survey (2MRS; Huchra et al. 2012), and the Sloan Digital Sky Survey (SDSS; Albareti et al. 2017), the development of accurate structure-detection algorithms has become an even more important concern for astronomy. Visually, the galaxy distribution shows filamentary ridge-like structures that connect massive clusters and groups. However, the identification of these structures through a computational algorithm is not easy to achieve. A good algorithm should first produce an identification that resembles the human visual perception. It should also deliver quantitative results and be founded in a robust and well-defined numerical theory. All of this must be done in an acceptable amount of time with reasonable computational resources.

Currently, there are several filament-finding algorithms that have been tested on the basis of N-body simulations. For example, Aragón-Calvo et al. (2007) present the multi-scale morphology filter method (MMF), which divides cosmic structure into nodes (clusters), filaments, and walls using a smoothing over a range of scales (from a Delaunay tessellation reconstruction, DFTE) and a morphological response filter. Another approach, presented by Aragón-Calvo et al. (2010), makes use of a watershed segmentation technique to trace the spines of the filaments. Alternatively, Cautun et al. (2013) propose an algorithm called NEXUS, that takes into account the density, tidal field, velocity divergence, and velocity shear of the galaxies. Other examples are the algorithm by González & Padilla (2010), which uses the binding energy for selecting the filament members; and the DisPerSE algorithm, by Sousbie (2011), based on the Morse theory – both use Delaunay–Voronoi tessellation based on density estimations.

On the other hand, several attempts have been made to trace the distribution of the real cosmic web using the SDSS database. For example, Sousbie et al. (2008) applied their local skeleton method to samples of Data Release (DR) 4, which allowed them to estimate the mean filament length per unit volume. The algorithm by Bond et al. (2010), called the smoothed Hessian major axis filament finder (SHMAFF), was applied to the SDSS-DR6 after removing the finger-of-God (FoG) effect. Platen et al. (2011) compared three different reconstruction techniques, namely the DFTE, the natural neighbor field estimator (NNFE), and a Kriging interpolation, and searched for voids also in DR6. These latter authors found that DFTE works quantitatively better than the others while the Kriging and NNFE have a better performance in producing visually appealing reconstructions than DFTE. Smith et al. (2012) applied their multi-scale probability mapping (MSPM), which combines probability and scale density information with a friends-of-friends (FoF) algorithm, over the SDSS-DR7 galaxies. This method allowed these latter authors to recover structures from clusters to filaments of up to ∼10 h−1 Mpc. Tempel et al. (2014) applied a Bisous model on the SDSS-DR8 spectroscopic galaxies to trace the filament spines. This latter method adjusts cylinders to the galaxy positions applying a stochastic metric. The subspace constrained mean shift (SCMS) approach, which uses a kernel density estimator (KDE), was applied by Chen et al. (2016) to DR7 and by Chen et al. (2015) to DR12. This method allows the identification of high-density regions by smoothing the galaxy distribution. These latter authors applied this technique over slices of 0.05 in redshift for the SDSS sky area. Moreover, Alpaslan et al. (2014) found, for the Galaxy And Mass Assembly (GAMA) survey, that there are fine filaments embedded inside the SDSS voids. These structures, referred to as “tendrils”, have a lower density than the SDSS filaments, appear to be morphologically distinct, are more isolated, and span shorter distances. A comprehensive review and comparative analysis of the above algorithms can be found in Libeskind et al. (2018).

Another approach to analyzing the LSS is to study the superclusters of galaxies. These are traditionally defined as concentrations of galaxy clusters (e.g., Abell 1961; Einasto et al. 2001; Chow-Martinez et al. 2014), that build up the cosmic web from a network of connected high-density nodes; or directly from the distribution of galaxies (e.g., Luparello et al. 2011; Costa-Duarte et al. 2011; Liivamägi et al. 2012). They can also be defined kinematically by mapping galaxy peculiar velocity flows, a technique still restricted to the very nearby Universe (Tully et al. 2014; Dupuy et al. 2019). This last method is the closest to a purely gravitational-potential-based approach, and allows the identification of the “basins of attraction” that partition the Universe into cells or cocoons (e.g., Dupuy et al. 2019; Einasto et al. 2019). For this work we adopted the supercluster second-order clustering definition for determining the superclusters of the sample. These systems are not virialized and the contents of the inter-cluster medium (dark matter halos, gas, and galaxies) dynamically interact and organize by falling through the gravitational potential of the more massive structures, forming walls, filaments, groups, and clusters. As shown in Tanaka et al. (2007), the possibility of finding elongated chain-like structures increases in superclusters. Also, following the classification of superclusters by Einasto et al. (2014) into filament-type and spider-type, both have filaments, in a linear or radial configuration, respectively. Following this approach, Cybulski et al. (2014), for example, applied a combination of Voronoi tessellation and minimum spanning tree (MST) techniques over the Coma supercluster region in order to search for bridges between clusters of galaxies.

Motivated by the above context, we developed a methodology1 for the identification of structures in the environment of superclusters using the galaxies embedded in them. We restrict our study to the SDSS-DR13 area, and use only galaxies with spectroscopic redshifts for our analysis. The approach we follow seeks to detect structures by using only the geometrical information of the galaxy distribution. Using different pattern-recognition methods, we identify high- to moderate-density galaxy systems, and low-density filaments connecting them. This allows the identification of structures over a wide range of scales (1–100 Mpc), from groups to long filaments. Moreover, the identified structures are validated through comparisons with previously reported catalogs. We also carried out a qualitative validation through a kernel method which is one of the most commonly used methodologies for the detection of overdensity regions. Our aim is to investigate whether previous filament candidates in the sample, identified from chains of Abell/ACO clusters, are bona-fide structures and to characterize their galaxy populations. Finally, we studied the relation between the galaxy properties and the supercluster environment in which they reside (e.g., systems, filaments, and the dispersed component of the superclusters).

This paper is organized as follows: in Sect. 2 we present the data for the sample of superclusters under analysis and the sample of galaxies from the SDSS survey. In Sect. 3 we describe in detail the implementation of mathematical tools and pattern-recognition methods applied for the detection of high-density regions (clusters and groups) and for the skeletonization of the low-density filamentary structures. In Sect. 4 we describe the algorithm for detecting clusters and groups of galaxies inside supercluster boxes, and in Sect. 5 we present the algorithm for finding the filaments and their skeletons. In Sect. 6 we describe the application of the algorithms to one of the superclusters, MSCC 310, as an example of their use. Section 7 is devoted to the validation and evaluation of the methodology and discussion of its results. In Sects. 8 and 9 we present the results concerning the analyses of the galaxy properties as a function of the supercluster environment. We also discuss these results and compare with previously reported results. Finally, in Sect. 10 we present the conclusions of this work. Throughout this paper we assume the Hubble constant H0 = 70 h70 km s−1 Mpc−1, the matter density Ωm = 0.3, and the dark-energy density ΩΛ = 0.7.

2. The data

2.1. The superclusters and filament candidates

We are interested in unveiling and studying LSS filaments, which can be defined as chains of clusters connected by bridges of galaxies and probably by gas and dark matter. As mentioned previously, these elongated structures should most likely be found in superclusters since they probably just passed the quasi nonlinear regime described by the Zel’dovich’s approximation (1970; see also the “sticking model” by Shandarin & Zel’dovich 1989). In the current evolutionary stage of LSS, superclusters are a network of sheets, filaments, and knots (clusters and groups) of galaxies, gas, and dark matter, just starting a global gravitational collapse process.

We selected a sample of superclusters of galaxies from the Main SuperCluster Catalogue (MSCC; Chow-Martinez et al. 2014) that are inside the SDSS region (in order to have a sample of galaxy data that is as homogeneous as possible). The original MSCC is an all-sky catalog that contains 601 superclusters, identified in a complete sample of rich Abell/ACO clusters with updated redshifts from 0.02 to 0.15 using a tunable FoF algorithm. From these superclusters, 166 are inside the SDSS-DR13 region. For this work we selected those superclusters with five or more clusters with a box volume (see below) inside the SDSS-DR13 survey area. In addition, we used the list of filament candidates for MSCC superclusters by Chow-Martínez et al. (in prep.) as a reference in order to select the superclusters with the most promising filaments. Roughly speaking, these filament candidates were identified as chains of at least three clusters – members of the superclusters – separated by less than 20 Mpc from each other. The present work also intends to validate these filament candidates by searching for the bridges of galaxies that we expect to connect them. It is worth mentioning that some of the filament candidates may turn out to be only chance configurations, with no bridges of galaxies connecting the clusters of a chain. Also, some bridges may exist, but not necessarily along the straight lines connecting the clusters. Our final sample consists of 46 superclusters of galaxies, which are listed in Table 1.

Table 1.

Sample of MSCC superclusters used in the present work.

For the Abell/ACO clusters and for the galaxies in the supercluster box volumes (see Sect. 2.3), we first transformed their radial-angular coordinates to rectangular coordinates as follows:

(1)

(2)

(3)

where DC is the co-moving distance as obtained using the spectroscopic redshift and the cosmological parameters indicated above.

2.2. The SDSS galaxies

The main galaxy sample of SDSS-DR13 (Albareti et al. 2017) is a suitable database to search for filamentary structures on the LSS because (i) it covers a large sky area (14 555 square degrees) containing various MSCC superclusters; (ii) it contains homogeneous photometric and spectroscopic data for galaxies with an astrometric precision of 0.1 arcsec rms and uncertainty in radial velocities of about 30 km s−1 (Bolton et al. 2012); (iii) it is roughly complete to the magnitude limit of the main galaxy sample (rPet = 17.77), which corresponds to an average z ∼ 0.1, going (inhomogeneously) deeper for data releases after DR7 (Abazajian et al. 2009); and (iv) at the limit of our sample, z = 0.15, the SDSS spectra are complete for galaxies brighter than Mr ∼ −21.

SDSS-DR7 joins the SDSS-I/II spectra for one million galaxies and quasars. It has ∼6% incompleteness due to fiber collisions (Strauss et al. 2002) and another ∼7% incompleteness attributed to pipeline misclassification (Rines et al. 2007). These spectra are included in the final data release of the SDSS-III (Alam et al. 2015). The Baryon Oscillation Spectroscopic Survey (BOSS) is part of the SDSS-III observations and obtained spectra for another 1.4 million galaxies. The BOSS observations are divided in two main samples, the low-redshift LOWZ (z <  0.4) and the high-redshift CMASS (0.4 <  z <  0.7) galaxy samples. The SDSS-DR13 (Albareti et al. 2017) includes spectra for more than 2.6 million galaxies and quasars.

Although photometric redshifts are available for SDSS galaxies, for this work we selected those objects listed on the SpecObj sample for which spectroscopic redshifts are available (downloaded from the SkyServer web service) and that are described as extragalactic (i.e., galaxies and low-z quasars). The SpecObj table contains the best and unique spectra for the same location within 2 arcsec; these are referred to as “sciencePrimary” objects. We considered galaxies within a redshift range from 0.01 to 0.15 and selected spectra with quality flag “good” or “marginal”. Because the kind of study presented here relies on the galaxy distance measurement, we restricted our analysis to galaxies for which spectroscopic redshift measurements are available to its higher accuracy. However, galaxies for which photometric redshift is available can be included in the sample in further analyses to test whether or not their addition increases the filament signal of detection.

For the present work we also made use of value-added subproduct catalogs such as the Max Planck for Astrophysics and Johns Hopkins University (MPA-JHU) catalog (Brinchmann et al. 2004; Kauffmann et al. 2003; Tremonti et al. 2004). These latter authors calculated different galaxy properties (stellar mass, metallicity, activity type classification, star formation rate, etc.) using the spectra from the SDSS-DR8 galaxies (Aihara et al. 2011). As explained by Tremonti et al. (2004), the galaxy properties in the MPA-JHU catalog are calculated by processing the galaxy spectrum in such a way that even the weaker emission lines are detectable. In order to analyze the morphological distribution of the galaxies in the different supercluster environments we employed the morphological classification provided by Huertas-Company et al. (2011). These latter authors calculate a probabilistic morphological classification for the SDSS-DR7 spectroscopic galaxies by applying deep-learning techniques that make use of their photometry. They also compare their automated classification with a sample of the Galaxy Zoo (Lintott et al. 2008, 2011) visual classification and show that their classification into early and late types is in good agreement with the visual classification.

2.3. The supercluster boxes

For each supercluster in Table 1 we selected all the SDSS galaxies (according to the above criteria) located inside the corresponding box volume. These boxes were defined in rectangular coordinates in a way that their walls were set at a distance of 20 Mpc beyond the center of the farthest clusters in each direction, for each supercluster. This extension was applied in order to guarantee that any connection of the supercluster with external structures could be detected. The box volumes of the superclusters vary from ( Mpc)3 to ( Mpc)3. Compared to the typical sizes of the observed and simulated basins of attraction in Dupuy et al. (2019) [( Mpc)3], the boxes we use here are slightly larger, as expected, implying that we are sampling the LSS in a general way, and are not restricting the analysis to the densest parts of the superclusters. Our sampling of the superclusters may be compared to that by Krause et al. (2013), that is, broader than the sampling done by, for example, Kopylova & Kopylov (2006) and Liivamägi et al. (2012).

The properties of the supercluster boxes along with information about the detected galaxy systems (to be described below) are presented in Table 2. In particular, the superclusters MSCC 236, MSCC 314, and MSCC 317 lie close to the limits of the SDSS region: although all their member clusters are inside, their boxes were reduced to a margin of 10 Mpc in only one direction.

Table 2.

Properties of the supercluster boxes and of the galaxy systems detected inside them using the GSyF algorithm.

Figure 1 shows the diminution of mean volume density of the boxes with redshift due to Malmquist bias. The fitted function will be used as the selection function for the SDSS galaxies considered in this work. It may be noted that the mean densities of the superclusters MSCC 55 and MSCC 579 lie far below the fit. This is also due to the positions of these superclusters close to the border of SDSS coverage: their samplings seem sparse and irregular. In fact, for MSCC 579 one can clearly see the shape of the cones of observation through the galaxy distribution. For this reason, the analysis of these superclusters and of the three cited above must be taken with caution.

thumbnail Fig. 1.

Distribution of mean volume densities (see fourth column of Table 2) for the 46 superclusters in our sample as function of redshift (blue points). The red line corresponds to the best fit of a power-law function. Residuals of the fitting are shown in the bottom panel. MSCC 579 and MSCC 55 were excluded from the fitting.

It is worth noting that, since we have only the radial velocity component available (redshift), the transformation from radial-angular coordinates to rectangular coordinates is more complicated for the galaxies. Their peculiar velocity may bias their redshift-space coordinate, especially when they are members of clusters and groups of galaxies that are subject to the FoG effect. Therefore, for the galaxy data used to detect the filaments, we first applied a correction, as described below, which redefined their individual DC in Eqs. (1)–(3).

3. Mathematical tools

In what follows, we consider the N galaxies in each supercluster volume as a set of points x1, x2, …, xN ∈ X, all part of a sample X.

3.1. Voronoi tessellation

The Voronoi tessellation (VT; Voronoi 1908) of a sample X, Vor(X), can be defined as the subdivision of a 2D plane or a 3D space into cells with the property that the seed point xi ∈ X is located in the cell vi if and only if the Euclidean distance DE(xi, vi) < DE(xi, vj) for each vj ∈ X with j ≠ i. In other words, the VT divides the space into polygonal cells centered on the seed points (in our case, galaxies) in a way that the cell walls are equidistant to all nearest seeds (e.g., Platen et al. 2011). Therefore, the density at each galaxy position xi is determined as di = 1/vi, with vi being the volume (or area) of the cell enclosing the object xi. Scoville et al. (2013) and Darvish et al. (2015), for example, use VT to find the high-density regions in sky slices while Cybulski et al. (2014) apply VT to identify the filamentary structures in the Coma cluster region.

3.2. Hierarchical cluster analysis

Hierarchical clustering (HC) is a machine-learning method whose objective is to group objects with similar properties. It has been used in several different fields such as artificial intelligence, biology, medicine, and business. In general, it can be used to carry out pattern-recognition analysis, allowing the user to regroup, segment, and classify any kind of data. This method is equivalent to a reduction of the dimensionality of the data and reduces the computing time considerably. In astronomy, the most popular application of HC has been in the detection of substructures inside galaxy clusters following the algorithm developed by Serna & Gerbal (1996). This algorithm considers the positions, redshifts, and potential binding energy between pairs of galaxies to detect substructures (see also Guennou et al. 2014).

As we are interested in finding galaxy structures on scales larger than the ones for substructures and for structures that may be less strongly gravitationally bound, we chose to use an agglomerative hierarchical clustering analysis method that considers only the positions and redshifts or 3D estimated positions of galaxies. A detailed description of the HC algorithm can be found in Theodoridis & Koutroumbas (2009), Theodoridis et al. (2010) and Murtagh & Contreras (2011). For our analysis we chose Ward’s minimum variance clusterization criteria, described in detail by Murtagh & Legendre (2014). In general, Ward’s method works by merging the groups following the criterion:

(4)

where ΔD is a term that measures the distance between two groups c1 and c2, respectively.

In our case, each point is initially considered as a group, subcluster, or singleton, and then each group can be agglomerated with a neighbor that has the minimum ΔD distance. The agglomeration continues until all points are grouped together.

The results of the HC clusterization can be represented by a dendrogram or hierarchical tree. A dendrogram represents, in a graphical form, the connections between elements and groups at different levels of agglomeration. The height of each connection line in the tree corresponds to the distance between two connected elements or centroids. This representation also allows visualization of the principal branch structures where the singletons are the final leaves. The number of desired groups, Ncut, is therefore obtained by cutting the hierarchical tree at a certain level. The exact value of this level depends on the characteristics of the sample or, more precisely, on the underlying physics used to define the groups. Each created group can be represented by a 2D or 3D Gaussian model, Pj(x). This allows the groups to be classified by their Gaussian properties, for example by their centroid (mean position, Cj), richness (number of members, Nj), and compactness (covariance, σj).

3.3. Graphs

Graph theory-based algorithms have shown to be a suitable tool to analyze complex networks. Some of the most common subjects where these algorithms are successfully applied are social networks, computer vision, statistics, business, and transportation networks.

A graph is a representation of the connections in a network. It is composed of “nodes” and “edges”, where each node represents an object, and the edges represent the connections between nodes. Also, the edges can have weights that represent the strength of the connection. An undirected graph has edges that do not have direction. We can define an undirected graph as G = (U, E, W), with n nodes (or vertices) ui ∈ U, m edges ekl ∈ E, and a weight set W with a wkl for each edge ekl. The information of a graph can be represented by a square adjacency matrix. The values of the matrix entries indicate the weight of the connection between nodes. Hence, the adjacency matrix A of the graph G is defined as:

where uk and ul are nodes in G. One can refer to Ueda & Itoh (1997) for a discussion on the use of the graph theory approach for quantifying the LSS of the Universe.

3.4. Minimum spanning tree

A spanning tree connects all nodes in a graph in a way that does not produce cycles. A graph can contain several unconnected spanning trees. Since the edges in a graph can have weights, the MST algorithm (Graham & Hell 1985) searches for a spanning tree that minimizes the total weight. This algorithm traces a tree-like continuous path for a group of edges and nodes in an optimal way. In particular, the Kruskal MST algorithm analyzes the edges in sequence, sorting them by weight. At the beginning, the shortest edge is analyzed and this would be the first tree branch. The nodes are then added to the tree under three conditions: (i) only one node is added to the tree; (ii) a node is added based on the number of connected edges; (iii) their edges cannot be connected to another existing node in the tree. The process continues with the following edges in the graph until all connected edges are analyzed. Finally, the tree is extracted from the graph and the process begins again with the remaining nodes until all are tested. As its name suggests, the result is a forest of optimized independent trees.

3.5. Dijkstra’s shortest path

Dijkstra’s algorithm (Dijkstra 1959) is a classical method for finding the shortest path between two nodes in a graph. We define a path of length ekl between two nodes uk and ul as a sequence of connected nodes u1, u2, …, un if k ≠ lk, l ∈ 1, …, n. In general, Dijkstra’s algorithm works as follows. First, an origin is selected by taking the node at the beginning of the path, u0. A distance value is then assigned to all nodes: set as zero for the origin, s(u0), and as infinity for all the other nodes, s(ui) = inf. Next, all nodes are marked as unvisited and u0 is marked as current a. The algorithm then calculates the distance from the current node a to all the unvisited nodes connected by the edges ei as snew = s(eai) + wai; here s(eai) is the distance from a to the node ui and wai is the weight of the edge ei. If s(eai) + wai < s(ei), then the distance is updated and the connected node label is updated as the current a. After visiting all neighbors of the current node, they are marked as visited. A visited node will not be checked again; the recorded distance s(eai) is therefore final and minimal. Finally, if all nodes have been visited, the algorithm stops. Otherwise, the algorithm sets the unvisited nodes with the smallest distance (from the initial node u0, considering all nodes in the graph) as the next “current node” and continues from the second step. A detailed description of the algorithm can be consulted in Santanu (2014).

3.6. Kernel density estimator

As mentioned before, VT is used to measure the local density at each point position. However, in some cases, it fails to identify large overdensity regions, as mentioned by Cybulski et al. (2014). An alternative to the VT method is to apply KDEs. In general, KDE methods work by adjusting a kernel function over each observation in the sample. However, the choice of the optimal kernel model and its intrinsic parameters is still under investigation in the pattern-recognition community. Also, there have been several attempts to apply adaptive Gaussian model kernels, in other words, to change the size of the Gaussian model as a function of different parameters, such as for example the distance to the nearest neighbor (Chen et al. 2016) or a weighting function (Darvish et al. 2015).

For this work we used the results from the VT method (see Sect. 3.1) as the input parameters for the KDE. We start by fitting an ellipsoid inside each VT cell. Thus, instead of choosing a fixed bandwidth for the kernel, we employ the eigenvalues and eigenvectors of the ellipsoids to calculate a Gaussian kernel ϕΣ centered at μ with covariance matrix Σ for each observation. Therefore, each n-dimensional kernel is represented as:

(5)

The KDE can then be estimated as:

(6)

where αi is a weight factor calculated from the VT cell volume (vi) as 1/vi.

The identification of the overdensity regions is done through the projection of KDE kernels in 2D planes superposing a regular rectangular grid to the data. Thus, the density estimation is obtained at a given grid intersection by calculating the average density of all kernels that overlap at that point. Observations closer to an evaluating point will therefore contribute more to the density estimation than points farther away from it. Consequently, the density will be higher in areas with many observations than in areas with few observations.

3.7. Transversal profiles

The distribution of galaxy properties in filaments is analyzed by constructing transversal profiles. These profiles are calculated by setting up a series of concentric cylinders with axes orientated along the filament skeletons. A bin is then considered to be the volume within two concentric cylinders of radius Rcy and Rcy + ΔRcy. The occurrence of a galaxy proxy in each bin is determined with respect to the galaxy distance from the filament skeleton Dske. The total count of galaxies per bin is weighted by the bin volume, in a similar way as making a normalized histogram. In order to compare samples of different sizes, a normalization is applied by dividing the number of events in a bin by the total number of galaxies in the sample.

4. Galaxy System-Finding algorithm (GSyF)

4.1. Detection of high-density regions

We first searched for the high-relative-density regions, clusters, and groups of galaxies (which we refer to generically as galaxy systems) inside the studied superclusters, because these systems are the natural nodes for filaments. This was also necessary for correcting the FoG effect and having the data prepared for the application of the filament-finding algorithm (see following section). Furthermore, the detection of galaxy systems allows the identification of new and possibly previously unknown systems (especially poorer galaxy groups), and the improvement of the membership estimation of the superclusters themselves.

A description of the algorithm, including the strategy we used for optimizing its parameters using simulated mock volumes, is presented in Santiago-Bautista et al. (2019). Here we review the main steps of this algorithm. First we calculate the local surface density at the position of each galaxy in the projected area of the supercluster by applying the VT method (Sect. 3.1). The VT individual area of the galaxy can be directly converted to a surface density estimation (di = 1/ai), in this case in units of deg−2. It is worth noting that the boxes we considered for the superclusters in our sample comprehend slices in redshift-space in the range 0.02 ≤ Δz ≤ 0.07.

In order to identify the galaxy systems, we start by applying the HC method (Sect. 3.2), but only to the Ngal galaxies with densities above a baseline density, dbas, which should be analogous to a background density. In a certain sense, this separates supercluster galaxies from void galaxies (i.e., under-dense regions). This baseline is calculated from the mean density by randomizing the galaxies in each sky-projected area. Since the distribution of points in space is not isotropic, it is not possible to directly set a background density from the projected positions of the galaxies. Therefore, it is necessary to simulate an isotropic distribution of the points in order to set the baseline value (see, e.g., Cybulski et al. 2014). A set of 1000 randomizations of the point positions is generated, each with the same sample number over the same area. The mean surface density is then calculated using:

(7)

where corresponds to the inverse of the area of the point for the randomization j.

Since the distribution of galaxies is not homogeneous among the different boxes, we calculate independent baseline values for each supercluster; see Table 2. A density contrast (δi) is then calculated as:

(8)

Here, the Ngal galaxies to which we apply the HC are the ones with a density contrast, δi >  0 (see Santiago-Bautista et al. 2019, for an evaluation of the negligible effect of slightly changing the density threshold).

Subsequently, we apply HC to the set of parameters (RA, Dec, 1000 z) for these galaxies (the factor of 1000 is the weight for z values to be comparable to the sky coordinates values). The number of groups taken from the analysis is defined as a cut of the HC tree, fixed to Ncut = Ngal/f, with a segmentation parameter f, which is the expected mean number of elements per group. Currently, the selection of the optimal number of groups in clusterization methods is still a topic under investigation in the pattern-recognition community, which includes the HC algorithm. A specific value of f was calculated for each supercluster (3 ≤ f ≤ 36) according to the optimization process described in Santiago-Bautista et al. (2019). This strategy was adopted because a physically motivated value for f would depend on many parameters, like the density of galaxies in each box, the sampling of these galaxies with respect to the real distribution, and the redshift, among others, which are difficult to estimate for our data.

Finally, we select only those systems with a number of galaxies, Nj, larger than two. These pre-identified systems are then subjected to the next step of refinement: the iterative estimation of the dynamical parameters, virial mass and radius.

4.2. Finger of God correction

After identifying the galaxy systems, we proceed to refining the galaxy membership and correcting the galaxy positions for the FoG effect using a virial approximation. We apply a simplified version of the algorithm presented by Biviano et al. (2006) for the estimation of the virial mass and radius. We do not apply the surface pressure term correction based on the concentration parameter. Avoiding such a correction can lead to an overestimation of the virial radius, but for this geometric analysis, a virial approximation is enough.

The virial-parameter-calculation algorithm works as follows: First we take the projected center and mean velocity of the system from the results of HC (Cj). The projected center is then set at the position of the brightest r-band magnitude member galaxy (BMG) within 1 σj from the HC center, while the HC mean velocity is used directly. Those galaxies that are expected to belong to the system are selected among all galaxies in the sample (those with spectroscopic redshift in SDSS-DR13) that are projected inside a cylinder of radius Ra (hereafter referred to as the aperture). Biviano et al. (2006) show that the dynamical analyses are similar for different aperture sizes. We chose an aperture of Ra = Mpc. Subsequently, for the line-of-sight direction, we select galaxies with a difference in velocity of up to Sa = ±3000 km s−1 with respect to the mean cluster velocity. This would correspond to three times the velocity dispersion of a rich cluster. A robust estimation of mean velocity, vLOS, and velocity dispersion, σv, for the galaxies inside the cylinder is obtained using Tukey’s biweight method (Beers et al. 1990). An approximation of the mass Ma in the aperture is computed as:

(9)

where G is the gravitational constant, 3π/2 is the deprojection factor, and Rh is the projected harmonic radius.

We calculate the virial radius, , by assuming a spherical model for nonlinear collapse, that is, by taking the virialization density as ρvir = 18π2[3H2(z)]/[8πG], and Ma as an estimation for Mvir. We thus obtain

(10)

The aperture Ra is then updated to the calculated Rvir value, the mean velocity to vLOS, and Sa to σv, defining a new cylinder. This process is repeated iteratively until the radius Rvir converges. Mvir is finally calculated at the end of the iteration process.

The correction for the FoG effect is carried out by adjusting the position of the Nmem galaxies inside the final cylinder. This is done by scaling their comoving distances along the cylinder to the calculated virial radius. A schematic representation of the GSyF algorithm, including the FoG correction, can be found on the left side of Fig. 2.

thumbnail Fig. 2.

Flow chart of the GSyF (left side) and GFiF (right side) algorithms.

5. Galaxy Filaments skeleton-Finding algorithm (GFiF)

5.1. Detection of low-density regions

As we are interested in the detection and analysis of elongated and low-relative-density contrast structures, we apply again a combined VT+HC method to the data, but now in the rectangular 3D space, with the positions of the galaxies corrected for the FoG effect. Thus, the VT densities are now volume densities, in units of Mpc−3. At the beginning of this analysis the HC method is applied to all galaxies in the volume without density restrictions, that is, no baseline is applied. Density restrictions are considered later as criteria for the construction of filaments. Another difference between this application of VT+HC and the one used for the GSyF methodology is a relaxed cut in the hierarchical tree. Since we are interested in detecting more elongated representative structures, we tested values for the segmentation parameter f between 10 and 40. The direct effect of relaxing the cut is to allow the detection of groups at lower densities.

Here we need to make some practical definitions in order to describe our strategy. Figure 3 shows the following definitions schematically.

thumbnail Fig. 3.

Representation of a filament. Graph nodes are represented by white circles and edges by dark lines. The five systems connected are represented by a dotted circle of radius Rvir. A bridge connecting two systems is represented as a bold black line. The distance from galaxies to the filament (bold dashed line) is measured along a line perpendicular to the edges.

– The nodes to which we apply the method correspond, in the context in which we are working, to the HC group centroids.

– An edge is defined as any connection between two nodes.

– The real “links” between the systems are defined as the most promising edges, filtered according to their proximity and density contrast.

– Spanning trees are extracted as described in Sect. 3.4, by cutting the graph in uncycled optimal trees. Some nodes inside a spanning tree may have been detected as galaxy systems by the GSyF algorithm.

– A “bridge” is defined as a sequence of links and nodes between two systems.

– A “filament” is identified if a spanning tree bridges three or more systems connected by bridges.

– If the spanning tree contains zero or only one system, it is called a “tendril”.

– The “skeleton” is the medial line of a filament. The method for finding it, which intends to reduce the dimensionality of the objects (in our case, galaxy filaments), is known as “skeletonization”.

5.2. Chaining the filaments

Once we have applied HC, we measure the Euclidean distance DE of the centroid (node) of each group against all its group neighbors. These connections (edges) can be represented by an undirected graph as described in Sect. 3.3. The weights W of the edges are set by the Bhattacharyya coefficient, BC, defined as:

(11)

The Bhattacharyya coefficient quantifies the amount of overlapping between two distributions P1(x) and P2(x). Thus, the orientation of the two groups weights the connection between them.

In the next step, we filter the edges by two criteria: First we select the edges corresponding to a DE smaller than a threshold, Dmax (hereafter, linking length). Secondly, we consider an edge as a real link of galaxies based on the following: (i) we define a cylinder along the edge with a radius of 1 Mpc; (ii) we measure the linear density of galaxies along the cylinder; and (iii) if the mean linear density of the cylinder is above d = N/V (Table 2) we take the cylinder as a link of galaxies connecting the two nodes.

Each ensemble of connected links is a tree in the forest graph. We then apply Kruskal’s MST technique (Sect. 3.4) to the forest graph to identify independent trees and their dominant branches. To proceed we need to match the list of detected spanning trees with the list of detected GSyF systems. However, due to the effect of losing sampled galaxies with increasing redshift (see Fig. 1), the richness of the detected systems depends on the redshift. In other words, to have a comparable richness for two similar systems, for instance one at z = 0.03 and another at z = 0.13, we have to apply a correcting factor to the richness of the second one. To overcome this limitation, we apply the following lower limit for the richness of the systems at the supercluster redshift: log10Nmin = alog10 z + b, with a = −1.0 and b = −0.2. This leads to a lower richness limit of Nmin = 30–5 galaxies per system, from the nearest and farthest supercluster in our sample respectively.

Now that we have the systems, and the bridges between them (instead of the nodes and edges in the previous step), we can identify the filaments. As stated above, and following the definition by Chow-Martínez et al. (in prep.), we search for the filaments which have at least three galaxy systems connected by bridges.

Although isolated bridges (i.e., connecting only one pair of systems) and tendrils (connections between nodes with no system embedded) are important and are also a byproduct of the algorithm, we focus our discussion hereafter on the filaments. The connecting edges of these filaments are then refined using Dijkstra’s algorithm (Sect. 3.5). This refinement allows the identification of the filament skeleton, that is, the principal branch connection. According to the pattern-recognition literature, a skeleton represents the principal features of an object such as topology, geometry, orientation, and scale. Figure 4 presents the steps of the GFiF algorithm schematically.

thumbnail Fig. 4.

Illustration of the steps of the GFiF algorithm. In the first box (top-left) one can see the distribution of galaxies. In the second one (top-right) the HC groups are marked, with denser red colors representing the richer HC groups. The filtered edges (links) among the groups of the spanning tree are displayed in the third box (bottom-left). The last box (bottom-right) presents the systems (green circles), bridges (brown lines), and other links (blue lines) found among the groups of the preceding step.

The results of the filament-finding algorithm depend on several parameters, in particular the number of HC groups, Ncut (or, equivalently, f), and the linking length Dmax. Therefore, it is necessary to carry out a search for the optimal combination of these parameters. In addition, in order to find the longest filaments possible inside the supercluster volume, we search for the linking length that maximizes the number of filaments in the box, that is just before they begin to percolate. The optimization for these parameters is described in detail in Santiago-Bautista et al. (2019). We found that the optimized parameter f decreases with z from about 20 to 10 galaxies in the range covered by our sample, that is, depending on the sampling, a smaller density is found with increasing z. The Dmax, in turn, increases with z, in such a way as to compensate for the decrease in f. A schematic representation of the GFiF algorithm can be found on the right side of Fig. 2.

6. Detection algorithms in action

In order to illustrate the function of the detection algorithms presented above, we now describe their application to one of the superclusters in our sample, MSCC 310, the Ursa-Majoris Supercluster (see Tables 1 and 2). This supercluster contains 21 Abell clusters with redshifts in the range from 0.05 to 0.08 and is one of the largest in volume in our sample. MSCC 310 occupies an area in the sky of about 1700 deg2, equivalent to a volume of ( Mpc)3 – including the 20 Mpc added to the box limits from the farthest clusters.

The volume contains N = 12 286 SDSS galaxies with spectroscopic redshift. This corresponds to a mean surface density of 76.9 gal.deg−2 or 0.008 gal.Mpc−3; see Table 2. The list of parameter values used is shown in Table 3.

Table 3.

Glossary of parameters used by GSyF and GFiF algorithms.

6.1. Application of GSyF to MSCC 310

First we applied the VT algorithm to the projected distribution of MSCC 310 galaxies and calculated their di. We then made 1000 simulations to estimate dbas (15.7 deg−2, for this case). We applied the HC algorithm to the Ngal = 7529 galaxies with δi >  0. After calculating the best f parameter from the 30 mock simulations (f = 6, in this case), we took the Ncut = 1140 groups generated from the HC application. As expected, these groups have, approximately six members on average. Of these, we retained 1015 with Nj ≥ 3.

The iterative virial refinement was initialized by assigning the center of each HC group to the brightest r-band galaxy member close to its geometrical centroid (see Table 4). For MSCC 310 groups, the mean difference between the geometrical center and the projected position of the brightest galaxy was found to be about 350 kpc. On average, virial refinement required six iterations in order to produce convergence to the virial radius. This refinement resulted in 122 systems with Nmem ≥ 10 for the MSCC 310 volume. The refinement also detected 113 smaller systems with 5 ≤ Nmem <  10.

Table 4.

Main properties of the 25 richest systems identified in the volume of the supercluster MSCC 310.

In Table 4 we list the properties of the first 25 richest systems for the MSCC 310 supercluster. The range of virial radii of the GSyF systems with Nmem ≥ 10 in MSCC 310 was Mpc. For groups with 5 ≤ Nmem <  10, the range of virial radius lies within Mpc. After the refinement, the projected central position of the systems changed by 170 kpc on average, while the redshift was refined for some cases up to Δz ∼ 0.001 or Δσv ∼ 300 km s−1.

As an example, the richest system in MSCC 310 is the cluster A1291 A. Its HC initial centroid position (set as the position of the brightest galaxy in the HC group: α = 172.73, δ = 56.49 and z = 0.0611) changed by 13 Mpc after 17 iterations of the virial refinement (the final centroid position corresponds to α = 173.01, δ = 56.09, z = 0.0535). This position is at 240 kpc from the system’s brightest galaxy detected for A1291 A which has coordinates (α = 173.05, δ = 56.05, z = 0.0585; Lauer et al. 2014).

Finally, as described in Sect. 4.2, we correct the DC of the member galaxies in each system by re-scaling their dispersion range to the Rvir of the system. An example of the MSCC 310 volume before and after the correction is shown in Fig. 5.

thumbnail Fig. 5.

Three-dimensional distribution of galaxies for the MSCC 310 supercluster volume. Top: Galaxy positions before the application of the FoG correction. Bottom: Galaxy positions after correction for FoG effects. The colors represent density as calculated from 3D VT. The highest density is represented in red while lower densities are represented in green to blue.

6.2. Application of GFiF to MSCC 310

With the co-moving distances for the MSCC 310 galaxies corrected for the FoG effect, we proceeded to transform their sky coordinates to rectangular ones following Eqs. (1)–(3). The Ngal was now taken to be the total number of galaxies in the box of MSCC 310, N = 12 286, for which we applied the GFiF method. The VT algorithm was then applied to calculate volumetric numerical densities. We used a segmentation parameter of f = 16 for the HC algorithm. From that, we identified 768 low-density groups and for each pair we calculated the DE distance between centers and the BC weight. As expected, the implementation of the HC algorithm over all galaxies detected larger groups (∼15 galaxies on average now) and those were more elongated, with a mean σj of 1.8 Mpc compared with the mean σj of 0.5 Mpc found with the application of GSyF.

In order to filter the connections, a linking length of Dmax = 8 Mpc was used resulting in 334 edges. As described above, Dmax and f were obtained by the optimization process described in Santiago-Bautista et al. (2019). The second filter, the minimum mean linear density along the edge cylinders (in this case 0.008 gal.Mpc−3), left 273 links from the 316 connections smaller than Dmax. This resulted in 34 trees, to which we applied MST. Of these, only 9 are linking three or more systems of galaxies with a richness Nmem above 11 galaxies, and the rest are isolated bridges and tendrils. This result is shown in the dendrogram depicted in Fig. 6 (top panel) which shows nine dominant filaments for the MSCC 310 supercluster.

thumbnail Fig. 6.

Results of the GFiF algorithm for the MSCC 310 supercluster volume. Upper panel: dendrogram with the nine detected filaments represented by different colors. The y axis of the dendrogram plot indicates the distance at each level of the tree. Lower panel: RA × Z distribution, where SDSS galaxies are represented by gray points and filaments are represented by lines according to the colors in the upper panel. Tendrils are represented by gray lines.

Concerning the systems embedded in the structures, from the 359 HC groups (nodes) in the spanning trees, 116 were found to have a match in the systems with Nmem ≥ 10 identified with GSyF. From these, 61 were found to be in filaments (53%), 26 (22%) in bridges between pairs of systems, and 29 (25%) were found to not be connected by bridges, that is, they were relatively isolated. The filaments detected by the GFiF algorithm in the MSCC 310 supercluster and their main properties are listed in Table 5.

Table 5.

Main properties of the filaments extracted through GFiF for the supercluster MSCC 310.

We can also observe the filaments inside the MSCC 310 volume in the bottom panel of Fig. 6. In this panel, the nine filaments are plotted over the distribution of galaxies in a RA [deg]  ×  Z [Mpc] plane. This projection allows the recognition of structures both in one of the coordinates of the sky plane and depth. Filaments are depicted in the same colors in both panels of this figure. Isolated bridges, which connect two systems alone without forming a filament, are represented only in the bottom panel and by black lines. Tendrils are not represented to avoid crowding. The longest paths for the filament skeletons, that is, those that connect the farthest systems of each filament, range from 18 to 62 Mpc and connect up to 11 systems inside the MSCC 310 volume. Moreover, we measured the paths between pairs of systems chained together by bridges; such distances range from 5 to 24 Mpc.

7. Validation of the methods

7.1. Checking the identified systems of galaxies

In order to validate our GSyF algorithm we compared the list of identified systems to different cluster and group catalogs in the region of SDSS. For MSCC 310, for instance, GSyF detected 122 systems with ten or more galaxies and another 113 systems with 5 ≤ Nmem <  10. A match was considered positive if the projected positions of the system in the two compared catalogs were found to be within 1 Mpc of one another, while in redshift space we considered a difference Δz = 0.007 which corresponds to ±2100 km s−1.

For the rich clusters, first we compared our results to the original Abell/ACO catalog (Abell et al. 1989) based on the most recent parameter measurements for its clusters (e.g., Chow-Martinez et al. 2014). Also, we compared the detected systems against the central galaxy position provided by the Brightest Cluster Galaxy catalog (Lauer et al. 2014, hereafter L14). Regarding catalogs based on the SDSS spectroscopic sample we compared with the C4 cluster catalog (Miller et al. 2005), based on the SDSS-DR2. For less rich clusters and groups we compared our systems with the Multi-scale Probability Mapping clusters/groups catalog (MSPM, Smith et al. 2012) and the Tempel et al. (2012) catalog (hereafter T11), carried over the SDSS-DR7 and -DR8 respectively. For the comparisons we used all systems detected by GSyF down to a richness of five galaxies.

Using the tolerance cylinder described above, 19 of the 37 Abell/ACO clusters inside the MSCC 310 box were detected as systems of richness above five galaxies with our method (51%), while the equivalent number was 26 (76%) for the 34 clusters in C4. There are 11 clusters in the L14 catalog embedded in the volume and 8 (73%) of them have GSyF counterparts. However, by increasing the aperture to 2 Mpc, we increased the detection of Abell clusters to 29/37 (78%), C4 clusters to 33/34 (97%) and the L14 catalog to 100%; see Table 6. The increase of 20 − 30% in cluster matches caused by using a larger aperture size can be related to the fact that the mean separation of member galaxies increases for lower richness systems, and the determination of the cluster center is then subject to this separation; see Table 6. For example, A1452 and A1507 B have a GSyF counterpart located at ∼1.5 Mpc projected distance and Δσv of ∼630 km s−1 and 120 km s−1 respectively (see Table 4, systems No. 12 and 14), while their C4 counterparts are 0.7 and 0.4 Mpc away respectively.

Table 6.

GSyF systems detected by other catalogs for the MSCC 310 supercluster.

For what concerns the less massive systems, there are 315 groups detected by T11 and 213 groups listed in the MSPM catalog with richness larger than or equal to five galaxies for the MSCC 310 volume. Our algorithm detected systems that correspond to 61% (79%) of the T11 groups and 67% (78%) of the MSPM groups, within an aperture of 1 Mpc (2 Mpc); see Table 6. This is acceptable for our purposes because we have constructed GSyF to find the clusters that present the FoG effect, although we can clearly go farther towards poorer systems with GSyF.

The region of the UMa supercluster was studied by Krause et al. (2013). These authors identified 31 galaxy systems in the MSCC 310 area, each with between 15 and 94 galaxies. We found that our GSyF systems match with 24 (77%) of these clusters within an aperture of 3 Mpc, of which 10 are Abell clusters.

The systems detected in the main portion of the MSCC 310 supercluster are depicted on a sky projected distribution in Fig. 7 (top panel) and are represented by black circles with radii equal to the measured virial radius. The system positions from the Abell, C4, L14, MSPM, and T11 catalogs are depicted as red, pink, cyan, blue, and green points, respectively. We also observe that the system membership number detected by GSyF is in agreement, for most of the cases, with the number of members for the same systems detected by T11, C4, and MSPM (see Fig. 8). Qualitatively one can observe in this figure that the richness from T11 is in better agreement with our measurements, while MSPM estimates a richness slightly lower than both ours and T11.

thumbnail Fig. 7.

Projected distribution in the sky of systems detected by GSyF. Top: systems detected for the MSCC 310 (UMa supercluster). Bottom: systems detected for the MSCC 295 (Coma supercluster). The system radii are shown as circles of r = Rvir. For comparison, the positions of systems reported by MSPM, T11, C4, L14, and Abell catalogs are depicted by color points: blue, green, pink, cyan, and red, respectively.

thumbnail Fig. 8.

Comparison of MSCC 310 supercluster GSyF system richness against the richness measured by other catalogs for the matching systems. Symbol colors are the same as the ones in Fig. 7. The dashed line represents the identity.

A similar analysis can be done for the other superclusters in our sample. For example, for the Coma supercluster (MSCC 295, Fig. 7, bottom panel), the GSyF algorithm detected 115 systems in total. Of these, we found that A1656, the richest one, is composed of 579 galaxies. The estimated virial radius and mass are Mpc and 7.7 × 1014M, respectively. The second-richest cluster, A1367, has 243 galaxies, while its radius and mass are Mpc and 5.3 × 1014M, respectively. These estimations are in good agreement with those measured by Rines et al. (2003). The complete catalog of systems for each volume is available online2.

7.2. Checking the filament skeletons

We compared the filaments obtained using the GFiF algorithm for the MSCC 310 volume with those presented by Tempel et al. (2014, hereafter T14) as extracted from their Table 2. We transformed the T14 filament positions in survey coordinates (see T14, Eq. (1)) to our rectangular space and cosmology. There are about 630 T14 filaments that lie in the sampled volume of the MSCC 310 supercluster. These filaments have a mean length of 9 Mpc while the largest one has a length of 48 Mpc. As a comparison, the filament skeletons detected by GFiF have a mean length of 42 Mpc and the largest one has a length of 62 Mpc. We found a 40% match between our detected filaments and T14 and an 80% match with our isolated bridges and tendrils. The mean difference between the medial axis of the T14 filaments matching the nearest filament or tendril detected by us is ∼1.5 Mpc. In Fig. 9, T14 filaments are represented by a sequence of points forming a line. The calculated separation was taken to be the distance from the T14 filament points to the edges of our filaments. Our filaments are depicted over T14 filaments in this figure. As can be seen, GFiF detects the most prominent (dense) filaments among the ones in T14.

thumbnail Fig. 9.

Comparison of GFiF filaments for the MSCC 310 supercluster to the T14 filaments in the same region for the SDSS-DR8. Gray lines are T14 filaments. Colored lines depict filaments identified in this work.

7.3. Comparison with KDE density maps

In order to validate the results from GSyF and GFiF algorithms and specifically to corroborate the densities, we performed an independent analysis of the galaxies in the MSCC 310 volume using the KDE method, as described in Sect. 3.6. A quantitative comparison in the space between the 3D KDE and the skeleton structures is left for upcoming works. Here, we restricted our analysis to 2D projections (density maps) of the 3D KDE (XY, XZ, YZ). For this analysis we used kernels of 1 Σ in size (see Sect. 3.6). Since each kernel is created based on the VT cell, we used dbas as a baseline density. We selected those regions for which dKDE >  dbas in the RA × Dec projected density map. Subsequently, we compared the position of the density peak of each region against the centroids of the 122 GSyF systems. We found that 93 GSyF systems with Nj ≥ 10 (76%) match density peaks above 3 dbas. The remaining 29 GSyF systems (24%) are identified with density peaks in the range (1 − 3) dbas. Moreover, we observe that the filament edges connect these density peaks forming chains of overdensity regions. Figure 10 (top panel) shows the systems detected by GSyF represented by circles of r = Rvir over the galaxy density distribution as measured using KDE in a RA × Dec projection. The bottom panel of Fig. 10 shows the filaments overlaid on the KDE density map for the MSCC 310 volume. The density maps are set in terms of the mean number density.

thumbnail Fig. 10.

Top: RA × Dec projected density map as measured from 3D-KDE with 1Σ in terms of the density contrast. The GSyF systems are represented by white circles with radius scaled to the estimated Rvir. Bottom: RA × Z projection. The filaments detected in this work are overlaid in color. Density is represented following the color scale displayed on the right, where denser regions are redder and less dense zones are bluer.

8. Filament properties

8.1. Main properties of the filaments

In a similar way as described for MSCC 310, we applied the GFiF algorithm to the 46 superclusters of our sample, detecting a total of 144 filaments in 40 superclusters which are listed in Table 7. This table also lists the parameters used or measured by GFiF. The process of filtering the connections can be followed through columns 5–10. The list of detected filaments for all studied supercluster volumes can be consulted in Table A.1, in the same format as in Table 5. The MSCC 75, MSCC 76, MSCC 264, MSCC 441, MSCC 579 and MSCC 586 superclusters have not been evaluated with GFiF due to the sparseness of the SDSS coverage in these sky areas.

Table 7.

Summary of the properties of the filaments detected by GFiF for the superclusters in Table 1.

The filament skeletons detected by GFiF have lengths of between 9 and 130 Mpc. Figure 11 depicts the length distribution for all the detected filaments. The distribution shows that the majority of the structure lengths range from 10 to 40 Mpc. There are two structures longer than 100 Mpc. The first is the filament of 130 Mpc in length located in MSCC 323, which contains the Abell clusters A1449 B and A1532 A, and the second, of 105 Mpc, is located in MSCC 335, which contains A1478 A, A1480 B, and A1486 A. Excluding these two particular cases, we observe that the mean length of the filaments is about 37 Mpc while the median corresponds to 29 Mpc.

thumbnail Fig. 11.

Distribution of filament skeleton length for the 144 filaments detected by GFiF. The length used corresponds to the longest path between the systems at the extremity of the filament. See Table A.1.

8.2. Distribution of galaxies along the filaments

In order to evaluate the environment within the filaments, we extracted longitudinal profiles of number density. Figure 12 shows the longitudinal distribution of galaxies for all bridges, from one extreme to the other (ending systems), in the supercluster MSCC 310. We observe that the density of galaxies is higher near the ends of the bridges, as expected, and decreases through the midpoint between systems. We then proceeded to extract density profiles for bridges, from the systems to the midpoint, by counting the galaxies that lie within a cylinder of radius 1 Mpc with the medial skeleton set by the bridge skeleton. The galaxies are counted in slices of size Δd = 0.5 Mpc along the skeleton axes. We also calculated the longitudinal profiles after excluding the galaxies belonging to systems (considered at 1.5 Rvir) from their bridges in order to determine pure filament profiles. These profiles allow us to evaluate the mean density contrast of the filaments as compared with the background density. In Fig. 13 we show the longitudinal number density profile for all filaments detected in our sample. The stacked longitudinal profile including galaxies in systems is depicted by a blue line. The dispersion about the stacked profile is represented by a blue shaded area. The pure profiles (excluding the galaxies of the systems) is represented by the red line, and its corresponding dispersion by a red shaded area. As can be seen, the mean density contrast along the filament is ∼10, that is, the filament is about ten times denser than background.

thumbnail Fig. 12.

Distribution of galaxies along bridges connecting pairs of systems for the nine MSCC 310 filaments. All bridges are scaled to length 1.0.

thumbnail Fig. 13.

Longitudinal VT density distribution for galaxies in all bridges of filaments detected by GFiF. Profiles are considered from the system center to the middle of the bridge. The thick blue line depicts the mean longitudinal profile for bridges including galaxies in systems. The thick red line corresponds to the mean longitudinal profile for all filaments excluding galaxies belonging to systems within 1.5 Rvir. Blue and red shaded areas are the dispersion around the stacked profile.

8.3. Transversal density profiles

For the calculation of transversal density profiles, we excluded the galaxies located in systems and within a radius of 1.5 Rvir. The density profile is calculated as described in Sect. 3.7. The cylinder radius Rcy was set from 0 up to 10 Mpc in steps of ΔRcy = 0.5 Mpc.

We computed the galaxy number density profile for filaments in two ways. First we counted the number of galaxies within concentric cylinders and divided them by the volume within the cylinders. We refer to this as the local number density profile. For the second, we employed the number densities calculated using the VT di, as described in Sect. 3.1. We then measured the mean VT number density within concentric cylinders. The local number density and VT number density profiles are scaled in density contrast and stacked together. Figure 14 shows the stacked profile for all filaments detected by GFiF, for local densities (top panel) and VT densities (bottom panel), respectively. The first aspect to note is that local number density profiles are smoother, although in general both mean profiles are similar. We can observe in both local and VT density profiles that overdensity extends up to 5 Mpc. At about 3 Mpc, the overdensity reaches a value of around 3, while the typical characteristic density contrast of 10 is reached closer to 2 Mpc.

thumbnail Fig. 14.

Stacked number density profiles for the 144 filaments identified by GFiF. Individual profiles are represented by thin gray lines. Top: the red lines corresponds to the mean local density (stacked) profile. Bottom: mean VT density stacked profile. The solid line indicates the mean profile while the shaded area represents the dispersion of the profile. Solid black line depicts the density contrast of 10 × d.

Finally, we used the density profiles to estimate the mean radius of the filaments Rfil. This was achieved by considering the intersection point at which the local density profile crosses the 10 × d line, as indicated in Fig. 14 by the black solid line. The mean radius as well as the mean density of each filament is noted in Table A.1.

Figure 15 (top panel) presents the radii distribution for all the filaments. The filament radii range from 0.6 to 4.5 Mpc with a mean value of 2.4 Mpc. The bottom panel of the figure depicts the filament radius as a function of the filament length. We observe that the filament length does not correlate with the filament radius. However, it is important to note that the radius varies slightly around the mean value along the filament path.

thumbnail Fig. 15.

Top: distribution of radius of filaments in our sample. Bottom: comparison of filament length and radius for the 144 filaments detected by GFiF. The length used corresponds to the longest path between a pair of systems, that is, the skeleton length. See Table A.1.

9. Properties of galaxies in filaments

9.1. Stellar mass profile

We constructed a galaxy stellar mass profile for all filaments by using the masses from the MPA-JHU group (Brinchmann et al. 2004; Kauffmann et al. 2003; Tremonti et al. 2004) described in Sect. 2.2. First, we weighted the mass by the average mass of the volume under analysis to remove the redshift dependence of the stellar mass (Chen et al. 2017). This weighting is equivalent to a normalization of the stellar mass and allows us to carry out a stacking procedure in order to increase the signal of the profiles. This mass profile is extracted as described in Sect. 3.7.

Figure 16 shows the stacked stellar mass profile for all filaments. The variance of the stacked profile is depicted by the error bars. We observe that, statistically, the stellar masses of the filament galaxies are larger than the average mass up to about 2 Mpc, while beyond 3 Mpc they tend to be 10% smaller. This region, farther than 3 Mpc, probably represents the dispersed population of the supercluster associated to the more extended sheet component. Therefore, our results indicate that the stellar mass correlates with the distance to the filament skeleton, and is greater (up to 25%) near the skeleton than far from it. These results are in good agreement with the results presented by Chen et al. (2017) for the MGS sample from DR7 (Abazajian et al. 2009). Our results are also compatible with those presented by Alpaslan et al. (2016) and Kraljic et al. (2018), for the GAMA spectroscopic survey, who find similar trends for the filaments found at redshifts z <  0.09 and 0.03 ≤ z ≤ 0.25, respectively.

thumbnail Fig. 16.

Stacked transversal stellar mass profile for the 144 filaments detected by GFiF. Errors correspond to the variance of the stacked profiles.

9.2. Morphological type

In order to decipher whether or not there is some morphological trend in the population of filament galaxies (as may be expected from the morphology–density relation), we also constructed morphology profiles based on morphological classifications by Huertas-Company et al. (2011). These profiles classify the galaxies into four morphological types. For our analysis, we used the probability p(Early) = p(E) + p(S0) that classifies galaxies into early type as p(Early) >  0.5 and late type as p(Early) < 0.5. We then computed the distribution of both galaxy types as a function of the distance to the filament skeleton. The distributions were normalized so that they can be compared and stacked for all filaments in our sample in a similar way to a profile extraction, again excluding galaxies in systems. The result is shown in Fig. 17.

thumbnail Fig. 17.

Top: stacked transversal morphological type profiles for the 144 filaments detected by GFiF. The error bars correspond to the variance of the stacked profiles. Bottom: early-to-late-type ratio as a function of the distance to the filament skeleton.

Our results show that the fraction of early-type galaxies is higher than that of late types near the filament skeleton up to ∼2 Mpc. This effect is more discernible when computed as an early-to-late-type ratio (Fig. 17, bottom panel). We observe that at distances shorter than 2 Mpc, the fraction of early types reaches almost twice the fraction of late types. At greater distances (i.e., towards the dispersed supercluster population) the fractions tend to be similar (E/S ratio ∼1). A two-sample Kolmogorov-Smirnov test applied to the distributions of early and late types in Fig. 17 reveals that, for the first bins, they are significantly different (p-value lower than 0.1). Our results are consistent with those presented by Kuutma et al. (2017) for the Huertas-Company et al. (2011) sample – these latter authors also observe that early-type galaxies are more abundant near the filament skeleton.

9.3. Activity type

We used the activity classification from the MPA-JHU group (Brinchmann et al. 2004; Kauffmann et al. 2003; Tremonti et al. 2004) described in Sect. 2.2 for the analysis with respect to activity type. We computed the distribution of the different galaxy activity populations as a function of the filament skeleton distance. All distributions are normalized for all filaments and stacked together.

Figure 18 divides galaxies into four activity groups: active galactic nuclei (AGNs), star-forming galaxies (SFGs), low-ionization nuclear emission-line region galaxies (LINERs), and inactive galaxies (unclassified). The error bars are not displayed over the lines for the purpose of clarity – note that they are very large, implying that we have to interpret this figure with caution. Another effect to take into account is that the fractions are averaged over all the filaments (at different redshifts). The most evident tendency we can see in these distributions is a decrease in the activity as long as the galaxies “approach” the filament, although the fractions for the dispersed component are particular noisy. Inside the filaments, the tendency is to have more passive galaxies, implying again smaller fractions of AGNs and SFGs. However, the fraction of LINERs also increases towards the filament skeletons, possibly indicating a post-activity phase for the galaxies. Deeper analyses are necessary to give a clear picture of the effect of the filament environment on the activity of galaxies.

thumbnail Fig. 18.

Stacked transversal activity type profiles for the 144 filaments detected by GFiF. The black bar on the left represents the typical errors on the stacked profiles, not overlaid for clarity.

10. Conclusions

In this paper we studied the bridges and filaments of galaxies in the environment of superclusters of galaxies. We developed two algorithms, the Galaxy System-Finding algorithm, GSyF, and the Galaxy Filament-Finding algorithm, GFiF, to detect systems of galaxies (clusters and groups), aiming especially to correct for the FoG effect, and to identify the elongated bridges and filaments mentioned above. These algorithms were applied to a sample of SDSS galaxies with spectroscopic redshifts in rectangular boxes enclosing 46 superclusters of galaxies selected from the MSCC catalog in a redshift range from 0.02 to 0.15.

GSyF and GFiF employ a set of different classic pattern-recognition methods. Both of them are probabilistic in the sense that they define systems and filaments as a function of the relative position and orientation of the Gaussian groups, which are detected with a hierarchical clusterization method. For GSyF, the membership of the Gaussian groups is refined using a virial approximation, allowing us to discern gravitationally bound systems of galaxies from misdetections. For GFiF, these measurements are used to define a general tree from which we extract independent structures based on density criteria. Although the HC algorithm needs to be optimized for the number of clustering groups, this can be automatized based on the density function characterizing the survey. The structures are represented by a filament skeleton that allows us to measure and quantitatively trace the filament path.

We show (Sect. 7.1) that the systems detected by our methodology are in good agreement with those reported in the literature. Specifically, our comparisons of the systems sample with other cluster and group catalogs (Abell, C4, L14, T11 and MSPM) showed a match rate above 78% for groups with richness above five galaxies at redshifts z <  0.11. For systems with richness above ten galaxies the coincidences were slightly higher for the group catalogs (T11 and MSPM) and were slightly lower for the cluster catalogs (Abell and C4). Moreover, the richness, velocity dispersion and virial radius of systems measured by the GSyF algorithm are in good agreement with those reported in other system catalogs. Our GSyF algorithm detected a total of 2705 systems in the rectangular boxes enclosing the volumes of 45 of the superclusters in our sample. Of these, 159 systems with richness above ten galaxies have not yet been reported in the literature3.

We also compared, in Sect. 7.2, the results of our filament-finding algorithm with those of Tempel et al. (2014) for the same regions. We observe that T14 filaments are shorter, more numerous, and describe sparser and finer structures while GFiF detects larger and denser elongated structures that bridge galaxy systems. Our filaments, in some sense, link several T14 threads, forming one larger structure, providing a broader picture of the filament. The comparison with isolated bridges and tendrils (a byproduct of our algorithm) shows a match of 80% with T14 filaments and comparable filament lengths.

The GFiF algorithm detected a total of 144 filaments and 63 isolated bridges in the rectangular boxes enclosing the volumes of 40 of the superclusters in our sample. The supercluster filaments we detected have lengths from 9 up to 130 Mpc (mean 37 Mpc, median 29 Mpc) while the isolated bridges have lengths of between 5 and 15 Mpc. These values are consistent with the median bridge length value from Kraljic et al. (2019), 7.9 Mpc, for the HORIZON-AGN simulation.

For most of the cases, the numerical density inside the filaments was found to be between 5 and 15 times the mean density. The radii of the filament skeletons range from 0.6 up to 4.5 Mpc, with most being between 2 and 3 Mpc. These values are consistent with those found by Cautun et al. (2013) by applying the NEXUS algorithm to an N-body simulation.

We also compared the properties of the galaxies that inhabit the filament as a function of the distance from its skeleton. Our conclusions can be summarized as follows: (i) The transversal local and VT number density profiles for pure filaments show that, at distances of up to 5 Mpc, the filaments have a positive overdensity with respect to the background density inside the boxes. (ii) At distances of about 3 Mpc, the density contrast reaches a value of 3, a limit that matches the range where typically the environmental effects studied in Sect. 9 seem to apply. (iii) The mean density contrast of the filaments, 10, is reached closer to 2 Mpc, a limit that we used as reference for estimating the radius of the filaments.

Our analyses regarding the stellar masses, morphological type, and activity type show that these galaxy properties correlate with the distance from the filament skeleton. We arrive at the following conclusions: (i) Inside 3 Mpc from the filament skeleton the galaxy stellar masses increase up to about 25%. This result leads to two hypotheses: (a) the mass growth of the galaxies is sensitive to the environment, or (b) the dynamical evolution brings massive galaxies into the potential well of the filaments. This result confirms several analyses which suggest that stellar masses are sensitive to the environment (Alpaslan et al. 2015, 2016; Poudel et al. 2016; Chen et al. 2017; Malavasi et al. 2017; Kraljic et al. 2018; Musso et al. 2018). (ii) The early-to-late-type-galaxy ratio has its maximum at the center of the filament and remains above 1:1 up to a distance of 1.5 Mpc. This result is in close agreement with a similar study by Kuutma et al. (2017) for the SDSS. (iii) Concerning the activity type, we observe that the fractions of AGNs and SFGs seem to be higher outside the filaments (in the supercluster dispersed component), showing a decrease as the galaxies approach these structures. Inside the filaments, the fractions of inactive galaxies and LINERs increase, indicating a possible post-activity phase. A similar result for the star-forming galaxies was observed by Kraljic et al. (2018) for the GAMA spectroscopic survey.

The GSyF and GFiF algorithms can be used to search for these kinds of structures in different surveys, using spectroscopic or photometric redshifts. We plan to apply them to other galaxy databases, like the ones that are becoming available for the southern celestial hemisphere, and also, to galaxy surveys that reach deeper redshifts. Both algorithms and catalogs can be obtained electronically upon request.


3

Data on these systems are available at the CDS.

Acknowledgments

I.S.-B. thanks CONACyT and DAIP for funding this research. This project was partially financed by DAIP funding CIIC 205/2019. The authors are grateful to the anonymous referee for the important comments that helped to improve the paper. All the pattern recognition computing, statistics and graphics have been made using MATLAB©. Part of this work was carried out with the computational facility TITAN at the Institut de Recherche en Astrophysique et Planétologie, Toulouse, France. This work has made use of NASA’s Astrophysics Data System Bibliographic Services. Funding for SDSS-III has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation, and the US Department of Energy Office of Science. The SDSS-III web site is http://www.sdss3.org/. SDSS-III is managed by the Astrophysical Research Consortium for the Participating Institutions of the SDSS-III Collaboration including the University of Arizona, the Brazilian Participation Group, Brookhaven National Laboratory, Carnegie Mellon University, University of Florida, the French Participation Group, the German Participation Group, Harvard University, the Instituto de Astrofisica de Canarias, the Michigan State/Notre Dame/JINA Participation Group, Johns Hopkins University, Lawrence Berkeley National Laboratory, Max Planck Institute for Astrophysics, Max Planck Institute for Extraterrestrial Physics, New Mexico State University, New York University, Ohio State University, Pennsylvania State University, University of Portsmouth, Princeton University, the Spanish Participation Group, University of Tokyo, University of Utah, Vanderbilt University, University of Virginia, University of Washington, and Yale University.

References

  1. Abazajian, K. N., Adelman-McCarthy, J. K., Agüeros, M. A., et al. 2009, ApJS, 182, 543 [NASA ADS] [CrossRef] [Google Scholar]
  2. Abell, G. O. 1961, AJ, 66, 607 [NASA ADS] [CrossRef] [Google Scholar]
  3. Abell, G. O., Corwin, H. G., Jr, & Olowin, R. P. 1989, ApJS, 70, 1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  4. Aihara, H., Allende Prieto, C., An, D., et al. 2011, ApJS, 193, 29 [NASA ADS] [CrossRef] [Google Scholar]
  5. Alam, S., Albareti, F. D., Allende Prieto, C., et al. 2015, ApJS, 219, 12 [NASA ADS] [CrossRef] [Google Scholar]
  6. Albareti, F. D., Allende Prieto, C., Almeida, A., et al. 2017, ApJS, 233, 25 [NASA ADS] [CrossRef] [Google Scholar]
  7. Alpaslan, M., Robotham, A. S. G., Obreschkow, D., et al. 2014, MNRAS, 440, L106 [Google Scholar]
  8. Alpaslan, M., Driver, S., Robotham, A. S. G., et al. 2015, MNRAS, 451, 3249 [NASA ADS] [CrossRef] [Google Scholar]
  9. Alpaslan, M., Grootes, M., Marcum, P. M., et al. 2016, MNRAS, 457, 2287 [NASA ADS] [CrossRef] [Google Scholar]
  10. Alvarez, G. E., Randall, S. W., Bourdin, H., Jones, C., & Holley-Bockelmann, K. 2018, ApJ, 858, 44 [NASA ADS] [CrossRef] [Google Scholar]
  11. Aragón-Calvo, M. A., Jones, B. J. T., van de Weygaert, R., & van der Hulst, J. M. 2007, A&A, 474, 315 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  12. Aragón-Calvo, M. A., Platen, E., van de Weygaert, R., & Szalay, A. S. 2010, ApJ, 723, 364 [NASA ADS] [CrossRef] [Google Scholar]
  13. Beers, T. C., Flynn, K., & Gebhardt, K. 1990, AJ, 100, 32 [NASA ADS] [CrossRef] [Google Scholar]
  14. Biviano, A., Murante, G., Borgani, S., et al. 2006, A&A, 456, 23 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  15. Bolton, A. S., Schlegel, D. J., Aubourg, É., et al. 2012, AJ, 144, 144 [NASA ADS] [CrossRef] [Google Scholar]
  16. Bond, J. R., & Szalay, A. S. 1983, ApJ, 274, 443 [Google Scholar]
  17. Bond, J. R., Kofman, L., & Pogosyan, D. 1996, Nature, 380, 603 [NASA ADS] [CrossRef] [Google Scholar]
  18. Bond, N. A., Strauss, M. A., & Cen, R. 2010, MNRAS, 409, 156 [NASA ADS] [CrossRef] [Google Scholar]
  19. Brinchmann, J., Charlot, S., White, S. D. M., et al. 2004, MNRAS, 351, 1151 [NASA ADS] [CrossRef] [Google Scholar]
  20. Cautun, M., van de Weygaert, R., & Jones, B. J. T. 2013, MNRAS, 429, 1286 [NASA ADS] [CrossRef] [Google Scholar]
  21. Cen, R., & Ostriker, J. P. 1999, ApJ, 514, 1 [NASA ADS] [CrossRef] [Google Scholar]
  22. Chen, Y.-C., Ho, S., Freeman, P. E., Genovese, C. R., & Wasserman, L. 2015, MNRAS, 454, 1140 [NASA ADS] [CrossRef] [Google Scholar]
  23. Chen, Y.-C., Ho, S., Brinkmann, J., et al. 2016, MNRAS, 461, 3896 [NASA ADS] [CrossRef] [Google Scholar]
  24. Chen, Y.-C., Ho, S., Mandelbaum, R., et al. 2017, MNRAS, 466, 1880 [NASA ADS] [CrossRef] [Google Scholar]
  25. Chow-Martinez, M., Andernach, H., Caretta, C. A., & Trejo-Alonso, J. J. 2014, MNRAS, 445, 4073 [NASA ADS] [CrossRef] [Google Scholar]
  26. Colless, M., Dalton, G., Maddox, S., et al. 2001, MNRAS, 328, 1039 [NASA ADS] [CrossRef] [Google Scholar]
  27. Costa-Duarte, M. V., Sodré, L., Jr, & Durret, F. 2011, MNRAS, 411, 1716 [NASA ADS] [CrossRef] [Google Scholar]
  28. Cybulski, R., Yun, M. S., Fazio, G. G., & Gutermuth, R. A. 2014, MNRAS, 439, 3564 [Google Scholar]
  29. Darvish, B., Mobasher, B., Sobral, D., Scoville, N., & Aragon-Calvo, M. 2015, ApJ, 805, 121 [NASA ADS] [CrossRef] [Google Scholar]
  30. Davis, M., Huchra, J., Latham, D. W., & Tonry, J. 1982, ApJ, 253, 423 [NASA ADS] [CrossRef] [Google Scholar]
  31. Dijkstra, E. W. 1959, Numer. Math., 1, 269 [CrossRef] [MathSciNet] [Google Scholar]
  32. Doroshkevich, A. G., & Khlopov, M. I. 1984, MNRAS, 211, 277 [Google Scholar]
  33. Dupuy, A., Courtois, H. M., Dupont, F., et al. 2019, MNRAS, 489, L1 [NASA ADS] [CrossRef] [Google Scholar]
  34. Eckert, D., Ettori, S., Pointecouteau, E., et al. 2017, Astron. Nachr., 338, 293 [NASA ADS] [CrossRef] [Google Scholar]
  35. Einasto, M., Einasto, J., Tago, E., Müller, V., & Andernach, H. 2001, AJ, 122, 2222 [CrossRef] [Google Scholar]
  36. Einasto, M., Saar, E., Martínez, V. J., et al. 2008, ApJ, 685, 83 [NASA ADS] [CrossRef] [Google Scholar]
  37. Einasto, M., Lietzen, H., Tempel, E., et al. 2014, A&A, 562, A87 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  38. Einasto, J., Suhhonenko, I., Liivamägi, L. J., & Einasto, M. 2019, A&A, 623, A97 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  39. Eisenstein, D. J., Zehavi, I., Hogg, D. W., et al. 2005, ApJ, 633, 560 [NASA ADS] [CrossRef] [Google Scholar]
  40. Gallazzi, A., Bell, E. F., Wolf, C., et al. 2009, ApJ, 690, 1883 [NASA ADS] [CrossRef] [Google Scholar]
  41. Gavazzi, G., Fumagalli, M., Cucciati, O., & Boselli, A. 2010, A&A, 517, A73 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  42. González, R. E., & Padilla, N. D. 2010, MNRAS, 407, 1449 [NASA ADS] [CrossRef] [Google Scholar]
  43. Graham, R. L., & Hell, P. 1985, Ann. History Comput., 7, 43 [Google Scholar]
  44. Guennou, L., Adami, C., Durret, F., et al. 2014, A&A, 561, A112 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  45. Guglielmo, V., Poggianti, B. M., Vulcani, B., et al. 2018, A&A, 620, A7 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  46. Huchra, J. P., Macri, L. M., Masters, K. L., et al. 2012, ApJS, 199, 26 [NASA ADS] [CrossRef] [Google Scholar]
  47. Huertas-Company, M., Aguerri, J. A. L., Bernardi, M., Mei, S., & Sánchez Almeida, J. 2011, A&A, 525, A157 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  48. Kauffmann, G., Heckman, T. M., White, S. D. M., et al. 2003, MNRAS, 341, 33 [NASA ADS] [CrossRef] [Google Scholar]
  49. Klypin, A. A., Trujillo-Gomez, S., & Primack, J. 2011, ApJ, 740, 102 [Google Scholar]
  50. Kopylova, F. G., & Kopylov, A. I. 2006, Astron. Lett., 32, 84 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  51. Kraljic, K., Arnouts, S., Pichon, C., et al. 2018, MNRAS, 474, 547 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  52. Kraljic, K., Pichon, C., Dubois, Y., et al. 2019, MNRAS, 483, 3227 [NASA ADS] [CrossRef] [Google Scholar]
  53. Krause, M. O., Ribeiro, A. L. B., & Lopes, P. A. A. 2013, A&A, 551, A143 [Google Scholar]
  54. Kuutma, T., Tamm, A., & Tempel, E. 2017, A&A, 600, L6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  55. Lauer, T. R., Postman, M., Strauss, M. A., Graves, G. J., & Chisari, N. E. 2014, ApJ, 797, 82 [Google Scholar]
  56. Libeskind, N. I., van de Weygaert, R., Cautun, M., et al. 2018, MNRAS, 473, 1195 [Google Scholar]
  57. Liivamägi, L. J., Tempel, E., & Saar, E. 2012, A&A, 539, A80 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  58. Lintott, C. J., Schawinski, K., Slosar, A., et al. 2008, MNRAS, 389, 1179 [NASA ADS] [CrossRef] [Google Scholar]
  59. Lintott, C., Schawinski, K., Bamford, S., et al. 2011, MNRAS, 410, 166 [NASA ADS] [CrossRef] [Google Scholar]
  60. Luparello, H., Lares, M., Lambas, D. G., & Padilla, N. 2011, MNRAS, 415, 964 [NASA ADS] [CrossRef] [Google Scholar]
  61. Malavasi, N., Arnouts, S., Vibert, D., et al. 2017, MNRAS, 465, 3817 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  62. Miller, C. J., Nichol, R. C., Reichart, D., et al. 2005, ApJ, 130, 968 [NASA ADS] [CrossRef] [Google Scholar]
  63. Murtagh, F., & Contreras, P. 2011, Wiley Interdisciplinary Rev.: Data Mining Knowl. Discovery, 2, 86 [Google Scholar]
  64. Murtagh, F., & Legendre, P. 2014, J. Classification, 31, 274 [Google Scholar]
  65. Musso, M., Cadiou, C., Pichon, C., et al. 2018, MNRAS, 476, 4877 [NASA ADS] [CrossRef] [Google Scholar]
  66. Peebles, P. J. E. 1980, The Large-scale Structure of the Universe (Princeton University Press) [Google Scholar]
  67. Planck Collaboration VIII. 2013, A&A, 550, A134 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  68. Platen, E., van de Weygaert, R., Jones, B. J. T., Vegter, G., & Calvo, M. A. A. 2011, MNRAS, 416, 2494 [NASA ADS] [CrossRef] [Google Scholar]
  69. Poudel, A., Heinämäki, P., Nurmi, P., et al. 2016, A&A, 590, A29 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  70. Rines, K., Geller, M. J., Kurtz, M. J., & Diaferio, A. 2003, AJ, 126, 2152 [NASA ADS] [CrossRef] [Google Scholar]
  71. Rines, K., Diaferio, A., & Natarajan, P. 2007, ApJ, 657, 183 [Google Scholar]
  72. Santanu, R. S. 2014, Graph Theory with Algorithms and Its Applications: In Applied Science and Technology (Springer Publishing Company, Incorporated) [Google Scholar]
  73. Santiago-Bautista, I., Caretta, C. A., Bravo-Alfaro, H., Pointecouteau, E., & Madrigal, F. 2019, ArXiv e-prints [arXiv:2001.03209] [Google Scholar]
  74. Scoville, N., Arnouts, S., Aussel, H., et al. 2013, ApJS, 206, 3 [NASA ADS] [CrossRef] [Google Scholar]
  75. Serna, A., & Gerbal, D. 1996, A&A, 309, 65 [NASA ADS] [Google Scholar]
  76. Shandarin, S. F., & Zel’dovich, Y. B. 1989, Rev. Mod. Phys., 61, 185 [NASA ADS] [CrossRef] [Google Scholar]
  77. Smargon, A., Mandelbaum, R., Bahcall, N., & Niederste-Ostholt, M. 2012, MNRAS, 423, 856 [Google Scholar]
  78. Smith, A. G., Hopkins, A. M., Hunstead, R. W., & Pimbblet, K. A. 2012, MNRAS, 422, 25 [NASA ADS] [CrossRef] [Google Scholar]
  79. Sousbie, T. 2011, MNRAS, 414, 350 [NASA ADS] [CrossRef] [Google Scholar]
  80. Sousbie, T., Pichon, C., Courtois, H., Colombi, S., & Novikov, D. 2008, ApJ, 672, L1 [NASA ADS] [CrossRef] [Google Scholar]
  81. Springel, V., White, S. D. M., Jenkins, A., et al. 2005, Nature, 435, 629 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
  82. Strauss, M. A., Weinberg, D. H., Lupton, R. H., et al. 2002, AJ, 124, 1810 [NASA ADS] [CrossRef] [Google Scholar]
  83. Tanaka, M., Hoshi, T., Kodama, T., & Kashikawa, N. 2007, MNRAS, 379, 1546 [Google Scholar]
  84. Tanimura, H., Aghanim, N., Douspis, M., Beelen, A., & Bonjean, V. 2019, A&A, 625, A67 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  85. Tempel, E., Tago, E., & Liivamägi, L. J. 2012, A&A, 540, A106 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  86. Tempel, E., Stoica, R. S., Martínez, V. J., et al. 2014, MNRAS, 438, 3465 [NASA ADS] [CrossRef] [Google Scholar]
  87. Theodoridis, S., & Koutroumbas, K. 2009, in Pattern Recognition, 4th edn., eds. T. Sergios, & K. Konstantinos (Boston: Academic Press), 595 [Google Scholar]
  88. Theodoridis, S., Pikrakis, A., Koutroumbas, K., & Cavouras, D. 2010, Introduction to Pattern Recognition: A Matlab Approach (Boston: Academic Press) [Google Scholar]
  89. Tremonti, C. A., Heckman, T. M., Kauffmann, G., et al. 2004, ApJ, 613, 898 [NASA ADS] [CrossRef] [Google Scholar]
  90. Tully, R. B., Courtois, H., Hoffman, Y., & Pomarède, D. 2014, Nature, 513, 71 [NASA ADS] [CrossRef] [Google Scholar]
  91. Ueda, H., & Itoh, M. 1997, PASJ, 49, 131 [NASA ADS] [Google Scholar]
  92. Ursino, E., Galeazzi, M., Gupta, A., et al. 2015, ApJ, 806, 211 [NASA ADS] [CrossRef] [Google Scholar]
  93. Vogelsberger, M., Genel, S., Springel, V., et al. 2014, Nature, 509, 177 [NASA ADS] [CrossRef] [Google Scholar]
  94. Voronoi, G. 1908, J. Reine Angew. Math., 134, 198 [Google Scholar]
  95. Wang, P., Luo, Y., Kang, X., et al. 2018, ApJ, 859, 115 [Google Scholar]

Appendix A: Additional table

Table A.1.

Main properties of the filaments extracted through GFiF.

All Tables

Table 1.

Sample of MSCC superclusters used in the present work.

Table 2.

Properties of the supercluster boxes and of the galaxy systems detected inside them using the GSyF algorithm.

Table 3.

Glossary of parameters used by GSyF and GFiF algorithms.

Table 4.

Main properties of the 25 richest systems identified in the volume of the supercluster MSCC 310.

Table 5.

Main properties of the filaments extracted through GFiF for the supercluster MSCC 310.

Table 6.

GSyF systems detected by other catalogs for the MSCC 310 supercluster.

Table 7.

Summary of the properties of the filaments detected by GFiF for the superclusters in Table 1.

Table A.1.

Main properties of the filaments extracted through GFiF.

All Figures

thumbnail Fig. 1.

Distribution of mean volume densities (see fourth column of Table 2) for the 46 superclusters in our sample as function of redshift (blue points). The red line corresponds to the best fit of a power-law function. Residuals of the fitting are shown in the bottom panel. MSCC 579 and MSCC 55 were excluded from the fitting.

In the text
thumbnail Fig. 2.

Flow chart of the GSyF (left side) and GFiF (right side) algorithms.

In the text
thumbnail Fig. 3.

Representation of a filament. Graph nodes are represented by white circles and edges by dark lines. The five systems connected are represented by a dotted circle of radius Rvir. A bridge connecting two systems is represented as a bold black line. The distance from galaxies to the filament (bold dashed line) is measured along a line perpendicular to the edges.

In the text
thumbnail Fig. 4.

Illustration of the steps of the GFiF algorithm. In the first box (top-left) one can see the distribution of galaxies. In the second one (top-right) the HC groups are marked, with denser red colors representing the richer HC groups. The filtered edges (links) among the groups of the spanning tree are displayed in the third box (bottom-left). The last box (bottom-right) presents the systems (green circles), bridges (brown lines), and other links (blue lines) found among the groups of the preceding step.

In the text
thumbnail Fig. 5.

Three-dimensional distribution of galaxies for the MSCC 310 supercluster volume. Top: Galaxy positions before the application of the FoG correction. Bottom: Galaxy positions after correction for FoG effects. The colors represent density as calculated from 3D VT. The highest density is represented in red while lower densities are represented in green to blue.

In the text
thumbnail Fig. 6.

Results of the GFiF algorithm for the MSCC 310 supercluster volume. Upper panel: dendrogram with the nine detected filaments represented by different colors. The y axis of the dendrogram plot indicates the distance at each level of the tree. Lower panel: RA × Z distribution, where SDSS galaxies are represented by gray points and filaments are represented by lines according to the colors in the upper panel. Tendrils are represented by gray lines.

In the text
thumbnail Fig. 7.

Projected distribution in the sky of systems detected by GSyF. Top: systems detected for the MSCC 310 (UMa supercluster). Bottom: systems detected for the MSCC 295 (Coma supercluster). The system radii are shown as circles of r = Rvir. For comparison, the positions of systems reported by MSPM, T11, C4, L14, and Abell catalogs are depicted by color points: blue, green, pink, cyan, and red, respectively.

In the text
thumbnail Fig. 8.

Comparison of MSCC 310 supercluster GSyF system richness against the richness measured by other catalogs for the matching systems. Symbol colors are the same as the ones in Fig. 7. The dashed line represents the identity.

In the text
thumbnail Fig. 9.

Comparison of GFiF filaments for the MSCC 310 supercluster to the T14 filaments in the same region for the SDSS-DR8. Gray lines are T14 filaments. Colored lines depict filaments identified in this work.

In the text
thumbnail Fig. 10.

Top: RA × Dec projected density map as measured from 3D-KDE with 1Σ in terms of the density contrast. The GSyF systems are represented by white circles with radius scaled to the estimated Rvir. Bottom: RA × Z projection. The filaments detected in this work are overlaid in color. Density is represented following the color scale displayed on the right, where denser regions are redder and less dense zones are bluer.

In the text
thumbnail Fig. 11.

Distribution of filament skeleton length for the 144 filaments detected by GFiF. The length used corresponds to the longest path between the systems at the extremity of the filament. See Table A.1.

In the text
thumbnail Fig. 12.

Distribution of galaxies along bridges connecting pairs of systems for the nine MSCC 310 filaments. All bridges are scaled to length 1.0.

In the text
thumbnail Fig. 13.

Longitudinal VT density distribution for galaxies in all bridges of filaments detected by GFiF. Profiles are considered from the system center to the middle of the bridge. The thick blue line depicts the mean longitudinal profile for bridges including galaxies in systems. The thick red line corresponds to the mean longitudinal profile for all filaments excluding galaxies belonging to systems within 1.5 Rvir. Blue and red shaded areas are the dispersion around the stacked profile.

In the text
thumbnail Fig. 14.

Stacked number density profiles for the 144 filaments identified by GFiF. Individual profiles are represented by thin gray lines. Top: the red lines corresponds to the mean local density (stacked) profile. Bottom: mean VT density stacked profile. The solid line indicates the mean profile while the shaded area represents the dispersion of the profile. Solid black line depicts the density contrast of 10 × d.

In the text
thumbnail Fig. 15.

Top: distribution of radius of filaments in our sample. Bottom: comparison of filament length and radius for the 144 filaments detected by GFiF. The length used corresponds to the longest path between a pair of systems, that is, the skeleton length. See Table A.1.

In the text
thumbnail Fig. 16.

Stacked transversal stellar mass profile for the 144 filaments detected by GFiF. Errors correspond to the variance of the stacked profiles.

In the text
thumbnail Fig. 17.

Top: stacked transversal morphological type profiles for the 144 filaments detected by GFiF. The error bars correspond to the variance of the stacked profiles. Bottom: early-to-late-type ratio as a function of the distance to the filament skeleton.

In the text
thumbnail Fig. 18.

Stacked transversal activity type profiles for the 144 filaments detected by GFiF. The black bar on the left represents the typical errors on the stacked profiles, not overlaid for clarity.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.